|
|
 |
Re: FN-FORUM: [OT] flippancy - What's the best development platform for creating Web 2.0 business applications?
date posted 22nd November 2007 13:20
Graham Stark wrote:
> Hi.
>
>
>> On 22 Nov 2007 09:59:14 -0000, Graham Stark wrote
>>
>>> "But relying on these as the main tools to describe the
>>> relationships in a database causes lots of people lots of problems -
>>> especially unique incrementing IDs as primary keys."
>>>
>> Can you give an example of a problem caused by auto ints as I'm not sure
>> what you are getting at here?
>>
>>
>
> I'll try: I shouldn't make a statement and not be prepared to justify
> it, I suppose. I've given up my lunch break to write this!
>
> Suppose you have a discussion board. It has users, with characteristics
> like password, birthdate, gender and so on. Users identify themselves in
> some way, perhaps a nickname. The board has topics. Users post messages
> onto topic pages.
>
> A "single incrementing id field version" implementation of this, would
> be, I think:
>
> table user ( id integer primary key, );
> table topic( id integer primary key, );
> table message( id integer primary key,
> );
>
>
> OTOH, a relational/'academic bollocks' version with 'natural keys' might
> be:
>
> table user ( name char(30) primary key, );
> table topic( name char(30) primary key, );
> table message( user_name, topic_name, message_id, body,
> primary key( user_name, topic_name, message_id )
> foreign key user_name references user,
> foreign key topic_name reference topic );
>
> In the user table, the 'name char(30) primary key' enforces the fact
> that each user has to have a unique nickname.
>
> In the message table,
>
> 'primary key( user_name, topic_name, message_id )'
>
> tells you that each message is distinguished by the user who posted it,
> the topic and some number identifier (which could auto-increment if the
> database supports that). The
>
> 'foreign key user_name references user'
>
> tells you that the user must exist in the user table before she can
> post. Likewise for the topic.
>
> The first version seems to me close to how these things are most
> commonly done. I'm working at the moment on integrating a discussion
> board with almost exactly that structure. The second version is
> undoubtedly a little harder to set up, and results in larger (and for
> some purposes slower) tables. But I suggest it is better. I think there
> are three things to worry about about the first version:
>
> 1) If you *just* have ids as the primary keys, then you can't (legally)
> use foreign keys to specify how tables relate to one another. And that's
> a loss. On practical reason it's a loss is because modern code
> generators can use these relationships to generate nice code for you.
> For example, the PHP Propel system would generate a $user->getMessages()
> method to get all messages for a user;
>
> 2) If there are things that really make something unique - for example,
> that no two users can have the same nickname, then in the first case you
> have to express that somewhere else, either in application code or in
> some otherwise redundant index;
>
> 3) It's often much easier to query the second version. An example: a
> typical task might be to display all messages posted by 'john' about
> 'football'. Messages in the 'normalised/natural key' version might look
> like:
>
> user_name | topic_name | message_id | body
> 'John' 'Football' 3 stuff.....
>
> So the query is just:
> select body from message where user_name = 'john' and 'topic_name' =
> 'football';
>
> Whereas, in the unique id version, a row might look like:
>
> user_id | topic_id | message_id | body
> 3 22 2 Some stuff.....
>
> So we don't know which user_id refers to 'John' and which topic_id is
> 'football'. You have either to look them up in the query by joining all
> three tables:
>
> select body from message, topic, user where
> message.topic_id = topic.id and
> message.user_id = user.id and
> user.user_name = 'John' and
> topic.topic_name = 'Football';
>
> Or issue multiple queries in your application code:
>
> select id from user where user_name = 'John' into :user_id
> select id from topic where topic_name = 'Football' into :message_id
> select body from message
> where
> message_id = :message_id and
> user_id = :user_id
>
> My impression is that frequently any speed advantage from having simple
> integer primary keys gets wiped out by the number of joins or queries
> you have to make to get useful information out.
>
> In an example with three tables, none of this is unmanageably
> complicated, but in a big system with lots of tables it can hurt.
>
>
>
>>> It's also the single main problem with the persistence model in Ruby
>>> On Rails, which is where we came in. Look at item one on this list, for
>>> example:
>>>
>>> http://discuss.joelonsoftware.com/default.asp?joel.3.309321.3
>>>
>> Item one on that list speaks of the relational model being represented in
>> the business layer, it doesn't speak of problems of using auto ints as a
>> primary key in the database.
>>
>>
>
> It's really the same point: if you have single auto increment fields (or
> single keys of any kind) as the primary keys in all your tables, then
> the relationships between your tables must be described somewhere other
> than in the database itself. Which is a shame, as databases are good at
> this.
>
> Graham
>
>
>> --
>> Gary Short
>> http://www.garyshort.org
>>
>>
>>
>
>
>
What if John undergoes sex-reassignment and wants to be known as Jane
from now on?
Tom
|
 |
|