|
|
 |
Re: FN-FORUM: [OT] flippancy - What's the best development platform for creating Web 2.0 business applications?
date posted 22nd November 2007 12:53
Hi.
> On 22 Nov 2007 09:59:14 -0000, Graham Stark wrote
> > "But relying on these as the main tools to describe the
> > relationships in a database causes lots of people lots of problems -
> > especially unique incrementing IDs as primary keys."
>
> Can you give an example of a problem caused by auto ints as I'm not sure
> what you are getting at here?
>
I'll try: I shouldn't make a statement and not be prepared to justify
it, I suppose. I've given up my lunch break to write this!
Suppose you have a discussion board. It has users, with characteristics
like password, birthdate, gender and so on. Users identify themselves in
some way, perhaps a nickname. The board has topics. Users post messages
onto topic pages.
A "single incrementing id field version" implementation of this, would
be, I think:
table user ( id integer primary key, );
table topic( id integer primary key, );
table message( id integer primary key,
);
OTOH, a relational/'academic bollocks' version with 'natural keys' might
be:
table user ( name char(30) primary key, );
table topic( name char(30) primary key, );
table message( user_name, topic_name, message_id, body,
primary key( user_name, topic_name, message_id )
foreign key user_name references user,
foreign key topic_name reference topic );
In the user table, the 'name char(30) primary key' enforces the fact
that each user has to have a unique nickname.
In the message table,
'primary key( user_name, topic_name, message_id )'
tells you that each message is distinguished by the user who posted it,
the topic and some number identifier (which could auto-increment if the
database supports that). The
'foreign key user_name references user'
tells you that the user must exist in the user table before she can
post. Likewise for the topic.
The first version seems to me close to how these things are most
commonly done. I'm working at the moment on integrating a discussion
board with almost exactly that structure. The second version is
undoubtedly a little harder to set up, and results in larger (and for
some purposes slower) tables. But I suggest it is better. I think there
are three things to worry about about the first version:
1) If you *just* have ids as the primary keys, then you can't (legally)
use foreign keys to specify how tables relate to one another. And that's
a loss. On practical reason it's a loss is because modern code
generators can use these relationships to generate nice code for you.
For example, the PHP Propel system would generate a $user->getMessages()
method to get all messages for a user;
2) If there are things that really make something unique - for example,
that no two users can have the same nickname, then in the first case you
have to express that somewhere else, either in application code or in
some otherwise redundant index;
3) It's often much easier to query the second version. An example: a
typical task might be to display all messages posted by 'john' about
'football'. Messages in the 'normalised/natural key' version might look
like:
user_name | topic_name | message_id | body
'John' 'Football' 3 stuff.....
So the query is just:
select body from message where user_name = 'john' and 'topic_name' =
'football';
Whereas, in the unique id version, a row might look like:
user_id | topic_id | message_id | body
3 22 2 Some stuff.....
So we don't know which user_id refers to 'John' and which topic_id is
'football'. You have either to look them up in the query by joining all
three tables:
select body from message, topic, user where
message.topic_id = topic.id and
message.user_id = user.id and
user.user_name = 'John' and
topic.topic_name = 'Football';
Or issue multiple queries in your application code:
select id from user where user_name = 'John' into :user_id
select id from topic where topic_name = 'Football' into :message_id
select body from message
where
message_id = :message_id and
user_id = :user_id
My impression is that frequently any speed advantage from having simple
integer primary keys gets wiped out by the number of joins or queries
you have to make to get useful information out.
In an example with three tables, none of this is unmanageably
complicated, but in a big system with lots of tables it can hurt.
> > It's also the single main problem with the persistence model in Ruby
> > On Rails, which is where we came in. Look at item one on this list, for
> > example:
> >
> > http://discuss.joelonsoftware.com/default.asp?joel.3.309321.3
>
> Item one on that list speaks of the relational model being represented in
> the business layer, it doesn't speak of problems of using auto ints as a
> primary key in the database.
>
It's really the same point: if you have single auto increment fields (or
single keys of any kind) as the primary keys in all your tables, then
the relationships between your tables must be described somewhere other
than in the database itself. Which is a shame, as databases are good at
this.
Graham
> --
> Gary Short
> http://www.garyshort.org
>
>
|
 |
|