Database Design

I’m finally getting back to my PHP, MySQL and Apache book and started to plan out my first-ever database. The idea is to create a database-driven wish list to replace my current PHP-include experiment. While this is probably child’s play for the database programmers out there, it has required a bit of work on my part just to get a first-pass design.

To begin, I jotted down all of the fields I think I’ll need:

  • Primary Key
  • Person
  • Item
  • URI for item (if applicable)
  • Category

I then began the process of normalization. Since I created the original list based on person, that’s how I started. In other words, I want items 1, 2 and 3; Sarah wants items a, b and c.

It was easy to see that the item category was repeated across the lists, so I pulled that out in an attempt to reduce redundancy:
Attempt at First Normal Form

There is, however, still redundancy in the Wish_List table: Name. Obviously Sarah and myself wish for more than one item, so names are a repeated field. Plus, it needs to scale (i.e. to add Olivia). In trying to visualize how that relationship worked, I mapped out new relationships:
Attempt at second normal form

But something wasn’t right.

Finally it dawned on me — I kept approaching the structure from the person point-of-view. Essentially, I was using the person as a primary key.

So I starting over with the item as the primary key, instead of creating the arbitrary primary key. Basically, I changed my view from “Tim wants a book entitled How to Program” to “How to Program is a book Tim wants.” This really helped the relationships stand out. In terms of writing, I switched from active voice to passive voice. The result:
Nearing Third Normal Form

Much closer to being Third Normal Form, but I didn’t think that it was technically correct. I believe that Wish_List actually needs to be a mapping table between Category and the items. One last modification resulted in:
Third Normal Form

I think I now understand how to create an efficient database. We’ll see when I actually start building. If any expert database types out there want to weigh in on how I’m doing, I’d love to hear from you. Feel free to point out where I’ve gone wrong, as well as what I’ve got correct.

Up next – actually building the tables and programming.

Filed under:

The URI to TrackBack this entry is: http://www.tjameswhite.com/blog/archives/2005/06/database-design/trackback/

3 Comments »

  1. Hm. It would help if you explained what each table is supposed to do. What is ‘Category’ supposed to keep? Things like ‘Home’, ‘Personal’, ‘Fun’, ‘Baby Room’?

    It almost looks as though the Item should have the CategoryID, rather than the wishlisted item. And it looks like the wishlisted item should have a person, not the item.

    So, I’d suggest:

    Tables:
    tblItems {ItemID(Key), ItemName(text), URL(text, link), Category(Foriegn Key)}

    tblCategories {CategoryID(Key), CategoryName(text)}

    tblPerson {PersonID(Key), PersonName(Text)}

    tblWishlist {WishlistID(Key), PersonID(Foriegn Key), ItemID(Foriegn Key)}

    This set up works if you’re planning on having one category per item and one or more people who want one item. So, for example:

    Tim - Vasoline
    Sarah - Vasoline
    Olivia - Vasoline
    Tim - How to Program Book
    Sarah - The Complete Works of Shakespeare

    But the type is always the same:
    Vasoline - Baby Goods
    Shakespeare - Fiction Books
    How to Program - Non-Fiction

    The goal is to remove duplicated information with keys, since storing a number is less space than text, but tables also need to contain information about one complete object (a la Object Oriented). Each table handles everything about the one object.

    Does this makes sense? It’s been a rough day ;)

    Comment by Patrick — Thursday, 6/16/2005 @ 3:51 pm

  2. PS
    Every table should have a Key, even if it’s junk. Sometimes it’s the only way to uniquely identify a record. I’d also suggest putting some kind of Boolean flag in there to mark when you’ve received the item. It’s Bad, Bad, Bad database to delete things, since once it’s gone, it’s gone and you can never recover historical information on it. So, in the schema above, I’d put: Inactive(Bool) on the Wishlist table.

    Let’s talk database sometime ;)

    Comment by Patrick — Thursday, 6/16/2005 @ 3:53 pm

  3. Patrick,

    The tblWishlist you have mapped makes sense, and what actually my original thought — I just didn’t know if it worked that way. : )

    As for the rest of it, I think it makes sense. I’m going to have to print it out and compare it to my sketches. I’ll put up a follow-up post with more details about what everything should be doing.

    And excellent thought about the ‘got it’ flag. I had thought that far ahead.

    Comment by Tim — Thursday, 6/16/2005 @ 4:12 pm

RSS feed for comments on this post.

Leave a comment

Comments are moderated, if you've commented before, it will show up automatically. If not, it will be submitted for approval. Please leave a name and e-mail address. They are for my verification only and do not appear online in any way shape or form. Without a name and/or and email address I don't know who you are your comment will not be approved.

Line and paragraph breaks automatic, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

(required)

(required)