The Problem with Databases
November 22, 2008
by Sam McDonald
After working on both web and desktop apps, I must say that there is a gaping hole in web development in the area of databases. Specifically, there is a large disconnect between the data structure (schema) and the business logic, that for the most part, developers are ignoring.
The Problem
When writing a desktop application, data storage is simple and elegant. I simply get a top level view of the data (usually a dictionary), and send it a message to turn into file data. Every object in that dictionary knows how to package itself, and how to unpack itself. With this simple elegant approach, saving and loading is solved on the desktop. Here is an example from an iphone app I recently wrote:
- (NSDictionary*)dictionaryForSaving
{
NSDictionary* dictionary =
[NSDictionary dictionaryWithObject:cardsArray forKey:@"cards"];
return dictionary;
}
Of course the web can’t be quite as simple as the desktop. For one, instead of saving data for one user, I am saving data for everyone, but this isn’t where my gripe is. My problem is simply that the data storage is completely disconnected from the code written to access it. If I were to hand you a PHP application without any information on the schema, it would be a nightmare for you to run it. This means that that open source php apps such as wordpress are forced to write database creation code. Although the code should create the database as well as use it, the facilities to do this elegantly in PHP just don’t exist.
The MVC Approach
In the web a programming world there has been a large jump towards MVC (Model View Controller). CakePHP and KohanaPHP are great examples. Abstracting the model out in this way is a great push in the right direction, however this does little to solve the problem of the disconnect, if I am given this code without knowledge of the schema, I still have a lot of work to do.
A move in the right direction
So what is the right way? After using Django to program this blog, it is definitely a push in the right direction. Take a look at my model for a comment on the blog:
class Comment(models.Model):
post = models.ForeignKey(Post, blank=True)
text = models.TextField()
name = models.CharField(max_length = 50)
email = models.EmailField(blank=True)
url = models.CharField(max_length = 100, blank=True)
def __unicode__(self):
return self.email
This segment of code contains everything I need for comments. I don’t have to write a single query. Instead of writing an SQL query to get all posts for a particular post my code looks like this:
comments = Comment.objects.filter(post__id=post.pk)
Now to get back to the main issue, schema description. Those 8 lines of code aren’t only everything I need to have full control over the model, but as you can see, they also describe my schema for a Comments table. After running a simple python command that model is converted to this database schema:
mysql> desc blog_comment;
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| post_id | int(11) | NO | MUL | NULL | |
| text | longtext | NO | | NULL | |
| name | varchar(50) | NO | | NULL | |
| email | varchar(75) | NO | | NULL | |
| url | varchar(100) | NO | | NULL | |
+---------+--------------+------+-----+---------+----------------+
6 rows in set (0.00 sec)
Where it should be
Although this is a step in the right direction, there is still much more that should be done. First, this type of model needs to be happening in more places specifically php. If anyone knows of anybody that is currently doing this, I would be very interested, but so far I don’t see any popular attempts. Also, Django is definitely not purpose, there is no reason why my database can’t be edited when I edit my model, but the current solution is to recreate the table upon editing.

comment2
comment3