About Henrique Netzka

Designing a database can be a very simple task, but can also become the biggest headache. Some mistakes you make can really compromise your project, and I can’t really imagine one person in this world who actually likes to migrate data between databases.

So here’s me, gladly writing again to DATAVERSITY, and this time trying to help you avoid some of the common mistakes I have seen.

#5: The middleware dependency

One of the most common mistakes I see is people mixing up the app layers, and storing information in a database that is useless without the middleware – or worse, the front-end. I’ve always been frustrated by people or things that can tell you only half of the story – and this includes your database.

Imagine a scenario where your history table says a person became “status 1” last week, and just yesterday the same person became “status 12”. “What the heck is ‘status 12’?” – one will wonder. And they’ll have no clue. This is just a bad definition.

Whenever you define numbered status columns, it’s a good thing to design a support table, and helpful for your data analysts. First, design one of those simple tables that your programmer will hardly access if he’s using a database middleware layer like Hibernate or LINQ. Still, your database should be aware of what “status 12” means.

This is a poor design.

Much better now, with a support table being able to translate the statuses.

#4: Create your own storage protocol – and dump it into a BLOB field

Programmers seem to love BLOB fields. They just create their own protocol in their middleware and dump that XML, JSON whatsoever inside it. Their apps run as smooth as a milkshake, and all that data becomes completely useless in every context other than the apps.

Ann Winblad once said: Data is the new oil. I do agree with her, and I think every database designer should think about it too. Once you put your data in a BLOB field with a custom protocol, you’re giving up all the database capability to process that data – filtering, pivoting, indexing and so on. So, avoid that. Unless you have some very specific case and you’ll never want that data indexed, you’ll certainly be better off creating all the columns you need and storing it accordingly.

#3: Don’t think about Globalization and Localization

Okay, it might be a fact that some software will hardly be globalized or need to be localized. But it’s also a fact that the world lost most of its frontiers, and… Well, you never know, right? You might wake up someday and a person from the other side of the world will be navigating your website, using Google Chrome’s live translation, and this person decides to use your software. What then?

So, even if you won’t translate your app in the beginning, it’s almost costless to have a “lang[uage] char(5)” column in your support tables, as well as storing time zones accordingly. For the latter, you can simply add a TimeZone column to your table, and a TimeZone column to your Users table. This should be enough to guarantee you’ll be structured once your app gains the world.

#2: Triggers for everyone!

Someday you’ll certainly run into someone who’ll ask you: “You’re not a trigger guy, are you?” Triggers have been loved and hated for several years, but you don’t need to do either: you can just use them. The biggest problem regarding triggers is that they can trick you seamlessly. You fire an update, which ends up inserting in another table, which updates another one, which deletes one record, which sends an e-mail, and that’s the moment you realize that humble update changed 30 tables in your database. Now, what if one of those was mistaken? You’ll spend a lifetime trying to figure the bug out.

I like to use triggers to do simple updates, mostly the log-types. For instance, to keep track of a person’s status change, or to log high-level database operations, triggers should do the job. I must say this is a personal opinion, but I’m not sure triggers are the right place to put your business rules in.

#1: The misuse of the data types

Every programmer knows integers, strings, floats and… Well, let’s say everybody knows up to there. However, databases usually offer several variants of those types, regarding globalization, floating point precision, storage space and so on. So, topping the list of the mistakes you should avoid, is the misuse of the data types.

Before designing any database, you should ask yourself what are the options. For instance, your database can offer a Boolean type, or you’ll need to use a small integer to store it. Date and time columns, floating point and decimal numbers… some databases will even support arrays and spatial data! So, do whatever you want to do, but don’t miss the chance to define the right data type to each column. It can save a lot of time and money later!

Well, that’s it for today’s blog. I hope you’ve enjoyed reading and see you next time!