I've used MySQL for a while now, and there were lots of surprising things I needed to cater for. This is from a Django and MySQL 5.5 perspective . Later on you'll see the horrible things I did to work around. It was a terrible experience, as I've also used PostgreSQL ...

Feel free to add your own experiences in the comments section.

If this is the same with the migration system in 1.7 (seems it it, but you only get a plain traceback) then it's ideal to have migrations small in scope if you have custom sql that can easily fail, and squash them later, after you have successfully ran them.

DDL statements in MySQL, not only they are slow and lock tables (that means downtime), they will ignore transactions. The almighty InnoDB can't save your sanity when an ALTER is used in a transaction. Thus, migrations in MySQL must be approached with great care (test them well) and to avoid downtime you need to use specific external tools.

# Now we have a mild degree of confidence :-)

Unfortunately the defaults are hard to change without breaking existing apps , so you have to work this out in your connection settings. Now the settings are:

Unfortunately getting the collation and encodings right won't make get_or_create work flawlessly . The default transaction level for MySQL is REPEATABLE READ , that means reads are consistent in the same transaction - they will return the same result, even if outside the transaction the data has changed. In other words, REPEATABLE READ will break get_or_create in a transaction - it's possible that code in a transaction won't "see" the object created outside the transaction (like in another process).

Note that this will take long time to run (MySQL seems to copy all the tables for some reason) and it will change the encoding and collation for all the columns .

# Needs to be filled, to figure it out run:

"This migration probably took 2 hours, you don't really want to rollback ..."

# locks the tables, since it copies all the data.

With the new migrations in Django 1.7:

"This migration probably took 2 hours, you don't really want to rollback ..."

# locks the tables, since it copies all the data.

To fix the existing data, a south migration would look like:

Unfortunately I didn't know this from the start. So I had to fix the existing data and installation scripts. When you create a database, make sure that you specify the encoding and collation:

# Note that later we find out that this is still not enough. Read on.

The solution is to use utf_bin collation and specify a different one only when you need special accent handling and case folding. The settings now look like this:

If you have a case insensitive collation all sorts of queries that aren't a text search will behave strangely . You'll notice that get_or_create doesn't work as expected when you have accents or a different case.

All the default collations are case insensitive. This means your queries and indexes are also going to be case insensitive . I'd say they are just insensitive . To your pain.

But that's not that bad, you should get some errors if you fail to set the correct encoding. Collations however, are more trippy. A collation is a set of rules for comparing characters in a character set .

Yes, the index on utf8 columns is limited to 255 characters . If you need to store emoticons like an angry face 😠- and most probably you'd want to, given the predicament - then you're out of luck, you need a different charset, utf8mb4 . But that means smaller index, only 191 characters .

So far I've used utf8 as the connection charset. Cause you never know what the default one is ( probably latin1 ). However, note that the charset affects the maximum index size for VARCHAR columns. Ever wonder why you often see this:

And yes, you can set this in the MySQL settings but you don't really want that. Your app will misbehave and corrupt data if you ever happen to forget to change the /etc/mysql/my.cnf before deployment. And if you ever forget to do that, your data is already going to be corrupted by the time you notice something is not quite right.

There's an open Django ticket to set this the default .

# Note that later we find out that this is not enough. Read on.

And no, ORMs won't save you from this pain by default. For Django you need to have something like this in the settings:

With the default settings MySQL truncates and does other unspeakable things to data for the sake of not giving errors. Where is this a correct choice, hard to say.

If you used any other database then it's going to be a very perplexing experience.

MySQL supports a large part of the ANSI SQL 99 standard, however the default settings are nowhere close to that.

The inflexibility *

There's something wrong with the query optimizer, it tends to use temporary tables for most queries that have a handful of joins and a GROUP BY . This quite common with Django but the why and how is a too large topic to tackle here. Maybe later ...

For now, a story of grief.

I have a model like this:

class Item ( models . Model ): class Meta : unique_together = "type" , "name" , "parent" type = models . CharField ( max_length = 20 , db_index = True ) name = models . CharField ( max_length = 200 , db_index = True ) parent = models . ForeignKey ( "self" , null = True , # Notice anything peculiar? blank = True )

For reasons unknown I went ahead with this model. Why was it designed like that and why it remained like that needs not be questioned. But what is wrong with it?

You see, creating unique index on NULL columns is a bad idea as NULL values aren't included in the index. The database will disallow inserting "foobar" two times but will allow inserting a boundless number of NULL values. For example, this would be allowed:

>>> Item . objects . create ( name = "stuff" , type = "foobar" ) . pk 1L >>> Item . objects . create ( name = "stuff" , type = "foobar" ) . pk 2L

Given unfavorable conditions (parallelism), get_or_create would create duplicate objects - because the database lets it.

This is something normal in SQL and ideally you'd redesign the model and MySQL is not to be blamed, right? But, alas, no. You see, I was using an ORM and I really liked the convenience of the ForeignKey field. Pity me for my weakness for I was like a thirsty fool in the desert.

The options I didn't have:

Keep the constraint but replace the parent in the index with a computed column. But MySQL doesn't have computed columns!

in the index with a computed column. But MySQL doesn't have computed columns! Use a conditional index. Nope, MySQL doesn't allow conditional indexes.

Create a view. Make the index on the view. No, MySQL doesn't allow creating indexes on views.

To add insult to the injury, PostgreSQL had all these.

Creating some workaround on the client side was no option. The application was highly parallelism - this had to be handled in the database. But I fought on and found a solution: :uppercase:TRIGGERS.

In my mad quest to solve this quickly I brought this into existence:

CREATE TRIGGER check_unique_on_item BEFORE INSERT ON myapp_item FOR EACH ROW BEGIN IF NEW . parent_id IS NULL THEN IF ( SELECT COUNT ( * ) FROM myapp_item item WHERE item . parent_id IS NULL AND item . name = NEW . name AND item . type = NEW . type ) > 0 THEN SET NEW = 'Error: Cannot insert this item. There is already an existing entry.' ; END IF ; END IF ; END ;

Because I was doing something invalid, just to stop the insert, get_or_create had to deal with a new type of error: DatabaseError . Thus the Item model needed a custom manager:

from django.db.utils import DatabaseError class ItemManager ( models . Manager ): def get_or_create ( self , ** lookups ): try : return super ( ItemManager , self ) . get_or_create ( ** lookups ) except DatabaseError as e : try : return self . get ( ** lookups ), False except self . model . DoesNotExist : raise e

Now if you look closely at the trigger you'll notice that it's not really a real constraint, thus, not atomic. I have realized that it doesn't really work during load-tests ...

So I went back to the drawing board and came up with a new idea: create a shadow table that has all the constraints, and triggers to update that table before the real one gets changed.

So I removed the unique_together from the Item model and created this:

class ItemUniqueFixup ( models . Model ): class Meta : unique_together = "type" , "name" , "parent" type = models . CharField ( max_length = 20 ) name = models . CharField ( max_length = 200 ) parent = models . PositiveIntegerField ()

And created the triggers and filled the shadow table:

DROP TRIGGER IF EXISTS check_unique_on_item ; TRUNCATE TABLE myapp_itemuniquefixup ; INSERT INTO myapp_itemuniquefixup ( id , type , name , parent ) SELECT old . id AS id , old . type AS type , old . name AS name , IF ( old . parent_id IS NULL , 0 , old . parent_id ) AS parent FROM myapp_item old ; CREATE TRIGGER check_unique_on_item_insert BEFORE INSERT ON myapp_item FOR EACH ROW BEGIN INSERT INTO myapp_itemuniquefixup ( id , type , name , parent ) VALUES ( NEW . id , NEW . type , NEW . name , IF ( NEW . parent_id IS NULL , 0 , NEW . parent_id ) ); END ; CREATE TRIGGER check_unique_on_item_update BEFORE UPDATE ON myapp_item FOR EACH ROW BEGIN UPDATE myapp_itemuniquefixup fix SET fix . type = NEW . type , fix . name = NEW . name , fix . parent = IF ( NEW . parent_id IS NULL , 0 , NEW . parent_id ) WHERE NEW . id = fix . id ; END ; CREATE TRIGGER check_unique_on_item_delete BEFORE DELETE ON myapp_item FOR EACH ROW BEGIN DELETE FROM myapp_itemuniquefixup WHERE id = OLD . id ; END ;

Now you see, this was very tricky to get right. With conditional indexes in PostgreSQL this would had been as easy as: