Optimizing for Human Understanding

Long ago, I worked on a commercial game that loaded a lot of data from text files. Eventually some of these grew to over a megabyte. That doesn't sound like a lot now, but they were larger than the available buffer for decoding them, so I looked at reducing the size of the files.

The majority of the data was for placement of 3D objects. The position of each object was a three-element floating point vector delineated with square brackets like this:

[ 659.000000 -148.250000 894.100000 ]

An orientation was a 3x3 matrix, where each row was a vector:

[ [ 1.000000 0.000000 0.000000 ] [ 0.000000 1.000000 0.000000 ] [ 0.000000 0.000000 1.000000 ] ]

Now this format looks clunky here, but imagine a text file filled with hundreds of these. The six-digits after the decimal point was to keep some level of precision, but in practice many values ended up being integers. Drop the decimal point and everything after it, and the orientation matrix becomes:

[ [ 1 0 0 ] [ 0 1 0 ] [ 0 0 1 ] ]

which is a big improvement. In the vector example, there's "-148.250000" which isn't integral, but those last four zeros don't buy anything. It can be reduced to "-148.25".

The orientation still isn't as simple as it could be. It's clearly an identity matrix, yet all nine values are still specified. I ended up using this notation:

[ I ]

I also found that many orientations were simply rotations around the up vector (as you would expect in a game with a mostly flat ground plane), so I could reduce these to a single value representing an angle, then convert it back to a matrix at load time:

[ -4.036 ]

I don't remember the exact numbers, but the savings were substantial, reducing the file size by close to half. At the time the memory mattered, but half a megabyte is trivial to find on any modern system. This also didn't result in simpler code, because the save functions were now doing more than just fprintf -ing values.

What ended up being the true win, and the reason I'd do this again, is because it makes the data easier to visually interpret. Identity matrices are easy to pick out, instead of missing that one of the other values is "0.010000" instead of "0.000000". Common rotations are clearly such, instead of having to mentally decode a matrix. And there's less noise in "0.25" than "0.250000" (and come to think of it, I could have simplified it to ".25"). It's optimized for humans.

(If you liked this, you might enjoy Optimization in the Twenty-First Century.)

permalink September 4, 2016

previously