Finally: Data Structure Constants in Erlang

Here's a simple Erlang function to enclose some text in a styled paragraph, returning a deep list:

para(Text) -> ["<p class=\"normal\">", Text, "</p>"].

Prior to the new R12B release of Erlang, this little function had some less than ideal behavior.

Every time para was called, the two constant lists (a.k.a. strings) were created, which is to say that there weren't true data structure constants in Erlang.

Each string was built-up, letter by letter, via a series of BEAM virtual machine instructions. The larger the constant, the more code there was to generate it, and the more time it took to generate.

Because a new version of each string was created for each para call, there was absolutely no sharing of data. If para was called 200 times, 400 strings were created (200 of each). Remember, too, that each element of a list/string in 32-bit Erlang is 8 bytes. Doing the math on this example is mildly unsettling: 22 characters * 8 bytes per character * 200 instances = 35,200 bytes of "constant" string data.

As a result of more data being created, garbage collection occurred more frequently and took longer (because there was more live data).

Years ago, this problem was solved in HIPE, the standard native code compiler for Erlang, by putting constant data structures into a pool. Rather than using BEAM instructions to build constants, each constant list or tuple or more complex structure is simply a pointer into the pool. For some applications, such as turning a tree into HTML, the HIPE approach is a significant win.

And now, finally, as of the December 2007 R12B release, true data structure constants are supported in BEAM.

permalink December 9, 2007

previously