Implementing data structures in a way that uses efficiently memory should always be on your mind. I do not mean going overboard and micro-optimizing memory allocation right down to the bit. I mean organize data structures in memory so that you can avoid pointers, so that you can use contiguous memory segments, etc. Normally, minimizing storage by avoiding extra pointers when possible will benefit your program in at least two ways.

First, the reduced memory requirement will make your data structure fit in cache more easily. Remember that if pointers are 4 bytes long in 32 bits programming, they are 8 bytes long in 64 bits environments. This yields better run time performance because you maximize your chances of having the data you need in cache.

Second, contiguous memory layouts also allow for efficient scans of data structures. For example, if you have a classical binary tree, implemented using nodes having each two pointers, you will have to use a tree traversal algorithm, possibly recursive, to enumerate the tree’s content. If you don’t really care about the order in which the nodes are visited, what’s quite cumbersome.

It turns out that for special classes of trees, complete trees, there is a contiguous, and quite simple, layout.

Consider the following figure:

You notice that each node is numbered, and that you have the following relations holding between the nodes:

For a node , you have…

that its parent is given by ,

, that its left child is given by ,

, that its right child is given by ,

Assuming that the root is numbered . This suggests the layout:

where the arrows corresponds to parent/child links.

The proof that this mapping works correctly is somewhat self-evident. What is less evident, is that such mappings exist for trees with higher branching factors. In general, if you have a tree with a branching factor of , you will have that, for a node ,…

the parent is given by ,

, the children are given by , , … .

So, plugging in the previous equations leads us to the binary tree case. The case corresponds to higher-order three. Letting gives us a layout for quad-trees, which are extensively used in computer graphics. yields octrees which are also used in computer graphics applications, such as collision detection, ray tracing and volume rendering. Note that if you put , you get a list, that the parent is given correctly by and the (only) child by .

The addresses of the first node in a row, for:

, is given by 0,1,3,7,15,… or , is Sloane’s A000225.

, is given by 0,1,3,7,15,… or , is Sloane’s A000225. , is given by 0,1,4,13,40,121,… or , and is Sloane’s A003462.

, is given by 0,1,4,13,40,121,… or , and is Sloane’s A003462. , is given by 0,1,5,31,85,341,… or , and is Sloane’s A002450.

The general formula is .

*

* *

The cases with , and can be implemented very efficiently depending on the type of processor you are using. Of course, multiplies by powers of two can be replaced by left shifts. On Intel processors, the following two functions:

inline int left_child(int x) { return 2*x+1; } inline int right_child(int x) { return 2*x+2; }

Can each be replaced by a single lea instruction. Assuming the parameter value is passed through eax , left_child should compile to lea eax,2*eax+1 . Since the normal calling convention for Intel is to return integer types through eax , we’re done.

*

* *

If you think this addressing scheme is way cool, you’ll find even more interesting that the scheme dates from way back [1]. Although Williams considers it only in terms of a binary heap (for heap sort) and that expresses the addressing relations only in terms of the address of the parent; one can understand through that very terse paper (a page and a few lines) that the addressing can be generalized. Williams’ paper is the earliest reference I could find on the addressing scheme. However, I am not sure that it is the original paper; surely the 1964 C.ACM report is based on earlier work?

Share this: Reddit

Twitter

More

Facebook

Email



Like this: Like Loading... Related