Don't Structure Data All The Way Down

Let's write some functions to operate on circles, where a circle is a defined by a 2D center point and a radius. In Erlang we've got some options for how to represent a circle:

{X, Y, R} % raw tuple {circle, X, Y, R} % tagged tuple #circle{x = X, y = Y, r = R} % icky record

Hmmm...why is a circle represented as a structure, but a point is unwrapped, so to speak, into two values? Attempt #2:

{{X,Y}, R} % raw tuples {circle, {point,X,Y}, R} % tagged tuples ... % gonna stop with records

Now let's write a function to compute the area of a circle, using this new representation:

area({circle, {point,_X,_Y}, R}) -> math:pi() * R * R.

Simple enough. But take a few steps back and look this. First, we're not actually making use of the structure of the data in area . We're just destructuring it to get the radius. And to do that destructuring, there's a bunch of code generated for this function: to verify the parameter is a tuple of size 3, to verify that the first element is the atom circle , to verify the second element is a tuple of size 3 with the atom point as the first element, and to extract the radius. Then there's a trivial bit of math and we've got an answer.

Now suppose we want to find the area of a circle of radius 17.4. We've got a nice function all set to go...sort of. We need the radius to be part of a circle, so we could try this:

area({circle, {point,0,0}, 17.4})

Kind of messy. What about a function to build a circle for us? Then we could do this:

area(make_circle(0, 0, 17.4))

We could also have a shorter version of make_circle that only takes a radius, defaulting the center point to 0,0. Okay, stop, we're engineering ourselves to death. All we need is a simple function to compute the area of a circle:

area(R) -> math:pi() * R * R.

Resist the urge to wrap it into an abstract data type or an object. Keep it raw and unstructured and simple. If you want structure, add it one layer up, don't make it part of the foundation. In fact, I'd go so far as to say that if you pass a record-like data structure to a function and any of the elements in that structure aren't being used, then you should be operating on a simpler set of values and not a data structure. Keep the data flow obvious.

permalink January 20, 2008

previously