Software Estimation in the Fractal Dimension

How long is a piece of string?

Preface

Software estimation is a contentious topic. This piece does not intend to enter the fray with a diagnosis of estimation’s maladies or a prescription for their cure. Its purpose is to provide another perspective on an aspect of commercial software development that is, at least for some, a source of dissatisfaction…

The year is 1950. Lewis Fry Richardson, an English Scientist and Quaker sits at his office desk, deep in concentration. Hunched over a large map of Western Europe, illuminated by the softly humming glow of a single table lamp, he gently holds a set of metal compasses between his right thumb and forefinger. He deftly walks the compasses across the map, tracing the jagged border between Spain and Portugal. Upon reaching the border’s north western Atlantic coast, he scribbles upon a curled leaf of note paper, adjusts the compasses and walks back south. Walk, scribble, adjust, repeat.

Richardson was an ardent pacifist, fascinated by the causes of international conflict. He hypothesised (somewhat speculatively) that the probability of armed conflict between neighbouring countries is proportional to the length of their adjoining border. Interesting though this might have been, he noticed something far more intriguing in the process. He noticed that published measurements of international borders varied wildly between sources. Some sources quoted the Spanish/Portuguese border length as less than 1000km, while others quoted over 1,200km.

This sparked the startling realisation that the measured length of a border varies as a function of the length of the ruler employed in its measurement. Specifically, the longer the ruler, the shorter the measured length.

What’s going on here? To get a feel for what this means, let’s take a brief walk down a lane called reductio ad absurdum.

Imagine for a minute that you’re an enormous giant. Your left foot is in the Atlantic, your right is in the Mediterranean sea and you’re gripping a 1000 km long ruler in your gargantuan fist. You stoop down over the Iberian peninsula and smash your monstrous ruler smack bang onto the border between Spain and Portugal. The loss of life is tremendous, but you’ve got what you came for: a clear measurement of the length of the border. 1000 km, give or take…

Now, imagine you’re a tiny little ant. You’re eagerly scrabbling around in the dust somewhere at the southern tip of Spain, and you’re clasping a centimetre long ruler between your creepy little mandibles. You start measuring, scuttling Northwards over every divot, crevice and crenelation until you‘ve traced every millimetre of the border between Spain and Portugal. You collapse in exhaustion, but you’ve got what you came for: a clear measure of the length of the border. 27,000 km, give or take…

It’s fairly intuitive to understand what’s going on here. Country borders tend not to be perfectly straight lines. They’re jagged. If you place a long, perfectly straight ruler against a jagged line, you measure its length as a crow might fly. You miss the jaggedness, the hills, the rivers, the peninsulas, the escarpments.

The detail.

The ant on the other hand paces every minute peek and perturbation along the way, measuring roughly 27 times the distance in the process (If you’re wondering where the number 27 comes from, read on). Equally, if the journey was to be taken by a creature 100 times smaller than an ant, it would walk a longer distance still.

An intriguing implication of this is that if you wanted to know how far you would have to walk to trace the Spanish/Portuguese border, you’d have to obtain a measurement made using a ruler the length of your average stride. To put this another way, if you want an accurate measurement, the best way would be to actually walk it yourself.

It took a certain Benoît Mandelbrot to formalise Richardson’s discovery nearly 20 years later. Mandelbrot coined the term “fractal” to describe shapes (such as country borders) that exhibit self similarity at different levels of magnification. There’s plenty of maths and science to this, but let’s not bother with that quite yet. Instead, let’s teleport forwards about 70 years, to an office near you.

Oh boy…

The year is 2017, it’s 9pm on a mid November evening and James Foster, a software developer is working late. A half finished mug of coffee is slowly growing a skin as he hunches over his laptop, absorbed in thought. The soft glow of the screen illuminates deep furrows in his brow as he skims the requirements for an upcoming software project for the 10th time since his boss left for the evening. Her last words before heading for the lift were, “We need that estimate first thing in the morning James, I know you won’t let me down.”

The project dances indistinctly in his mind’s eye, rotating in 3D, morphing between focus and fuzz, detail and diffusion. Its size is ambiguous, the details are foggy but what the heck…from this distance, with the information to hand, it looks to James like a 6 month-er. The creeping feeling that 6 months will come and go with little to show lingers in his stomach. He shuts his laptop and heads to the lift in darkness.

That night, James dreams of marauding giants in southern Europe…

If you’re a software developer, you’ll no doubt have walked in James Foster’s shoes at one time or another. You may even be there now, bearing the burden of having to estimate the delivery date for something, the details of which will never be less clear than they are right now.

Emotions conflict heavily in this situation. Your current-self will urge you to exude competency by estimating low, while your future-self beseeches caution, admonishing you to estimate high. You’ll fear the estimate turned promise, the scope creep, the emergent complexity, the known unknowns, the unknown unknowns, the unknowable unknowns and you’ll reluctantly, squeamishly, hesitantly mutter….”um, about 6 months…give or take.”

And with these words, you’ll know for sure that either your current or your future self is damned. Probably both.

You see, software development is special (some might say). It’s not like building a house or a road or a car. It’s not merely complicated, it’s complex, meaning it’s difficult to predict. Estimating software delivery is not like costing up a new kitchen, or a washing machine repair, or something as mundane as measuring the length of border between two countries.

Or is it?

Software isn’t special

Richardson realised that a physical measurement is meaningless unless paired with knowledge of the size of the tool used to measure it. Our marauding giant and our plucky ant disagreed for this very reason. If an objective measure of physical properties is so prone to error, what hope do we have for the estimation of something as invisible, intangible and subjective as software? Let’s have a look at the ‘Richardson Effect’ in a bit more detail to see if it helps.

Benoît Mandelbrot formalised Richardson’s discovery in his 1967 paper “How Long Is the Coast of Britain? Statistical Self-Similarity and Fractional Dimension”. He observed that coastlines or borders are examples of self similar fractals. This means that the jaggedness of a border remains similar as you magnify it. i.e from a distance you see a jagged shape and as you zoom in, more jaggedness reveals itself, ad infinitum. Many natural and man made phenomena exhibit this property including trees, lightning bolts, electrocardiograms and the movements of the stock market.

Software complexity anyone?

Mandelbrot defined a new dimension, the ‘fractal dimension’ in addition to the standard up/down, left/right, forwards/backwards ones. The fractal dimension represents the jaggedness, or crinkliness of a line. This dimension can be used to calculate how the measured length of a line changes with the scale of the measurer. For example, the fractal dimension of the Spanish/Portuguese border is 1.18. This means that if you halve the length of your ruler, the measured length will be (ruler_length / 2) ✕ 2¹∙¹⁸.

Therefore if a 1,000 km ruler measures a border length of 1,000 km, a ruler half its length will measure 500 ✕ 2¹∙¹⁸ = 1,132 km.

So what length will a 1 cm ruler measure? That’s easy. 1,000 km = 1,000,000,00 cm, so 1,000,000,00¹∙¹⁸ = 2754228703 cm or 27542.28703 km. 27 times the length measured by the 1,000 km ruler.

Poor ant.

A perfectly smooth line has a fractal dimension of 1, meaning its measurement is the same regardless of ruler length. It remains smooth at all levels of magnification. An infinitely jagged line on the other hand has a fractal dimension of 2. Such a line scales in exactly the same way as a surface i.e it’s effectively 2 dimensional. The measured length of such a line will square as the ruler length halves!

All natural lines have a fractal dimension of somewhere between 1 and 2. The west coast of Great Britain has a fractal dimension of 1.25, fiord-festooned Norway has a crenellated 1.51, and smooth-as-silk South Africa has a remarkably un-crinkly fractal dimension of 1.02. (values sourced from Scale: The Universal Laws of Life and Death in Organisms, Cities and Companies by Geoffrey West).

Think of the fractal dimension as a measure of how much detail reveals itself as you look closer.

So what on earth does this tell us about software estimation?

Well, on the surface, not a lot. But look closer and it might reveal itself. I have no mathematical proof that the complexity of a software project scales as a self similar fractal but it certainly feels like it does, it looks like it does, it smells like it does and it quacks like…

Up-front estimation is like a giant’s eye view of the Spanish/Portuguese border (it looks kinda straight and turns left at the top). 1,000 km, 6 months…tops. The problem is, the giant isn’t making the journey. The journey is made one small step at a time.

The ant knows…

While our ant is scurrying over every crack and crater, every fissure and fault line, software developers are navigating through thought-space, one brain cycle at a time.

The software development landscape at ground level looks completely different to the distant amorphous globule that floated in James Foster’s worried mind. On closer inspection, obvious assumptions reveal themselves to be nuanced. The smooth path visualised from 10 km up is pock marked to hell at ground level. This affects the distance one has to travel. A lot.

The vague architectural and system level assumptions that formed the basis of James Foster’s 6 month ‘guesstimate’ fall into sharp relief as the journey actually begins. As we zoom in through the system architecture, technology choices, platforms, domain design, APIs, protocols, UI design, interfaces, classes, tests, algorithms, functions, collections, attributes, variables and semicolons (or not) at the end of each line, each step is a potentially jagged riddle of complication that is indistinct until encountered.

High level programming languages, frameworks and platforms effectively lengthen our software development legs, allowing us to step over lower levels of complexity. Just like the ant steps over microscopic cracks, our technology choices (can frequently) allow us to to vault the complexity associated with operating systems, kernels, compilers, assembly, logic gates, transistors, silicon, electricity, electrons, quantum physics. I make this point to illustrate that complexity does indeed continue ‘all the way down’ in a way that (at the very least) metaphorically mirrors the fractal nature of everything.

And it gets worse…

If complexity was limited to simply the hardware and software domain, estimation (for what it’s worth) wouldn’t be too problematic. However, hardware and software is nothing without wetware. Brains. Juicy, juicy brains marinating in sloppy bags of oxygen, carbon, nitrogen and hydrogen called Human Beings, people, or as project managers like to call us…”Resources”.

These fickle biological specimens are a significant source of complexity in our system. They are a soup of needs and emotions which, when combined with others of their kind, communicate through a low fidelity, highly lossy communication protocol called ‘language’. Joking aside, from organisational sociology, through intra and inter team dynamics, to individual relationships and the psychology and needs of individuals themselves, the software development system is entirely human.

This realisation is the punchline to the enduring practical joke of computer science as a refuge for the socially unrefined.

As far as estimation is concerned, the human element cannot be overstated. Productivity for example doesn’t vary linearly with the number of people involved (though Gantt wielding types may assume so). People are not fungible, and knowledge is an emergent phenomenon that routinely doesn’t exist (in a pertinent, specific form) at the start of an endeavour.

This barely scratches the surface, and we must move on.

So software’s special after all?

When one estimates the ‘delivery time’ for a nascent project, one is effectively measuring with an extremely long ruler. A ruler which will flatten mountains, straighten corners, and bridge valleys. These mountains, corners and valleys will however still have to be navigated at implementation time. The actual amount of time required depends on the complexity (the fractal dimension) of the project, the system in which you work, and the conceptual stride length with which you’ll be walking the journey. The conceptual stride length depends on your technical choices and your intra and inter human factors, both of which I acknowledge are virtually impossible to quantify.

It may however be useful to to consider whether your project is a highly complex Norway (fractal dimension 1.51) or a buttery smooth South Africa (fractal dimension 1.01) and to consider what one might do to one’s system to reduce this fractal dimension.

One might be forgiven at this stage for concluding that all one needs to do to counter the challenge of emergent complexity is to lock everyone in a room together until all the details have been discovered and chiseled into a gantt chart made of granite.

There’s a problem with this however, and this is where the fractal analogy breaks down somewhat.

Software isn’t made of rock

And that’s good, because neither are ideas. Software isn’t made of the stuff of modern day planet Earth. It’s made of stuff more akin its primordial ancestor. It’s malleable, changeable, plastic, volcanic. A metaphorical walk within a software development landscape would be witness to canyons opening under ones feet, islands sprouting from the ocean, mountains becoming valleys within days and destinations shifting continuously. So in addition to the fact that complicatedness reveals itself only at implementation time, we have the added complexity of this complicatedness shifting beneath our feet. In fact, our very footsteps can be the cause of this emergent complexity. The journey actually changes the destination.

While this might be perceived as an inconvenience to those who are incentivised by the sticks of ‘on time’ and ‘on budget’, it’s really a huge advantage for those incentivised by the carrot of meeting folks’ needs. We have the opportunity, throughout software’s development to demonstrate, to listen, to understand, to realise, to empathise, to learn, to grow and to change solutions in real time, as we journey towards meeting peoples’ evolving needs.

So lock people in a room if you think it’s going to help, but remember that the detail won’t reveal itself until you’re on the ground. And the ground itself will be seismically active.

So how should I estimate software?

The tone of this essay may betray a certain hopelessness as to the potential of ending the frustration associated with software estimation. I don’t wish to denigrate those who have a need for predictability and reliability. I truly empathise with this.

This essay has however presented a few points which may be helpful in meeting this need.

Measuring anything from a distance is prone to error. The further the distance you’re measuring from, the larger the error. If the thing you’re trying to measure exhibits self similar fractal characteristics, this error can be understood via the concept of a ‘fractal dimension’. In addition to complication only becoming fully visible at the point of execution, emergent complexity will change the route and the destination (the desired or necessary solution) upon every step. Point 4 is a good thing.

I will conclude with an observation and some questions. The observation is as follows.

Lewis Fry Richardson realised that to accurately measure how far he’d need to walk to complete the journey along the Spanish/Portuguese border, he’d need to actually walk it himself. This tells us something incontrovertible about software estimation. If you, with your organisation, with your team, with your technology, in your marketplace wish to accurately estimate how long a certain problem will take to solve through software. The best way will be to actually do it and measure how long it took.

I’m fully aware that this is distinctly unhelpful, so here come the questions:

If you can’t just do it and measure how long it took, how might the experience of others be able to help you out? If you can’t just do it and measure how long it took, how might your previous experience help you out? What system conditions (org, teams, individuals, technology choices etc) might render your previous experience helpful? Or unhelpful? If you can’t just do it and measure how long it took, how might doing some of it and measuring how long that took help? If software complexity is a self similar fractal, what does the visible complexity from a distance tell you about the likely complexity up close? Might this be quantifiable? If the distance from which you’re estimating increases the likely error, what does this tell you about when to estimate? If the error increases with the size of the thing you’re estimating, what does that tell you about the size of things you’d be wise to estimate? If estimating one thing is difficult, what effect will the knowledge that you’ll be working on something else at the same time have on this difficulty? Is your system more of a Norway (fractal dimension 1.51) or a South Africa (fractal dimension 1.01)? Can your system’s fractal dimension be reduced?

I apologise if you were looking for answers, only to find more questions. I do however hope that these questions will help you to find your own.

Benoît Mandelbrot put it nicely when he said, “Smooth shapes are very rare in the wild but extremely important in the ivory tower or the factory.” Software development, in the wild, is rarely smooth, and this is understandably difficult to grasp when viewed from up high. Ivory tower or not.

The pursuit of predictability can be approached in two ways. Firstly, one can improve one’s skills in predicting the future. Secondly one can nurture one’s system to become more predictable. The first approach is commonplace, the second less so.

Which one to choose?

References/Further reading

Scale: The Universal Laws of Life and Death in Organisms, Cities and Companies by Geoffrey West.

How Long Is the Coast of Britain? Statistical Self-Similarity and Fractional Dimension by Benoît Mandelbrot

The Collected Papers of Lewis Fry Richardson