Scott Ambler is Chief Methodologist/Agile and SOA at IBM Rational

Many people who are new to agile seem to struggle with how estimating occurs on agile projects. I suspect that this is due in part to the fact that many people still cling to very serious misunderstandings about estimation, such as being able to predict up front with precision what the cost of a project will be, and due in part to the rhetoric within the agile community around the "evil practices" of traditional project management. In this article I start with a brief discussion for the need for realistic expectations for initial estimating, then I describe the process of initial estimating from an agile viewpoint, and then I describe how to accurately update your estimates throughout the agile life cycle.

As Barry Boehm has shown us again and again and again we need to give ranged estimates which reflect the amount of uncertainty in the information which our estimates are based on. This holds for both cost and schedule estimates. Early in the project there is greater uncertainty so the range is greater, but as the project progresses the quality of the information improves, decreasing the uncertainty and therefore the range in your estimates decreases. This is something which Boehm calls the "cone of uncertainty." At the beginning of a project, due to the inherent uncertainty, it is reasonable to give estimates in the +/-30% range. Sadly, senior management, ever optimistic and unfailingly unable to learn from previous experiences, asks for a +/- 11% range on average (see the Dr. Dobb's July 2009 State of the IT Union Survey) and on average gets +/- 19% in practice. If we're actually getting +/- 19% why am I recommending +/- 30% then? In addition to the rather large body of knowledge about estimation, some of which recommends a far greater range, the Dr. Dobb's survey also found that to get +/- 19% project managers often had to do a fair bit of fudging such as padding the budget, dropping scope once the budget ran out, or even updating the original estimates at the end of the project to better align with the actual results (yikes). This implies that we need an initial range far greater, on average, than 19%.

There are many strategies for initially estimating the cost and schedule of an IT project, but they all boil down to the same basic idea: gather some information about the requirements, do some thinking about the potential solution(s), and then put this information through your estimation algorithm of choice. The individual strategies will potentially vary by the level of detail in the requirements, the formatting of those requirements, the amount of detail of the architecture/design specification, and in the factors addressed by the algorithm. For example COCOMO II and SLIM both require fairly detailed requirements and design specifications and vary mostly on algorithm minutiae (yes, as you'd expect there are raging religious debates about this within the estimation community). Both techniques require you to analyze your detailed specifications to count certain functional and technical aspects to plug into their complex estimation algorithms. On the other hand estimation strategies around use case points and user story points vary mainly on the format of the requirement, both of them requiring significantly less specification but relying on the ability of the team to guess at the size of the individual requirement. Once each high-level requirement, a use case or user story respectively, is given a point rating the points are merely added up and multipliers applied to give cost and time estimates. Other estimation strategies go even further in that they skip over the detailed use case/user story point counting step and go straight to educated guess(es) from experienced people.

On the surface of things, intuition tells us that the estimation techniques based on detailed specifications should be more accurate than the techniques which rely on the best guesses of the people involved. Sadly our intuition is pretty much wrong on this issue. The challenge is that there is still a lot of guessing going on with the detailed approaches. In theory, the errors in each of the little guesses should balance out and the overall error should be much smaller as a result. This would be true if the little guesses were independent of each other, but the problem is that they're not. All of the little guesses are dependent on the ability of the stakeholders to describe their requirements, something we know that people aren't good at; on the ability of the people doing the analysis; on the ability of the people developing the detailed architecture; on the ability of the estimator(s); and on the estimator's knowledge of the skills and ability of the people doing the actual work. In short, the guesses based on detailed information still have an incredible amount of uncertainty built into them, yet they appear to be less risky than the estimates based on less-detailed specifications because of the amount of effort put into calculating them. Arguably the only thing that these complex estimating techniques are doing is putting a "scientific faade" over an activity which is still mostly an art.

The reality is that you're going to need to do some initial estimating because at the beginning of a project someone is going to ask you to indicate what you think you're going to deliver, how you're going to do so, how much it's going to cost, and how long you think it will take. Because of the inherent uncertainty in the information which you have available to you at the beginning of a project you'll need to give ranged answers at first and then tighten up your answers as the project progresses. The Agile Modeling methodology describes techniques applicable to the first two questions. Let's explore how to address the questions about the cost and schedule estimates.

A common progress reporting technique in the agile community is something called a "burn down chart". Burn down charts show the amount of work on the project along the vertical "Y" access and time along the horizontal "X" axis. The change in remaining work to do is called the "burndown rate", something which varies throughout the project for several reasons. Based on how much work is left to do, and the burn down rate, you can easily project when you expect to be done. Then, given the expected end date and your current labor costs, you can estimate the overall labor portion of the expected cost of the project (non labor costs, such as hardware costs, travel costs, and so on will still need to be guessed).

Let's work through an example. You're a member of an agile project team which started on January 1. You spent the month of January in "sprint 0" -- which is also called iteration 0, the warm up phase, or the inception phase -- where you performed requirements envisioning to identify the initial high-level requirements, architecture envisioning to identify a realistic technical strategy, developed a shared vision, obtained support for the project, put together the team, and obtained a working environment. You've been asked by senior management to deliver the system by August 31. Your sprint length is two weeks, your guess at your initial velocity is 20 points per sprint, your burdened labor costs are $5000 per week per employee, you think you need four people on the team, and you have 300 points of work on your product backlog. So, with a burndown rate of 20 points your team projects you will require 15 sprints to complete construction (300/20) which is 30 weeks (15*2). You also believe that it will take an additional four weeks to release the system into production, with an expected release date of 34 weeks from February 1st, which is September 24th. There is also an expected labor cost of $680,000 (34 weeks * $5,000 * 4 people).

Because you're also overly optimistic and unable to learn from previous experiences, your team feels that it's possible that they can deliver by the end of August as requested. Worse yet, because of cultural dysfunction within your organization, the team chooses not to give a ranged estimate because in the past senior management simply forces the team to commit to the lower estimate in a nave belief that this will improve the team's motivation to deliver. In the end senior management takes your August estimate as a commitment, even though this estimate was based on guesses about the scope and the ability of the team to deliver functionality.

Let's move ahead to March 29, the beginning of the fourth sprint where you're doing sprint planning that morning. In the previous sprint you delivered 24 points of functionality, a bit better than you had initially guessed. But, because you've been getting great feedback from your stakeholders in the first and second sprint you identified a lot of missing requirements, this is a common occurrence on software delivery projects regardless of paradigm, and in the third sprint you still identified six points of new functionality. At this point in time you now have 320 points of functionality due to the influx of new requirements the previous iteration.

Now the estimate of the delivery date becomes a bit more interesting because there are two options for your burndown rate: The gross velocity, which is the number of points of functionality delivered by the team each sprint or the net velocity which is the number of points which the product backlog shrank by the previous sprint. In this case the gross velocity is 24 points and the net velocity is 18 points (24-6). Using the gross velocity the construction phase is now 14 sprints (320/24) and using net velocity the construction phase is 18 sprints (320/18). Including the four weeks required for release, the gross velocity indicates that the team will deliver in 32 weeks (14*2+4) on November 5 for a total labor cost of $760,000 ((32 + 6) weeks * $5000 * 4 people) and the net velocity in 40 weeks (18*2+4) on December 31st for a total labor cost of $920,000 ((40+6) weeks * $5000 * 4 people).

The most realistic thing to do is to take the conservative approach with net velocity, but either way it doesn't look like this team will make its date. Interestingly, we now have an easy strategy for giving a realistic ranged estimate for both the cost and schedule based on our actuals. We could have also given a ranged estimate at the very beginning of the project by estimating the initial velocity as a range.

At this point in time, you should inform senior management about the situation. For your IT governance strategy to be effective project teams must be as open and honest about their current status as possible, even when the status is bad. You should discuss strategies for rectifying the problem, which could include letting the delivery date slip, cutting functionality, or increasing the size of the team. Adding people to the team runs the risk of running afoul of Brook's Law which states that adding more people to a late project makes it later, although the team is still early in the life cycle so this shouldn't be a problem. Sadly, senior management tells you to stick to the existing date with the existing team and still deliver all the required functionality. The project status goes red and team morale goes down.

Move forward to April 26, the beginning of the sixth sprint. There are now 285 points on the product backlog, with 22 points delivered the previous sprint and four points added to your product backlog. Now the expected delivery date is between 30 weeks (285/22*2+4) and 36 weeks (285/18*2+4) which is November 20 and December 31 respectively. The cost is now estimated to be between $800,000 (40 * $5000 * 4) and $920,000 (46 * $5000 * 4). Alternatively, to make a delivery date of August 27 (the last Friday in August) you have only 14 weeks of construction left, implying that you can do between 126 (14/2 * 18) and 154 (14/2 * 22) points of work -- so you need to drop between 131 (285-154) and 159 points of functionality from the existing product backlog given your current gross and net velocities. Time for another discussion with senior management.

You can see where the scenario is going, the important thing is that it is relatively straightforward to update your estimates as you progress throughout an agile development project. I just wanted to leave you with a few more points.

First, it is common to have a negative net velocity in the first few sprints due to an increase in requirements as your stakeholders see the working solution at first. In this case the top end of your estimated range would be infinite, a problem which will correct itself once your stakeholders have a better understanding of what they actually want.

Second, a slightly more complicated approach, although more accurate, would be to use average rates for your velocity calculated over several sprints instead of just the previous sprint's velocity.

Third, you can have intelligent conversations based on actual data, not on wishful thinking. Senior management might not like what they're hearing, but the numbers speak for themselves. If your project is in trouble it's better to correct the problems as early as possible instead of letting them fester over time.

Fourth, doing these calculations by hand is part of the hidden bureaucracy in agile software development that we don't often talk about. The good news is that with integrated and instrumented tooling it's possible to automate these calculations.

Hot Links

In Questioning Earned Value Management (EVM) on IT Projects, I examine the traditional strategy of EVM for monitoring your project actuals against the plan.

The Dr. Dobb's 2009 State of the IT Union Survey explored project management and governance issues.

Lies, Great Lies, and Software Development Project Plans explores the results of the Dr. Dobb's July 2009 State of the IT Union Survey.

The Danger of Detailed Speculations summarizes the evidence around how detailed specifications can increase, not decrease, the risk on your IT projects.

At www.jazz.net you can see the project dashboard of the Jazz project team, including the burn down chart which is automatically generated by the Jazz-based development tools used by the team to develop Jazz.

Introduction to the Agile Scaling Model summarizes the ASM framework which provides advice for scaling agile strategies to meet the unique needs of your project.

The Surveys Exploring the Current State of Information Technology Practices page links to the results of all the Dr. Dobb's surveys which I've run over the years.

My [email protected] blog discusses strategies for adopting and applying agile strategies in the complex environments.