Agile programming notices that we’re bad at forcasting, but concludes that we’re systematically bad. The approach it takes is to ask the programmers to try for consistency in their forecasting of individual tasks, and puts the learning in the next phase, which is planning. So as individual members of a development team, we’re supposed to simultaneously believe that we can make consistent forecasts of particular tasks, and that our ability to make estimates is consistently off, and applying a correction factor will make it work.
Partly this is like learning to throw darts, and partly it’s a theory about aggregating biased estimates. What I mean by the dart throwing example, is that beginning dart tossers are taught to try first to aim for consistency. Once you get all your darts to end up close to the same spot, you can adjust small things to move the spot around. The first requirement for being a decent dart thrower is being able to throw the same way each toss. Once you can do that, you can turn slightly, or learn other tricks to adjust how your aim point relates to the place the darts land.
The aggregation theory says that the problem in forecasting is not with the individual estimates, it’s with random and unforseen factors that are easier to correct for in the aggregate. The problem with the individual forecasts might be overhead tasks that reliably steal time away, it might be that bugs are unpredictable, or it might be about redesign that only becomes apparent as you make progress. These are easier to account for in the aggregate planning than when thinking about individual tasks. If you pad all your tasks for their worst case, you end up with too much padding.
Over the long term, the expansion factor from individual tasks to weeks or months of work can be fairly consistent.
Providing Slack at the project level instead of the task level is a really good idea, and has worked well in many fields outside of programming. It is analogous to the concept of insurance: the RoI on Slack is higher when you aggregate many events with at least partially uncorrelated errors.
One major problem with trying to fix estimates at the task level is that there are strong incentives not to finish a task too early. For example, if you estimated 6 weeks, and are almost done after 3, and something moderately urgent comes up, you’re more likely to switch and fix that urgent thing since you have time. On the other hand, if you estimated 4 weeks, you’re more likely to delay the other task (or ask someone else to do it).
As a result, I’ve found that teams are literally likely to finish projects faster with higher quality if you estimate the project as, say, 8 3-week tasks with 24 weeks of overall slack (so 48 weeks total) than if you estimate the project as a 8 6-week tasks.
This is somewhat counterintuitive but really easy to apply in practice if you have a bit of social capital.
We’re not systematically bad forecasters. We’re subject to widespread rewards to overconfidence.
Ah, interesting evidence, thanks! And I like the dart-throwing analogy.