I can offer “Delayed Value Lost” or “Delayed Value Shrinkage.”

]]>I just read your article “Adaptive Agility – Managing Complexity and Uncertainty” and found you had come to the same conclusion I’ve come to in the hardware product development world. My book “Adaptive Product Management: Leading Complex and Uncertain Projects” (http://a.co/2wRFxjV) makes the same point about considering complexity and uncertainty and picking a management style that works for that project.

Cheers,

Andy Silber

]]>thank you very much for the detailed explanations! The problem I have with linear burnup is that it is a point estimate that doesn’t tell me anything about the uncertainty. It produces a value and I don’t know what to do with it.

We had teams that started slow, the linear burnup indicated a catastrophe and in the end, the projects turned out to be remarkably successful.

We had other teams that started quick and got in trouble close to the end.

Our approach was to try to get the most critical stuff done as early as possible and relied on the people’s judgment (which is far from being perfect of course).

Currently, I’m doing a little research on Monte Carlo estimation (see https://sebastiankuebeck.wordpress.com/2017/03/15/planning-with-uncertainty-part-2/) and I am testing the method by applying it to several open source projects. I’ll publish the results as soon as I’m done.

The nice thing about it is that it doesn’t require assumptions about the distribution and that it produces a distribution instead of a point estimate. The downside is that it requires a reasonable amount of historic data.

To the effort of story point estimation: The of story point estimation depends on the team/company culture. We used Magic Estimation to estimate backlogs with 100+ stories in an hour or so. I have also seen teams who spend a whole day per sprint discussing estimates without getting better results.

In addition to looking at the P90/P10 ratios we also compared the resulting curves visually (they are virtually identical) and also used Q-Q plots.

Well, if the result is as overwhelming as you wrote than there is no need to verify the method by using a different method. The terrible things that have been done with and to statistics (see e.g. http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124) made me a little bit paranoid. Sorry for that.

]]>Thanks for the great post. What I have found is that the use of the linear burnup extrapolation is a good, but not great tool for predictions. For this data using the burnup gives an EQF (Estimation Quality Factor) median of 6.0 for both story point and throughput extrapolation. This compares well with industry data from DeMarco and Lister reporting a median of 3.8. They judged a median EQF of 5.0 as pretty good. I think this improvement in EQF is primarily gained by de-biasing the estimates. That is the one thing that burnups do well, while we frequently see biases with estimates from humans. Burnup charts do not, however, reduce uncertainty unless the underlying data shows such a reduction.

You are correct that this study does not show that story points are bad in general. What we see is that for most conditions, there is little difference between using story points and using throughput. The only condition we studied that showed a significant difference is when the backlog contains stories with a wide distribution of story points. There may be other reasons why story points could be bad or good.

But I would disagree with your contention that counting stories would not reduce estimation time. There is a big difference between quantifying story points for each story and simply identifying which stories are so big that they need to be split.

As for the P90/P10 ratio, I did not post here all the rationale for why we used it. The primary reason that we used it is that it is essentially a means of stating the variance for distributions which are lognormal-ish (lognormal or Weibull). For such skewed distributions, we state the variance as a ratio rather than as a plus/minus which would be proper for a normal symmetric distribution. As we reported, we found the distributions to be close to either lognormal and Weibull. The nice thing about the P90/P10 ratio is that it uniquely describes the shape factor of either a lognormal or a Weibull distribution. For a lognormal distribution, the log of the P90/P10 ratio is proportional to the variance of the log of the distribution. For Weibull it is easy to derive the shape factor from the ratio. In addition to looking at the P90/P10 ratios we also compared the resulting curves visually (they are virtually identical) and also used Q-Q plots.

Lastly, you indicated that you found that burnup chart extrapolations did not produce reliable predictions. I’m curious what approaches you have used that are better. ]]>

first of all, thank you for taking the time to create this study!

If I got this right, you conclude that linear extrapolation in the burn up chart doesn’t generate reliable predictions. This confirms my personal experience.

However, the study does not provide evidence that story point estimation is bad in general, nor does it suggest an alternative that creates better results.

Just counting stories doesn’t really reduce estimation effort as you still have to estimate whether a story is a nut or a 10 tons super nut from outer space that has to be split into parts.

Another thing I noticed is that you only used the P90/P10 method without proving in any way that the assumptions for using this method are really met. I’d personally use a few methods just to be sure that the choice of method doesn’t influence the result.

Thanks for the question. We aren’t normalizing to compare SP across projects, but rather to compare the overall projects against each other. We normalize SP by dividing by the total SP delivered in the project. We do a similar normalization for time. This ensures that the burnup chart for each project starts at (0,0) and eventually concludes at (1,1). Using this approach allows up to compare projects on the same scale.

]]>Hours and dollars are cardinal measures so comparing hours and dollars between projects can be done. How is this done with SPs? ]]>