# Test Estimation – VI

So far, we have two ways to predict project outcome:

First by comparing the test effort to other projects, and suggesting it is “between the six of project X and projec Y”, this giving us a range.

Second by calculating out test costs as a percentage of the development (or total project) effort, looking at the official schedule and projecting out our expected project length. If we’re smart, we also take into account slips to get that percentage.

A third approach I can suggest is to predict the cycle time – time to run through all the testing once. I find that teams are often good at predicting cycle time. The problem is they predict that everything will go right.

It turns out that things don’t go right.

Team members find defects. That means they have to stop, reproduce the issue, document the issue, and start over — that takes time. More than that, it take mental brain energy; the tester has to “switch gears.” Plus, each defect found means a defect that needs to be verified at some later point. Some large percentage of defects require conversation, triage, and additional mental effort.

Then there is the inevitable waiting for the build, waiting for the environment, the “one more thing I forgot.”

So each cycle time should be larger than ideal – perhaps by 30 to 40%.

Then we need to predict the number of cycles based on previous projects. Four is usually a reasonable number to start with — of course, it depends if “code complete” means the code is actually complete or not. If “code complete” means “the first chunks of code big enough to hand to test are done”, you’ll need more cycles.

If you start to hear rhetoric about “making it up later” or “the specs tooks longer than we expected, but now that they are solid development should go faster”, you’ll need more cycles.

(Hint: When folks plan to make it up later, that means the software is more complex, probably buggier, than the team expected. That means it’ll take more time to test than you’d hoped, not less.)

So now we have three different methods to come up with estimates. With these three measures we can do something called triangulation – where we average the three. (Or average the ranges, if you came up with ranges.)

When that happens, it’s human nature to tend to throw out the outliers – the weird numbers that are too big or too small.

I don’t recommend that. Instead, ask why the outliers are big or small. “What’s up with that?”

Only throw out the outlier if you can easily figure out why it is conceptually invalid. Otherwise, listen to the outlier.

Which brings up a problem — all the estimating techniques I’ve listed so far have a couple of major conceptual flaws. And I haven’t talked about iterative or incremental models yet.

They are just a start.

Still more to come.

## 3 comments on “Test Estimation – VI”

1. blog says:

So each cycle time should be larger than ideal – perhaps by 30 to 40%.

Why 30-40%? Do you use a formula or is that STSP (standard test schedule padding) based on your experience? If it's your experience, how do you know it will work for others?

You are also saying (I think) that each test cycle is equal – do you have to run the same tests in each cycle? If you learn something on the first test pass, you may want to run more tests on subsequent passes. Conversely, you may learn something differently and run fewer tests – or different tests? Or is there an unwritten rule that says "thou shalt only add as many tests as thour removes?

Still trying to figure out what you're getting at – should I just wait for you to finish, or can I make one of those "choose your own ending" books?

2. Matthew says:

Logic tells us that we need /some/ padding, right?

The buffer percentage is going to come from experience and intuition – how much longer does it take us to actually test the software than the sum of the tasks?

That will be different by team. If you team has no experience with this sort of thing at all, well, 35% if a place to start.

Likewise, we'll say a 'cycle' is whatever we need need to do get some coverage of the app. If we find a ton of bugs in module A and none in module B, and all the fixes are made to A and 'we promise' say the devs, that none of the changes impact module B "at all." – well, cycle two will be very different than cycle one.

But it's probably a fair first-order approximation to say that cycle two will probably take about as long as cycle one did, no?

As for what I'm "getting at" – well I'm getting quite a few things. The general pattern is to lay out an estimation system, break it, and improve on it with the next one. I hope that's an interesting style of essay where we can at least learn a little – and be a little entertaining along the way.

I hope you agree.

3. blog says:

So…iterate, learn, and be smart (or whatever order is most appropriate) :}