Interesting insights from Software Estimation: Demystifying the Black Art

Note: This is a letter that I wrote to Steve McConnell after I read his most recent book. I'm publically sharing it because I learned something that may be of general interest from thinking about the topic, and I'd like to see more people read the book.

I finally got around to reading your most recent book. As always with your books, I found it fascinating. So fascinating that I'm writing you an open letter which I am going to publicly post elsewhere.

First of all, as I expected, you hit it out of the park. You've done a very good job on a difficult topic that our industry normally does a horrible job on. Very few of your readers have any clue how poorly they judge what is 90% likely. It is incredibly helpful to be conscious of how people confuse estimates, targets and commitments. You were absolutely right to back up your advice on how to create accurate estimates with advice on how to defend those estimates from organizational pressures to replace them with wishful thinking. And, as always, all of the advice is backed up by invaluable compiled (and meticulously referenced) data on everything from how uncertain the best possible estimates are at various stages in the software lifecycle to how wide the productivity variation is from company to company.

As with any book, it is not perfect. However the overall quality is extremely high and the remaining imperfections are small. Furthermore you took into account all of my criticisms for the one chapter that I reviewed. Since I had the opportunity to review the rest of the book and didn't, I feel that any oversights that I notice are more my fault than yours.

Needless to say I highly recommend this book to everyone involved in the software development process. And my main difficulty now is identifying who in my immediate environment I should lend it to first. (ie Who would create the biggest positive impact on the company I'm in.)

As is often the case with your work, of even higher value is how close reading leads to or reinforces insights on other parts of software development. Sometimes this is presented in an understated way. Such as the paragraph on page 64 that says, "...individual performance varies by a factor of 10 or more. Within any particular organization, however, your estimates probably won't need to account for that much variation because both top-tier and bottom-tier developers tend to migrate toward organizations that employ other people with similar skill levels." (A fact which you then provide 2 references to.) I laughed aloud at that one.

Sometimes the tangential gems are presented very directly. For a random example on page 69 you point out that multi-site development increases needed effort an average of 56%. As you say, this effect should be carefully considered by organizations considering outsourcing. And while most software professionals understand that this is a significant factor, very few of us can quantify it. Which makes it hard for us to get businesses to take it seriously.

And sometimes the insights are not directly presented. They are just implicit in the copious data that you've presented, waiting to reward the careful reader who can spot them. I'd like to talk about one of those.

It has long been a mantra among people who like dynamic languages that developers are more productive in small groups, and so there is great value in delivering languages that make small groups as productive as possible. I cannot count how many times I have seen variations on this theme, nor can count how many times I have personally repeated it. Supporting anecdotal evidence is easy to find. However until I read your book, I'd never seen concrete quantitative evidence that I could quote to support what is common knowledge in some circles.

Well I'd long known evidence for part of that assertion. Variations on the chart that you reproduce as table 5-3 on pages 64-65 have been circulating for ages. And while I agree with your conclusion that it is more productive to use a language such as Java instead of a language like C, I'd also point out that it is more productive to use a language such as Perl instead of a language such as Java. Interpolating from that chart with too much precision, about 2.4 times. I hadn't before seen the more detailed table 18-3 that you offer on page 202. Judging from that, average Java programmers need 2.75 times as much code as average Perl programmers to do the same task. Those estimates agree since neither is very precise - Java takes somewhere between 2 and 3 times as much work for the same task as Perl.

Of course coding is but one of the tasks that needs to happen in software development. If only half of your development time would have gone to coding (a reasonable estimate based on table 21-4 on page 236), then reducing coding time to 40% of what it was only saves you 30% of overall effort. Still that is a significant reduction. Why don't people pay more attention to it?

The catch is, of course, that Java has many features that make it much better than Perl for handling the challenges of development in large teams. Therefore it is easy to dismiss the productivity benefit because "Perl is not scaleable." And it is easy to likewise dismiss the anecdotal accounts of exactly how productive small teams are because common sense keeps us from accepting that 6 people do more than a dozen.

Which is part of the reason why I am grateful to you for reproducing figure 20-3 on page 229. I've heard estimates before that it takes a team of about 20 people to match the output of a team of 6-7, but I'd never before seen concrete data backing that up.

Anecdotaly the primary cause is well-understood: people are most effective in a flat team, but that only works for teams up to about 6-8 people. With that structure you have little to no overhead from having to manage process, or from people not being able to find out what they need to know when they need to know it. But that falls apart when there are too many lines of communications. The solution to that problem is to introduce process to cut down who needs to talk to whom, when. However adding process drops productivity per person significantly, meaning you have to add more people. And this cascades until you get to the same productivity with a far larger team. But then you can scale for a lot longer, but at far higher cost.

There are secondary issues that are also well understood. For instance you're likely to find a higher portion of good developers in the small team environment? Why? Well there are a lot of reasons. First of all it is clear that it is easier for an individual to be productive in the small team than in the large one. People who are drawn to productive environments are likely to be people who value their personal productivity, who are therefore likely to be productive people. Conversely it is much harder for an incompetent developer to hide in a small group than a large one, so the worst developers don't stay. Additionally, given comparable turnover rates, one can maintain staffing levels in a small group while being more selective about candidates than one can in a large group. And finally a company that understands the cost benefits of having a small group of good people can justify higher individual salaries for those people.

So the 3-1 individual productivity difference in lines of code between small teams and large teams has a number of causes. It really isn't as simple as saying, "Move 2/3 of your 20 person team away and you'll get the same productivity." However that said, the line of code measure may be hiding some more dramatic productivity differences.

Some are very hard to quantify. For example common sense tells us that a team of 6 people that all talk to each other is going to have more consistency across 57,000 lines of code than a team of 20 people who are deliberately being kept from talking all the time. That lack of consistency is going to show up in all sorts of bad ways, from re-invented wheels to misunderstood internal APIs.

But one is easy to quantify: the small team is much more likely to be using a productive interpreted language than the large one. So the 57,000 line project delivered by the 7 person team might well have 2-3 times the functionality of the 57,000 line project delivered by a 20 person team in about the same time. (As I've noted, the productivity difference comes from a combination of factors, including having better people.) Even if you're paying those programmers 50% more per person, your productivity per dollar is about 5 times better with the small team than the large one. That's a pretty dramatic difference. While I'll be the first to admit that there are limits to what small teams of good people can do, I'll also stand in line to point out that those limits are farther out than most people realize, and there is a very good business case for relying on small teams whenever you can.

Anyways, congratulations on yet another excellent book, and I'm sure that I'll be digesting its consequences for a long time to come.

Cheers,
Ben

Comment on Interesting insights from Software Estimation: Demystifying the Black Art

Replies are listed 'Best First'.
Re: Interesting insights from Software Estimation: Demystifying the Black Art by talexb (Chancellor) on Jun 11, 2007 at 17:38 UTC
Interesting -- I'll have to see if I can pick up a copy of this book. However, it does make me wonder, what literature there is for smaller teams, say 1-3? When I started work at my current employer, I was joining a one-mean team, but about 18 months later, my team-mate left. So I've been working as a one-man team for the last three years or so. That has its advantages and disadvantages, depending, I suppose, on the personality involved. For example, with two people on the team, we often used extreme programming (XP) to write reams of code. Now that I'm on my own, there's a SysAdmin who sometimes sits with me as I wrote code now. While he's a smart guy, he's probably not be getting 100% of what I'm writing in Perl or SQL. It's still nice to have someone in the co-pilot's seat while I write code and explain my thought processes. I do know one thing about a small team -- you have to get on well with each other, otherwise productivity goes way down. The good news about a one-man team is that I can have a team meeting in my head. And .. that can be a bad thing too. :) Actually, I've found that the best time for a team meeting is The Next Day, if I'm really stuck. Another approach is to explain the problem to someone like my wife; she doesn't know much about software development, but she's a really good listener. This usually results in my explanation tailing off as I race off to look for a pen, so that I can write down the solution that's just occurred to me. My other outs are dropping in on Perlmonks or visiting #perl on freenode .. asking and answering questions on-line always gets my creative juices flowing. And, of course, without coffee I'd be lost. Anyone else have comments about a Development Team of One? Alex / talexb / Toronto "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds	[reply]
Re^2: Interesting insights from Software Estimation: Demystifying the Black Art by tilly (Archbishop) on Jun 11, 2007 at 18:00 UTC
If there is extensive research on the performance of very small teams, he did not present it. All that he presented was one graph from Lawrence Putnam for schedule and effort to complete a medium sized project for various team sizes. Medium was defined as 35,000-95,000 lines of code, and averaged 57,000 lines. (The averages were close to this for all team sizes.) Eyeballing the graph, it looks like teams of 1.5-3 people completed their projects in about 14 months, 3-5 in about a year or so, and 5-7 in just under a year. No data is presented about the relative quality of those projects, or the distribution of languages used. If you wish more detail than that, the reference that he gives is Putnam and Myers, 2003. Which would be Five Core Metrics. I have not read that book.	[reply]
Re^2: Interesting insights from Software Estimation: Demystifying the Black Art by itub (Priest) on Jun 12, 2007 at 08:46 UTC
Poor wife... why bug her when you can use a teddy bear? ;-) Another effective technique is to explain your code to someone else. This will often cause you to explain the bug to yourself. Sometimes it takes no more than a few sentences, followed by an embarrassed "Never mind, I see what's wrong. Sorry to bother you." This works remarkably well; you can even use non-programmers as listeners. One university computer center kept a teddy bear near the help desk. Students with mysterious bugs were required to explain them to the bear before they could speak to a human counselor. --Brian Kernighan and Rob Pike See also: Teddy Bear code reviews	[reply]
Re^3: Interesting insights from Software Estimation: Demystifying the Black Art by talexb (Chancellor) on Jun 12, 2007 at 14:09 UTC
;) I ask my wife because although she knows little about software development, or even computers, she has a very good grasp of logic and is quite able to ask intelligent questions. And asking questions comes in handy when the 'Aha!' moment doesn't occur. And sometimes, there isn't a 'great' solution -- you just have to pick one of the available 'less great' options and go with it. Alex / talexb / Toronto "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds	[reply]
Re^2: Interesting insights from Software Estimation: Demystifying the Black Art by samizdat (Vicar) on Jun 12, 2007 at 16:32 UTC
One of the advantages of big teams is that you ask for a few books and you get back "Investing in our people is always in ____'s best interests." Ordered. In regard to the productivity of small teams, I spent most of my career as "it". I think it surely has both advantages and disadvantages. One disadvantage is that you don't get valid design and code reviews, but one of its advantages is that you need fewer of them. I'll second the personality and politics caution. My very first programming-only job was 8051 assembly coding for a handbuilt wire processing assembly line. The guy who hired me started out being quite egotistical about his skills, but ended up getting quieter and quieter as my skills surpassed his. I won't say that I _was_ better than he was, but since I was learning this from scratch, it was obvious that I was going to be pretty good quickly. It never turned into a problem, but it could have. He was having difficulty not being the only star. I don't think that's an issue only for small teams (!!!!), but it's definitely more of a showstopper potential for them. I heartily second tilly's point that small teams can do a lot with Perl or other dynamic interpreted languages. There are times when teams (because of management or programmer stupidity) choose Java, say, because it's the 'next wave for Enterprise development', and end up killing the project and the company because of the design and CPU overhead. I'll need to read McConnell's book to get to the source of the thread, but I'll agree that an awful lot of projects that are coded in C or C++ or Java had no need to be done that way and would have been better done in Perl. A lot of this 'embedded Linux' stuff has no need to be compiled C, and we'd be better served to get it out the door more quickly than to save a few cycles of CPU. We're definitely seeing every one of tilly's concerns about big teams here, from personality conflicts to API problems to inefficiencies to crappy programmers hiding in the weeds. I've slipped in a few scripts that have ben well received, mostly for generating C header files and array declarations early in the build, but the truth is that there's very little code here that couldn't be all Perl. Food for thought. Thanks, tilly, for the link and the commentary. Don Wilde "There's more than one level to any answer."	[reply]
Re: Interesting insights from Software Estimation: Demystifying the Black Art by blahblahblah (Priest) on Jun 12, 2007 at 13:54 UTC
I also would recommend this book to all programmers. It's a very understandable blend of guidelines plus the math and research that led to the formulation of those guidelines. It has greatly helped me in my current project, which was one of the biggest projects I've ever undertaken. I often get asked for quick estimates of projects. My self-taught estimation technique had evolved like this over the years: 1. give a quick guess (usually turned out 3x too short) 2. make a quick guess, then triple that (usually closer, but still too short) 3. give a wildly pessimistic guess (never short, but not what my boss wants to hear) Since reading the book, I feel my boss & I are both happier with the estimates/schedule that we figure out together. In addition to presenting an easy-to-grasp summary of research into various individual factors that influence an estimate/schedule (like tilly's examples), the book leaves gives a lot of practical ideas for making better estimates. Joe	[reply]
Re^2: Interesting insights from Software Estimation: Demystifying the Black Art by dpavlin (Friar) on Jun 16, 2007 at 21:08 UTC
I had similar process when finding out my estimation factor. I also notices that my (positive) estimates are three times too short. However, when I need to estimate time for group of people, even if I know them really well, and can estimate time for each of them individually, I would probably take a optimistic guess, multiply it by three and than take next time unit. Yes, convert hours to days, days to weeks, weeks to months... Months would go to quoters or years, but you would already know that :-) This technique proved very successful in estimating time for small team of 7 people with various skill levels. It also leaves enough time for testing, documentation and deployment :-) 2share!2flame...	[reply]


Keep It Simple, Stupid
	PerlMonks