|
For more information on Cutter Consortium's
Enterprise
Risk Management & Governance advisory
service, please contact Dennis Crowley at +1 781
641 5125, or e-mail dcrowley@cutter.com. |
WHAT METRICS SAY ABOUT XP
by Michael Mah, Senior Consultant, Cutter Consortium
Frothy eloquence neither convinces nor satisfies me. I am from Missouri. You have got to show me.
-- Missouri Congressman Willard Duncan Vandiver, 1899
In a recent Cutter Consortium Agile Project Management Executive Update, Senior Consultant E.M. Bennatan quoted the good congressman Vandiver and considered the possibility that, while he was the inspiration for the Missouri epithet as the "Show-me" state, he might have also unwittingly been the father of metrics, more than 100 years ago.
By that same token, I had often wondered about the "bottom line" of XP projects, outside of the frothy eloquence put forth by Kent Beck, Jim Highsmith, and other authors of the Agile Manifesto when it was first published. While the new movement positioned itself as a break from the "heavy methodologies" (in a CNN Crossfire-style left-to-right debate), it seemed to me that the authors were in essence claiming "better software, faster, at lower cost" compared to those who espoused "process." (Ironically, "better, faster, cheaper" was exactly what the CMM was striving for, starting with its inception in the late 1980s.)
Would XP projects deliver software faster, with fewer defects? How might they measure up?
Recently, we had the opportunity to answer this question for senior executives of a medical devices company, whose management team would not rest with just the frothy eloquence. They demanded actual before-and-after productivity numbers. The company -- specializing in medical instrumentation and diagnostics -- considered their software strategic. That is to say, the functionality provided by the software inside their products is a differentiator for them in the marketplace. That's why their customers buy from them. Management wanted hard proof of fewer defects and faster cycle time, if it existed. They might as well have been from Missouri, since they wanted to know -- based on facts -- whether their XP implementation was yielding the productivity and quality improvements they sought.
o answer those questions, we visited the company and collected actual project history for projects built using both traditional methods as well as XP methods. Through a series of one-and-a-half-hour- long interviews for each project, we gathered start and end dates for the "before" projects, sketched out the shape of the staffing profile to determine the amount of work effort expended, and tallied the volume of functionality produced by the teams in terms of modules, programs, objects, and new/changed code. We also collected defect statistics recorded as the number of bugs encountered during testing. Since reducing defects and improving quality were critical to this company (after all, they were in the medical devices industry), they wanted a defect baseline. Defects in the "after" scenario would be plotted and compared against this "before" baseline. (For details on the approach, see " Secrets of a Benchmarking Consultant " in the August 2001 issue of IT Metrics Strategies, as well as the Cutter research report " IT Organization, Benchmark Thyself .")
Next, we gathered the same type of information (time, staffing/effort, software size, and defects) on the company's XP releases [1]. For the sizing information, we also collected the number of user stories and story cards, in addition to the modules, objects, and new/changed code. All in all, we gathered 10 projects -- five in the "before" scenario and five in the "after" scenario. The people were the same, and the applications were virtually of the same type. The most meaningful change in this controlled experiment (for the most part) was the shift to Industrial XP practices [2]. Industrial XP are XP methods adapted for large-scale applications. Each data sample was then plotted against industry trendlines. Of the 7,000+ projects in our industry database, we specifically extracted cost, schedule, and defect benchmarks for medical/scientific applications and then conducted an apples-to-apples comparison.
The findings were fascinating. In a nutshell, the cycle time for the XP projects was about 25%-30% faster than for the traditional projects. But the real story also seemed to be in the defects, which fell by a factor of about four.
In the "before" scenario, defects discovered during testing were running at about two times the industry averages. For the most part, this was almost a predictable effect of schedule compression by using large teams. Most of the traditional projects were handed aggressive deadlines. To try and make these dates without cutting functionality, projects ramped-up staff. The natural consequence of this is higher defects, which we often see in industry data. It's a "law of software project physics" that seems to be the nature of trying to compress too many features into a tight time frame. More bugs happen as a natural consequence of the higher communication complexity of large teams, when it comes to knowledge and design work.
But in the "after" scenario, the defects were running about half the industry average. That's a four-fold reduction. Interestingly, the XP projects, while still using more people than the norm, showed fewer defects, not more. Were we seeing a shift in the software physics laws?
Perhaps. Truth be told, it will take more data to conclude whether this is inherent in XP methods or not. I will say that it is entirely possible, because of two distinct attributes of the XP process: colocated client teams and paired programming. Lower defects might be a result of pair programming in two ways: (1) getting what the client wants correctly the first time (or close to it), and (2) the instantaneous peer review of the code enabled by paired teams. Both may be contributing to fewer mistakes making their way into the code in the first place.
Time will tell if this is something definitive across XP projects at other companies. We intend to gather further XP project metrics whenever the opportunity arises. If your company would like to participate in this research, please let us know. Stay tuned.
-- Michael Mah, Senior Consultant, Cutter Consortium
NOTES
[1] For the record, the Software Engineering Institute refers to this as the "minimum data set."
[2] For more information on Industrial XP, see the profile on Cutter Consortium Senior Consultant Joshua Kerievsky at http://www.cutter.com/meet-our-experts/kerievskyj.html .
What Metrics Say About XP
