Notes from Waltzing With Bears: Managing Risk on Software Projects (by Tom DeMarco and Timothy Lister)
Required reading for all serious students of software engineering.
If a project has no risk, don’t do it. No risk, no reward.
Risk defined: a possible future event that will lead to an undesirable outcome; the undesirable outcome itself. Better definition of risk: a weighted pattern of possible outcomes and their associated consequences.
Transition indicator: a harbinger the risk is likely to materialize
“Project managers often tell us that their clients would never do any projects if they understood the downside. Such managers see themselves as doing a positive service to their clients by shielding them from the ugliness that lies ahead. Concealing the potential for delay and failure, as they see it, is a kindness that helps clients marshal sufficient gumption to give the go-ahead. Then, the project can very gently introduce them to bad news, a little bit at a time, as it happens.”
Risk Management Decriminalizes Risk. “Can-do thinking” in corporate America; when you put a structure of risk management in place, you authorize people to think negatively, at least part of the time.
Risk Management Protects Against Invisible Transfers of Responsibility. When a client negotiates away a contingency fee that was meant to cover certain risks, responsibility for those risks has likely migrated from the contractor to the client.
Risk Management Requires Organizational Buy-In. Telling the truth where optimism/lying is the norm puts you at a huge disadvantage. Use your risk management knowledge in secret, unless your organization explicitly provides for this; otherwise you lose out to the hungry peer who says “Give me the project and I will deliver on time, guaranteed”.
Bad risk management: only dealing with problems for which you have solutions. To vaccinate, at the first go-round of what would normally be risk identification, vaccinate everyone by naming all the catastrophic outcomes you can imagine. Work backwards and try to describe which scenarios might lead to that.
Risk management: where your project planning is very much focused on what to do if you don’t catch breaks. Projects that start off as personal challenges seldom have their risks managed sensibly. Luck should never have to be built into the plan. Offer reasonable stretch goals, but make sure that real expectations make room for the breaks that don’t happen.
The pathology of setting a deadline to the earliest articulable date essentially guarantees that the schedule will be missed.
For the software industry as a whole, window size of delivery is in the range of 150 to 200% of the allocated time. That means in general, you can expect projects to take up to two times as long as you think they should, even when you do thorough, bottom-up estimates. This is why I personally multiply all my estimates by 4x.
When a project strays from schedule, it’s seldom because the work planned just took longer than anyone had thought; a much more common explanation is that the project got bogged down doing work that wasn’t planned at all.
Totally mechanical beginning to the business of risk management: run a few postmortems of projects good and bad and look for ways in which they deviated from their initial expectations. Trace each deviation back to its cause and call that cause a risk. Give it a number and carry on. Yesterday’s problem is today’s risk.
In personal experience (having worked mostly in the industry since 2003) the common risks (the ones that keep appearing AGAIN AND AGAIN) are:
- Key personnel turnover
- You have someone great, but they aren’t being treated or compensated properly. Enjoy your amazing deal while it lasts, but find a way to treat people fairly and still profit. Improve the quality of codebase README documentation to reduce the onboarding time of new hires. Every new hire should make README improvements that fully resolve all areas of confusion.
- Building systems that haven’t been designed for potential scale
- You haven’t measured or modeled what it will take to scale delivery of services and how those costs will scale.
- Changing product requirements
- Use a spec and have 1-day development cycles, with team standups at the start and at the end of the day. Start of day: what I did yesterday, what I plan to do today, where I’m blocked. End of day: what I did today, what I plan to do tomorrow, where I’m blocked.
- Growing too slow; having no predictable channel for acquiring customers
- This can be done in parallel to building the product
- Users find the product unintuitive
- Easily mitigated through blind usability studies and then actually responding to the feedback.
- Confusing and undocumented codebase functionality
- All commits to version control should reference a JIRA/Trello ticket
There are 5 core risks common to all software projects: irrational deadline, requirements inflation (“scope creep”), spec ambiguity, employee turnover, and poor productivity.
What to do about a risk? Avoid it, contain it, mitigate it, evade it? Avoiding means forgoing the reward of the risk. You mitigate a risk when you take steps before its materialization to reduce eventual containment costs. These are the steps required in advance so that the containment strategy you’ve chosen will be implementable at transition time. Evading a risk is just like crossing your fingers and getting lucky: risk management is not the same as worrying about your project.
The client has every right to nominate certain risks for the contractor to manage, and vice versa. If you are the client, your safest posture is to assume that only those risks specifically allocated to the contractor are his, and that all the rest are yours. Incentives or penalties in the contract allocate risk.
The contractor’s risks are those that endanger the successful completion of the contract or diminish the value of completion to the contractor. Everything else is judged by the contractor to be somebody else’s risk, and thus a candidate for exclusion from his risk management. That means that you, as the client, have to manage these risks or no one will.
A common class of litigation arises out of projects in which the client is surprised to find that certain important risks never made it onto the contractor’s radar. Usually, fault lies with the contract that failed to assign those risks. As a general rule, there are no contracts that successfully transfer all responsibility to a single party. If you are either client or contractor, expect to have to do some risk management.
If you calculate exposure for all your risks and set aside a risk reserve equal to the total exposure, that risk reserve will, on average, be sufficient to pay for the risks that do materialize. Your best guess about likely materialization may come from industry data, previous lists, or just a flat-out guess… Don’t excuse yourself from this essential act just because any answer you come up with will never be demonstrably correct. Risks also need to be budgeted for in a time sense as well as money.
Showstopper risks: these are risks that, should they materialize, will fully kill a project. The rule here is that a risk owned above you in the hierarchy is an assumption to you. The risk still belongs on your risk list, but it should be explicitly noted as a project assumption. You would do well to make a little ritual of passing this risk upward. When you present your risk management plan, formally delegate the management of some risks upward to someone above you in the hierarchy.
For each managed risk, you need to choose one or more early indications of materialization. For example:
|Startup won’t acquire enough users||Company misses one of its early growth goals|
|Key personnel turnover||Person is uncommunicative during one-on-one meetings|
Steps for risk management:
- Use a risk-discovery process to compile a census of risks facing your project
- Make sure all of the core risks of software projects are represented in your census
- Inherent schedule flaw
- Managers who come up with or agree to seriously flawed schedule commitments are performing poorly. The key point is that when a project overruns its schedule, it is in spite of, not due to, developer performance. Schedules should be based on bottom-up estimate of work rather than arbitrary commitments.
- Requirements inflation
- Well-managed projects change at less than 1% per month (US Department of Defense standard)
- Employee turnover
- Specification breakdown (ambiguity in specification)
- 10-15% of software projects are canceled without delivering anything. Each project has cancelation risk that is closed once all parties sign off on the boundary data going into and out of the product, and on definitions down to the data element level of all dataflows arriving or departing from the software to be constructed. Data inflow and outflow descriptions are less prone to ambiguity than function descriptions. Force yourself to get agreement on data inflow and outflow before 15% of the way through the project. If you can’t attain consensus by that point, the best option is project cancellation.
- Poor productivity
- Inherent schedule flaw
- Do all of the following homework on a per-risk basis:
- Give the risk a name and id
- Brainstorm to find a transition indicator – the earliest practical indication of materialization – for the risk.
- Estimate the cost and schedule impact of risk materialization.
- Estimate the probability of risk materialization.
- Calculate the schedule and budget exposure for the risk.
- Determine in advance what contingency actions the project will need to take if and when transition occurs.
- Determine what mitigation actions need to be taken in advance of the transition to make the selected contingency actions feasible.
- Add mitigations actions to the overall project plan.
- Designate showstoppers as project assumptions. Perform the ritual of passing each of these risks upward.
- Make a first pass at schedule estimation by assuming that no risk will materialize.
- Using min,max bottom-up estimates for each of your functionality points to construct a risk diagram that shows the earliest and latest possible delivery for the project.
- Express all commitments using risk diagrams, explicitly showing the uncertainty associated with each projected date and budget.
- Monitor all risks for materialization or expiration, and execute contingency plans whenever materializations occur.
- Keep the risk-discovery process going throughout the project, to cope with late-apparent risks.
- Force a complete design partioning prior to any implementation. Use this as input to the process of creating an incremental delivery plan.
- Assess value to the same precision as cost.
- Break the requirements contained in the spec down to their elemental level. Number them in a rank-order by priority. Use net value to the user and technical risk as the two criteria for prioritization.
- Create a release plan in which the product is broken into versions (enough to schedule a new version every week or so). Assign all the elemental requirements to their versions, with the higher-priority items coming in earlier. Calculated Expected Value for each version and record it in the plan. Treat the incremental delivery plan as a major project deliverable.
- Create an overall final product-acceptance test, divided into releases; one per version.
Keep your risk census public if the politics allow for it.
The hidden meaning of “I don’t know”: an essential part of project management is coming up with the answers to key questions such as, When will you be done? Will your user accept and use the product? Our point is that you need to recognize these I-don’t-know questions because they are always indicators of risk. Force yourself each to ask a subsidiary question: What do I know (or what could I know) about what I don’t know?
Unwritten rules of corporate culture:
- Don’t be a negative thinker.
- Don’t raise a problem unless you have a solution for it.
- Don’t say something is a problem unless you can prove it is.
- Don’t be the spoiler.
- Don’t articulate a problem unless you want its immediate solution to become your responsibility.
Introduce a ritual that makes it okay to share fears about a project.
- Brainstorm disasters
- Describe scenarios that could lead to disaster
- Run root cause analysis
WinWin management: the project makes an up-front commitment to seek out all stakeholders and solicit from each one the so-called win conditions that would make the project a success from his or her point of view. The requirement is defined as the set of win conditions. Nothing can be considered a requirement if no one can be found to identify it as one of his or her win conditions. Ask participants, “Can you think of an obvious win condition for this project that is in conflict with somebody’s win condition?” Each identified conflict is a potential risk.
Incremental delivery is a way to reduce risk, but doesn’t make sense if you’re only shipping a total of two or three versions. A proactive approach to incremental delivery involves prioritizing value delivered to the stakeholder and confirmation of risk hypotheses. The risk-aware manager will want to force the portions involving serious technical risk into the early versions.
Projects with a critical deadline require an early start:
An IT manager and a normal person are both working in Chicago on a Wednesday afternoon when they learn that they have to be in San Francisco for a noon meeting on Friday and that it’s imperative to be on time. The normal person– let’s call her Diane– takes a Thursday evening flight and checks herself into that pleasant little hotel just down the block from the San Francisco office. She has a leisurely dinner at Hunam and wanders over to Union Street to take in a film. The next morning, she has a relaxed breakfast and works on her laptop until eleven. She checks out at 11:30 and strolls into the office ten minutes early.
Meanwhile, the IT manager, Jack, has booked himself on the 8:40, Friday morning. He catches a cab midtown on at 7:05 and runs into a traffic jam on the Eisenhower. He complains angrily to the cabdriver all the way to O’Hare. The stupid driver just can’t be made to understand that it is essential that Jack make this flight. When he checks in at UNited, he tells the check-in clerk rather forcefully that the flight must take off and land on time, no excuses. He tells her that he will be “very, very disappointed” with any lateness. When a gate hold is announced, Jack jumps up and objects loudly. When a revised departure time is announced, he digs deep into his bag of managerial tools and delivers the ultimate pronouncement: “If you people don’t get me into San Francisco in time for my noon meeting, HEADS WILL ROLL!”
How to decide what to build: costs and benefits need to be specified with equal precision. When a benefit cannot be stated more precisely than “We gotta have it,” then the cost specification should be “It’s gonna be expensive.”
“The savings figures also are classified by whether they are reductions or avoided costs. The difference is important. Reductions are decrements from current approved budget levels. You (the requesting manager) already have the money; you are agreeing to give it up if you get the money for the new system. Avoided cost savings are of this form: ‘If I don’t have this system, I will have to hire another in . But if I do get this system, I can avoid this cost.’ This is every system requester’s favorite kind of benefit: all promise, no pain. The catch is that there is no reason to believe you’d ever have been funded to hire those additional workers. You’re trading off operating funds you may never get in the future for capital funds today. Very tempting, but most request-evaluators see this coming miles away. The correct test for avoided-cost benefits is whether the putative avoidable cost is otherwise unavoidable, in other words, that the future budget request would inevitably be approved. This is a tough test, but real avoided-cost benefits can and do pass it.” – Steve McMenamin, Atlantic Systems Guild
If increasing the size of a product exposes you to more-than-proportional increases in cost, then decreasing product size offers the possibility of more-than-proportional savings. Eliminating those portions of the system where the value/cost ratio is low is probably the easiest and best way to relax constraints on time and budget.