The latest enthusiasm for hypothesis-testing in the Agile community is a good thing…Until it turns bad. If we’re not careful how we do hypothesis testing, that’s exactly what could happen.
Hypothesis testing means applying the scientific method, which involves doing something really, really hard: putting our cherished beliefs to the test, not to prove them, but to disprove them. Any fool can come up with “evidence” to support a hypothesis. Why do I think that matching socks keep disappearing after I do the laundry? Demons steal them. How do I know? If I’m really committed to this explanation, I’ll find some way to support this novel viewpoint.
Without this core commitment to testing to disprove, we inevitably fall victim to confirmation bias. We see this human weakness in operation all the time, such as when apocalyptic sects reaffirm their beliefs after the world doesn’t end on a predicted date, or people still believe that they have the power to influence streetlights. Less kooky examples exist. In fact, you probably succumbed to confirmation bias several times today, without being aware of it. My neighbor is obsessed with his lawn. My boss never listens. New Yorkers are loud and opinionated. We like our beliefs, so we surround them with “evidence” that protects them. See? Donald Trump is loud and opinionated, and he’s from New York!
Most of the time, our mistaken beliefs have trivial consequences, or none at all. So what if my neighbor really isn’t hyper-focused on landscape maintenance? However, when we’re talking about work, these biases do matter. If we’re wrong about what customers really want, we may waste millions of dollars in developing our software, and perhaps risk the fortunes of the company in the process. And if I test my hypotheses about customers the wrong way, looking for confirmation instead of challenging my assumptions, the risk increases dramatically.
Having worked in software companies, I’ve seen my share of executives waving “proof” that their ideas were good in the face of anyone impolite enough to point out the mounting evidence that, in fact, the ideas were pretty bad. I have here a list of strategic customers who have asked for this feature. I have the reports from industry analysts that prove there’s a multi-billion dollar market for this product. When polled, a majority of sample customers gave a thumbs up to this idea. Nearly every bad product, from New Coke to the Zune, had market research that “supported” it.
By now, many of you reading this blog post may be saying, “Yeah, I’ve heard that already.” But let me challenge you on your complacency. When was the last time you really put your idea to the test, looking for ways of disproving it that were clear and unambiguous?
Let’s take an example from IT. Suppose that you have been implementing an ERP system, and you keep hearing that the new software is “too difficult to use.” Obviously, you want some clarification about what that statement means, so you spend some time polishing the hypothesis until you arrive at the following:
When it takes contracts professionals more than 1 hour to assemble a new contract electronically, they will revert to the paper alternative.
That assertion is not only clear, but eminently testable. You can observe people in Contracts to see what happens when the ERP system gives them grief. If the frustration lasts for more than 1 hour, do they switch to paper forms?
While that’s definitely a test, it’s actually not a very good one. The results may be suggestive, but not definitive. Here are a couple of ways to cast doubt on the experiment:
- Observers might interfere with the results. No one likes looking foolish. If you’re watching people create new contracts, they might work with the software longer than if they were not being observed, just to show that they’re not stupid or incompetent.
- Other parties might interfere with the results. A manager in Contracts might make it clear to her staff that, during the test, it’s important to dramatize the problems with the new software. As a result, employees might give up faster than they might have, if they were not being observed.
Both of these scenarios show how easy it is to challenge the results. If I think the Contracts people are whiners, I can raise either possibility to deflect any criticism of the software’s usability.
To get better results, you need to construct a critical test, designed to provide unambiguous results that are immune to this kind of challenge. One of the most important tests between Newtonian and Einsteinian physics, for example, was the observed position of Mercury. For general relativity to be a better model, astronomers needed to find evidence that the gravity of the Sun bent light enough to make Mercury appear here rather than there at critical moments in its orbit. If Mercury appeared there rather than here, no amount of hand-waving could save relativity.
Those are the kinds of tests that you need to construct, whenever possible. They’re not that hard to do. To go back to our ERP usability problem, one way to test the hypothesis is to remove all paper forms from the Contract department and see what happens. If the system is genuinely “too difficult to use,” then the department would screech to a halt. While that might seem like a radical test, the value of the information it generates might easily outweigh the short-term cost.
We’re just not used to running critical tests, or using other techniques to disprove our cherished beliefs. I had the honor of being a reviewer for the Learning track for Agile 2016, and I was dearly hoping to see at least one presentation proposal talk about “hypothesis testing” in these terms. A few came close, but not by making a clear connection to the scientific method. The Agile community dearly needs this kind of guidance, or else we’ll just devise ways to confirm our own biases at an Agile pace.
Need help with hypothesis testing? Here’s what we can do for you:
- Getting Better Customer Insights Faster (workshop)
- Combining Agile Practice And Analytics (workshop)
- Serious Games As Innovation Tools (workshop)