Wednesday, December 23, 2009

Why vanity metrics are dangerous

In a previous post, I defined two kinds of metrics: vanity metrics and actionable metrics. In that post, I took it for granted that vanity metrics are bad for you, and focused on techniques for creating and learning from actionable metrics. In this post, I'd like to talk about the perils of vanity metrics.

My personal favorite vanity metrics is "hits." It has all the worst features of a vanity metric. It violates the "metrics are people, too" rule: it counts a technical process, not a number of human beings. It's a gross number, not per-customer: one hit each from a million people is a very different thing than a million hits from just one person. Most normal people don't understand it: what counts as a hit, anyway (an image, a page, a JS file...)? And it has no built-in measure of causality: if we get a million hits this month, what caused them? How could we generate more hits? Are those hits all of equal value? Since each of these questions requires diving into a different sub-metric, why not just use those metrics instead? That's the essence of actionable metrics.

But actionable metrics are more work. So it's reasonable to ask: what's wrong with vanity metrics, at least as a proxy for customer behavior? If hits are bigger this month than last month, that's progress. Why do we need to ask hard questions about the metric, if it's at least directionally correct?

When we rely on vanity metrics, a funny thing happens. When the numbers go up, I've personally witnessed everyone in the company naturally attributing that rise to whatever they were working on at the time. That's not too bad, except for this correlate: when the numbers go down, we invariably blame someone else. Over time, this allows each person in the company to live in their own private reality. As these realities diverge, it becomes increasingly difficult for teams to reach consensus on what to do next. If those teams are functionally organized, the problem is amplified. If all the engineers work on the same thing at the same time, and all the marketers do the same, and QA, and ops, all the way down the line, then each department develops its own team-based private reality. Now picture product prioritization meetings in such a company. Each team can't believe those idiots down the hall want to try yet another foo project when it's so evident that foo projects always fail.

Have you ever built one of those charts that shows a metric over time, annotated with "key events" that explain what happened to the numbers at key inflection points? If you never have, you can create your own using Google Finance. Go ahead and try it, then come back. You've just experienced vanity metrics hell. Everyone knows those charts are totally unpersuasive. At best, they can only show correlation, not causation. Sure I can build a stock chart, like this one, that shows that eBay's stock price went into a four-year decline immediately after "eBay Inc Acquires Dutch Company Marktplaats.nl." But do you really believe that's what caused eBay's problems? Of course not. At worst, these kinds of vanity metrics can be easily used for gross distortions. And that potentiality can cripple companies at just those key moments when they need to understand their data the most.

Let me take an example from a company that was going through a tough "crossing the chasm" moment. They had just experienced two down quarters after many quarters of steady growth. Naturally, they had just raised money, and their new investors were understandably pissed. The company struggled mightily with how to explain this bad news to their board. They were accustomed to measuring their progress primarily by gross revenue compared to their targets. When the numbers started to go down, they started to investigate. It turned out that, during the course of the decline, one customer segment was losing customers while another was gaining customers. It's just that the declining segment's customers were more valuable. In retrospect, I can see the irony of the situation perfectly. This decline was actually the result of the company successfully executing its strategy. The customers on their way out were more valuable in the short-term, but the new customers coming in were where the real growth was going to happen in the long-term. Unfortunately, the magnitude of the shift, and the relative values of the two segments, took the company by surprise.

Guess how well those board meetings went. All of the sudden, now that the company was struggling, new metrics were being used to judge success. The vanity charts appeared, showing the changes the company had made to its strategy and the subsequent changes in customer behavior broken down by segment. All very reasonable, well designed, well argued. In other words, a total disaster. This board had no way to know if they were hearing real insight or just well-crafted excuses. The insight turned out to be correct (it's always clear in hindsight). Too bad several of the executives making that presentation weren't around to be vindicated.

The whole situation could have been avoided if the company had used actionable metrics to set and evaluate goals from the start. The strategy changes could have been rolled out gradually, segment by segment, in controlled trials. The data from those trials could have been used to predict the future effects, and allowed the company to make smarter decisions. Actionable metrics don't guarantee you'll make good decisions. But at least you can have facts in the room at the time.

And that's my metrics heuristic. Consider a team's last few critical decisions. Not the ones with carefully choreographed discussion and a formal agenda. Those are the easy meetings. I'm talking about the adhoc crisis decisions, the periodic product prioritization meetings, and the failure post-mortems. How much actionable data was in the room at the time? With data, teams have an opportunity to improve their decision making over time, systematically training their intuition to conform to reality. Without it, they're just rolling the dice.

And that's why vanity metrics are dangerous.
blog comments powered by Disqus