Monday, September 13, 2010

The Superbowl ad test

I am a firm believer in the danger of vanity metrics, numbers that give the illusion of progress but often mask the true relationship between cause and effect. Since I first started writing about vanity metrics, I’ve met more and more entrepreneurs who are struggling with a simple question: how can I tell a vanity metric when I see it?

From the outside, vanity metrics are a lot easier to see than from the inside, precisely because of the psychology behind them. Everyone wants to believe that the work they are doing is making a difference. So it’s easy to read positive causes into noisy data, whether it’s really happening or not. (This is called “the illusion of cause” and is discussed at length in the extremely readable book The Invisible Gorilla). Even worse, entrepreneurs are faced with a constant barrage of vanity metrics from competitors and other companies engaged in PR. Vanity metrics are generally bigger. And everyone knows bigger is better, right?

News publications print vanity metrics because they want to give their readers information about the companies they cover. Companies want the coverage, but they don’t actually want to reveal anything useful about their operations. The solution? Vanity metrics. By only releasing vanity metrics, companies co-opt the press into helping them mislead others. Is that really news? I’ll leave that to professional journalists to sort out. For the general public, it’s probably OK to treat company updates as entertainment. But for entrepreneurs, investors, analysts and competitors, it’s quite dangerous.

Here’s my quick heuristic for telling if a given number, graph, or chart is a vanity metric: could it have been caused by the company secretly running a Superbowl ad and not disclosing it?

If yes, it’s very likely to be a vanity metric. Let’s take a look at an example, one of my favorites, the "billions of messages" claim.

Here's Mashable's coverage of Facebook chat reaching "a billion messages a day." Or take a recent TechCrunch article about a startup I won't name: “X billion messages sent since June 2009.” These articles treat this as a huge number, and it is. Probably, it represents tremendous success for the company in question. It’s side-by-side with a number of other vanity metrics. But notice what’s not listed: messages sent per person, churn rates of active users, or activation rate of new user. Even worse, we have no indication of how these numbers are moving over time. Is the company growing because of an amazing viral loop paired with a strong engagement loop? It’s possible, but the article doesn’t say. Most of the article is about the features – new and old – of the product. The unstated implication is that these features are what are leading to this tremendous growth. But is that true? Isn’t it equally possible that this company is spending more money on advertising or marketing than it’s competitors? Or that there is some other external factor at work?

I have no insight into these questions, and I don’t mean to pick on these companies in particular. My point is that this article does not contain the kind of information we’d need to draw reasonable inferences, which is by design. That’s what you pay PR firms for: to get an article written that is entirely factual and yet still provides positive spin for your company. (For context "some 740 billion text messages were sent in the first half of 2009." The PR firm helpfully left out that context.)

So, could these numbers have been generated by a Superbowl ad? Of course. We have no idea when the billions of messages were sent. They could all have been sent very recently. That’s the magic of vanity metrics – you never know what’s really going on. The trouble comes when companies and investors come to rely on these numbers to make consequential business decisions. How should a company like the one above prioritize their next set of features? Hopefully, they have internal reports that show the true correlation between their features and customer results. Are employees paying more attention to those reports than to the positive press coverage? I sure hope so.

Notice that cohort and conversion based metrics do not suffer from this problem. When we look at the same conversion percentage for cohort after cohort, we are effectively getting a new, independent, report card for our efforts each period. Each cohort is mostly unaffected by the behavior of earlier cohorts. And it is much more insulated from external effects, like an advertising or PR blitz, than your typical vanity metric.

It is not difficult to translate a gross metric like total messages sent into cohort terms. Since I’m picking on the TechCrunch example from above, we’re talking about more than a year’s worth of data. Let’s divide it into monthly cohorts. For each month, messages are sent by two kinds of people: new customers and returning customers. In order to make each cohort as meaningful as possible, let’s define them as follows:

New customer: someone who registered for the service in a given month
Returning customer: someone who used the service in the immediately preceding month.

I choose the “preceding month” definition in order to give us a sense for individual people’s behavior. A huge advertising blitz might cause a temporary winback effect by bringing in lots of old customers, but this is generally the kind of effect we want to ignore (unless we’re measuring the short-term effectiveness of the advertising).

Now, let’s plot a single number for each cohort, the percentage of customers in that cohort who sent at least one message in that time period. That makes our numbers denominated in people, not messages, which is much easier to understand. (remember, metrics are people, too). If we wanted to get fancy, we could also plot the average number of messages sent per person in each cohort.

If these numbers are flat month-to-month, then we can draw some strong conclusions about the product features we’re working on: they are basically having no effect on customer behavior. Hopefully, that’s not the case. Hopefully, the numbers are steadily improving month after month.

The data needed to generate this simple graph already exists: it’s the same basic data you’d need to get an accurate count of the total number of messages sent, just presented in a different form. For understanding what’s really going on with a product, this alternate form is far superior. Is it any wonder companies don’t want the press to have it?

It’s my hope that, in time, our industry will start to reject vanity metrics as a serious part of the discourse about customers. But this will take a long time. Investors and journalists have the most leverage to start making this change. Entrepreneurs have a part to play, too. Playing with vanity metrics is a dangerous game. Even if you intend to “only” give that sugar rush to publicists or investors, it’s all-too-easy to be taken in yourself. Your employees probably read the same press you are trying to influence. Your investors may be taken in today, but they will use those same vanity metrics to hold you accountable tomorrow. It’s much easier to rely on actionable metrics in the first place.
blog comments powered by Disqus