Tuesday, December 30, 2008

Assessing fit with the Wisdom of Crowds

Cover of When I wrote earlier about how to conduct a good technical interview, I had only a few things to say about how to assess if the candidate fits in with the team, including this:
This responsibility falls squarely to the hiring manager. You need to have a point of view about how to put together a coherent team, and how a potential candidate fits into that plan. Does the candidate have enough of a common language with the existing team (and with you) that you'll be able to learn from each other? Do they have a background that provides some novel approaches? Does their personality bring something new?
A few commenters have taken issue with the idea that it's solely the hiring manager's responsibility to assess fit, arguing correctly that the whole team should participate in the evaluation and decision. I completely agree. Still, I do think fit is a quality that requires special treatment, because it is the hardest attribute to evaluate.

Unlike the other attributes we look for in an interview candidate (like drive, brains or empathy) fit is not an individual quality. It's caught up in group dynamics. Worse, it has a self-referential quality to it. The very team that is making the assessment is being asked to assess itself at the same time as the candidate. How else can they tell whether the new team that will be created by the addition of this person will be superior to the team as it is presently constituted?

I have found James Surowiecki's book The Wisdom of Crowds particularly helpful in thinking through these issues. This book is Tipping Point-esque, full of anecdotes and interesting social science citations. Its central thesis is that, under the right circumstances, groups of people can be smarter than even their smartest single member. I find his observations compelling, and I feel good recommending the book to you, even though I know there are many among us who find "argument-by-anecdote" irritating. You don't have to buy the argument, but the facts and citations are worth the price of admission.

Let me briefly summarize the part of the book I find most helpful (leaving a few out, which aren't germane to today's topic). Not all crowds are wise. In order to get optimal results from a group-based effort, you need three things: diversity, Independence, and an objective method for aggregating results. For example, we conduct elections with a secret ballot, which ensures Independence (since nobody can directly influence your vote); we let everyone vote, which ensures diversity (since even extreme opinions can be heard); and we use a numerical count of the results (which, recent experiences notwithstanding, is supposed to mean that everyone's vote counts equally according to an objective formula). Similar mechanisms are at work in the stock market, Google PageRank, and guessing an ox's weight at the state fair.

Remove any of these essential ingredients, and you can find examples of groups gone bad: the Bay of Pigs invasion fiasco, market bubbles, and pretty much every one of Dilbert's team meetings.

Anyway, back to fit. Surowiecki has helped me in two ways:
  1. Assessing fit in the context of what makes a good team. In order to improve the performance of a team, it's not enough just to keep adding smart people. You actually need to find a diverse set of people whose strengths and weaknesses complement each other. The problem most teams have with fit, in my experience, is they confuse it with liking. Many great engineers I've worked with (especially the type I talked about in the hacker's lament) seem to think that if they get along well with someone in an interview, they'll be a good addition to the team. This kind of homogeneity can lead to groupthink.

  2. Putting together a process for helping a team assess fit. Another dangerous group dynamic in fit questions is that people are very sensitive to the opinions of their peers when talking about the group itself. In order to make an informed decision, the team needs a process of gathering and combining their opinions without having anyone's voice stifled.
For example, let's say you are on a team that prides itself in its all-hands-on-deck-at-all-hours style of working. There are a lot of great teams that work this way, preferring a series of sprints and lulls to a steady pace. Now let's say you're interviewing someone who doesn't seem to like to work that way. They work steady hours, but not for lack of drive; their references say they have exceptional output. Now picture the group meeting to talk about this prospective candidate. First up, the alpha-hacker on the team says something like "we should pass, this candidate is lazy."

Now it's your turn. Even if you are convinced that this candidate would make a great addition to the team, how much courage does it take to say so? You might convince the group, but you might not. And if you don't, will it raise suspicion about how dedicated you are? Will you be implicitly calling into question the validity of the team's values? Will they then be watching you for signs of laziness in the future? What about that vacation you've been planning to take... and so on. In my experience, there are plenty of situations where dissenting voices simply opt out. There's a clear danger to speaking up, but a pretty murky benefit.

If you find this line of reasoning confusing (nobody on my team feels that way) or paranoid (let's just all be rational), you may be surprised what the other people in the room are thinking. Take a look at some of the social science research in this area, like the Asch conformity experiments. In those, a group of people are asked to answer a simple question about their observations, one at a time. The first few people are actually actors, and they all give the same patently false answer. The experiment measures the likelihood that the last person in the sequence, the real experimental subject, will conform to what the previous people have said, or dissent. Even though the answer is obvious, and the other people in the room are all strangers, a surprising number of people choose to conform. I have found the pressure is much higher in situations where the answer is unclear, and the other group members are coworkers.

Combating these tendencies is the real job of the hiring manager. If that's you (or you are on a team whose hiring manager abdicates that responsibility), here are three suggestions that have worked for me to take advantage of the wisdom of crowds in hiring. Each of these are based in changes I've made to my hiring process in response to five whys analysis of previous hiring mistakes.
  • Before you meet the candidate, spend time thinking about the strengths and weaknesses of your team. Try to brainstorm some archetypes that probably would interview badly but actually be successful in filling out your team. Having been through this exercise in advance can help you listen carefully to what team members say about the candidate, and see if their objections are simply fearing what is different, or if they have a more serious concern.

    One experience I had was with a candidate who was absolutely convinced that our development methodology wouldn't work. He spent an inordinate amount of time grilling us on exactly how we work and why, asking smart but tough questions. He made us nervous, but we took a risk and hired him anyway. After we hired him, he spent weeks driving the team crazy with his critical (but, we had to admit, accurate) eye. Then, all of the sudden, a remarkable thing happened. Another new hire started to complain about the way we worked. Our former critic promptly set him straight, shooting down his complaints with the same ruthless efficiency he had previously devoted to analyzing our work. He had been converted, and from that point on acted as "defender of the faith" far better than I ever could.
  • Maintain strict Independence for each interviewer. Our rule was always that each interviewer was not allowed to talk any other interviewer once their session was concluded. The first time we'd exchange any words at all was during the end of the day assessment meeting. This prevents a previous interview from biasing a later interview. For example, I've seen situations where even a positive comment, like "wow, that candidate is smart!" cause disaster. The next interviewer, armed with an expectation of brilliance, chooses harder questions or becomes disappointed by an "only above average" performance.

  • Aggregate results carefully. When sitting in a room talking about the candidate, I have had success with two precautions. First, we would always share our experiences in reverse-seniority order. That meant that the most junior person would be forced to speak first, without knowing anything about what his or her manager thought. As the hiring manager, I would speak last, and I'd do my best to avoid giving any indication in advance of what I thought.

    We'd also have a structured discussion in two parts: in the first part, each person talks only objectively about what happened in the interview, without giving opinions or passing judgment. Others are allowed to ask "clarifying questions" only - no leading questions or comments. Only in the second round does each person give their opinion about whether to hire the person or not. This helps get the facts on the table in an objective way. If our team had ever struggled to do it, I would have insisted on written comments shared anonymously. In other words: do whatever you have to do to get an objective discussion going.
If you've been the victim of groupthink in hiring, or have suggestions for ways to avoid it, please share. All of us are team members, family members, coworkers and leaders. Tell us what you've learned in those different contexts. Maybe it'll help someone else avoid the same mistakes.



Tuesday, December 16, 2008

Engagement loops: beyond viral

There's a great and growing corpus of writing about viral loops, the step-by-step optimizations you can use to encourage maximum growth of online products by having customers invite each other to join. Today, I was comparing notes with Ed Baker (one of the gurus of viral growth). We were trying to broaden the conversation beyond just viral customer acquisition. Many viral products have flamed out over the years, able to capture large numbers of users, but proving transient in their value because they failed to engage customers for the long-term. Our goal is to understand the metrics, mechanics, and levers of engagement.

Levers of engagement
Let's start with the levers of engagement. What can you do to your product and marketing message to increase engagement?
  1. Synthetic notifications. The most blunt instrument is to simply reach out and contact your customers on a regular basis. This is such an obvious tactic that a surprising number of companies overlook it. For example, IMVU runs frequent promotional campaigns that offer discounts, special events, and other goodies to its customers. From a strictly "promotional marketing" point of view, they probably run those campaigns more than is optimal (there's always fatigue that diminishes the ROI on promotions the more you use them). But there is a secondary benefit from these activities: to remind customers that IMVU exists, and encourage them to come back to the site. The true ROI of a synthetic notification has to balance ROI, customer fatigue, and the engagement effects of the campaign itself.

    When you live with your own product every day, it's easy to lose sight of just how busy your customers are, and just how many things they are juggling in their own lives. A lot of engagement problems are caused by the customer completely forgetting about the provider of the service. Direct notifications can help ameliorate that problem.

  2. Organic notifications. Facebook, LinkedIn, and other successful social networks have elevated this technique to a high art. They do everything in their power to encourage customers to take actions that have a side-effect of causing other customers to re-engage. For example, from an engagement standpoint, it's a pretty good thing to automatically notify a person's friends whenever they upload pictures. But it's exponentially more engaging to have each person tag their friends in each picture, because the notification is so much more interesting: "you've been tagged in a photo, click to find out which one!" Similarly, the mechanics of sending users notifications when new friends of theirs join the site is a great organic re-engagement tactic. From the point of view of the existing customer, it goes beyond reminding them that the site exists; it also provides social validation of their choice to become a member in the first place.

    As with synthetic notifications, organic notifications are subject to fatigue, if they are not used judiciously. On Facebook, "poking" seems to have fairly high fatigue, whereas "photos" has low (close to zero?) fatigue. Ed adds this account: "When I first joined Facebook, I used to poke my friends and get poked back for the first few weeks, but now I rarely, if ever, poke people. Photos, on the other hand, is probably the primary reason I go to Facebook every day. Because they are constantly new and changing, I doubt I will ever get tired of looking at my friends photos, and I will probably always get especially excited to see a new photo that I have been tagged in."

  3. Positioning (the battle for your mind). The ultimate form of engagement is when the company doesn't have to do anything explicit to make it happen. For example, World of Warcraft never needs to send you an email reminding you to log in. And they don't need to prompt you to tell your guild-mates about the new epic loot you just won. The underlying dynamics of the product, your guild, and the fun you anticipate takes care of those impulses. This is true, to a greater or lesser extent, for every product. After you've acquired a customer, why would they bother to come back to your service? What do they get out of it? What is going on in their head when that happens?

    I wrote about this challenge for iPhone developers, in an essay on retention competition: the battle over what icon the user will click when they go to the home screen. At that point, there's no opportunity for marketing or sales; the battle is already won or lost in the person's mind. It's analogous to walking down the aisle in a supermarket. Just because you're already a Tide customer, doesn't necessarily mean you'll always buy Tide again. However, if you've come to believe that Tide is simply the only detergent in the world that can solve your cleaning problems, you're pretty unlikely to even notice the other competitors sitting on the shelf. Great iPhone apps work the same way.

    Marketing has a discipline about how to create those effects in the minds of customers; it's called positioning. The best introduction to the topic is Positioning (I highly recommend it, it's a very entertaining classic). But you don't have to be a marketing expert to use this tactic; you just need to think clearly about the key use cases for your product. Who is using it? What were they doing right before? And what causes them to choose one product over another? For example, a common use case for teenagers is: "I just got home from school, I'm bored, and I want to kill some time." If your product and its messaging is all about passing time while having fun, you might be able to get to the point where that is an automatic association, and they stop seriously considering other alternatives. That's exactly what the world's best video games do.

Seeing the engagement loop
We're just starting to weave these techniques into a broad-based theory of engagement, that would complement the work that has been done to date on viral marketing and viral loops. Notice that all of these techniques are attempting to affect one of a handful of specific behaviors that have to happen for a product to have high engagement. Do these sound at all familiar?
  1. A customer decides to return to your product, as a result of either natural interest, or a notification (organic or synthetic).
  2. They decide to take some action, perhaps influenced by the way in which they came back.
  3. This action may have side effects, such as sending out notifications or changing content on a website.
  4. These side effects affect other customers, and some side effects are more effective than others.
  5. Some of those affected customers decide to return to your product...
This is essentially a version of the viral loop. Let's look at a specific example, and start to think through what the metrics might look like if we attempted to measure it:
  1. Customer gets a synthetic message saying: "upload some photos!" Some percentage of customers click through.
  2. Some percentage of those actually upload.
  3. Those customers get prompted to tag their friends in their photos. Some percentage of them do (A), and these result in a certain number of emails sent (B).
  4. Each friend that's tagged gets an email that lets them know they've been tagged. Some percentage of them click through. (C)
  5. Of those, some percentage are themselves convinced to upload and photos. (D)
Calculating the "engagement ratio"
If we combine the quantities A-D using the same kinds of formulas we use for viral loop optimization, and the result is greater than one, we should see ever-increasing engagement notifications being sent. This will lead to some reactivation of dormant customers as well as some fatigue, as existing customers get many notification. Our theory is that the key to long-term retention is creating an engagement loop where the reactivation rate exceeds the rate of fatigue. This will yield a true "engagement ratio" that is akin to the viral ratio.

This makes intuitive sense, since the key to minimizing fatigue is to keep things new, exciting, and relevant. For example, user-generated content that includes of friends, especially if it includes you ("Joe tagged you in a photo. Click here to find out which one!") is usually going to be newer, more exciting, and more relevant than synthetic notifications ("Did you know you can know upload multiple photos at a time with our new photo uploader?"), or even than more generic organic notifications ("You've been poked by Joe."). High "engagement growth" with low fatigue is how you get the stickiness of a product to near 100%. You can try to churn out, but your friends keep pulling you back in. That's an engagement loop at work.

Seeing the whole
Engagement loops are a powerful concept all by themselves, and they can help you to make improvements to your product or service in order to optimize the drivers of growth for your business. But I think the value in this framework is that it can help make overall business decisions that require thinking about the whole rather than just one of the parts.

For example, let's say you have a viral ratio of 1.4. Your site is growing like wildfire, but your engagement isn't too good. You decide to do some research into why customers don't stay involved. When asked to describe your product, customers say something like "Product X is a place to connect with my friends online." Turns out, when optimizing your viral loop, this was the winning overall marketing message. It's stamped on your emails, landing pages, UI elements - everywhere. Removing a single instance of that message would make your viral ratio go down, and you know that for a fact, because you've split-tested every single possible variation.

As you talk to customers, you notice the following dilemma. Customers have a lot of options of places to connect with their friends online. And, compared to market leaders like Facebook and Myspace, you discover that your product isn't really that much better. Consequently, you are losing the positioning battle for your customers when they get home from school and ask themselves, "how can I connect with my friends right now?" Worse, your product isn't really about connecting with friends; that's just the messaging that worked best for the viral loop, where customers aren't that familiar your product anyway.

To win the positioning battle, you could try and make your product better than the competition, or find a different positioning that allows you to be the best at something else. Let's assume for the sake of argument that your competitors offerings are "good enough" and that you cant' figure out how to beat them at their own game. So you decide to try to reposition around a different value proposition, one that more closely matches what your product is best at. You could try and drive home that positioning with an expensive PR campaign, superbowl ads, and whatnot. But you don't have to - you have a perfectly good viral loop that is slowly but surely exposing the entire world to your positioning messages.

Here's what this long example is all about. When you go to change your messaging, imagine that your viral ration drops from 1.4 to 1.2. Disaster, right? Not necessarily. Since your viral ratio is still above one, it's still getting your message out, albeit a little slower. But if your new positioning message improves your engagement loop by more than the cost to your viral loop, you have a net win on your hands. Without measuring your engagement loop, can your business actually make tradeoff decisions like this one?

Connecting engagement and viral loops
The two loops are intimately connected, in a figure-eight pattern. Customers exit the viral loop and become part of the engagement loop. As your engagement improves, it becomes easier and easier to get customers to reenter the viral loop process and bring even more friends in. And as in all dynamic systems, there's no way to optimize a sub-part without sub-optimizing the whole. If you're focused on viral loops without measuring the effect of your changes on other parts of your business (of which engagement is just one), you're at risk of missing the truly big opportunities.

Hopefully, this theory will prompt some interesting responses. We'd love to hear your feedback and hear your stories. Have you struggled with engagement and retention? What's worked (and not worked) for you? Share your stories, and we'll incorporate them as we continue to flesh out this theory. Thanks for being part of the conversation.

Monday, December 8, 2008

Continuous integration step-by-step

Let's start with the basics: Martin Fowler's original article lays out the mechanics of how to set up a CI server and the essential rules to follow while doing it. In this post I want to talk about the nuts and bolts of how to integrate continuous integration into your team, and how to use it to create two important feedback loops.

First, a word about why continuous integration is so important. Integration risk is the term I use to describe the costs of having code sitting on some, but not all, developers' machines. It happens whenever you're writing code on your own machine, or you have a team working on a branch. It also happens whenever you have code that is checked-in, but not yet deployed anywhere. The reason it's a risk is that, until you integrate, you don't know if the code is going to work. Maybe two different developers made changes to the same underlying subsystem, but in incompatible ways. Maybe operations has changed the OS configuration in production in a way that is incompatible with some developer's change.

In many traditional software organizations, branches can be extremely long-lived, and integrations can take weeks or months. Here's how Fowler describes it:
I vividly remember one of my first sightings of a large software project. I was taking a summer internship at a large English electronics company. My manager, part of the QA group, gave me a tour of a site and we entered a huge depressing warehouse stacked full with cubes. I was told that this project had been in development for a couple of years and was currently integrating, and had been integrating for several months. My guide told me that nobody really knew how long it would take to finish integrating.
For those of you with some background in lean manufacturing, you may notice that integration risk sounds a lot like work-in-progress inventory. I think they are the same thing. Whenever you have code that is un-deployed or un-integrated, it's helpful to think of it as a huge stack of not-yet-installed parts in a widget factory. The more code, the bigger the pile. Continuous integration is a technique for reducing those piles of code.

Step 1: get a continuous integration server.
If you've never practiced CI before, let me describe what it looks like briefly. Whenever you check-in code to your source control repository, an automated server notices, and kicks off a complete "build and test" cycle. It runs all the automated tests you've written, and keeps track of the results. Generally, if all tests pass, it's happy (a green build) and if any tests fail, it will notify you by email. Most CI servers also maintain a waterfall display that shows a timeline of every past build. (To see what this looks like, take a look at the CI server BuildBot's own waterfall).

Continuous integration works to reduce integration risk by encouraging all developers to check in early and often. Ideally, they'll do it ever day or even multiple times per day. That's the first key feedback loop of continuous integration: each developer gets rapid feedback about the quality of their code. As they introduce more bugs, they have slower integrations, which signals to them (and others) that they need help. As they get better, they can go faster. In order for that to work, the CI process has to be seamless, fast, and reliable. As with many lean startup practices, it's getting started that's the hard part.

Step 2: start with just one test
.
You may already have some unit or acceptance tests that get run occaisionally. Don't use those, at least not right away. The reason is that if your tests are only being run by some people or in some situations, they probably are not very reliable. Startng with crappy tests will undermine the team's confidence in CI right from the start. Instead, I recommend you set up a CI server like BuildBot, and then have it run just a single test. Pick something extremely simple, that you are convinced could never fail (unless there's a real problem). As you gain confidence, you can start to add in additional tests, and eventually make it part of your team-wide TDD practice.

Step 3: integrate with your source control system
.
Most of the times I've tried to introduce TDD, I've run into this problem: some people write and run tests religiously, while others tend to ignore them. That means that when a test fails, it's one of the testing evangelists who inevitably winds up investigating and fixing it - even if the problem was caused by a testing skeptic. That's counter-productive: the whole point of CI is to give each developer rapid feedback about the quality of their own work.

So, to solve that problem, add a commit hook to your source control system, with this simple rule: nobody can check in code while the build is red. This forces everyone to learn to pay attention to the waterfall display, and makes a failed test automatically a big deal for the whole team. At first, it can be frustrating, especially if there are any intermittent or unreliable tests in the system. But you already started with just one test, right?

The astute among you may have noticed that, since you can't check in when the build is red, you can't actually fix a failing test. There are two ways to modify the commit hook to solve that problem. The first, which we adopted at IMVU, was to allow any developer to add a structured phrase to their check-in comment that would override the commit hook (we used the very creative "fixing buildbot"). Because commits are mailed out to the whole team, anyone who was using this for nefarious purposes would be embarrassed. The alternative is to insist that the build be fixed on the CI server itself. In that case, you'd allow only the CI account to check in during a red build.

Either way, attaching consequences to the status of the build makes it easier to get everyone on the team to adopt it at once. Naturally, you should not just impose this rule from on high; you have to get the team to buy-in to trying it. Once it's in place, it provides an important natural feedback loop, slowing the team down when there are problems caused by integration risk. This provides the space necessary to get to the root cause of the problem. It becomes literally impossible for someone to ignore the failures and just keep on working as normal.

As you get more comfortable with continuous integration, you can take on more advanced tactics. For example, when tests fail, I encourage you to get into the habit of running a five whys root-cause analysis to take corrective action. And as the team grows, the clear-cut "no check-ins allowed" rule becomes too heavy-handed. At IMVU, we eventually built out a system that preserved the speed feedback, but had finer-grained effects on each person's productivity. Still, my experience working with startups has been that too much time spent talking about advanced topics can lead to inaction. So don't sweat the details - jump in and start experimenting.



Reblog this post [with Zemanta]

Sunday, December 7, 2008

The hacker's lament

One of the thrilling parts of working and writing in Silicon Valley is the incredible variety of people I've had the chance to meet. Sometimes, I meet someone that I feel a visceral connection with, because they are struggling with challenges that I've experienced myself. In a few cases, they are clearly smart people in a bad situation, and I've written about their pain in The product manager's lament and The engineering manager's lament.

Today I want to talk about another archetype: the incredibly high-IQ hacker who's trying to be a leader. (As always, this is a fictionalized account; I'm blending several people I've known into a single composite. And please forgive the fact that I use male pronouns to describe the archetype. There is terrible gender bias in our profession, but that's a subject for another day. Suffice to say, most of the hackers I've known have been men. As a last disclaimer, please consult the definition of the word hacker if you're not familiar with the controversies surrounding that term.)

It's common to find a hacker at the heart of almost any successful technology company. I know them right away - we can talk high-level architecture all the way down to the bits-and-bytes of his system. When I want to know about some concurrency issues between services in his cluster, he doesn't blink an eye when I suggest we get the source code and take a look. And as soon as I point out an issue, he can instantly work out the consequences in his head, and invent solutions on the fly.

This kind of person is used to being the smartest person in the room. In fact, it's a rare person who can be subjected to recurring evidence of just how stupid the people around them are, and not become incredibly arrogant. Those who have the endurance are the ones that tend to lead teams and join startups, because you just can't be successful in a startup situation without empathy. I would characterize them as intolerant but not arrogant.

When a startup encounters difficult technical problems, this is the guy you want solving them. He's just as comfortable writing code as racking servers, debugging windows drivers, or devising new interview questions. As the company grows, he's the go-to person for almost everything technical, and so he's very much in demand. He throws off volumes of code, and it works. When scalability issues arise, for example, he's in the colo until 2am doing whatever it takes to fix them.

But life is not easy, either. As the company grows, the number of things he's called on to do is enormous, and the level of interruptions are getting intense. It's almost as if he's a country that was immune to the economic theory of comparative advantage. Since he's better at everything, he winds up doing everything - even the unimportant stuff. There's constant pressure for him to delegate, of course, but that doesn't necessarily work. If he delegates a task, and it gets messed up, he's the one that will get called in to deal with it. Better just to take care of it himself, and see that it's done right.

When you're the physical backstop putting dozens of fingers in the damn to prevent it from bursting, you might get a little irritated when people try to "help" you. The last thing you need is a manager telling you how to do your job. You're not very receptive to complaints that when you take on a task, it's unpredictable when you'll finish: "you try getting anything done on schedule when you're under constant interruptions!" Worst of all, your teammates are constantly wanting to have meetings. When they see a problem with the team's process, why don't they just fix it? When the architecture needs modifying - why do we need a meeting? Just change it. And we can't hire new engineers any faster, because you can't be interviewing and debugging and fixing all at the same time!

The picture I'm trying to paint is one of a bright individual contributor stretched to the breaking point. I've been there. Trust me, it's not a lot of fun. And I've also been on the receiving end; and that's not much fun either. Yet, quite often these dynamics play out with ever-increasing amplitude, until finally something drastic happens. Unfortunately, more often than not, it's the hacker who gets fired. What a waste.

What's wrong with this picture?

One of the most exhilarating things about a startup is that feeling of intense no-holds-barred execution. Especially in the early days, you're fighting for survival every day. Every day counts, every minute counts. Even if, in a previous life, you were a world expert in some functional specialty, like in-depth market research or scalable systems design, the compressed timeline of a startup makes it irrelevant. You get to figure things out from first principles all the time, experiment wildly, and invest heavily in what works. From the outside, it looks a lot like chaos. To a hacker, it looks a lot like heaven.

But even a tiny amount of success requires growth. Even with the highest standards imaginable, there's no way to hire just genius hackers. You need a diversity of skills and backgrounds. Suddenly, things slow down a little bit. To me, this is the critical moment, when startups either accept that "process = bureaucracy" or reject that thinking to realize that "process = discipline." And it's here that hackers fall down the most. We're just not naturally that good at thinking about systems of people; we're more comfortable with systems of computers.

If you've ever been abused by a bad manager in your career, it's easy to become traumatized. I think this is the origin of the idea among hackers that managers are idiots who just get in the way. The variations on this theme are legion: the pointy-haired boss, the ivory-tower architect, and of course the infinite variety of marketroids. But whenever groups of people assemble for a common purpose, they adopt process and create culture. If nobody is thinking about it, you're rolling the dice on how they turn out. And, at first, it's OK if the person who's doing that thinking is part-time, but eventually you're going to need to specialize. The alpha-hacker simply can't do everything.

Even in the areas that hackers specialize in, this go-it-alone attitude doesn't work. Building a good application architecture is not just coding. It's more like creating a space for other people to work in. A good architect should be judged, not by the beauty of the diagram, but by the quality of the work that the team does using it. The "just fix it" mentality is counter-productive here. Every bug or defect needs to go through the meta-analysis of what it means for the architecture. But that's impossible if you're constantly fire-fighting. You need to make time to do root cause analysis, to correct the systemic mistakes all of us tend to make.

And taking on too many projects at once is a classic sub-optimization. Sure, it seems efficient. But when there is a task half-done, it's actually slowing the team down. That's because nobody else can work on the task, but it's costly to hand it off. Imagine a team working from a forced-rank priority queue. Naturally, the best person should work on the #1 priority task, right? Not necessarily. If that person is subject to a lot of interruptions, as the people working on the less-important tasks finish, they're forced to keep working down the list. Meanwhile, the #1 task is still not done. It would have been faster for the team as a whole to have someone else work on the task, even if they were much slower. And of course there's the secondary benefit of the fact that as people work on tasks they don't know anything about, they learn and become more capable.

The reason this situation reaches a breaking-point is that it's constantly getting worse. As the team grows, the number of things that can go wrong grows with it. If a single person stays the bottleneck, they can't scale fast enough to handle all those interruptions - no matter how smart they are. And the interruptions themselves make looking for solutions increasingly difficult. Each time you look for solutions, you see a conundrum of this form: you can't hire because you're too busy, but you can't delegate because you can't hire.

All is not lost, though. When I get involved in companies that struggle with this problem, here is the kind of advice I think can help:
  • Introduce TDD and continuous integration. This is one of the bedrock practices of any lean startup, and so it's a common piece of advice I give out. However, it's particularly helpful in this situation. Without requiring a lot of meetings, it changes the perspective of the team (and its leadership) from fire-fighting to prevention. Every test is a small investment in preventing a specific class of bugs from recurring; once you've been successful at building this system, it's pretty easy to see the analogy to other kinds of preventative work you could do. It also helps ratchet down the pressure, since so many of the interruptions that plague the typical hacker are actually the same bugs recurring over and over. TDD plus continuous integration works as a natural feedback loop: if the team is working "too fast" to produce quality code reliably, tests fail, which requires the team to slow down and fix them.

  • Use pair programming and collective code ownership. These are two other Extreme Programming practices that are explicitly designed to counteract the problems inherent in this situation. Pair programming is the most radical, but also the most helpful. If your team isn't ready or able to adopt pair-programming across the board, try this technique instead: whenever anyone is becoming a bottleneck (like the proverbial hacker in this post), pass a rule that they are only allowed to pair program until they are not the bottleneck anymore. So each time someone comes to interrupt them, that person will be forced to pair in order to get their problem solved. In the short term, that may seem slower, but the benefits will quickly become obvious. It's another natural feedback loop: as the interruptions increase, so does the knowledge-transfer needed to prevent them.

  • Do five whys. This is a generalization of the previous two suggestions. It requires that we change our perspective, and instead treat every interruption as an opportunity to learn and invest in prevention.

  • Hire a CTO or VP Engineering. A really good technology executive can notice problems like the ones I'm talking about today and address them proactively. The trick is to hire a good one - I wrote a little about this in What does a startup CTO actually do? Sometimes, a great hacker has the potential to grow into the CTO of a company, and in those cases all you need is an outside mentor who can work with them to develop those skills. I've been privileged to have been the recipient of that kind of coaching, and to have done it a few times myself.
At the end of the day, the product development team of a startup (large or small) is a service organization. It exists to serve the needs of customers, and it does this by offering its capabilities to other functions in the company, and partnering with them. That's only possible if those interactions are constructive, which means having the time and space for people of different backgrounds and skills to come together for common purpose. That's the ultimate task for the company's technology leadership.

I strongly believe that all hackers have the innate ability to become great leaders. All that's required is a shift in perspective: at their root, all technology problems are human problems. So, fellow hackers, I'd love to hear from you. Does this sound familiar? Are you ready to try something different?

Getting started with split-testing

One of the startup founders I work with asked me a smart question recently, and I thought I'd share it. Unlike most of the people who've endured my one-line split-testing talk, this team has taken it to heart. They're getting started creating their first A/B tests, and asked "Should we split-test EVERYTHING?" In other words, how do you know what to split-test, and what just to ship as-is? After all, isn't it a form of waste to split-test something like SSL certs, that you know you have to do?

I love questions like this, because there is absolutely no right answer. Split-testing, like almost everything in a lean startup, requires judgment. It's an art, not a science. When I was just starting out with practices like split-testing, I too sought out hard and fast rules. (You can read about some of the pitfalls I ran into with split-testing in "When NOT to listen to your users; when NOT to rely on split-tests"). That said, I think it's important to get your split-testing initiative off to a good start, and that means being selective about what features you tackle with it.

The goal in the early days of split-testing is to produce unequivocal results. It's a form of waste to generate reports that people don't understand, and it impedes learning if there is a lot of disagreement about the facts. As you get better at it, you can start to apply it to pretty much everything.

In the meantime, here are three guidelines to get you started:
  1. Start simple. Teams that have been split-testing for a long time get pretty good at constructing tests for complex features, but this is a learned skill. In the short term, tackling something too complex is more likely to lead to a lot of wasted time arguing about the validity of the test. A good place to start is to try moving UI elements around. My favorite is to rearrange the steps of your registration process for new customers. That almost always has an effect, and is usually pretty easy to change.

  2. Make a firm prediction. Later, you'll want to use split-tests as exploratory probes, trying things you truly don't understand well. Don't start there. It's too easy to engage in after-the-fact rationalization when you don't go in with a strong opinion about what's going to happen. Split-testing is most powerful when it causes you to make your assumptions explicit, and then challenge them. So, before you launch the test, write down your belief about what's going to happen. Try to be specific; it's OK to be wrong. This can work for a change you are sure is going to have no effect, too, like changing the color of a button or some minor wording. Either way, make sure you can be wrong. If you're a founder or top-level executive, have the courage to be wrong in a very public way. You send the signal that learning always trumps opinion, even for you.

  3. Don't give up. If the first test shows that your feature has no effect, avoid these two common extreme reactions: abandoning the feature altogether, or abandoning split-testing forever. Set the expectation ahead of time that you will probably have to iterate the feature a few times before you know if it's any good. Use the data you collect from each test to affect the next iteration. If you don't get any effect after a few tries, then you can safely conclude that you're on the right track.

Most importantly, have fun with split-testing. Each experiment is like a little mystery, and if you can get into a mindset of open-mindedness about the answer, the answers will continually surprise and amaze you. Once you get the hang of it, I promise you'll learn a great deal.

Saturday, December 6, 2008

The four kinds of work, and how to get them done: part three

Those startups that manage to build a product people want have to deal with the consequences of that success. Having early customers means balancing the needs of your existing customers with the desire to find new ones. And still being a startup means continuing to innovate as well as keep the lights on. In short, it's hard, and easy to mess up - as I've certainly done more than my fair share.

In part one of this series, I talked about the four fundamental kinds of work companies do, and in part two I talked in some detail about the conflicts these differences inevitably create. Today, I want to talk about solutions. In short, how do you grow your team when it's time to grow your company?

For starters, there's whole volumes that need to be written about how to actually find and hire the people your startup needs. For example, you can read my previous post about how to do a a technical interview. But most startups succeed in hiring, one way or another, and are still left with the problems of organizing the people that rapid growth brings in.

To mitigate these problems, we need a process that recognizes the different kinds of work a company does, and creates teams to get them done. Here are my criteria for a good growth process: it should let the company set global priorities and express them as budgets, it should delegate trade-off decisions to a clearly-identified leader who is tasked with making trade-offs within a single kind of work, and it should allow personnel to circulate "backwards" through the kinds of work by facilitating clear hand-offs of products and features between teams.
  1. Build strong cross-functional teams for each kind of work. Let's start with the most important thing you can do to help product teams succeed: make them cross-functional. That means every team should have a representative from absolutely every department that the team will have to interact with. To begin with, see if you can get designers, programmers, and QA on the same team together. When you've mastered that, consider adding operations, customer service, marketing, product management, business development - the idea is that when the team needs to get approval or support from another department, they already have an "insider" who can make it happen.

    The advantages of cross-functional teams are well documented, and for a thorough treatment I recommend the theory in the second half of Agile Software Development with Scrum. I want to focus here on one particular strength: the ability to operate semi-autonomously. Whatever process you use for deciding what work is going to get done at a macro-level, once the team accepts its mission, it's free to get the job done by whatever means necessary. If the team has true representation from all parts of the company, it won't have the incredibly demoralizing experience of coming up with a solution and then having some outsider veto it. The flip side of that is that you can have strong accountability: all the usual political excuses are removed.

    I have been amazed at the levels of creativity these kinds of teams unlock.

  2. Create iteration cycles that provide clear deadlines and opportunities to revise budgets. Opinions vary about the optimal cycle length, but I do think it's important to pick a standard length. Scrum recommends 30 days; I have worked in one or two-week cycles up to about three months. At IMVU, we found 60 days was just about right. It allows each team to do multiple iterations themselves, before having to be accountable to the whole company for their work. You could easily manage that as two 30-day scrums, or four two-week sprints. Either way, create a clear cycle begin and end date, and use the cycle breaks for two things: to hold teams accountable and to reallocate personnel between teams.

    In order to prevent people from bunching up in the later stages of the work pipeline, those leaders need to be focused on automation and continuous improvement. Make sure you keep track of whether that's happening. Between cycles, put some pressure on those later teams to do more with fewer people. If they can keep their team sizes constant as the company grows, you'll be making it possible to have more people working on R&D and Strategy.

  3. Find a balance between long-term ownership and learning new things. As the team grows, it's tempting to want to keep people on the same teams for long duration. That's more efficient, in a way, because they get to know each other and the better they get to know the part of the product their team is working on. However, I think it's more efficient for the company overall if people are moving between teams and learning new skills. That keeps the work interesting, and prevents specialists from becoming company-wide bottlenecks.

    Some people will naturally move with the work they do, as projects pass between teams. Manager need to keep an eye out for opportunities to move people simply for the sake of helping them take on new challenges. Team leaders should rotate less frequently, because too much churn works against accountability. A good target is to try and circulate about 30% of a given team between cycles.

  4. Formalize and celebrate the hand-off between teams. When a project is done, there are three possible outcomes. Either that team is going to keep iterating it in the next cycle, or they are going to hand it off to another team, or they are going to euthanize it and remove it from the project. If you get serious about these hand-offs, you can prevent a common problem that startups encounter: lots of features that nobody has ownership of. If you celebrate these hand-offs, you reinforce the idea that different kinds of work are equally valuable, and you help people let go when it's time for transitions to happen. For example, the level of attention a project gets when it moves into maintenance mode is going to be less than it had before. Acknowledge that up-front, and celebrate the people who made it possible for that feature to run its course and help the company succeed.

  5. Prefer serializing work to doing it in parallel. When possible, try to have the whole team work on a one project at a time, rather than have many going simultaneously. It may seem more efficient to have lots of projects going (each with one or two people on them), but that's not been true in my experience. Lots of little projects erodes teamwork and builds up work-in-progress. I believe that having the whole team work from a common task list, in priority order, lets everyone feel that "all hands on deck" sense that the founders experienced in the early days of the company. You can still do multiple projects in a single cycle, just try and have as few "in flight" at any given time as possible. You'll also be able to put more of an emphasis on finishing and shipping, than if everyone has to rush to finish their projects all at the same time (stepping on each others' toes, naturally).
This process can seem daunting, especially if you're currently running in ad-hoc mode. That's OK; like all lean transformations, you need to undertake this one in small steps. Remember the mantra: incremental investment for incremental gain. Each process change you make should pay for itself in short order, so as to build credibility and momentum for further refinements.

Where should you start? I rcommend you try building a cross-functional team and see what happens. At first, choose only a few functions, keep the team small, and give them a modest goal. Try your hardest to give them space to operate without supervision, which means choosing a problem that doesn't have huge risks associated with it (or you'll be too scared to let them out of your sight). Keep the team size small, maybe just two or three people. If the goal is clear, and the team is willing to embrace it, you may be surprised just how fast they can execute.

One last thought. These ideas are easier to give lip service to than to actually implement. So if it doesn't work right away, don't give up. Maybe you think you've created a semi-autonomous team, but the team feels like it's just business as usual. Or you may have too high a level of background interruptions for them to really stay the course. Or they may decide just to slack off instead of accepting the mission. Try to see these as opportunities to learn: at the end of each cycle do a five whys post-mortem. I gurantee you'll learn something interesting.

Saturday, November 29, 2008

The ABCDEF's of conducting a technical interview

I am incredibly proud of the people I have hired over the course of my career. Finding great engineers is hard; figuring out who's good is even harder. The most important step in evaluating a candidate is conducting a good technical interview. If done right, a programming interview serves two purposes simultaneously. On the one hand, it gives you insight into what kind of employee the candidate might be. But it also is your first exercise in impressing them with the values your company holds. This second objective plays no small part in allowing you to hire the best.

Balancing competing objectives is a recurring theme on this blog - it's the central challenge of all management decisions. Hiring decisions are among the most difficult, and the most critical. The technical interview is at the heart of these challenges when building a product development team, and so I thought it deserved an entire post on its own.

In this post I'll follow what seems to be a pattern for me: lay out a theory of what characterizes a good interview, and then talk practically about how to conduct one.

When I train someone to participate in a technical interview, the primary topic is what we're looking for in a good candidate. I have spent so much time trying to explain these attributes, that I even have a gimmicky mnemonic for remembering them. The six key attributes spell ABCDEF:
  • Agility. By far the most important thing you want to hire for in a startup is the ability to handle the unexpected. Most normal people have a fairly narrow comfort zone, where they excel in their trained specialty. Those people also tend to go crazy in a startup. Now, we're not looking for people who thrive on chaos or, worse, causes chaos. We want someone who is a strong lateral thinker, who can apply what they've learned to new situations, and who can un-learn skills that were useful in a different context but are lethal in a new one. When talking about their past experience, candidates with agility will know why they did what they did in a given situation. Beware anyone who talks too much about "best practices" - if they believe that there are practices that are ideally suited to all situations, they may lack adaptability.

    To probe for agility, you have to ask the candidate questions involving something that they know little about.

  • Brains. There's no getting around the fact that at least part of what you should screen for is raw intelligence. Smart people tend to want to work with smart people, so it's become almost a cliche that you want to keep the bar as high as you can for as long as you can. Microsoft famously uses brainteasers and puzzles as a sort of quasi-IQ test, but I find this technique difficult to train people in and apply consistently. I much prefer a hands-on problem-solving excercise, in a related discipline to the job they are applying for. For software engineers, I think this absolutely has to be a programming problem solved on a whiteboard. You learn so much about how someone thinks by looking at code you know they've written, that it's worth all the inconvenience of having to write, analyze and debug it by hand.

    I prefer to test this with a question about the fundamentals. The best candidates have managed to teach me something about a topic I thought I already knew a lot about.

  • Communication. The "lone wolf" superstar is usually a disaster in a team context, and startups are all about teams. We have to find candidates that can engage in dialog, learning from the people around them and helping find solutions to tricky problems.

    Everything you do in an interview will tell you something about how the candidate communicates. To probe this deeply, ask them a question in their area of expertise. See if they can explain complex concepts to a novice. If they can't, how is the company going to benefit from their brilliance?

  • Drive. I have most been burned by hiring candidates that had incredible talents, but lacked the passion to actually bring them to work every day. You need to ask: 1) does the person care about what they work on? and 2) can they get excited about what your company does? For a marketing job, for example, it's reasonable to expect that a candidate will have done their homework and used your product (maybe even talked to your customers) before coming in. I have found this quite rare in engineers. At IMVU, most of them thought our product was ridiculous at best; hopeless at worst. That's fine for the start of their interview process. But if we haven't managed to get them fired up about our company mission by the end of the day, it's unlikely they are going to make a meaningful contribution.

    To test for drive, ask about something extreme, like a past failure or a peak experience. They should be able to tell a good story about what went wrong and why.

    Alternately, ask about something controversial. I remember once being asked in a Microsoft group interview (and dinner) about the ActiveX security model. At the time, I was a die-heard Java zealot. I remember answering "What security model?" and going into a long diatribe about how insecure the ActiveX architecture was compared to Java's pristine sandbox. At first, I thought I was doing well. Later, the other candidates at the table were aghast - didn't I know who I was talking to?! Turns out, I had been lecturing the creator of the ActiveX security model. He was perfectly polite, not defensive at all, which was why I had no idea what was going on. Then I thought I was toast. Later, I got the job. Turns out, he didn't care that I disagreed with him, only that I had an opinion and wasn't afraid to defend it. Much later, I realized another thing. He wasn't defensive because, as it turns out, he was right and I was completely wrong (Java's sandbox model looked good on paper but its restrictions greatly retarded its adoption by actual developers).

  • Empathy. Just as you need to know a candidates IQ, you also have to know their EQ. Many of us engineers are strong introverts, without fantastic people skills. That's OK, we're not trying to hire a therapist. Still, a startup product development team is a service organization. We're there to serve customers direclty, as well as all of the other functions of the company. This is impossible if our technologists consider the other types of people in the company idiots, and treat them that way. I have sometimes seen technical teams that have their own "cave" that others are afraid to enter. That makes cross-functiona teamwork nearly impossible.

    To test for empathy, I always make sure that engineers have one or two interviews with people of wildly different background, like a member of our production art department. If they can treat them with respect, it's that much less likely we'll wind up with a silo'd organization.

  • Fit. The last and most elusive quality is how well the candidate fits in with the team you're hiring them into. I hear a lot of talk about fit, but also a lot of misunderstandings. Fit can wind up being an excuse for homogeneity, which is lethal. When everyone in the room thinks the same way and has the same background, teams tend to drink the proverbial Kool-Aid. The best teams have just the right balance of common background and diverse opinions, which I have found true in my experience and repeatedly validated in social science research (you can read a decent summary in The Wisdom of Crowds).

    This responsibility falls squarely to the hiring manager. You need to have a point of view about how to put together a coherent team, and how a potential candidate fits into that plan. Does the candidate have enough of a common language with the existing team (and with you) that you'll be able to learn from each other? Do they have a background that provides some novel approaches? Does their personality bring something new?
It's nearly impossible to get a good read on all six attributes in a single interview, so it's important to design an interview process that will give you a good sampling of data to look at. Exactly how to structure that process is a topic for another day, however, because I want to focus on the interview itself.

My technique is to structure a technical interview around an in-depth programming and problem-solving exercise. If it doesn't require a whiteboard, it doesn't count. You can use a new question each time, but I prefer to stick with a small number of questions that you can really get to know well. Over time, it becomes easier to calibrate a good answer if you've seen many people attempt it.

For the past couple of years I've used a question that I once was asked in an interview, in which you have the candidate produce an algorithm for drawing a circle on a pixel grid. As they optimize their solution, they eventually wind up deriving Bresenham's circle algorithm. I don't mind revealing that this is the question I ask, because knowing that ahead of time, or knowing the algorithm itself, confers no advantage to potential candidates.

That's because I'm not interviewing for the right answer to the questions I ask. Instead, I want to see how the candidate thinks on their feet, and whether they can engage in collaborative problem solving with me. So I always frame interview questions as if we were solving a real-life problem, even if the rules are a little far-fetched. For circle-drawing, I'll sometimes ask candidates to imagine that we are building a portable circle-drawing device with a black and white screen and low-power CPU. Then I'll act as their "product manager" who can answer questions about what customers think, as well as their combined compiler, interactive debugger, and QA tester.

You learn a lot from how interested a candidate is in why they are being asked to solve a particular problem. How do they know when they're done? What kind of solution is good enough? Do they get regular feedback as they go, or do they prefer to think, think, think and then dazzle with the big reveal?

My experience is that candidates who "know" the right answer do substantially worse than candidates who know nothing of the field. That's because they spend so much time trying to remember the final solution, instead of working on the problem together. Those candidates have a tendency to tell others that they know the answer when they only suspect that they do. In a real-world situation, they tend to wind up without credibility or forced to resort to bullying.

No matter what question you're asking, make sure it has sufficient depth that you can ask a lot of follow-ups, but that it has a first iteration that's very simple. An amazing number of candidates cannot follow the instruction to Do the Simplest Thing That Could Possibly Work. Some questions have a natural escalation path (like working through the standard operations on a linked-list) and others require some more creativity.

For example, I would often ask a candidate to explain to me how the C code they are writing on the whiteboard would be rendered into assembly by the compiler. There is almost no earthly reason that someone should know about this already, so candidates answer in a wide variety of ways: some have no idea, others make something up; some have the insight to ask questions like "what kind of processor does this run on?" or "what compiler are we using?" And some just write the assembly down like it's a perfectly normal question. Any of these answers can work, and depending on what they choose, it usually makes sense to keep probing along these lines: which operations are the most expensive? what happens if we have a pipelined architecture?

Eventually, either the candidate just doesn't know, or they wind up teaching you something new. Either way, you'll learn something important. There are varying degrees of not-knowing, too.
  1. Doesn't know, but can figure it out. When you start to probe the edges of someone's real skills, they will start to say "I don't know" and then proceed to reason out the answer, if you give them time. This is usually what you get when you as about big-O notation, for instance. They learned about it some time ago, don't remember all the specifics, but have a decent intuition that n-squared is worse than log-n.

  2. Doesn't know, but can deduce it given the key principles. Most people, for example, don't know exactly how your typical C++ compiler lays out objects in memory. But that's usually because most people don't know anything about how compilers work, or how objects work in C++. If you fill them in on the basic rules, can they reason with them? Can those insights change the code you're trying to get them to write?

  3. Doesn't understand the question. Most questions require a surprising amount of context to answer. It doesn't do you any good to beat someone up by forcing them through terrain that's too far afield from their actual area of expertise. For example, I wold often work the circle-drawing question with candidates who only had ever programmed in a web-based scripting language like PHP. Some of them could roll with the punches and still figure out the algorithmic aspects of the answer. But it was normally useless to probe into the inner workings of the CPU, because it wasn't something they knew about, and it can't really be taught in less than a few hours. You might decide that this knowledge is critical for the job you're hiring for, and that's fine. But it's disrepectful and inefficnet to waste the candidate's time. Move on.
My purpose in elaborating these degrees of not-knowingness is to emphasize this essential point: you want to keep as much of the interview split between boxes one and two. In other words, you want to keep asking questions on the boundaries of what they know. That's the only way to probe for agility, brains, and the best way to probe for communication. In the real world, the vast majority of time (especially in startups) is spent encountering novel situations without a clear answer. What matters is how good your thinking is at times like those, and how well you can communicate it. (It's kind of like playing Fischer Random Chess, where memorizing openings is useless).

Let me return to my topic at the top of the post: using the interview to emphasize values as well as evaluate. The best interviews involve both the interviewer and the canddiate learning something they didn't know before. Making clear that your startup doesn't have all the answers, but that your whole team pushes their abilities to their limits to find them is a pretty compelling pitch. Best of all, it's something you just can't fake. If you go into an interview with the intention of lording your knowledge over a candidate, showing them how smart you are, they can tell. And if you ask questions but don't really listen to the answers, it's all-too-obvious. Instead, dive deep into a problem and, together, wrestle the solution to the ground.


Reblog this post [with Zemanta]

Saturday, November 22, 2008

Net Promoter Score: an operational tool to measure customer satisfaction

Cover of
I've mentioned Net Promoter Score (NPS) in a few previous posts, but haven't had a chance to describe it in detail yet. It is an essential lean startup tool that combines seemingly irreconcilable attributes: it provides operational, actionable, real-time feedback that is truly representative of your customers' experience as a whole. It does it all by asking your customers just one magic question.

In this post I'll talk about why NPS is needed, how it works, and show you how to get started with it. I'll also reveal the Net Promoter Score for this blog, based on the data you've given me so far.

How can you measure customer satisfaction?
Other methods for collecting data about customers have obvious drawbacks. Doing in-depth customer research, with long questionnaires with detailed demographic and psychograpic breakdowns, is very helpful for long-range planning, interaction design and, most importantly, creating customer archetypes. But it's not immediately actionable, and it's far too slow to be a regular part of your decision loop.

At the other extreme, there's the classic A/B split-test, which provides nearly instantaneous feedback on customer adoption of any given feature. If your process for creating split-tests is extremely light (for example, it requires only one line of code), you can build a culture of lightweight experimentation that allows you to audition many different ideas, and see what works. But split-tests also have their drawbacks. They can't give you a holistic view, because they only tell you how your customers reacted to that specific test.

You could conduct an in-person usability test, which is very useful for getting a view of how actual people perceive the totality of your product. But that, too, is limited, because you are relying on a very small sample, from which you can only extrapolate broad trends. A major usability problem is probably experienced similarly by all people, but the absence of such a defect doesn't tell you much about how well you are doing.

Net Promoter Score
NPS is a methodology that comes out of the service industry. It involves using a simple tracking survey to constantly get feedback from active customers. It is described in detail by Fred Reichheld in his book The Ultimate Question: Driving Good Profits and True Growth. The tracking survey asks one simple question: How likely are you to recommend Product X to a friend or colleague? The answer is then put through a formula to give you a single overall score that tells you how well you are doing at satisfying your customers. Both the question and formula are the results of a lot of research that claims that this methodology can predict the success of companies over the long-term.

There's a lot of controversy surrounding NPS in the customer research community, and I don't want to recapitulate it here. I think it's important to acknowledge, though, that lots of smart people don't agree with the specific question that NPS asks, or the specific formula used to calculate the score. For most startups, though, I think these objections can safely be ignored, becuase there is absolutely no controversy about the core idea that a regular and simple tracking survey can give you customer insight.

Don't let the perfect be the enemy of the good. If you don't like the NPS question or scoring system, feel free to use your own. I think any reasonably neutral approach will give you valuable data. Still, if you're open to it, I recommend you give NPS a try. It's certainly worked for me.

How to get started with NPS
For those that want to follow the NPS methodology, I will walk you through how to integrate it into your company, including how to design the survey, how to collect the answers, and how to calculate your score. Because the book is chock-full of examples of how to do this in older industries, I will focus on my experience integrating NPS into an online service, although it should be noted that it works equally well if your primary contact with customers is through a different channel, such as the telephone.

Designing the survey
The NPS question itself (again, "How likely are you to recommend X to a friend or colleague?") is usually asked on a 0-10 point scale. It's important to let people know that 10 reperesents "most likely" and 0 represents "least likely" but it's also important not to use words like promoter or detractor anywhere in the survey itself.

The hardest part about creating an NPS survey is to resist the urge to load it up with lots of questions. The more questions you ask, the lower your response rate, and the more you bias your results towards more-engaged customers. The whole goal of NPS is to get your promoters and your detractors alike to answer the question, and this requires that you not ask for too much of their time. Limit yourself to two questions: the official NPS question, and exactly one follow-up. Options for the follow-up could be a different question on a 10-point scale, or just an open ended question asking why they chose the rating that they did. Another possibility is to ask "If you are open to answering some follow-up questions, would you leave your phone number?" or other contact info. That would let you talk to some actual detractors, and get a qualitative sense of what they are thinking, for example.

For an online service, just host the survey on a webpage with as little branding or decoration as possible. Because you want to be able to produce real-time graphs and results, this is one circumstance where I recommend you build the survey yourself, versus using an off-the-shelf hosted survey tool. Just dump the results in a database as you get them, and let your reports calculate scores in real-time.

Collecting the answers
Once you have the survey up and running, you need to design a program to have customers take it on a regular basis. Here's how I've set it up in the past. Pick a target number of customers to take the survey every day. Even if you have a very large community, I don't think this number needs to be higher than 100. Even just 10 might be enough. Build a batch process (using GearMan, cron, or whatever you use for offline processing) whose job is to send out invites to the survey.

Use whatever communication channel you normally rely on for notifying your customers. Email is great; of course, at IMVU, we had our own internal notification system. Either way, have the process gradually ramp up the number of outstanding invitations throughout the day, stopping when it's achieved 100 responses. This way, no matter what the response rate, you'll get a consistent amount of data. I also recommend that you give each invitation a unique code, so that you don't get random people taking the survey and biasing the results. I'd also recommend you let each invite expire, for the same reason.

Choose the people to invite to the survey according to a consistent formula every day. I recommend a simple lottery among people who have used your product that same day. You want to catch people when their impression of your product is fresh - even a few days can be enough to invalidate their reactions. Don't worry about surveying churned customers; you need to use a different methodology to reach them. I also normally exclude anyone from being invited to take the survey more than once in any given time period (you can use a month, six months, anything you think is appropriate).

Calculate your score
Your NPS score is derived in three steps:
  1. Divide all responses into three buckets: promoters, detractors, and others. Promoters are anyone who chose 9 or 10 on the "likely to recommend scale" and detractors are those who chose any number from 0-6.
  2. Figure out the percentage of respondants that fall into the promoter and detractor buckets.
  3. Subtract your detractor percentage from your promoter percentage. The result is your score. Thus, NPS = P% - D%.
You can then compare your score to people in other industries. Any positive score is good news, and a score higher than +50 is considered exceptional. Here are a few example scores taken from the official Net Promoter website:

Apple 79
Adobe 46
Google
73
Barnes & Noble online
74
American Express
47
Verizon
10
DIRECTV
20

Of course, the most important thing to do with your NPS score is to track it on a regular basis. I used to look at two NPS-related graphs on a regular basis: the NPS score itself, and the response rate to the survey request. These numbers were remarkably stable over time, which, naturally, we didn't want to believe. In fact, there were some definite skeptics about whether they measured anything of value at all, since it is always dismaying to get data that says the changes you're making to your product are not affecting customer satisfaction one way or the other.

However, at IMVU one summer, we had a major catastrophe. We made some changes to our service that wound up alienating a large number of customers. Even worse, the way we chose to respond to this event was terrible, too. We clumsily gave our community the idea that we didn't take them seriously, and weren't interested in listening to their complaints. In other words, we committed the one cardinal sin of community management. Yikes.

It took us months to realize what we had done, and to eventually apologize and win back the trust of those customers we'd alienated. The whole episode cost us hundreds of thousands of dollars in lost revenue. In fact, it was the revenue trends that eventually alerted us to the magnitude of the problem. Unfortunately, revenue a trailing indicator. Our response time to the crisis was much too slow, and as part of the post-mortem analysis of why, I took a look at the various metrics that all took a precipitous turn for the worse during that summer. Of everything we measured, it was Net Promoter Score that plunged first. It dropped down to an all-time low, and stayed there for the entire duration of the crisis, while other metrics gradually came down over time.

After that, we stopped being skeptical and started to pay very serious attention to changes in our NPS. In fact, I didn't consider the crisis resolved until our NPS peaked above our previous highs.

Calculating the NPS of Lessons Learned
I promised that I would reveal the NPS of this blog, which I recently took a snapshot of by offering a survey in a previous post. Here's how the responses break down, based on the first 100 people who answered the question:
  • Number of promoters: 47
  • Number of detractors: 22
  • NPS: 25
Now, I don't have any other blogs to compare this score to. Plus, the way I offered the survey (just putting a link in a single post), the fact that I didn't target people specifically to take the survey, and the fact that the invite was impersonal, are all deeply flawed. Still, all things considered, I'm pretty happy with the result. Of course, now that I've described the methodology in detail, I've probably poisoned the well for taking future unbiased samples. But that's a small price to pay for having the opportunity to share the magic of NPS.

I hope you'll find it useful. If you do, come on back and post a comment letting us all know how it turned out.


Reblog this post [with Zemanta]

Wednesday, November 19, 2008

Lo, my 1032 subscribers, who are you?

When I first wrote about the advantages of having a pathetically small number of customers, I only had 5 subscribers. When I checked my little badge on the sidebar today, I was shocked to see it read 1032. As it turns out, it was much harder to get those first five subscribers, then the next thousand, thanks to great bloggers like Andrew Chen, Dave McClure, and the fine folks over at VentureHacks. Thank you all for stopping by.

Of course, 1000 customers is pretty pathetically small too. When startups achieve that milestone, it's a mixed blessing. On the one hand, having a little traction is a good thing. But on the other hand, figuring out what's going on starts to get more difficult. You can't quite talk to everyone on the phone . You have to start filtering and sorting; deciding which feedback to listen to and which loud people to ignore. It's also time to start thinking about customer segments. Do you have a particular set of early adopters that share some common traits? If so, they might be pointing the way towards a much bigger set of people who share those traits, but are not early adopters.

Let's take an example of a startup I was advising a few years ago. Of their early customers, about 1/3 of them turned out to be high school or middle school teachers. This wasn't an education product - it was a pretty surprising group to find using it. What all these teachers had in common were two things: they were technology early adopters that were willing to take a chance on a new software product, and they all had similar problems organizing their classes and students. At that early stage, it was the company's first glimpse of what a crossing the chasm strategy might look like: use these early adopters to build a whole product for the education market. Then sell it to mainstream educators, schools, and school districts, who shared the same problem of organizing classes, but were not themselves early adopters.

So how do you get started with customer segmentation? If you've already been talking to customers one-on-one, don't stop now (and if you haven't, this is still a good time to start). Those conversations are the best way to look for patterns in the noise. As you start to see them, collect your hypotheses and start using broader-reach tools to find out how they break down. I would recommend periodic surveys, along with some kind of forum or other community tool where the most passionate customers can congregate. You can also use Twitter, your blog (with comments), or even a more structured tool like uservoice.

I'd start with a simple survey (I use SurveyMonkey), combining the NPS question with a handful of more in-depth optional questions. In fact, I feel like I should eat my own dogfood, take my own medicine, or whatnot. Here's my survey for Lessons Learned:
As a loyal subscriber, I'd like to invite you to take the first Lessons Learned customer survey: Click Here to take survey
I put this together using the free version of SurveyMonkey, to show just how easy it is. If you're serious about this, you probably want to use their premium version, which will let you do things like add logic to let people easily skip the second page if they choose to, and send them to a "thank you page" afterward. Be sure to make the thank you page have a call to action (like a link to subscribe, for example) - after all, you're dealing with a customer passionate enough to talk to you.

So, to those of you who take the time to fill out the survey: thanks for the feedback! And to everyone who's taken the time to read, comment, or subscribe: thank you.
Reblog this post [with Zemanta]