Monday, December 28, 2009

Continuous deployment for mission-critical applications

Having evangelized the concept of continuous deployment for the past few years, I've come into contact with almost every conceivable question, objection, or concern that people have about it. The most common reaction I get is something like, "that sounds great - for your business - but that could never work for my application." Or, phrased more hopefully, "I see how you can use continuous deployment to run an online consumer service, but how can it be used for B2B software?" Or variations thereof.

I understand why people would think that a consumer internet service like IMVU isn't really mission critical. I would posit that those same people have never been on the receiving end of a phone call from a sixteen-year-old girl complaining that your new release ruined their birthday party. That's where I learned a whole new appreciation for the idea that mission critical is in the eye of the beholder. But, even so, there are key concerns that lead people to conclude that continuous deployment can't be used in mission critical situations.

Implicit in these concerns are two beliefs:

1. That mission critical customers won't accept new releases on a continuous basis.
2. That continuous deployment leads to lower quality software than software built in large batches.

These beliefs are rooted in fears that make sense. But, as is often the case, the right thing to do is to address the underlying cause of the fear instead of avoiding improving the process. Let's take each in turn.

Another release? Do I have to?
Most customers of most products hate new releases. That's a perfectly reasonable reaction, given that most releases of most products are bad news. It's likely that the new release will contain new bugs. Even worse, the sad state of product development generally means that the new "features" are as likely to be ones that make the product worse, not better. So asking customers if they'd like to receive new releases more often usually leads to a consistent answer: "No, thank you." On the other hand, you'll get a very different reaction if you ask customers "next time you report an urgent bug, would you prefer to have it fixed immediately or to wait for a future arbitrary release milestone?"

Most enterprise customers of mission critical software mitigate these problems by insisting on releases on a regular, slow schedule. This gives them plenty of time to do stress testing, training, and their own internal deployment. Smaller customers and regular consumers rely on their vendors to do this for them, and are otherwise at their mercy. Switching these customers directly to continuous deployment sounds harder than it really is. That's because of the anatomy of a release. A typical "new feature" release is, in my experience, about 80% changes to underlying API's or architecture. That is, the vast majority of the release is not actually visible to the end-user. Most of these changes are supposed to be "side effect free" although few traditional development teams actually achieve that level of quality. So the first shift in mindset required for continuous deployment is this: if a change is supposedly "side effect free," release it immediately. Don't wait to bundle it up with a bunch of other related changes. If you do that, it will be much harder to figure out which change caused the unexpected side effects.

The second shift in mindset required is to separate the concept of a marketing release from the concept of an engineering release. Just because a feature is built, tested, integrated and deployed doesn't mean that any customers should necessarily see it. When deploying end-user-visible changes, most continuous deployment teams keep them hidden behind "flags" that allow for a gradual roll-out of the feature when it's ready. (See this blog post from Flickr for how they do this.) This allows the concept of "ready" to be much more all-encompassing than the traditional "developers threw it over the wall to QA, and QA approved of it." You might have the interaction designer who designed it take a look to see if it really conforms to their design. You might have the marketing folks who are going to promote it double-check that it does what they expect. You can train your operations or customer service staff on how it works - all live in the production environment. Although this sounds similar to a staging server, it's actually much more powerful. Because the feature is live in the real production environment, all kinds of integration risks are mitigated. For example, many features have decent performance themselves, but interact badly when sharing resources with other features. Those kinds of features can be immediately detected and reverted by continuous deployment. Most importantly, the feature will look, feel, and behave exactly like it does in production. Bugs that are found in production are real, not staging artifacts.

Plus, you want to get good at selectively hiding features from customers. That skill set is essential for gradual roll-outs and, most importantly, A/B split-testing. In traditional large batch deployment systems, split-testing a new feature seems like considerably more work than just throwing it over the wall. Continuous deployment changes that calculus, making split-tests nearly free. As a result, the amount of validated learning a continuous deployment team achieves per unit time is much higher.

The QA dilemma
A traditional QA process works through a checklist of key features, making sure that each feature works as specified before allowing the release to go forward. This makes sense, especially given how many bugs in software involve "action at a distance" or unexpected side-effects. Thus, even if a release is focused on changing Feature X, there's every reason to be concerned that it will accidentally break Feature Y. Over time, the overhead of this approach to QA becomes very expensive. As the product grows, the checklist has to grow proportionally. Thus, in order to get the same level of coverage for each release, the QA team has to grow (or, equivalently, the amount of time the product spends in QA has to grow). Unfortunately, it gets worse. In a successful startup, the development team is also growing. That means that there are more changes being implemented per unit time as well. Which means that either the number of releases per unit time is growing or, more likely, the number of changes in each release is growing. So for a growing team working on a growing product, the QA overhead is growing polynomially, even if the team is only growing linearly.

For organizations that have the highest standards for mission critical, and the budget to do it, full coverage can work. In fact, that's just what happens for organizations like the US Army, who have to do a massive amount of integration testing of products built by their vendors. Having those products fail in the field would be unacceptable. In order to achieve full coverage, the Army has a process for certifying these products. The whole process takes a massive amount of manpower, and requires a cycle time that would be lethal for most startups (the major certifications take approximately two years). And even the Army recognizes that improving this cycle time would have major benefits.

Very few startups can afford this overhead, and so they simply accept a reduction in coverage instead. That solves the problem in the short term, but not in the long term - because the extra bugs that get through the QA process wind up slowing the team down over time, imposing extra "firefighting" overhead, too.

I want to directly challenge the belief that continuous deployment leads to lower quality software. I just don't believe it. Continuous deployment offers three significant advantages over large batch development systems. Some of these benefits are shared by agile systems which have continuous integration but large batch releases, but others are unique to continuous deployment.
  1. Faster (and better) feedback. Engineers working in a continuous deployment environment are much more likely to get individually tailored feedback about their work. When they introduce a bug, performance problem, or scalability bottleneck, they are likely to know about it immediately. They'll be much less likely to hide behind the work of others, as happens with large batch releases - when a release has a bug, it tends to be attributed to the major contributor to that release, even if that association is untrue. 
  2. More automation. Continuous deployment requires living the mantra: "have every problem only once." This requires a commitment to realistic prevention and learning from past mistakes. That necessarily means an awful lot of automation. That's good for QA and for engineers. QA's job gets a lot more interesting when we use machines for what machines are good for: routine repetitive detailed work, like finding bug regressions. 
  3. Monitoring of real-world metrics. In order to make continuous deployment work, teams have to get good at automated monitoring and reacting to business and customer-centric metrics, not just technical metrics. That's a simple consequence of the automation principle above. There are huge classes of bugs that "work as designed" but cause catastrophic changes in customer behavior. My favorite: changing the checkout button in an e-commerce flow to appear white on a white background. No automated test is going to catch that, but it still will drive revenue to zero. Continuous deployment teams will get burned by that class of bug only once.
  4. Better handling of intermittent bugs. Most QA teams are organized around finding reproduction paths for bugs that affect customers. This made sense in eras where successful products tended to be used by a small number of customers. These days, even niche products - or even big enterprise products - tend to have a lot of man-hours logged by end-users. And that, in turn, means that rare bugs are actually quite exasperating. For example, consider a bug that happens only one-time-in-a-million uses. Traditional QA teams are never going to find a reproduction path for that bug. It will never show up in the lab. But for a product with millions of customers, it's happening (and being reported to customer service) multiple times a day! Continuous deployment teams are much better able to find and fix these bugs.
  5. Smaller batches. Continuous deployment tends to drive the batch size of work down to an optimal level, whereas traditional deployment systems tend to drive it up. For more details on this phenomena see Work in small batches and the section on the "batch size death spiral" in Product Development Flow
For those of you who are new to continuous deployment, these benefits may not sound realistic. In order to make sense of them, you have to understand the mechanics of continuous deployment. To get started, I recommend these three posts: Continuous deployment in 5 easy steps, Timothy Fitz's excellent Continuous Deployment at IMVU: Doing the impossible fifty times a day, and Why Continuous Deployment?

Let me close with a question. Imagine with me for a moment that continuous deployment doesn't prevent us from doing staged releases for customers, and it actually leads to higher quality software. What's preventing you from using it for your mission-critical application today? I hope you'll share your thoughts in a comment.

Wednesday, December 23, 2009

Why vanity metrics are dangerous

In a previous post, I defined two kinds of metrics: vanity metrics and actionable metrics. In that post, I took it for granted that vanity metrics are bad for you, and focused on techniques for creating and learning from actionable metrics. In this post, I'd like to talk about the perils of vanity metrics.

My personal favorite vanity metrics is "hits." It has all the worst features of a vanity metric. It violates the "metrics are people, too" rule: it counts a technical process, not a number of human beings. It's a gross number, not per-customer: one hit each from a million people is a very different thing than a million hits from just one person. Most normal people don't understand it: what counts as a hit, anyway (an image, a page, a JS file...)? And it has no built-in measure of causality: if we get a million hits this month, what caused them? How could we generate more hits? Are those hits all of equal value? Since each of these questions requires diving into a different sub-metric, why not just use those metrics instead? That's the essence of actionable metrics.

But actionable metrics are more work. So it's reasonable to ask: what's wrong with vanity metrics, at least as a proxy for customer behavior? If hits are bigger this month than last month, that's progress. Why do we need to ask hard questions about the metric, if it's at least directionally correct?

When we rely on vanity metrics, a funny thing happens. When the numbers go up, I've personally witnessed everyone in the company naturally attributing that rise to whatever they were working on at the time. That's not too bad, except for this correlate: when the numbers go down, we invariably blame someone else. Over time, this allows each person in the company to live in their own private reality. As these realities diverge, it becomes increasingly difficult for teams to reach consensus on what to do next. If those teams are functionally organized, the problem is amplified. If all the engineers work on the same thing at the same time, and all the marketers do the same, and QA, and ops, all the way down the line, then each department develops its own team-based private reality. Now picture product prioritization meetings in such a company. Each team can't believe those idiots down the hall want to try yet another foo project when it's so evident that foo projects always fail.

Have you ever built one of those charts that shows a metric over time, annotated with "key events" that explain what happened to the numbers at key inflection points? If you never have, you can create your own using Google Finance. Go ahead and try it, then come back. You've just experienced vanity metrics hell. Everyone knows those charts are totally unpersuasive. At best, they can only show correlation, not causation. Sure I can build a stock chart, like this one, that shows that eBay's stock price went into a four-year decline immediately after "eBay Inc Acquires Dutch Company" But do you really believe that's what caused eBay's problems? Of course not. At worst, these kinds of vanity metrics can be easily used for gross distortions. And that potentiality can cripple companies at just those key moments when they need to understand their data the most.

Let me take an example from a company that was going through a tough "crossing the chasm" moment. They had just experienced two down quarters after many quarters of steady growth. Naturally, they had just raised money, and their new investors were understandably pissed. The company struggled mightily with how to explain this bad news to their board. They were accustomed to measuring their progress primarily by gross revenue compared to their targets. When the numbers started to go down, they started to investigate. It turned out that, during the course of the decline, one customer segment was losing customers while another was gaining customers. It's just that the declining segment's customers were more valuable. In retrospect, I can see the irony of the situation perfectly. This decline was actually the result of the company successfully executing its strategy. The customers on their way out were more valuable in the short-term, but the new customers coming in were where the real growth was going to happen in the long-term. Unfortunately, the magnitude of the shift, and the relative values of the two segments, took the company by surprise.

Guess how well those board meetings went. All of the sudden, now that the company was struggling, new metrics were being used to judge success. The vanity charts appeared, showing the changes the company had made to its strategy and the subsequent changes in customer behavior broken down by segment. All very reasonable, well designed, well argued. In other words, a total disaster. This board had no way to know if they were hearing real insight or just well-crafted excuses. The insight turned out to be correct (it's always clear in hindsight). Too bad several of the executives making that presentation weren't around to be vindicated.

The whole situation could have been avoided if the company had used actionable metrics to set and evaluate goals from the start. The strategy changes could have been rolled out gradually, segment by segment, in controlled trials. The data from those trials could have been used to predict the future effects, and allowed the company to make smarter decisions. Actionable metrics don't guarantee you'll make good decisions. But at least you can have facts in the room at the time.

And that's my metrics heuristic. Consider a team's last few critical decisions. Not the ones with carefully choreographed discussion and a formal agenda. Those are the easy meetings. I'm talking about the adhoc crisis decisions, the periodic product prioritization meetings, and the failure post-mortems. How much actionable data was in the room at the time? With data, teams have an opportunity to improve their decision making over time, systematically training their intuition to conform to reality. Without it, they're just rolling the dice.

And that's why vanity metrics are dangerous.

Wednesday, December 16, 2009

What is Lean about the Lean Startup?

The first step in a lean transformation is learning to tell the difference between value-added activities and waste. That foundational idea, so clearly articulated in books like Lean Thinking, is what originally led me to start using the term lean startup. I admit that I haven't always done such a good job emphasizing this connection; after all, there's an awful lot to the lean startup theory, and I'm always struggling with how best to explain it fully. Luckily, I've had some excellent backup.

The following is a guest post for Startup Lessons Learned by the legendary Kent Beck. One of the most amazing things about the past year has been the opportunity to meet many legends and personal heroes. And yet, I have a confession to make. Many of these heroes have proved disappointing: some have been defensive, stand-offish, and downright mean. Not so with Kent Beck. 

Longtime readers will recall how I first met him. I was giving my first-ever webcast on the lean startup. For those who've heard it, it contains a length discourse on the subject of agile software development and extreme programming, including its weaknesses when applied to startups. Now, this webcast was packed, hundreds of people were logged in. The chat stream was flying by in my peripheral vision, a constant distraction, hard to focus on. As I'm pontificating about agile, I see the name Kent Beck in my peripheral vision. I was truly terrified, and I almost completely lost my train of thought. Was that really the Kent Beck? I assumed he was there to refute my critique of extreme programming, but nothing could be further from the truth. In fact, of all the gurus and leaders I've had the chance to meet, he has been by far the most open-minded. He instantly understood what I was saying, and since that first encounter, our exchanges have made me a lot smarter. 

So when he weighed into a recent thread on the Lean Startup Circle mailing list on this very subject, I asked if he would expand his comments into a guest post. The following is the result. - Eric

Names matter. By pulling in a web of associations, names help people quickly assess ideas. Chosen well, they draw attention from people likely to appreciate the ideas they identify.

There is a dark side to naming. When a name is misused, as with some of the claims to "agility" extant, the initial interest is followed by disappointment when customers discover there is no corned beef between the slices of rye. It's tempting to ride the coattails of a popular idea by using a word with momentum, but in the end it backfires for the idea and the word.

The naming question has been raised about the "lean" in Lean Startups. Are lean startups really lean or was the word chosen because it is widely recognized and popular?

I had a background in lean manufacturing (book knowledge, anyway) and lean software development (hands on) before encountering Lean Startups. When I read Eric's blog I immediately felt at home: the principles were the same even though some of the practices were different.

The foundation of TPS (Toyota Production System) is that people need to be (and feel) productive and society needs people to produce value. This value is evident in Lean Startups. We are all engaged in creating valuable (we hope) services for society in some form or other and simultaneously meeting our own need to feel significant and productive.

Another basic principle of TPS is respect for people. One form of respect is not wasting the time of people who are creating new products and services. Another form of respect is inviting customers to be part of the process of creating those products and services. At times on this lean startup mailing list I hear an undercurrent of "ha, ha, I got you to give me feedback on this fake landing page even though I gave you nothing in return," which is a violation of this principle of respect. Overall, though, Lean Startups seems far more respectful to me than the "build something big and shove it down customers throats" model I have participated in (and failed with) over and over.

TPS focuses on eliminating waste. Rather than try to create value faster (I'm thinking of Charlie Chaplin in "Modern Times" or Lucille Ball's Candy Factory scene), a lean organization creates more value by eliminating waste. This principle appears throughout Lean Startups, starting with the biggest waste of all in a startup--building something no one uses.

A TPS tactic that familiar in lean startups is reducing inventory. If you can split a feature in half and get good feedback about the first half, do it. The lack of inventory enables quick changes of direction, something seen in Lean Startups in the pivot and in TPS in the ability of a single production line to create multiple vehicles. I haven't seen an equivalent of the systematic elimination of work-in-process inventory in Lean Startups, however. (Those who are interested in work-in-process might want to take a look at Work in small batches and Continuous deployment - Eric)

Some specific TPS practices appear in Lean Startups. A/B testing is set-based design. 5 Whys is straight out of the Toyota playbook. Conversion optimization is a form of kaizen. Whether practices work directly is not as important as whether the principles are alive, though. I see the lean principles throughout Lean Startups.

Is the "lean" in Lean Startups an illegitimate attempt to steal some of the "lean foo"? I don't think so. It doesn't look precisely like manufacturing cars, but the principles are shared between the Lean Startups and a lean manufacturing system like TPS, and to some extent even the practices. I expect the cross-fertilization to continue, even as Lean Startups discover what is unique about the startup environment and what calls for a unique response.

Monday, December 14, 2009

Business ecology and the four customer currencies

Lately, I’ve been rethinking the concept of “business model” for startups, in favor of something I call “business ecology.” In an ecosystem, each participant acts according to its own imperatives, but these selfish actions have an aggregate effect. Some ecosystems are stable, others malign, and others grow and prosper. A successful startup strives for this latter case. I think this concept is necessary in order to answer the truly vexing startup questions, like: “Should startups charge customers money from day one?”

Let’s begin with the four customer currencies. I had a lot of use for this concept back when I worked on game design and virtual worlds. In order to maintain game play balance, game designers have to take into account the needs of customers who have an excess of four different assets: time, money, skill, and passion.

If players with more money than others can simply buy their way to the top of the heap, a multiplayer game fails – because this makes the game un-fun for other players. The same is true if kids who have an unlimited amount of time on their hands are guaranteed the top spot – this isn’t going to be very fun for the busy professionals who want to play only casually. Chess is only fun for those who have the requisite skill to play well – and even then, only if there are ranking systems to make sure that players of relatively equal skill play each other. If you could buy a higher chess ranking or, worse, simply grow it by logging more hours, that would ruin the system for everyone. And passionate players are often the backbone of game communities – especially online. They run the clubs, forums, groups and mailing lists that make the game more fun overall. If they are barred from participating (say, because they lack the skill needed to prevent advanced players from killing them all the time), the game is worse off.

Each of these four currencies represents a way for a customer to “pay” for services from a company. And this is true outside of games. Constructing a working business model is a form of ecosystem design. A great product enables customers, developers, partners, and even competitors to exchange their unique currencies in combinations that lead to financial success for the company that organizes them.

Here’s the ecosystem we built at IMVU, just to give one example. We cultivated a passionate community that nurtured a skilled set of developers. Those developers create an incredible variety of virtual goods: 3D models, textures, homepage stickers, music, and much more – more than three million in total last time I checked. This variety entices millions of end-users to invest their time and passion with IMVU, providing many incentives for a small fraction of those users to become paying customers. Those paying customers provide IMVU with sufficient profits to reinvest in the core experience for everyone. It’s a working, growing, ecosystem.

Having a balanced ecosystem is what game designers strive for. But startups strive for something else: growth. Thus, business ecology is concerned with both ecosystem design and finding a driver of growth for that ecosystem. In a previous post, I covered the three main drivers of growth: Paid, Sticky, and Viral. When a startup finds a working value-creating ecosystem that supports one of these drivers of growth, watch out. They’re off to cross the chasm.

And this is why questions like “Should a company charge money from day one?” are nonsensical. Some companies definitely should. Others definitely shouldn’t. In order to tell which is which, you have to understand the unique ecology of the business in question.  Let’s look at some examples:

  • In a traditional business, customers pay money for a physical artifact (a product) or a service. Companies use that money to market the product or service to more customers. This is the simplest ecosystem and simplest driver of growth. A business that strives for something like this should absolutely be charging money from day one, in order to establish baselines for their two key metrics: CPA (the cost to acquire a new customer) and LTV (the lifetime value of each acquired customer). In other words, the minimum viable product is designed to answer the question: does the product generate enough demand and margin to support a growing ecosystem?  
  • Now consider a traditional media business. By paying money to content creators (ie writers, producers, talent), the business uses builds up assets that are of interest to other consumers. Those other consumers pay for this content sometimes with money, but more often with their attention. This attention is valuable to yet another set of people: namely, the traditional businesses (see above) who are using marketing to grow, and are looking to advertise to new prospects. The value of the attention that the media company collects determines how profitable it is. In the old days, these media companies would then themselves plow this profit back into marketing and advertising, and grow. Today, many of these businesses are suffering because the ecosystem no longer balances thanks to the Internet. (Sorry about that.) If you’re starting a new media company, does it make sense to charge from day one? Probably not – you need to be finding an audience, making sure that audience will trade you their attention for your content, and – most importantly – establishing a baseline for how much that attention is worth to advertisers. A minimum viable product in this category must answer the question: does my media content or channel command the attention of a valuable audience?

  • Let’s look at a viral growth company, like Facebook. They are a classic case of a company that doesn’t seem to care about charging customers money. Here’s Andrew Chen’s description:

    “it strikes me that consumer internet companies often don’t care much whether or not they have viable businesses in the short run. If you are building a large, viral, ad-support consumer internet property, you just want to go big! As soon as possible!”
    This is a common sentiment, but I don't agree. I think it uses the phrase “viable business” in too narrow a sense. When Facebook launched early-on at college campuses, it was immediately apparent that they had a viable business, even though they weren’t charging customers for anything. Why? Because they were collecting truly massive amounts of attention and they had an amazing driver of growth. Those two factors made it relatively easy for them to raise enough money to avoid having to build a profitable business in the short term. But that doesn’t mean they didn’t have a viable one. The ecosystem worked, and was growing. Figuring out how to turn that attention into cash seems to have been pretty obvious to Mark Zuckerberg. For a true viral ecosystem, the minimum viable product is designed to answer the question: can I unlock viral growth mechanics while still keeping my ecosystem alive? As many viral companies have found to their chagrin, quite a few viral products are fundamentally useless. Although they grow, they don’t actually collect enough of any customer currency to be viable.

    In fact, the viral metaphor is actually more apt than many people realize, once you look at it from an ecological perspective. Facebook is actually quite rare – many other viral products didn’t really build their own working ecology: they colonized someone else’s. That was true for Paypal cannibalizing eBay, YouTube and MySpace, and could still be true of Slide, Zynga, or RockYou – we’ll see.

    Now, Andrew’s excellent piece that I quoted from above correctly diagnoses two situations where consumer internet companies often get in trouble:

    1. They focus too much on short-term revenue, getting caught in a local maximum via constant optimization. They aren’t really engaged in customer development, they aren’t getting inside their customers’ heads, and they aren’t crafting a robust ecosystem. For a consumer internet company in particular, this is often due to a lack of design thinking.

    2. They get focused solely on growth. This isn’t helpful either, as countless companies have shown. If you haven’t figured out the ecosystem, growth is useless – whether it is a acquisition-only viral loop, like Tagged, or an advertising blitz like countless dot-bombs.

  • Let’s consider one last example, a sticky-growth company like eBay or World of Warcraft. Here the goal is to create a product whose ecosystem makes it hard for customers to leave. eBay offers their customers an opportunity to monetize their skill and passion via online trading for hard currency. World of Warcraft offers beautifully balanced and addictive game play, for which customers trade all four currencies in bewildering combinations. Like eBay, these investments are best understood as trades between players, which is what makes multiplayer game design so much harder than its single-player counterpart. What these products all have in common is the question their minimum viable product is attempting to answer: does this product have high natural retention built-in?

Understanding the four customer currencies allows us to avoid these problems, and also unify a number of different concepts that have been floating around. Take the minimum viable product, for starters. How should the word viable be understood? Here’s the original definition I proposed for MVP:
“the minimum viable product is that version of a new product which allows a team to collect the maximum amount of validated learning about customers with the least effort.”

Of course, this begs the question: what are we trying to learn? Now I think I can answer this question with some certainty: we want to learn how to construct a business ecology harnessed to the right driver of growth. And how do we validate that learning? By creating a model of the ecosystem we want, and showing that actual customer behavior conforms to that model. And, of course, if customers don’t behave the way we expect, it’s time to pivot.

That necessarily means that different types of startups will be seeking to learn different things. Minimum viable product is a tactic for mitigating risk. It doesn’t say anything about which risks it should be used to mitigate. Startup founders need to use their own judgment to ask: which is the riskiest assumption underlying my business plan? In each of the ecosystem examples I gave above, the tactics of the minimum viable product are quite different. In some cases, Tim Ferriss-style landing page tests will suffice. Others require Steve Blank-style problem/solution presentations. And others require an early product prototype. The level of design required will vary. The level of engineering quality will vary. The amount of traditional business modeling will vary.

And now we can answer the biggest question of all: how do we know it’s time to scale? Or, to borrow Steve Blank’s formulation, how do we know it’s time to move from Customer Validation to Customer Creation? I think I have a solid answer here too: when we have enough data that shows our business ecology is value-creating and also ready to grow via a specific driver of growth.

Founders struggle with this question. Successful startups don’t. In almost every case I’m aware, the question never had to be asked. When an ecosystem is thriving and growing, it takes work just to keep up with its scaling needs. This was true at Facebook, eBay, and Google – and at countless other successful startups. Marc Andreessen has already coined a phrase for what it looks like: product/market fit. One clue that you don’t have product/market fit – you’re trying to evaluate your business to see if you have it. It’s probably time to pivot.

These concepts have important implications for any lean startup. My whole goal with the lean startup movement has been to learn how to tell the difference between value-creating activities and waste in startups – and then to start eliminating waste. In an entrepreneurial situation, this is hard, because artifacts that we are creating (products, code, marketing campaigns, even revenue) are of secondary importance. The real value we create is learning how to craft profitable ecosystems. Even then, it’s the learning that’s the real value. So, when evaluating any activity, ask: is this helping me learn more about my startup’s ecosystem? If not, eliminate it. If so, ask: how could I get even more learning while doing even less work?

Most of all, beware one-size-fits-all startup advice. In order to figure out what applies to your unique situation, focus on the principles. Who are the customers? What currencies do they have? And what problems do they need solved? Look for a balanced ecosystem and a driver of growth. And be sure to hold on to the reins once you find it.

Thursday, November 12, 2009

New York: three straight days of Lean Startup (two of which are free)

Greetings from Europe, where I'm just wrapping up an incredible (and exhausting) speaking tour. I've been so busy with talks and travel that I haven't had much chance to post updates to the blog, but many of the events here have had video, and I will try and post details soon. Before I get to go home, I am doing a full week of events on the east coast in New York and Boston: five events in five days - if I survive.

For those who want to come see me at Web 2.0 Expo in New York, there's good news. Thanks to web2open, there are two Lean Startup sessions that are completely free and open to the public. We did similar events at the Expo in San Francisco, and had a lot of fun. (To get a preview, you can read my write up of that event here.) Here are details on the three New York events; registration details follow:
  • Monday, November 16 at 9:00am, we'll kick off the week with a Lean Startup Workshop. This will be an abbreviated three-hour version of the Master Class I've been doing with O'Reilly. (For those that want the full thing, there is one scheduled in New York on December 10.)
  • Tuesday, November 17 at 4:20pm, you can come get a 50-minute introduction to the lean startup theory and practice. This is a designated web2open hybrid session - which means you can attend for free even if you're not attending the rest of the conference. It's the only hybrid session happening on Tuesday. Unlike in San Francisco, the open Q&A session won't happen until the next day.
  • Wednesday, November 18 at 3:30pm you can come have all of your lean startup-related questions answered at a web2open session. Again, this is free and open to the public. I'm hoping this will also be an opportunity to organize a meeting of the New York Lean Startup Meetup. Stay tuned to their mailing list for details.
So, if you'd like to attend, please do as follows. Sign up at the Expo registration form here using the code webny09opn by Nov 15. You'll be offered a $100 discount on a full conference registration, but that is strictly optional. If you just want to come to the web2open unconference, you can register for free.

The Boston event at MIT on the 19th is already sold-out. On the 20th, I'm doing a lunchtime recap at Dog Patch Labs for Polaris Ventures (who are also a generous sponsor of the MIT talk); they have a few extra tickets to accommodate some additional people. Click here for details.

As always, if you're a reader, come say hello and introduce yourself after the event. And do keep the feedback coming - that's what makes these events worthwhile. And for those that can't make it, you can always follow along at the #leanstartup hashtag on Twitter.

And for those of you in San Francisco - the Lean Startup Cohort program is going to begin on December 17. Thanks to the many of you who have supported the idea, and especially to our brave early adopters!

Monday, October 26, 2009

A real Customer Advisory Board

A reader recently asked on a previous post about the technique of having customers periodically produce a “state of the company” progress report. I consider this an advanced technique, and it is emphatically not for everyone.

Many companies seek to involve customers directly in the creation of their products. This is a lot harder than it sounds. Hearing occasional input is one thing, but building an institutional commitment to acting on this feedback is hard. For one, there are all the usual objections to customer feedback: it is skewed in favor of the loud people, customers don’t know what they want, and it is fundamentally our job to figure out what to build. All of those objections are valid, but that can’t be the end of the story. Just because we don’t blindly obey what our customers say doesn’t absolve us of the responsibility of hearing them out.

The key to successful integration of customer feedback is to make each kind of collection part of the regular company discipline of building and releasing products. In previous posts, I’ve mentioned quite a few of these, including these most important ones:
  • having engineers post on the forums in their own name when they make a change
  • routinely split-testing new changes
  • routinely conducting in-person usability tests and interviews
  • Net Promoter Score
Each of these techniques is fundamentally bottoms-up.  They assume that each person on the team is genuinely interested in testing their work and ideas against the reality of what customers want. Anyone who has worked in a real-world product development team can tell you how utopian that sounds. In real life, teams are under tremendous time pressure, they are trying to balance the needs of many stakeholders, and they are human. They make mistakes. And when they do, they are prone to all the normal human failings when it comes to bad news: the desire to cover it up, rationalize the failure away, or redefine success.

To counteract those tendencies, it helps to supplement with top-down process as well. One example is having a real Customer Advisory Board. Here’s what it looks like. In a previous company, we put together a group of passionate early adopters. They had their own private forum, and a company founder (aka me) personally ran the group in its early days. Every two months, the company would have a big end-of-milestone meeting, with our Board of Directors, Business Advisory Board, and all employees present. At this meeting, we’d present a big package of our progress over the course of the cycle. And at each meeting, we’d also include an unedited, uncensored report direct from the Customer Advisory Board.

I wish I could say that these reports were always positive. In fact, we often got a failing grade. And, as you can see in my previous post on “The cardinal sin of community management” the feedback could be all over the map. But we had some super-active customers who would act as editors, collecting feedback from all over the community and synthesizing it into a report of the top issues. It was a labor of love, and it meant we always had a real voice of the customer right there in the meeting with us. It was absolutely worth it.

Passionate online communities are real societies. What we call “community management” is actually governance. It is our obligation to govern well, but – as history has repeatedly shown – this is incredibly hard. The decisions that a company makes with regard to its community are absolute. We aspire to be benevolent dictators. And unlike in many real-world societies, our decisions are not rendered as law but as code. (For more on this idea, see Lawrence Lessig’s excellent Code is Law.) The people who create that code are notoriously bad communicators, even when they are allowed to communicate directly to their customers.

A customer advisory board that has the ear of the company’s directors acts as a kind of appeals process for company decisions. As I mentioned in “The cardinal sin of community management,” many early adopters will accept difficult decisions as long as they feel listened to. As a policy matter, this is easy to say and very hard to implement. That’s why the CAB is so valuable. They provide a forum for dissenting voices to be heard. The members of the CAB have a stake in providing constructive feedback, since they will tend to be ignored if they pass on vitriol. In turn, they become company-sanctioned listeners. By leveraging them, the company is able to make many more customers feel heard.

The CAB report acts as a BS detector for top management. It’s a lot harder to claim everything is going smoothly, and that customers are dying for Random New Feature X when the report clearly articulates another point of view. Sometimes the right thing to do is to ignore the report. After all, listening to customers is not intrinsically good. As always, the key is to synthesize the customer feedback with the company’s unique vision. But that’s often used as an excuse to ignore customers outright. I know I was guilty of this many times. It’s all-too-easy to convince yourself that customers will want whatever your latest brainstorm is. And it’s so much more pleasant to just go build it, foist it on the community, and cross your fingers. It sure beats confronting reality, right?

Let me give one small example. Early in IMVU’s life, IM was a core part of the experience. Yet we were very worried about having to re-implement every last feature that modern IM clients had developed: away messages, file transfer, voice and video, etc. As a result, we tried many different stratagems to avoid giving the impression that we were a fully-featured IM system, going so far as to build our initial product as an add-on to existing IM programs. (You can read how well that went in another post here.)

This strategy was simply not working. Customers kept demanding that we add this or that IM feature, and we were routinely refusing. Eventually, the CAB decided to weigh in on the matter in their board-level report. I remember it so clearly, because their requests were actually very simple. They asked us to implement five – and only five – key IM features. For weeks we debated whether to do what they asked. We were afraid that this was just the tip of the iceberg, and that once we “gave in” to these five demands there would be five more, ad infinitum. It actually took courage to do what they wanted – as it does for all visionaries. Every time you listen to customers, you fear diluting your vision. That’s natural. But you have to push through the fear, at least on occasion, to make sure you’re not crazy.

In this particular example, it turned out they were right. Just those few IM features made the product dramatically better. And, most importantly, that was the end of IM feature creep. Nobody even mentioned it as an issue in subsequent board meetings. That felt good – but it also gave our Board tremendous confidence that we could change the kind of feedback we were getting by improving the product.

This technique is not for everybody. It gets much harder as the company – and the community – scales, and, in fact, IMVU uses a different system of gathering community feedback today. But, if your community is giving you a headache, give this a try. Either way, I hope you’ll share your experiences, too.

Friday, October 23, 2009

Case Study: Using an LOI to get customer feedback on a minimum viable product

How much work should you do on a new product before involving customers? If you subscribe to the theory of the minimum viable product, the answer is: only enough to get meaningful feedback from early adopters. Sometimes the best way to do this is to put up a public beta and drive a limited amount of traffic to it. But other times, the right way to learn is actually to show a product prototype to customers one-on-one. This is especially useful in situations, like most B2B businesses, where the total number of customers is likely to be small.

This case study illustrates one company’s attempt to do customer development by testing their vision with customers before writing a single line of code. In the process, they learned a lot by asking initial prospects to sign a non-binding letter of intent to buy the software. As you’ll see, this quickly separated the serious early adopters from everyone else. Mainstream customers don’t have enough motivation to buy an early product, and so building in response to their feedback is futile.

Along the way, this case study raises interesting ethical issues. The lean startup methodology is based on enlisting customers as allies, which requires honesty and integrity. If you deceive customers by showing them screenshots of a product that is “in-development” but for which you have written no code, are you lying to them? And, if so, will that deception come back to haunt you later? Read on and judge for yourself.

The following was written an actual lean startup practitioner. It was originally posted anonymously to the Lean Startup Circle mailing list, and then further developed on the Lean Startup Wiki’s Case Studies section. If you’re interested in writing a future case study, or commenting/contributing to one, please join the mailing list or head on over to the wiki. What follows is a brief introduction by me, the case study itself, and then some Q&A led by LSC creator Rich Collins. Disclaimer: claims and opinions expressed by the authors of case studies are theirs alone; I can’t take credit or responsibility. – Eric Ries

In April of 2009 my partner and I had an idea for a web app, a B2C platform that we are selling as SaaS [software-as-a-service]. We decided from the get-go that, while we clearly saw the benefits and necessity of our concept, we would remain fiercely skeptical of our own ideas and implement the customer development process to vet the idea, market, customers etc, before writing a single line of code.

My partner was especially adamant about this as he had spent the last 6 months in a cave writing a monster, feature-rich web app for the financial sector that a potential client had promised to buy, but backed out at the last second.  They then tried to shop the app around, and found no takers.  Thousands of lines of code, all for naught -- as is usually the case without a customer development process. (See Throwing away working code  for more on this unfortunate phenomenon. -Eric)

We made a few pencil drawings of what the app would look like which we then gave to a graphic designer.  With that, the graphic designer created a Photoshop image. We had him create what we called our "screenshots" (which suggests that an app actually existed at the time) and had him wrap them in one of these freely available PS Browser Templates. Now armed, with 4 "screenshots" and a story, we approached our target market, some of which was through warm introductions, and some, very literally, was through simple cold-calling.

Once we secured a meeting, we told our potential customers that we were actively developing our web app (implying that code was being written) and wanted to get potential user input into the development process early on.  Looking at paper print-outs of our "screenshots", no one could tell that this was simply a printout of a PSD, and not a live app sitting on a server somewhere. We walked them through what we thought would be the major application of our product.  Most people were quite receptive and encouraging.  What proved to be very interesting was that we quickly observed a bimodal distribution with regards to understanding the problem and our proposed solution:

  • people either became very excited and started telling us what we should do, what features it needed and how to run with this, or
  • they didn't think there was a real problem here, much less a needed solution.
We ruminated on this for a while. The vehemence of those that didn't get it surprised us.  Perhaps we had a super-duper-hyper-ultra-cool idea  --- but not enough customers existed to make it worth the effort. We visited each potential customer a minimum of twice, if not three times.  Each time we would come back with a few more "screenshots" and tell them that development was progressing nicely and ask them for more input. We also solicited information as to how they were currently solving the problem and how much they paid for their solution.

On the third visit, we pressed those who saw merit in the idea to sign a legally non-binding Letter of Intent.  Namely, that they agree to use it free of charge if we deliver it to them and it is capable of X, Y and Z.  And not only do they agree to use it, but that they intend to purchase if by Y date at X price if it meets their needs.

By the way, this LOI was not written in legalese.  Three quarters of it was simple everyday English.  In fact, we customer dev-ed the LOI itself.  The first time, we asked a client to sign it before we had even written it.  When they agreed to sign it, we quickly whipped it up while sitting in a coffee shop and emailed it off to them.  This would help us separate the wheat from the chaff when it came to determining interest and commercial viability.  Once we had two LOIs signed and in-hand, we actually began to write code.

We also implicitly used the LOIs for price structure and price discovery - which we are still working on.  We backed into prices from all sorts of angles, estimating the time-cost of equivalent functionality, competitive offerings, other tools we were potentially displacing -- but in the end, we lobbed a few numbers at them and waited to see if they flinched.

Customer A got X price, Customer B got X + Y price, and so on.  So far, our customers have never mentioned price as an objection, which suggests to me that at this point we are very much underpriced. The LOI was also useful as we leveraged it by approaching the competitor of one of those who signed by simply letting them know that their competitor will be using our app.  They returned our cold intro email within 8 mins.

We have two customers that have balked at signing LOIs, but want to use our product.  This has been somewhat of a quandary for us.  When we decided to go the LOI route, we thought that we would not bend and that we would only service those customers who would sign the LOI.  In the end, we decided that these two customers were large enough to help us with exposure, provide good usage data and worth the risk of them wasting our time.  Time will tell if this theory proves correct.

Right now, the app itself is pretty ugly, a bit buggy and slow -- and doesn't even do a lot.  It is borderline embarrassing.  Don't get me wrong, it does the few necessary things.  BUT it definitely does NOT have the super-duper-hyper-ultra-cool Web 2.0 spit and polish about it. Interestingly enough, our ratio of positive comments to negative comments from actual users is about 10 to 1.  One of our first customers had a disastrous launch with it, yet, has signed on to try it again (granted, they did get it for free and we did offer it for free for this next time). But they didn't hesitate to try it again.  I thought we would have to plead, beg and beseech.  But for them, it was a no-brainer.  So, we have to be doing something right.

Our feature set is very limited and being developed almost strictly from user input.  While I personally have all sorts of super-duper-hyper-ultra-cool Web 2.0 ideas --- we are holding ourselves back, and forcing ourselves to wait for multiple, explicit and overlapping user requests.  We have seen our competitors whose feature sets are very rich, to say the least, but we think in some cases, are as over-engineered as they are feature-rich.

Only time and the market will tell if they are innovative and we are slow, lazy pigs or they have gotten ahead of themselves/the market and our minimalist solution will be better received.

Rich Collins, founder of the Lean Startup Circle, responded to the poster with some Q&A.
LSC: What is your response to some of the people on Hacker News that questioned the ethics of taking this approach?

Some of the commenters have some good points.  It definitely explores ethical boundaries.  However, I don't think we indulged in any zero-sum game type deception.  By that, I mean our intentional fuzziness about the state of development did not cause harm in any manner to our prospective clients.  In fact, just us showing up at their offices and talking about our screenshots benefited our prospective clients tremendously as:

  1. Those clients who had never even entertained the functionality we were proposing gained significant knowledge.
  2. With that knowledge, they could (and did) Google our competition and start exploring the space and current offerings. 
We did, in fact, tell one of our prospects in the beginning that our screenshots were simply mock-ups.  However, that makes the prospect feel as if you are wasting their time and they then are unlikely to provide input.

"Oh, this is just a Photoshop file?  Well, come back to us when you are further along." which defeats the whole purpose of getting face time for Customer Development!

When you tell them, the app is in development (and it was, even before coding, we were spending a lot of time on what we wanted and didn't want, how it would look, use cases ‚ etc) the prospects are interested in providing input and shaping the product.  They need to feel and see some momentum.

LSC: Your use of a non-binding letter of intent was another interesting tactic.  Did the customers that signed it end up paying for your product?

Yes and no.  We had a dispute with one signee and couldn't convert them.  However, we successfully converted others.  I should also mention that there was one client who refused to sign an LOI, but we are in the process of converting them.

The LOI was designed to give us hard, non-bullshit-able feedback instantly.  Too often people will affirm your idea so that you (or they) can save face, which BTW is a form of well-intentioned and socially acceptable deception.  This is why, IMHO, friends, wives, and significant others are probably not good people to talk to about your idea.  At the end of the day, no one knows if the idea is any good.  The market will tell you.

LSC: Would you respond to a few selected Hacker News comments?
"If I were one of your prospects, I would never sign a letter of intent based on drawings only. I'd make you come back later with something, anything I could play with ... Come back when you have something real to show. Until then you're no different from any other poser."

I myself probably would never sign an LOI on screenshots only.  However, our customers did a lot of stuff that I would never do.  Lesson learned:  I am not my customer.  We think differently.  We solve our problems differently.  We have different needs and wants.  Repeat after me:  You are not your customer.

LSC: And one more: "Except the LOIs in this case are utterly meaningless. I've been on the customer side of LOIs that were signed on request, knowing that it obligated us to nothing."

Wrong.  We got instantaneous feedback on the validity of the idea and started our sales process concurrently.  While legally non-binding, customers who have signed an LOI are a lot less likely to disappear or make themselves hard to get a hold of.  LOIs, while clearly not as good as signed sales contract, do have meaning and are valuable.  I encourage B2B startups to keep them in their customer development arsenal.

Special thanks to Rich Collins, the Lean Startup Circle practitioners, and to everyone who has contributed to the Case Studies on the wiki. And thanks to these entrepreneurs for sharing their story. Have a case study you’d like to share? Head on over to the Lean Startup Wiki.

Monday, October 19, 2009

Myth: Entrepreneurship Will Make You Rich

I have a new guest post on GigaOm today, called Myth: Entrepreneurship Will Make You Rich. Here's an excerpt:
One of the unfortunate side effects of all the publicity and hype surrounding startups is the idea that entrepreneurship is a guaranteed path to fame and riches. It isn’t. Building a startup is incredibly hard, stressful, chaotic and –- more often than not –- results in failure. That doesn’t mean it’s not a worthwhile thing to do, just that it’s not a good way to make money.

A more rational career path for money-making is one that rewards effort, in the form of promotions, increased security, salary and status. Startups, unfortunately, punish effort that doesn’t yield results. In fact, the biggest source of waste in a startup is building something nobody wants. While in an academic R&D lab, creation for creation’s sake will often get you praise, in a startup, it will often put you out of business.

So why become an entrepreneur instead of developing technology in an R&D lab? Three reasons: change the world, make customers’ lives better and create an organization of lasting value. If you only want to do one of these things, there are better options. But only startups combine all three.

Take this fictional example of a Seedcamp attendee (actually a composite), which I will refer to as Hairbrush 2.0...

Read the rest of Myth: Entrepreneurship Will Make You Rich

Also take a look at the great Hacker News discussion of this essay. It includes several gems, including this comment from davidu:
1) Being an entrepreneur, for me, isn't about being wealthy, it's about being successful.
2) Rich is a variable term, and intended to be so.
Entrepreneurship may not make you wealthy, but it can certainly make you rich.
I enjoy the freedom and independence afforded by starting EveryDNS and OpenDNS. Both contain a passion for a system I love, the DNS, and both have let me help millions of consumers around the world. I even like knowing I control the DNS for millions and millions of Internet users. That's an awesome responsibility and it certainly makes me feel rich about everything I do.
And when it comes to money, Eric is only somewhat right. He says you should get a job that rewards and promotes effort. But lots of lawyers and finance kids in New York thought they had stable jobs that would make them rich. Ask them today and most will tell you a different story altogether. Now they hate their jobs and have no job security or path to becoming really wealthy.
So like I said, being entrepreneur, for me, isn't about being wealthy, it's about being successful. That's a measuring stick that's far more important.
and this one from gits_tokyo:
People that I've spoken with in the past more often than not associated the idea of me doing a startup in the tech industry with gaining massive wealth. While I may entertain this, deep down I find it lacking as there's so much more than wealth to be had.
How about, living in a world... some distant future from the everyday-everyday where day-by-day you toil piecing together a vision, one day injecting it into the present, in order to influence a whole new set of social behaviors while also unfolding valuable opportunities. How about, the day of flipping that proverbial switch, releasing this vision out in the wild. How about, the potential of millions interacting with your vision, it becoming a staple part of a users online experiences. There's something undeniably provoking about all this, rush of my life.
Wealth, although a welcomed aside pales in comparison. Hell I would even go so far as to say, in a world where sex is constantly peddled as a cure all, let me say it, sex pales in comparison to the feeling I get from being an entrepreneur.

Inc Magazine on Minimum Viable Product (and a response)

Inc Magazine has a great new piece up about the increasing use of the Minimum Viable Product by businesses (and not just startups). Here's an excerpt; some of my comments are below:

One of the most gut-wrenching moments for a company is the rollout of a new product. A significant swing and miss can break a company's momentum -- and maybe its bank account. Unfortunately, after months or even years of development, many companies discover that customers aren't willing to buy their new wares. That's why some entrepreneurs are trying another approach to product launches: marketing a product online before spending much on research and development or inventory.

Consider the method used by TPGTEX Label Solutions, a Houston-based software company that specializes in bar codes and labels for manufacturers and chemical companies. Like many companies, TPGTEX rolls out new products several times a year. But instead of spending the time and money to develop products on spec, TPGTEX creates mocked-up webpages that list the features of a potential new product -- such as a system for making radio-frequency identification, or RFID, labels -- along with its price. Then, the company spends no more than a few hundred dollars marketing the product through search engines and to the contacts in its sales database and LinkedIn. It isn't until a customer actually clicks or calls to place an order that TPGTEX's developers will build the software. "We do not develop a product until we get a paying customer," says Orit Pennington, who co-founded the six-employee company with her husband in 2002. Development time is typically no more than two to three weeks, and it generally takes just a few orders to cover development costs.

TPGTEX's approach is an example of a trend in business that has been dubbed minimum viable product or microtesting. The idea is to develop something with the minimum amount of features or information needed to gauge the marketability of a product online. That might mean mocking up a website with potential features and seeing how many visitors click on the item. It might also involve buying pay-per-click ads to see how easy it is to gain potential customers. Or it might mean selling a few products on a site like eBay to see how well they perform before ordering in bulk from a wholesaler.
What sets this approach apart from practices like using focus groups is that companies base product development decisions not just on what customers say they want but on how they vote with their wallets.

Read the rest...

This article is part of a trend that has taken me a bit by surprise: the adoption of lean startup techniques outside the traditional domain of high-tech startups. The theory predicts this, of course, because the definition of a startup as “a human institution creating a new product or service under conditions of extreme uncertainty” says nothing about sector, size of company, or industry. Still, it’s always a relief to see practice and theory converge.

Of course, as more people attempt to use the Minimum Viable Product as a tactic, there are a lot of misconceptions possible. The biggest is the confusion over why this tactic is useful. The Inc story, and many others, does a good job emphasizing its lean-ness. By allowing customers to “pull” value from the company in small batches, you reduce the risk of building a product that nobody wants. Like all lean transformations, this is powerful – it increases the value of every dollar invested in new product creation.

But MVP is most powerful when it is used as part of an overall strategy of learning and discovery. And this is the most confusing, because MVP does not pay off under this strategy if we are attempting to build a minimal product. For that, release early, release often will suffice. But if our aspiration is to change the world, we need something more.

The key ideas are customer development, the pivot, MVP, and root cause analysis. Each is described in separate essays on this blog, but let me say a few words about how they work together – especially for companies with big ambitions. Big visions take a long time to develop, and require an exceptionally high degree of product/market fit. That’s just a fancy way of saying: customers have to really, really like your product. Being specific, it means that their behavior powers one of the three fundamental drivers of growth with a large coefficient. But if big products and big visions take a long time to develop, it’s exceptionally risky to build it based on vision alone. That’s because for a big product to take off, it needs to be right in many key respects. Miss just one, and you can find yourself just a few degrees off – and moving with too much momentum to change course. Think Friendster, the “achieving a failure” startup I’ve written about, Apple’s Newton, Webvan, etc. In each of these, the failure of the initial idea led to the failure of the company (or division).

Building an MVP can help mitigate that risk. But it’s not enough. What if customers hate the MVP? Does that mean your product vision is fundamentally flawed, or just that your initial product sucks? There is no way to know for sure. That’s why entrepreneurship in a lean startup is really a series of MVP’s, each designed to answer a specific question (hypothesis). Being systematic about these hypotheses is what customer development is all about. By testing each failed hypothesis leads to a new pivot, where we change just one element of the business plan (customer segment, feature set, positioning) – but don’t abandon everything we’ve learned. In order to work, these pivots have to be heading in a coherent direction, which is why vision is still such a critical part of entrepreneurship, even in a data-based decision making environment. (See “It’s a startup, not a spreadsheet” for more.)

And yet, even that is not enough. The more visionary the entrepreneur, the more difficult it is to really pivot, really seek out what’s in customers’ heads, and really create a minimum viable product. And so startups – great and terrible alike - are prone to give these ideas lip service, but fail to really take maximum advantage. That’s why a process of rigorous root cause analysis is so critical. After every major milestone, the company has to ask: what did we learn? Why didn’t we learn more? And, most importantly, make incremental investments to do better next time. This is the ultimate startup discipline, the hardest to master, and the one that pays biggest dividends. If you can embrace continuous improvement from day one, you can actually speed up as you scale. It’s an awesome thing to watch.

Sunday, October 11, 2009

Innovation inside the box

I was recently privy to a product prioritization meeting in a relatively large company. It was fascinating. The team spent an hour trying to decide on a new pricing strategy for their main product line. One of the divisions, responsible for the company’s large accounts, was requesting data about a recent experiment that had been conducted by another division. They were upset because this other team had changed the prices for small accounts to make the product more affordable. The larger-account division wanted to move the pricing in just the other direction – making the low-end products more expensive, so their large customers would have an increased incentive to upgrade.

Almost the entire meeting was taken up with interpreting data. The problem was that nobody could quite agree what the data meant. Many custom reports had been created for this meeting, and the data warehouse team was in the meeting, too. The more they were asked to explain the details of each row on the spreadsheet, the more evident it became that nobody understood how those numbers had been derived.

Worse, nobody was quite sure exactly which customers had been exposed to the experiment. Different teams had been responsible for implementing different parts of it, and so different parts of the product had been updated at different times. The whole process had taken. And by now, the people who had originally conceived the experiment were in a separate division from the people who had executed it.

Listening in, I assumed this would be the end of the meeting. With no agreed-upon facts to help make the decision, I assumed nobody would have any basis for making the case for any particular action. Boy was I wrong. The meeting was just getting started. Each team simply took whatever interpretation of the data supported their position best, and started advocating. Other teams would chime in with alternate interpretation that supported their position, and so on. In the end, decisions were made – but not based on any actual data. Instead, the executive running the meeting was forced to make decisions based on the best arguments.

The funny thing to me was how much of the meeting had been spent debating the data, when in the end, the arguments that carried the day could have been made right at the start of the emeting. It was as if each advocate sensed that they were about to be ambushed; if another team had managed to bring clarity to the situation, that might have benefited them – so the rational response was to obfuscate as much as possible. What a waste.

Ironically, meetings like this had given data and experimentation a bad name inside this company. And who can blame them? The data warehousing team was producing classic waste – reports that nobody read (or understood). The project teams felt these experiments were a waste of time, since they involved building features halfway, which meant they were never quite any good. And since nobody could agree on each outcome, it seemed like “running an experiment” was just code for postponing a hard decision. Worst of all, the executive team was getting chronic headaches. Their old product prioritization meetings may have been a battle of opinions, but at least they understood what was going on. Now they first had to go through a ritual that involved complex math, reached no definite outcome, and then proceeded to have a battle of opinions anyway!

When a company gets wedged like this, the solution is often surprisingly simple. In fact, I call this class of solutions “too simple to possibly work” because the people inside the situation can’t conceive that their complex problem could have a simple solution. When I’m asked to work with companies like this as a consultant, 99% of my job is to find a way to get the team to get started with a simple – but correct – solution.

Here was my prescription for this situation. I asked the team to consider creating what I call a sandbox for experimentation. The sandbox is an area of the product where the following rules are strictly enforced:
  1. Any team can create a true split-test experiment that affects only the sandboxed parts of the product, however:
  2. One team must see the whole experiment through end-to-end.
  3. No experiment can run longer than a specified amount of time (usually a few weeks).
  4. No experiment can affect more than a specified number of customers (usually expressed as a % of total).
  5. Every experiment has to be evaluated based on a single standard report of 5-10 (no more) key metrics.
  6. Any team that creates an experiment must monitor the metrics and customer reactions (support calls, forum threads, etc) while the experiment is in-progress, and abort if something catastrophic happens.
Putting a system like this in place is relatively easy; especially for any kind of online service. I advocate starting small; usually, the parts of the product that start inside the sandbox are low-effort, high-impact aspects like pricing, initial landing pages, or registration flows. These may not sound very exciting, but because they control the product’s positioning for new customers, they often allow minor changes to have a big impact.

Over time, additional parts of the product can be added to the sandbox, until eventually it becomes routine for the company to conduct these rigorous split-tests for even very large new features. But that’s getting ahead of ourselves. The benefits of this approach are manifest immediately. Right from the beginning, the sandbox achieves three key goals simultaneously:

  1. It forces teams to work cross-functionally. The first few changes, like a price change, may not require a lot of engineering effort. But they require coordination across departments – engineering, marketing, customers service. Teams that work this way are more productive, as long as productivity is measured by their ability to create customer value (and not just stay busy).
  2. Everyone understands the results. True split-test experiments are easy to classify as successes or failures, because top-level metrics either move or they don’t. Either way, the team learns immediately whether their assumptions about how customers would behave were correct. By using the same metrics each time, the team builds literacy across the whole company about those key metrics.
  3. It promotes rapid iteration. When people have a chance to see a project through end-to-end, and the work is done in small batches, and has a clear verdict delivered quickly, they benefit from the power of feedback. Each time they fail to move the numbers, they have a real opportunity for introspection. And, even more importantly, to act on their findings immediately. Thus, these teams tend to converge on optimal solutions rapidly, even if they start out with really bad ideas.
Putting it all together, let me illustrate with an example from another company. This team had been working for many months in a standard agile configuration: a disciplined engineering team taking direction from a product owner who would prioritize the features they should work on. The team was adept at responding to changes in direction from the product owner, and always delivered quality code.

But there was a problem. The team rarely received any feedback about whether the features they were building actually mattered to customers. Whatever learning took place was happening by the product owner; the rest of the team was just heads-down implementing features.

This led to a tremendous amount of waste, of the worst kind: building features nobody wants. We discovered this reality when the team started working inside a sandbox like the one I described above.

When new customers would try this product, they weren’t required to register at first. They could simply come to the website and start using it. Only after they started to have some success would the system prompt them to register – and after that, start to offer them premium features to pay with. It was a slick example of lazy registration and a freemium model. The underlying assumption was that making it seamless for customers to ease into the product was optimal. In order to support that assumption, the team had written a lot of very clever code to create this “tri-mode” experience (every part of the product had to treat guests, registered users and paying users somewhat differently).

One day, the team decided to put that assumption to the test. The experiment was easy to build (although hard to decide to do): simply remove the “guest” experience, and make everyone register right at the start.  To their surprise, the metrics didn’t move at all. Customers who were given the guest experience were not any more likely to register, and they were actually less likely to pay. In other words, all that tri-mode code was complete waste.

By discovering this unpleasant fact, the team had an opportunity to learn. They discovered, as is true of many freemium and lazy registration systems, that easy is not always optimal. When registration is too easy, customers can get confused about what they are registering for. (This is similar to the problem that viral loop companies have with the engagement loop: by making it too easy to join, they actually give away the positioning that allows for longer-term engagement.) More importantly, the experience led to some soul-searching. Why was a team this smart, this disciplined, and this committed to waste-free product development creating so much waste?

That’s the power of the sandbox approach.

Tuesday, October 6, 2009

A large batch of videos, slides, and audio

I've been trying very hard to avoid turning this blog into a travelogue. Normally, I try to make my post-event writeups more than just a transcript, by including reactions and comments. On this speaking tour, that's been simply impossible, so I've decided to let the following collection of videos, podcasts, and slides batch up for a little while. If you're interested in more real-time updates during my speaking tour, please tune into my twitter feed.

In the meantime, I hope you enjoy all this multimedia content. In addition to some of my recent talks, you can learn more about the Startup Visa movement and enjoy two really interesting lean startup case studies.

My Stanford Entrepreneurial Thought Leader Seminar courtesy of Stanford Ecorner (audio podcast only for now, video coming soon):

if you'd like to follow along with slides, they are here:

From high atop the BT Tower in London, this brief BT Tradespace interview:

Why do we need a Startup Visa? A Tale of 2 Erics:

Also in London, I took up a lot of airtime during day two of Seedcamp. You can read highlights on their blog, or watch this short video:

Seedcamp - Day 2 Highlights from Seedcamp on Vimeo.

Or watch my full #leanstartup presentation at Seedcamp in London:

And two bonus videos that are well worth watching (weally):

Timothy Fitz, who worked for me at IMVU, giving an in-depth presentation on the details of the continuous deployment system that we built there.

With accompanying slides:

pbWorks (formerly pbWiki) was one of the first companies that ever invited me to join their advisory board. I like to think that had some small part in causing their subsequent success. Judge for yourself by watching David Weekly's #leanstartup case study (pbWorks):

Thanks to everyone who has helped plan, organize, record and attend these many events!