Tuesday, September 30, 2008

What does a startup CTO actually do?

What does your Chief Technology Officer do all day? Often times, it seems like people are thinking it's synonymous with "that guy who gets paid to sit in the corner and think 'technical' deep thoughts" or "that guy who gets to swoop in a rearrange my project at the last minute on a whim." I've tried hard not to live up (or down?) to those stereotypes, but it's not easy. We lack a consistent and clear definition of the job.

When I've asked mentors of mine who have worked in big companies about the role of the CTO, they usually talk about the importance of being the external face of the company's technology platform; an evangelist to developers, customers, and employees. That's an important job, for sure, and I've been called upon to do it from time to time. But I don't think most startups really have a need for someone to do that on a full time basis.

So what does CTO mean, besides just "technical founder who really can't manage anyone?"

I always assumed I wouldn't manage anybody. Being a manager didn't sound fun - deep down, who really wants to be held accountable for other people's actions? I mean, have you seen other people? They might do anything! So I initially gravitated to the CTO title, and not VP of Engineering. I figured we'd bring in a professional to do the managing and scheduling-type stuff, and I could stay focused on making sure we built really awesome technology. But along the way, something strange happened. It became harder and harder to separate how the software is built from how the software is structured. If you're trying to design an architecture to maximize agility, how can that work if some people are working in TDD and others not? How can it work if some folks are pre-building and others use five why's to drive decisions? And what about if deployment takes forever? Some options can improve the performance of the softare at the expense of readability, deployability, or scalability. Should you take them? These sounded to me like technical problems, but when you do any kind of root cause analysis they turn out to be people problems. And there's really no way to tackle people problems from the sidelines.

So I wound up learning the discipline of managing other people. Turns out, I wasn't too bad at it, and I found out just how rewarding it can be. But since I spent a long time in a hybrid CTO/VP Engineering role, I still have this nagging question. Just what is the CTO supposed to do?

Here's my take. The CTO's primary job is to make sure the company's technology strategy serves its business strategy. If that sounds either too simple or too generic, think for a second if any companies you know do the reverse. Have you ever heard a technologist use technical mumbo-jumbo to make it sound like a business idea he or she didn't like was basically impossible? That's what we should be trying to avoid.

I'll try and break it down into five specific skills.
  • Platform selection and technical design - if your business strategy is to create a low-burn, highly iterative lean startup, you'd better be using foundational tools that make that easy rather than hard. Massive proprietary databases? I don't think so. Can the company dig into its tools when they fail and fix them? If not, who's going to insist we switch to free and open source software? When projects are getting off the ground, who can the team check with to make sure their plans are viable? Who will hold them accountable for their project's impact on the platform as a whole?

  • Seeing the big picture (in graphic detail) - the CTO should be the person in the room who can keep everything your technology can and can't do in their head. That means knowing what's written and what's not, what the architecture can and can't support, and how long it would take to build something new. That's more than just drawing architecture diagrams, though. Being able to see the macro and micro simultaneously is a hallmark of all of the really great technologists I've had the privilege to work with.

  • Provide options - another mark of a good CTO is that they never say "that's impossible" or "we'd never do that." Instead, they find options and can communicate them to everyone in the company. If the CEO wants to completely change the product in order to serve a new customer segment, you need someone in the room who can digest the needs of the new (proposed) business, and lay out the costs of each possible approach. Some technologists have a tendency just to "decide for you" and give you the "best" option, but that's dangerous. You can't have an honest dialog if one party knows all the answers.

  • Find the 80/20 - this was my favorite part of the job. Sometimes, you're in a meeting where someone wants to build a new feature. And in their mind, they've got it all spec'ed out. It slices, it dices, and probably washes your car too. In my mind, they're racking up costs (one month for that part, two months for that other part, uh oh). On a bad day, I'd just give them the sobering news. But a good day looked like this. Once I understood what the objective of their feature was for customers, I could sometimes see a way to get 80% of the benefit for 20% of the cost. "Would you be able to learn what you need to learn if that feature just sliced, but not diced? Because if we don't have to add a dicing module, we can repurpose the flux capacitor via solar flares...." I was constantly amazed how often the answer was something like "really? dicing is what's expensive?! I just threw that in there on a whim!"

  • Grow technical leaders - I like to formalize this responsibility by eventually designating some engineers as "Technical Leads" and delegating to them the work of guiding the technical direction of more and more projects. This is the only way to scale. It also forced me to get clear about which aspects of our company's technical direction were really important principles, and which were just artifacts of how we got there. With multiple people trying to work to the same standard, we had to be a lot crisper in our definitions. Was the fact that we were primarily using PHP essential, or could we add new tools written in other languages? Was it an important or irrelevant fact that most of our web code was procedural and not object-oriented? What if someone wanted to write their module in OOP style? By delegating and training, we create a corps of leaders who could step in to provide CTO-like services on demand. And by working together, we created a team whose whole was greater than the sum of its parts.
I want to add one last idea, even though I recognize it is controversial, bordering on the boundary between the CTO and VP Engineering. I don't know how much I'm being influenced by having worn both hats, but I think it's important enough to go out on a limb and add.
  • Own the development methodology - in a traditional product development setup, the VP Engineering or some other full-time manager would be responsible for making sure the engineers wrote adequate specs, interfaced well with QA, and also run the scheduling "trains" for releases. But I think in a lean startup, the development methodology is too important to be considered "just management." If the team is going to use TDD or JIT scalability, for example, these choices have enormous impact on what the architecture must look like. At a minimum, I think it's the CTO and Tech Leads that have to be responsible for five why's-style root cause analysis of defects. Otherwise, how can they find out what their blind spots are and make sure the team and the architecture is adjusting? That job calls for someone who sees the big picture.
Your CTO might be a great architect, evangelist, interface designer or incredible debugger. Those are great skills to have, and I'm curious what you've seen work and not work. I'll be the first to admit that my experience is limited, so I'm collecting anecdotes. Have you worked with or for a great CTO? What made them exceptional? What's one thing a brand-new first-time CTO could learn from them?

Monday, September 29, 2008

Q&A with an actual reader

One of my favorite things about having a blog is the feedback I get in comments and by email. Today, I thought I'd answer a few questions that came in from a very thoughtful comment from Andrew Meyer. (He's also a blogger, at Inquiries Into Alignment).

Question 1:

When you're adding features to a product used by an existing user base, do you still do split testing to determine usage patterns?
Absolutely, yes. Sometimes, testing with existing customers is more complicated than with new customers. Existing customers already have an expectation about how your product works, and it's important to take this into consideration when adding or changing features. For example, it's almost always the case that a new layout or UI will confuse some customers, even if it's a lot better than the old one. You have to be prepared for that effect, so it doesn't discourage you prematurely. If you're worried it, either run the test against new customers or run it for longer than usual. We usually would give changes like this a few extra days to see if customers eventually recover their enthusiasm for the product.

On the other hand, existing customers can be a testing benefit. For example, let's say you are adding a new feature in response to customer feedback. Here, you expect that customers will find the feature a logical or natural extension, and so they should immediately gravitate to it. If they don't, it probably means you misunderstood their feedback. I have made this mistake many times. At IMVU, for example, we used to hear the feedback that people wanted to "try IMVU by themselves" before inviting their friends to use it. Because many on our team came from a games background, we just assumed this meant they were asking for a "single-player mode" where they could dress their avatar and try controlling it on their own.

Turns out, shipping that feature didn't make much impact when we looked at the data. Turns out, what customers really meant was "let me use IMVU with somebody I don't know" so they could get a feel for the social features of the product without incurring the social risk of recommending it to their friend. Luckily, the metrics helped us figure out the difference.

Question 2:

If your product has areas where people read and then different areas where people interact, are there ways to do metrics to determine where people spend their time? Could this be done on mouse focus, commenting amounts, answer percentages, download percentages, etc?

There are ways to measure customer behavior in tremendous detail, and in some situations these metrics are important. But lately I have been recommending in stronger and stronger terms that we not get too caught up in detailed metrics, especially when we are split-testing. Let's run a thought experiment. Imagine you have incredibly precise metrics about every minute that every customer spends with your product, every mouse click, movement - everything. So you do a split-test, and you discover that Feature X causes people to spend 20% more time on a given part of your product, say a particular web page.

Is that a success? I would argue that you really don't know. It might be that the extra time they are spending their is awesome, because they are highly engaged, watching a video or reading comments. Or it could be that they are endlessly pecking through menus, totally confused about what to do next. Either way, you would have been better off focusing your split-test on high level metrics that measure how much customers like your product as a whole. Revenue is always my preferred measure, but you can use anything that is important to your business: retention, activation, viral invites, or even customer satisfaction in the form of something like net promoter score. If an optimization has an effect at the micro level that doesn't translate into the macro level - who cares?

For more on the details of how to do simple and repeatable split-testing, take a look at The one line split-test, or how to A/B all the time.

Sunday, September 28, 2008

The lean startup comes to Stanford

I'm going to be talking about lean startups (and the IMVU case in particular) three times in the next two weeks at Stanford. It's exciting to see the theory and methodology being discussed in an academic context. The entrepreneurship progarms of the business, engineering, and undergraduate schools are all tackling the subject this semester, and I'm honored to be part of it. Even better, my friend Mike Maples, one of the pioneers of microcap investing in startups, is teaching a unit in Stanford's E145 on "The New Era of Lean Startups."

It's a real challenge to communicate honestly in these classes. I struggle to try and make the students actually experience how confusing and frustrating startup environments are. When we do the IMVU case, we generally get complete consensus in the class that several of the zany things we did are 100% right. Complete consensus? We didn't even think they were 100% right. And we still argue about whether our success came from those decisions, or some exogenous factor.

It's one of the hard things about learning just from hindsight, and it matters in the board room every bit as much as in the classroom. You can only learn from being wrong, but our brains are excellent rationalizers. When something works, it's too easy to invent a story about how that was your intention all along. If you don't make predictions ahead of time, there's no way to call you on it.

In fact, in the early days, when IMVU would experience unexpected surges of revenue or traffic, it was inevitable that every person in the company was convinced that their project was responsible. Those stories would be retold and repeated, and eventually achieved mythological status as "facts" that guided future action. But making decisions on the basis of myths is dangerous territory.

How did we combat this tendency? I don't pretend that we did it well. But many of the tools of lean startups are designed for just this purpose:
  • Regular checking in with and regular talking to customers surfaces bogus theories pretty fast
  • Split-tests make it harder to take credit for someone external factor making you successful
  • Cross-functional teams tend to examine their assumptions harder and with more skepticism than purley single-function teams
  • Working in small batches tends to make it less likely that you'll attribute big results to small changes (because the fact that small changes sometimes do lead to big results is counter-intuitive)
  • Rapid iteration makes it easy to test and re-test your assumptions to give you many opportunities to drive out superstition
  • Open source code invites criticism and active questioning
Still, it's hard to make the case that these solutions are needed, because the problems seem so obvious. I hear some variation of this pretty often: "I mean, sure those guys were rationalizing and kidding themselves. But our team would never do that, right? We'll just be more vigilant." Good luck.

Let me end with a challenge: see if you can find and kill just one myth in your development team. My suggestion: take a much-loved feature and split-test it with some new customers to see if it really makes a difference. If you try, share your story here. I'm especially interested in what you used to share the idea with your colleagues. What language should we use? What arugments are persuasive? What works and what doesn't?

Monday, September 22, 2008

You don't need as many tools as you think

I'm always excited to see someone else writing about lessons learned from their startup, and wanted to link today to Untitled - Startup Lessons Learned -- Take it with a grain of salt. Here's something I can relate to:
We used assembla for subversion, scrums, milestones, wikis, and for general organizational purposes. We had all the tools in place but we didn’t actually practice agile development. Scrum reports would come in once a month, nobody was actually responsible for anything ... No fancy tools needed here — it’s about the mentality, attention to detail, and the actions that foster agile development, not the tools & systems you set up in place to facilitate this.
It's a natural assumption that, in order to implement a new process, we need fancy new tools. This is generally false. And even when new tools are needed, my experience has been that you can only figure out what tools you need once you have done the relevant task by hand. It's another example of the refactoring principle in action.

My favorite instance of this is scheduling software. Now, there are some situations where scheduling needs to be incredibly complex, like when many teams are working on something that is heavily synchronized and where the consequences of failure are severe. But it's incredibly easy to fool yourself into thinking that, so it's worth being a little skeptical.

If you read Lean Thinking, you can enjoy numerous examples of companies that replaced multi-million dollar MRP systems with a simple white board. The lean manufacturing guys call this visual control and it's very powerful. When you make progress evident to everyone on the team, you allow for decentralized initiative, and foster a focus on team (rather than individual goals). If you've never worked in this way, you'd be surprised how many people can lend a hand in areas way out of their specialty, if given the opportunity. Most engineers are terrible visual designers, but if the designer on their team is struggling, maybe they can help out with an icon or two. And wait until you see a "non-technical" designer writing simple code to try and speed up a release.

In order for progress to be evident it has to be:
  • Simple - everyone on the team has to understand what it means. A typical setup might be to have cards representing tasks (as in XP story cards) and have them move across a board from an "in progress" column to "complete." I usually recommend a three-part board, where the left column is where the scrum-style "product backlog" is maintained in priority order, the middle represents tasks in progress, and the right column is for tasks recently finished.

  • Visible - the whole team has to be able to see the status, and not just when they are actively having a problem. A board posted in the same room as the team is best, because you can't help notice it when you come in. In continuous integration sytems, a single colored status webpage can work (it looks like this). But putting the status on a webpage only works if members of the team have a reason to check it all the time. For example, I recommend that you wire up your source control system to disallow checkins while any of your unit tests are failing. This has lots of benefits, but one strong one is that it causes everyone to check the status of the build all the time.

  • Accurate - if you've ever been managed by task-tracking software, you'll relate to the feeling that you either have to spend ludicrous amounts of time posting updates or the information goes stale. It's exponentially more frustrating if some people on the team update their status religiously, and others don't. The more visible the status is, the more likely people will update it, but it's also important that updates be a natural part of your workflow.

    For example, I recommend a simple rule: each team member is allowed to have only one task "in progress" at any given time. This rule is easy to enforce; just look up at the board and count the cards in the "in progress" column. It's also a natural accuracy-enhancer, since when you want to work on a new task, you need to move your old task to complete. If the task is not complete, you force people to surface the issue quickly. (It also has the nice side-effect of driving down the batch size of work, but that's for another post)
This framework for making progress evident applies to more than just scheduling, of course. Acceptance tests can make the progress of your features evident, assuming they are simple enough for everyone to understand, visible to all, and accurately reflect the goals of your project. Cacti graphs can serve the same purpose for quality of service, and a good business dashboard can help with business goals.

In each case, though, think twice before you set up an elaborate automated system. I used to think a giant flat-panel screen that broadcast our company's key metrics would be what was required to get everyone to pay attention to them. I never tried it, but I recently got to meet a startup who did. The result: as soon as the novelty value wore off, nobody paid attention anymore. A much better solution, I think, would be to have each project leader be in charge of physically printing out a one-page report every week with the relevant stats for their project, and present it to the whole company. If they are going to be judged by the output of that report, you had better believe they are going to: 1) make sure it is accurate, 2) check it often, and 3) make their team understand it.

The three drivers of growth for your business model. Choose one.

Master of 500 Hats: Startup Metrics for Pirates (SeedCamp 2008, London)

This presentation should be required reading for anyone creating a startup with an online service component. The AARRR model (hence pirates, get it?) is an elegant way to model any service-oriented business:
  1. Acquisition
  2. Activation
  3. Retention
  4. Referral
  5. Revenue
We used a very similar scheme at IMVU, although we weren't lucky enough to have started with this framework, and so had to derive a lot of it ourselves via trial and error. Dave's done a great job of articulating the key metrics you want to look at in each of these five areas, and I won't bother repeating them here (go read the presentation already). He also has a discussion of how your choice of business model determines which of these metric areas you want to focus on. That's where I'd like to pick up the discussion.

I think the salient question to ask about any business model is: what is the primary driver of growth? I break the answer to that question down into three engines:
  1. Viral - this is the business model identified in the presentation as "Get Users." Here, the key metrics are Acquisition and Referral, combined into the now-famous viral coefficient. If the coefficient is > 1.0, you generally have a viral hit on your hands. You get increasing growth by optimizing the viral loop, and you get revenue as a side-effect, assuming you have even the most anemic monetization scheme baked into your product. The law of large numbers (of customers) says you can't help but make at least some money - your valuation is determined by how well you monetize the tidal wave of growth. Examples of this are well-known, and (in my definition) include any product that causes new customers to sign up as a necessary side-effect of existing customers' normal usage: Facebook, Myspace, AIM/ICQ, Hotmail, Paypal.

  2. Paid - if your product monetizes customers better than your competitors, you have the opportunity to use your lifetime value advantage to drive growth. In this model, you take some fraction of the lifetime value of each customer and plow that back into paid acquisition through SEM, banner ads, PR, affiliates, etc. The spread between your LTV and blended CPA determines either your profitability or your rate of growth, and a high valuation depends on balancing these two factors. To the extent that you have good word-of-mouth, activation or retention, these factors tend to drive down your CPA or drive up your LTV, and so are nice bonuses. But because paid traffic is fundamentally a bidding war, it's important that you have a differentiated ability to monetize customers better than other people who are bidding for the same traffic. Otherwise, your CPA will get driven up close to or exceeding your LTV, and you can't grow profitably anymore. IMVU is in this business, as is Amazon, Netflix, Match.com, and CafePress.

  3. Sticky - Dave calls this "Drive Usage" and I think it's the model that causes the greatest confusion. Because Activation is a key term in the Viral business model equation, and Retention is a key term in the Paid business model, it's easy to mix up this type of business with the other two. For example, you often hear eBay or Neopets described as having viral growth, but I don't think that's correct. What those sites have in common (despite their very different audiences) is that something is causing their customers to become addicted to their product, and so no matter how they acquire a new customer, they tend to keep them. This has led to exponential growth. For eBay, this is caused by the incredible network effects of their business (so-called demand-side increasing returns and supply-side increasing returns). For Neopets, it's simply a side-effect of their game-like product design. Either way, you can use any marketing channel that's available to bring in new customers, including word of traditional advertising, SEO, SEM - wherever you can find prospects who are going to find your product addicting. But it's not really viral growth, even when it's exponential. Again, looking at eBay - most buyers and sellers would be 100% happy with eBay if it already had a critical mass of people to bid or create auctions. Although many eBay fans love to tell their friends about it, they really don't have a need to bring them on board. As far as they are concerned, that's eBay's job. That's why eBay advertises on search engines, and Facebook doesn't.
In my opinion, every startup needs to "pick a major" among these three drivers of growth. It's simply too hard to focus on more than one. It's a choice that has to be made at the level of strategy; notice how similar the tactics are between them. All three probably make attempts at world-of-mouth marketing - it's just that for Viral, it's life-or-death. Similarly, it probably makes sense for everyone to take advantage of SEO (hey, it's nearly-free traffic). But a Viral company who is focused there is probably going out of business.

The difficulty is exacerbated by the fact that these models also cut across business functions. Sometimes we have the attitude that the Product Development team is the one responsible for Activation and Retention (hey, a great product would do that naturally) or that the Marketing team is responsible for Revenue and Referral (hey, go get me some money or free customers already). In reality, the key metrics for your growth drivers have to be jointly owned. They cannot be delegated, unlike the minor metrics, which can easily be owned by one part of the business, or even outsourced. For example, it's always nice to have someone constantly optimizing your SEM accounts, driving down your CPA. They might even occasionally make "optimizations" that improve CPA but negatively impact LTV (or vice versa). But if you were using the Paid driver of growth, you just outsourced your heart while it was still beating. Oops!

One last thought. Beware the new hire who has "extensive experience" in startups or big companies - using a different growth driver. Make sure you test to see if they are truly open minded, because otherwise you risk them banging their head against a wall, trying to use the tactics that worked so well in their previous company. Be-double-ware of the new hire who has "years of industry experience in multiple companies" all in a different growth driver. It's the rare person who truly understands not just what worked in the past but why.

Thoughts on scientific product development

I enjoyed reading a post today from Laserlike (Mike Speiser), on Scientific product development.
By embracing a scientific approach to product development, not only will your business have a much higher probability of success, but it will also be a more fun and creative place to work. Nothing kills innovation like the fear of failure. And nothing leads to failure like a process that resembles astrology more than it does astronomy.
There are two concepts I want to delve into a little more. The first is his version of the Code/Data/Learn feedback loop.

He focuses on having a clear idea of the problem you are trying to solve, running experiments, and then: "Did the results of your test match what you expected? If not, kill the feature and start over." My experience is this kind of thinking can run into trouble. The two extremes our brains tend to swing to are either "I know it's right, just do it" or, if things don't go as planned, "screw it, just give up." The goal of iterative development is to give us guard rails so we don't veer off to either extreme. When an experiment doesn't go as planned, that's the time to learn. Why didn't reality conform to our expectations? Unpacking our assumptions usually leads to important insights that should be plowed into a next edition of the feature. You can't do science and be defensive at the same time. If the product people on your team think they have to get every experiment right on the first try, there's no chance you're going to iterate and learn. So don't just kill the feature - iterate. Only when you've tried everything you can think of, and you're not learning any more, is it time to bring out the axe. Keep split-testing, but keep this iron rule: if it doesn't change customer behavior, it's not a feature. Kill it.

A second idea is that less is more in product design:
A key tenet of the philosophy is that uncluttered products with fewer, better features are preferred to similar products with more features. I agree with the less is more product development approach, but for a different reason.

The reason I like less is more as an approach is that it allows for a more scientific approach to product development. By starting a new product off with as few features as possible (1?), you can be incredibly scientific. With 10 features in a single release, you may spend more time trying to figure out what is working and what isn’t working than it took to build the thing in the first place. As you incrementally experiment with your product, you can observe the impact of a particular feature one at a time and adjust accordingly.

This is a great articulation of the principle that working in small batches allows problems to become instantly localized. The smaller the unit of a release is, the more you can learn about it, because the number of variables that have changed is so few. It's worth doing this not just for features, but for any changes you make to your product. For example, trying to make server software more scalable. Our instinct is sometimes to do giant big-bang rewrites. These rarely work on the first try - usually small mistakes wind up blowing away all the benefit of the macro change. If we can find ways to break the change down into tiny pieces, we can find and revert the small mistakes as they are introduced.

When I've worked to get everyone to build in small batches, I would see this pattern in action. Lots of engineers are busy checking in and deploying their work. Someone has managed to convince themselves that they have to do their big architecture change in one fell swoop. So they work on their own branch for a while, and then try to check in their massive changeset. All day long, I keep noticing them "bouncing" their changes against production: constantly checking it in, realizing something's not right, and reverting. What's not right? First, the tests don't pass. Then it turns out theirs some integration problem because their long-lived branch diverged too much. Then, during deployment, the cluster immune system rejects the change... etc. Meanwhile, a dozen other engineers are getting annoyed, because this constant bouncing is making it harder for them to work (they can't deploy changes while tests or the cluster are in red-alert mode). When we think that working solo, on a branch, in large batches is more efficient, this is the kind of waste we forget about. Sure, we kept that one engineer busy while they toiled away on their own, but did that optmize the whole team's efforts?

Working in a scientific and iterative way is not about cold, calculating facts. On the contrary - creativity being constantly challenged by reality is beautiful. This is a point practitioners of science are always trying to make. My experience is that this way of working is liberating. We don't have to sit around and argue about who's opinion is correct. We can just blitz through new, creative ideas, testing to see what's real. Now that is fun.

Thursday, September 18, 2008

Lo, my 5 subscribers, who are you?

It's not always fun being small. When you have an infinitesimal number of customers, it can be embarrassing. Some might look at my tiny "5 readers" badge and laugh. But as long as your ego can take it, there are huge advantages to having a small number of customers.

Most importantly, you can get to know those few customers in a way that people with zillions of customers can't. You can talk to them on the phone. You can provide personalized support. You can find out what it would take for them to adopt your product, and then follow up a week later and see if they did. Same with finding out what it would take to get them to recommend your product to a friend. You can even meet the friend.

For companies in the early-adopter phase, you can play "the earlyvangelist game" whenever a customer turns out to be too mainstream for your product. Pick a similar product that they do use, and ask them "who was the first person you know who started using [social networking, mobile phones, plasma TV, instant messaging...]? can I talk to them?" If your subject is willing to answer, you can keep going, following the chain of early-adoption back to someone who is likely to want to early-adopt you.

That level of depth can help you build a strong mental picture of the people behind the numbers. It's enourmously helpful when you need to generate new ideas about what to do, or when you face a product problem you don't know how to solve.

(For example, we used to be baffled at IMVU by the significant minority of people who would download the software but never chat with anyone. It wasn't until we met a few of them in person that we realized that they were having plenty of fun dressing up their avatar and modeling clothes. They wanted to get their look just right before they showed it to anyone else - they would even pay money to do it. But all of our messaging and "helpful tutorials" were pushing them to chat way before they were ready. How annoying!)

And since I have a blog, I have a way to ask questions directly to you. If you have a minute, post your answers in a comment, or email me. Here's what I want to know:
  1. First of all, the NPS question: On a scale of 1-10 (where 10 is most likely), how likely is it that you you would recommend this blog to a friend or colleague?
  2. How did you hear about it?
  3. What led you to become a subscriber, versus just reading an article and leaving like everybody else? (or, if you're not a subscriber, what would it take to convince you?)
  4. What do you hope to see here in the future?
Thanks, you loyal few. I am grateful for your time and feedback.

How to get distribution advantage on the iPhone

I have had the opportunity to meet a lot of iPhone-related companies lately. Many of them have really cool products shipping or about to be released, and I wholeheartedly agree with my friends at the iFund that the next generation of applications is going to be amazing.

I've also been playing around with the App Store. From a technical point of view, it's amazing. You just install app after app after app, and it just works. My home screen is a giant mess, because installing apps is just so much fun.

But from a customer experience point of view, I'm not yet sold. Figuring out which apps are going to be any good is almost impossible. Even with only a few months of development, third parties have crammed every single category in the store full of apps. Most of my time in the store is spent scrolling through endless lists. And what distinguishes a good app? I can't really tell. All I see is a name, an icon, a price, the developer's name, and a review star-rating. The reviews are all over the map. When I choose to read them, it seems totally random what I'll find. But even clicking through to see a screenshot and some reviews is incredibly time consuming, given the hundreds of apps in most categories. Most of the time, I have no idea if I'm going to like the app after I install it.

So how's a normal person to choose? I think this is a major challenge for companies that hope to build dominance in some category on the iPhone. Today, the fact that the store is open and has almost no barriers to entry is great for the companies I meet, because they can get their first versions in front of customers quickly, and start iterating fast. But if they are lucky enough to have success, the store is going to become a nightmare, because it will give all of their competitors easy access to their customers and an opportunity to compete with them on an even playing field. The app store is not set up to allow anyone to achieve a durable advantage.

Browsing the app store is an awful lot like shopping in a retail grocery store. You see row after row of tiny boxes, each vying for your attention. They can't present much information, unless you take the box off the shelf and look at it. They rely on impressions, branding and price to try and get you to do that. The store determines which products sit on which shelves, and which yours sits next to. Of course, for a few extra dollars, the right people can get their products on more shelves, or in premium locations, or in giant promotional stands.

Sound familiar? In a world where competition is based on brief looks in predefined categories, it's hard to just "build a better mousetrap" and hope for the best. This is what brand marketers and consumer packaged goods companies have been studying and refining for years: how to win the battle in your mind before you ever set foot in the store. Once you have come to think of Crest as the #1 toothpaste, and, more importantly, your toothpaste, it's unlikely you're going to pay attention to the other boxes on the shelf, no matter how shiny they are.

There are other models, in other distribution channels. On Facebook, viral distribution has proved decisive. Those companies who have learned to build apps that optimize the viral loop dominate in every category where they compete. Not many customers ever browse the app directory or search for specific apps - they don't have to, they find out about apps by being invited by a friend. If you sell an online service that solves a defined problem, you can compete in SEO or SEM. If your site is consistently ranked #1 for a given search term, you can make it very hard for someone else to compete for new customers. In other markets, he who controls the directory has the power, like Download.com in the world of windows shareware.

Word of mouth is a powerful force multiplier in all of these models. If everyone I know is using a specific product, in most markets that's a heavy influence. And in some markets it's decisive, because of well-known network effects (as happened with Microsoft, eBay, and many others). But if your product category doesn't have strong network effects, word of mouth alone is not usually enough to fend off a competitor who also has a quality product.

So what model will prevail on the iPhone? So far, I don't see any apps that have much in the way of viral distribution. Do any apps really cause my friends to sign up, as a natural side-effect of my using the app? I haven't found any yet. And I don't see much searching for apps going on. Do most people know what kind of app they want? And how can they tell the best app for a given search? For example, I did a search for "taxi" in the app store. I got 4 results, 3 free, one for $0.99. I downloaded and tried all four of them (because I had time to kill) - and I'm still not sure which one was the best. Back when I was staring at the search screen, it really was a crapshoot.

So unless someone cracks the code on one of these other models, I think we may revert to the retail model, where good positioning and good branding will win. When I'm scrolling through the endless list of games in that category, the icon that I've come to associate with "that company that makes amazing iPhone games" is going to get a disproportionate share of my attention. Does that mean existing brands have the advantage? I'm not sure. For a lot of brands, their iPhone products will run into the line-extension trap (see The 22 Immutable Laws of Marketing). That looked to me like what's happening with EA's iPhone offerings. So there is an opportunity to build new brands with attributes like "the most amazing mobile apps" but I think building a company around that strategy means really thinking through how to do it. Just bringing a good app to market isn't going to be enough.

So for those who are thinking of starting a new company to build iPhone apps, here's the question I would be pondering. After I've built my first successful app, and all kinds of competitors have copied me and have similar apps right next to mine in the store, how will I continue to get new customers? How will new customers know that my apps are superior?

Tuesday, September 16, 2008

How to Usability Test your Site for Free

Noah Kagan has a great discussion of usability testing which can help get you over the "that's too hard" or "that's too expensive" fear.

At Facebook we never did testing or looked at analytics. At Mint, Aaron (CEO) was very very methodical and even flew in his dad who is a usability expert. We did surveys, user testing and psychological profiles. This was extremely useful in identifying the types of users we may have on the site and especially for seeing how people use the site. I never really did this before and was AMAZED how people use the site vs. what I expected. Most people know I am very practical or as my ex-gfs call it “cheap.” Anyways, here how our new start-up user tests.

His tips are both very practical and very effective - I've used craigslist, surveymonkey, and, yes, even cafes too. Usability testing is great for coming up with ideas about what to change in your product, but don't forget to split-test those ideas to make sure they work, too.

Read more at How to Usability Test your Site for Free | Noah Kagan's Okdork.com

Monday, September 15, 2008

The one line split-test, or how to A/B all the time

Split-testing is a core lean startup discipline, and it's one of those rare topics that comes up just as often in a technical context as in a business-oriented one when I'm talking to startups. In this post I hope to talk about how to do it well, in terms appropriate for both audiences.

First of all, why split-test? In my experience, the majority of changes we made to products have no effect at all on customer behavior. This can be hard news to accept, and it's one of the major reasons not to split-test. Who among us really wants to find out that our hard work is for nothing? Yet building something nobody wants is the ultimate form of waste, and the only way to get better at avoiding it is to get regular feedback. Split-testing is the best way I know to get that feedback.

My approach to split-testing is to try to make it easy in two ways: incredibly easy for the implementers to create the tests and incredibly easy for everyone to understand the results. The goal is to have split-testing be a continuous part of our development process, so much so that it is considered a completely routine part of developing a new feature. In fact, I've seen this approach work so well that it would be considered weird and kind of silly for anyone to ship a new feature without subjecting it to a split-test. That's when this approach can pay huge dividends.

Let's start with the reporting side of the equation. We want a simple report format that anyone can understand, and that is generic enough that the same report can be used for many different tests. I usually use a "funnel report" that looks like this:

Control Hypothesis A
Hypothesis B
Registered1000 (100%)
1000 (100%)
500 (100%)
650 (65%)
750 (75%)
200 (40%)
350 (35%)
350 (35%)
100 (20%)
100 (10%)
100 (10%)
25 (5%)

In this case, you could run the report for any time period. The report is set up to show you what happened to customers who registered in that period (a so-called cohort analysis). For each cohort, we can learn what percentage of them did each action we care about. This report is set up to tell you about new customers specifically. You can do this for any sequence of actions, not just ones relating to new customers.

If you take a look at the dummy data above, you'll see that Hypothesis A is clearly better than Hypothesis B, because it beats out B in each stage of the funnel. But compared to control, it only beats it up through the "Chatted" stage. This kind of result is typical when you ship a redesign of some part of your product. The new design improved on the old one in several ways, but these improvements didn't translate all the way through the funnel. Usually, I think that means you've lost some good aspect of the old design. In other words, you're not done with your redesign yet. The designers might be telling you that the new design looks much better than the old one, and that's probably true. But it's worth conducting some more experiments to find a new design that beats the old one all the way through. In my previous job, this led us to confront the disappointing reality that sometimes customers actually prefer an uglier design to a pretty one. Without split-testing, your product tends to get prettier over time. With split-testing, it tends to get more effective.

One last note on reporting. Sometimes it makes sense to measure the micro-impact of a micro-change. For example, by making this button green, did more people click on it? But in my experience this is not useful most of the time. That green button was part of a customer flow, a series of actions you want customers to complete for some business reason. If it's part of a viral loop, it's probably trying to get them to invite more friends (on average). If it's part of an e-commerce site, it's probably trying to get them to buy more things. Whatever its purpose, try measuring it only at the level that you care about. Focus on the output metrics of that part of the product, and you make the problem a lot more clear. It's one of those situations where more data can impede learning.

I had the opportunity to pioneer this approach to funnel analysis at IMVU, where it became a core part of our customer development process. To promote this metrics discipline, we would present the full funnel to our board (and advisers) at the end of every development cycle. It was actually my co-founder Will Harvey who taught me to present this data in the simple format we've discussed in this post. And we were fortunate to have Steve Blank, the originator of customer development, on our board to keep us honest.

To make split-testing pervasive, it has to be incredibly easy. With an online service, we can make it as easy to do a split-test as to not do one. Whenever you are developing a new feature, or modifying an existing feature, you already have a split-test situation. You have the product as it will exist (in your mind), and the product as it exists already. The only change you have to get used to as you start to code in this style, is to wrap your changes in a simple one-line condition. Here's what the one-line split-test looks like in pseudocode:

if( setup_experiment(...) == "control" ) {
// do it the old way
} else {
// do it the new way

The call to setup_experiment has to do all of the work, which for a web application involves a sequence something like this:
  1. Check if this experiment exists. If not, make an entry in the experiments list that includes the hypotheses included in the parameters of this call.
  2. Check if the currently logged-in user is part of this experiment already. If she is, return the name of the hypothesis she was exposed to before.
  3. If the user is not part of this experiment yet, pick a hypothesis using the weightings passed in as parameters.
  4. Make a note of which hypothesis this user was exposed to. In the case of a registered user, this could be part of their permanent data. In the case of a not-yet-registered user, you could record it in their session state (and translate it to their permanent state when they do register).
  5. Return the name of the hypothesis chosen or assigned.
From the point of view of the caller of the function, they just pass in the name of the experiment and its various hypotheses. They don't have to worry about reporting, or assignment, or weighting, or, well, anything else. They just ask "which hypothesis should I show?" and get the answer back as a string. Here's what a more fleshed out example might look like in PHP:

$hypothesis =
array(array("control", 50),
array("design1", 50)));
if( $hypothesis == "control" ) {
// do it the old way
} elseif( $hypothesis == "design1" ) {
// do it the fancy new way
In this example, we have a simple even 50-50 split test between the way it was (called "control") and a new design (called "design1").

Now, it may be that these code examples have scared off our non-technical friends. But for those that persevere, I hope this will prove helpful as an example you can show to your technical team. Most of the time when I am talking to a mixed team with both technical and business backgrounds, the technical people start worrying that this approach will mean massive amounts of new work for them. But the discipline of split-testing should be just the opposite: a way to save massive amounts of time. (See Ideas. Code. Data. Implement. Measure. Learn for more on why this savings is so valuable)

Hypothesis testing vs hypothesis generation
I have sometimes opined that split-testing is the "gold standard" of customer feedback. This gets me into trouble, because it conjures up for some the idea that product development is simply a rote mechanical exercise of linear optimization. You just constantly test little micro-changes and follow a hill-climbing algorithm to build your product. This is not what I have in mind. Split-testing is ideal when you want to put your ideas to the test, to find out whether what you think is really what customers want. But where do those ideas come from in the first place? You need to make sure you don't get away from trying bold new things, using some combination of your vision and in-depth customer conversations to come up with the next idea to try. Split-testing doesn't have to be limited to micro-optimizations, either. You can use it to test out large changes as well as small. That's why it's important to keep the reporting focused on the macro statistics that you care about. Sometimes, small changes make a big difference. Other times, large changes make no difference at all. Split-testing can help you tell which is which.

Further reading
The best paper I have read on split-testing is "Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO" - it describes the techniques and rationale used for experiments at Amazon. One of the key lessons they emphasize is that, in the absence of data about what customers want, companies generally revert to the Highest Paid Person's Opinion (hence, HiPPO). But an even more important idea is that it's important to have the discipline to insist that any product change that doesn't change metrics in a positive direction should be reverted. Even if the change is "only neutral" and you really, really, really like it better, force yourself (and your team) to go back to the drawing board and try again. When you started working on that change, surely you had some idea in mind of what it would accomplish for your business. Check your assumptions, what went wrong? Why did customers like your change so much that they didn't change their behavior one iota?

Sunday, September 14, 2008

How to listen to customers, and not just the loud people

Frequency is more important than talking to the "right" customers, especially early on. You'll know when the person you're talking to is not a potential customer - they just won't understand what you're saying. In the very early days, the trick is to find anyone at all who can understand you when you are talking about your product.

In our first year at IMVU, we thought we were building a 3D avatar chat product. It was only when we asked random people we brought in for usability tests "who do you think of as our competitors?" that we learned different. As product people, we thought of competition in terms of features. So the natural comparison, we thought, would be to other 3D avatar based products, like The Sims and World of Warcraft. But the early customers all compared it to MySpace. This was 2004, and we had never even heard of MySpace, let alone had any understanding of social networking. It required hearing customers say it over and over again for us to take a serious look, and eventually to realize that social networking was core to our business.

Later, when the company was much larger, we had everyone on our engineering team agree to sit in on one usability test every month. It wasn't a huge time commitment, but it meant that every engineer was getting regular contact with an actual customer, which was invaluable. Most of the people building our product weren't themselves target customers. So there was simply no substitute for seeing actual customers with the product, live.

Today, when I talk to startup founders, the most common answer I get to the question "do you talk to your customers?" is something like "yes, I personally answer the customer support emails." That's certainly better than nothing, but it's not a good substitute for proactively reaching out. As Seth writes this week in Seth's Blog: Listening to the loud people, the most aggressive customers aren't necessarily the ones you want to hear from. For example, my experience with teenagers is that they are very reluctant to call or email asking for support, even when they have a severe problem. They just don't need another authority figure in their life.

Don't confuse passion with volume. The people who are the lifeblood of an early-stage startup are earlyvangelists. These are people who understand the vision of your company even before the product lives up to it, and, most importantly, will buy your product on that basis. In some situations, they are also the vocal minority who wants to reach out and get in your face when you do something wrong, but not always. If you're just getting negativity from someone, they are more likely a internet troll - not an earlyvangelist. (For more on earlyvangelists and why they are so important, see Steve Blank's The Four Steps to the Epiphany)

Here's the suggestion from Seth Godin I want to emphasize:
And here's one thing I'd do on a regular basis: Get a video camera or perhaps a copy machine and collect comments and feedback from the people who matter most to your business. Then show those comments to the boss and to your staff and to other customers. Do it regularly. The feedback you expose is the feedback you'll take to heart.
It's not enough to just look at the feedback that comes across your desk. You need to foster situations where you - and everyone you work with - is likely to see feedback that matters. Some techniques that I've found especially helpful:

  1. Build your own tracking survey, using a methodology like Net Promoter Score (NPS) to identify and get a regular check-up from promoters (and to screen out detractors). As a nice side-effect, NPS gives you a very reliable report card on customer satisfaction.
  2. Create a members-only forum where only qualified customers (perhaps, paying customers) can post. Let them connect with each other, but also with you. Treat these people as VIPs, and listen to what they have to say.
  3. Establish a customer advisory board. Hand pick a dozen customers who "get" your vision. The way I have run these in the past (when I was dealing with extremely passionate customers) is to have them periodically produce a "state of your company" progress report. I would insist that this report be included in the materials at every board meeting, uncensored and unvarnished.

Saturday, September 13, 2008

SEM on five dollars a day

How do you build a new product with constant customer feedback while simultaneously staying under the radar? Trying to answer that question at IMVU led me to discover Google AdWords and the world of search engine marketing.

SEM is a simple idea. You declare how much someone clicking an advertisement is worth to you, and then the search engine does its best to get you as many clicks as it can at that price. There's a lot of complexity that I'm leaving out, naturally, because I want to stay focused on the simple idea at the center of SEM: that you can pay peanuts to have people come to your website. In a mature company with a mature product, the goal is to pay for lots of people to come to your website. But I think the genius of Google's innovation is that it allows you to pay for just a few people. Think of it as micropayments for beta testers.

My first AdWords campaign was limited to five dollars a day, and we were buying clicks at five cents a click. That yields 100 clicks a day, every day. Probably if I had been an experienced marketer, I would have known that tiny volume to be insignificant, and I would have been embarrassed. Luckily, I didn't know any better. 100 clicks might not sound like very much, but look at it this way: it meant that every single day, 100 human beings were coming to our website and being offered our product.

We expected that many of those people would buy our product (that's why we charged from day one). But anyone who had done direct response marketing before would have known better. At first, zero people bought anything from us. We tried tinkering with the payment system. Still zero. Maybe the problem was helping people find the payment system. Still nothing. We kept working our way backwards, until we realized that nobody was even making it past the first landing page. Oops. Slowly, over time, we optimized (or eliminated) each step in the process of becoming a customer by giving us money. And one day a remarkable thing happened: we started making more than five dollars a day in revenue.

In the process, we had also built a simple cohort-based analytics system. Its simplicity made it effective - everyone in the company could use and understand it. It just answered this question: for any given time range, for the 100% of people who registered in that period, what percentage of them downloaded? chatted once? chatted five times? bought something? That simple funnel analysis became our scorecard, and helped us refine our product with constant customer input.

After we were making more than five dollars a day, we could take the profits and reinvest them in raising the budget. As we would find more keywords to bid on (always bidding the minimum five cents per click), sometimes the increase in volume would drive our funnel percentages down. When that happened, we'd stop raising the budget, and keep optimizing the product. In this way, we gradually built out a more and more mainstream product.

So how do you find those initial 100 clicks a day? We started by using features of our product as keywords, but this was of limited volume. Eventually Steve Blank, one of our early investors, suggested a technique that not only increased the number of clicks, but also started to use AdWords as a learning and discovery tool. We ran ad campaigns against every single product we could think of in an adjacent market space to ours. We tried obvious competitors as well as long-shots. Since we were only paying per click, it didn't cost us anything to cast a wide net. We would pretty much bid on any phrase that was "[name of competitive product] chat" and variations like that. And then we would use that simple analytics system I mentioned to monitor the conversion rates of customers from each campaign. Those rates gave us a map that told us a lot about our customers; insights that proved stable even when the company grew orders of magnitude bigger.

Only much later did I realize that this was an application of customer development to online marketing. It's now a technique I recommend for any web-based startup.

Friday, September 12, 2008

Andrew Chen: Growing renewable audiences

Growing renewable audiences (a talk at O’Reilly Alphatech Ventures) | Futuristic Play by @Andrew_Chen

In fact, I’ll describe press and blog traffic as “fool’s gold” because of the associated emotions that it brings. It’s easy to overestimate the impact of this kind of traffic because it just feels good to have your name and company featured. It strokes your ego. You might get a bunch of inbound emails from other press and partners, and all of these things can contribute to a feeling that you’re on your way to getting tons of traffic. Problem is, you inevitably become yesterday’s old news.

vs. sustainable:

Compare this to the renewable strategies, like viral marketing, SEO, widgets, and ads, which can scale into 10s of millions of users but are primarily centered around tough, non-user centric work. These are things that if you get right, you can optimize your way into a big, sustainable audience.

In an enterprise sales context, this is called a "repeatable and scalable sales process" - once you know how to do this, your company can graduate from early adopters and make an attempt at the mainstream.

Wednesday, September 10, 2008

Marc Prensky's Weblog: Cell Phones in Class

Marc's writing has been a huge influence on me in thinking through the consequences of the way the current generation of "digital natives" is educated. Are today's kids apathetic? He's argued that a kid who can't pay attention in class but can master the latest Halo in 14 straight hours doesn't have an attention problem, he or she has a boredom problem.

Among his ideas are that today's kids should be taught in class in a way that is relevant to their actual lives, which necessarily means allowing them to use technology. Why ban cell phones when we can take advantage of the fact that, in most classes, they are pervasive.

Apparently a school in Australia has taken his suggestion, and you can read all about it at Marc Prensky's Weblog: Cell Phones in Class -- A Huge Breakthrough.

A new version of the Joel Test (draft)

(This article is a draft - your comments are especially welcome as I think through these issues. Please leave feedback!)

I am convinced one of Joel Spolsky's lasting contributions to the field of managing software teams will turn out to be the Joel Test, a checklist of 12 essential practices that you could use to rate the effectiveness of a software product development team. He wrote it in 2000, and as far as I know has never updated it.

I have been thinking a lot about what a new version of this test would look like, given what I've seen work and not work in startups. Like many forms of progress, most of the items on the new test don't replace items from Joel's - they either supplement or extend the ideas on which the original is based.

Let's start with the original list:

  1. Do you use source control? This is still an essential practice, especially on the web. There was a time when "web content" was considered "not code" and therefore not routinely source controlled. but I have not seen that dysfunction in any of the startups I advise, so hopefully it's behind us. Joel mentions that "CVS is fine" and so is Subversion, its successor. I know plenty of people who prefer more advanced source control system, but my belief is that many agile practices diminish the importance of advanced features like branching.
  2. Can you make a build in one step? You'd better. But if you want to practice rapid deployment, you need to be able to deploy that build in one step as well. If you want to do continuous deployment, you'd better be able to certify that build too, which brings us to...
  3. Do you make daily builds? Daily builds are giving way to true continuous integration, in which every checkin to the source control system is automatically run against the full battery of automated tests. At IMVU, our engineering team accumulated thousands upon thousands of tests, and we had a build cluster (using BuildBot) that ran them. We did our best to keep the runtime of the tests short (10 minutes was our target), and we always treated a failing test as a serious event (generally, you couldn't check in at all if a test was failing). For more on continuous deployment, see Just-in-time Scalability.
  4. Do you have a bug database? Joel's Painless Bug Tracking is still the gold standard.
  5. Do you fix bugs before writing code? Increasingly, we are paying better lip service to this idea. It's incredibly hard to do. Plus, as product development teams in lean startups become adept at learning-and-discovery (as opposed to just executing to spec), it's clear that some bugs shouldn't be fixed. See the discussion of defects later in this post for my thoughts on how to handle those.
  6. Do you have an up-to-date schedule? This, along with #7 "Do you have a spec?" are the parts of the Joel Test I think are most out-of-date. It's not that the idea behind them is wrong, but I think agile team-building practices make scheduling per se much less important. In many startup situations, ask yourself "Do I really need to accurately know when this project will be done?" When the answer is no, we can cancel all the effort that goes into building schedules and focus on making progress evident. Everyone will be able to see how much of the product is done vs undone, and see the finish line either coming closer or receding into the distance. When it's receding, we rescope. There are several ways to make progress evident - the Scrum team model is my current favorite.
  7. Do you have a spec? I think the new question needs to be "does the team have a clear objective?" If you have a true cross-functional team, empowered (a la Scrum) to do whatever it takes to succeed it's likely they will converge on the result quickly. You can keep the team focused on customer-centric results, rather than conformance to spec. Now, all well-run teams have some form of spec that they use internally, and Joel's advice on how to craft that spec is still relevant. But increasingly we can move to a world where teams are chartered to accomplish results instead of tasked with creating work on spec.
  8. Do programmers have quiet working conditions? Joel is focused on the fact that in many environments, programmers are considered "just the hired help" akin to manual labor, and not treated properly. We always have to avoid that dysfunction - even the lean manufacturing greats realized that they couldn't afford to see their manual-labor workforce that way. I think we need to modify this question to "Do programmers have access to appropriate working conditions?" We want every knowledge worker to be able to retreat into a quiet haven whenever they need deep concentration. But it's not true that energized programmers primarily do solitary work; certainly that's not true of the great agile teams I've known. Instead, teams should have their own space, under their control, with the tools they need to do the job.
  9. Do you use the best tools money can buy? Joel said it: "Top notch development teams don't torture their programmers." Amen.
  10. Do you have testers? I think reality has changed here. To see why, take a look at Joel's Top Five (Wrong) Reasons You Don't Have Testers. Notice that none of those five reasons deals with TDD or automated testing, which have changed the game. Automated testing dramatically reduces the cost of certifying changes, because it removes all of the grunt work QA traditionally does in software. Imagine a world where your QA team never, ever worries about bug regressions. They just don't happen. All of their time is dedicated to finding novel reproduction paths for tricky issues. That's possible now, and it means that the historical ratio of QA to engineering is going to have to change (on the other hand, QA is now a lot more interesting of a job).
  11. Do new candidates write code during their interview? Completely necessary. I would add, though, a further question: Do new employees write code on their first day? At IMVU, our rule was that a new engineer needed to push code to production on their first day. Occasionally, it'd have to be their second day. But if it languished until the third day, something was seriously wrong. This is a test of many key practices: do you have a mentoring system? Is your build environment difficult to set up? Are you afraid someone might be able to break your product without your automated defenses knowing about it?
  12. Do you do hallway usability testing? I love Joel's approach to usability, and I still recommend his free online book on UI design. Some people interpret this to mean that you have to do your usability "right" the first time. I strongly disagree. Usability design is a highly iterative process, and the more customers who are involved (via in-person interview, split-test experiment, etc) the better.

Now let's take a look at some new questions:

Do you work in small batches? Just like in lean manufacturing, it's generally more efficient to drive down the batch size. I try to encourage engineers to check in anytime they have the software in a working state in their sandbox. This dramatically reduces the waste of integration risk. We rarely have code conflicts, since nobody gets out of sync for very long. And it's way easier to deploy small bits of code, since if something goes wrong, the problem is automatically localized and easy to revert.

Do you routinely split-test new features? I hope to write at a future date about how to build your application so that A/B tests are just as easy as not doing them.

Do you practice Five Why's? Joel himself has written about this topic, in the context of doing root cause analysis to provide excellent quality of service without SLAs. I'm not aware of anyone using this tool as extensively as we did at IMVU, where it became the key technique we used to drive infrastructure and quality improvements. Instead of deciding upfront what might go wrong, we use what actually went wrong to teach us what prevention tactics we need to do. Our version of this was to insist that, for every level of the problem that the post-mortem analysis uncovered, we'd take at lesat one corrective action. So if an employee pushed code that broke the site, we'd ask: why didn't our cluster immune system catch that? why didn't our automated tests catch it? why couldn't the engineer see the problem in their sandbox? why didn't they write better code? why weren't they trained adequately? And make at all five of those fixes.

Do you write tests before fixing bugs? If a bug is truly a defect, then it's something that we don't want to ever see again. Fixing the underlying problem in the code is nice, but we need to go further. We need to prevent that bug from ever recurring. Otherwise, the same blindspot that lead us to create the bug in the first place is likely to allow it happen again. This is the approach of test-driven-development (TDD). Even if you've developed for years without automated tests, this one practice is part of a remarkable feedback loop. As you write tests for the bugs you actually find and fix, you'll tend to spend far more time testing and refactoring the parts of the code that are slowing you down the most. As the code improves, you'll spend lest time testing. Pretty soon, you'll have forgotten that pesky impulse to do a ground-up rewrite.

Can you tell defects from polish? Bugs that slow you down are defects, and have to be fixed right away. However, bugs that are really problems with the experience design of your product should only be fixed if they are getting in the way of learning about customers. This is an incredibly hard distinction to understand, because we're so used to a model of product development teams as pure "execution to spec" machines. In that model, anything that the product owner/designer doesn't like is a bug, and Joel's right that we should always fix before moving on (else you pile up an infinite mound of debt). However, in the learning phase of a product's life, we're still trying to figure out what matters. If we deploy a half-done feature, and customers complain about some UI issues (or split-tests demonstrate them), we should refine and fix. But oftentimes, nobody cares. There are no customers for that feature, UI issues or no. In that case, you're better off throwing the code away, rather than fixing the UI. The hardest part is forcing yoursel fot make this decision binary: either continue to invest and polish or throw the code out. Don't leave it half-done and move on to new features; that's the fallacy Joel tried to warn us about in the first place.

Do your programmers understand the product they are building and how it relates to your company's strategy? How can they iterate and learn if they don't know what questions are being asked at the company's highest levels. At IMVU, we opened up our board meetings to the whole company, and invited all of our advisers to boot. Sometimes it put some serious heat on the management team, but it was well worth it because everyone walked out of that room feeling at a visceral level the challenges the company faced.

What other questions would you ask a brand-new startup about its product development practices? What answers would predict success?
Reblog this post [with Zemanta]

Smarticus — 10 things you could be doing to your code right now

Smarticus — 10 things you could be doing to your code right now

A great checklist of techniques and tools for making your development more agile, written from a Rail perspective. Of the techniques he mentioned, I think four are fundamental and critical for any lean startup:

TDD (or the even more politely named TATFT)
Continuous integration
Automate your deployments
Collect statistics

The tools to help you do these things are getting better and better every day, but don't confuse tools with process. Whatever state your code or team is in, you can always start going faster. Or to borrow from a military context (John Boyd) "people-ideas-hardware, in that order."
Reblog this post [with Zemanta]

Seth Godin: How often should you publish?

Is it too self-referential to post a blog entry about someone else's blog entry about how often to write a blog entry? I dunno. But Seth Godin is a great writer, so I don't see why I can't crib from him whenever. His post is ostensibly about how often to release new work (whether you're a blog writer, movie star, software team...) but it's really about how to manage your effort between what he calls the frontlist and backlist. Here's my favorite part:

If you've got a team, part of the team should obsess about the backlist, honing it, editing it and promoting it, while the rest work to generate (as opposed to promote) the frontlist.

The opportunity isn't to give into temptation and figure out how to recklessly and expensively market the frontlist. It is to adopt a long and slow and ultimately profitable strategy of marketing your ever-growing backlist.

I see startups struggle with this all the time. Life is so easy in the days before the "launch" - you just focus on building and polishing those new features. But then what? You have customers, they are using your product, and you are trying to help them. But you're also trying to build and polish new features. And fix the ones from before. And polish them more. And maybe even learn something along the way.

My career has been full of "student body right" moments, where the whole team is suddenly forced to change direction. Often, it's just a reaction to a deficit of frontlist or backlist work. The leadership art is to balance the needs of the present with the needs of the future. Seth's post doesn't tell us how to do that, but he at least clues us in to this essential idea: that we have to make sure to do it at all. I wish he'd mentioned it a little sooner...

Update: bonus thought from Dharmesh Shah's 8 Startup Insights Inspired By The Mega Mind of Seth Godin:

6. Beware The Need for Critical Mass

I’m going to lead with a quote from Seth on this one: “Failing for small audiences is a loud cue that you will fail even bigger with big audiences.” Too often, startup founders talk about how they are pushing to get to “critical mass” and how “economies of scale” are going to kick in. That’s all fine and dandy. I get it. I’ve been in the software industry for a long time. But, is it absolutely, positively necessary to get to some “critical mass” before your business starts to make any sense at all? Is that mass all that critical? Does it have to be?

Can’t you make some kind of business out of something that looks a bit like this:

Mass You Have < The Magical Mass That Is Critical

Why do so many startups have these mythical, magical numbers (“once we hit 1,000,000, users rainbows are going to spontaneously pop out of nowhere and magic fairy dust will fall out of the sky and make our financials look sooo much better”).

Monday, September 8, 2008

Waves of technology platforms

I still remember the first time I switched to LAMP. I was building a new startup in 1999, and wanted to do it right. I had heard that all great companies built their applications on Oracle. So one of the first things we did was to hire an Oracle expert and get to work. Our paltry funding didn't allow us to buy expensive Sun or SGI boxes, but we had a pretty beefy intel-based box from Dell. We had just heard about Oracle's support for Linux, so we installed Red Hat and tried to install Oracle. We tried for weeks. I don't really remember why it didn't work. It wouldn't boot. It would boot and crash. We tried to learn how to create a schema, or really anything at all, and mostly failed. It felt like software that grownups were supposed to use, and we didn't qualify.

Meanwhile, we were building our app in PHP, using a generic DB driver and mysql, "for the time being." As the days turned into weeks, eventually the mysql version of our app became the only version, and at some point we just decided to give up on Oracle, fire the expert, and see what happened.

That startup didn't turn out so well, but not for lack of technology. It cost us a few hundred thousand dollars to get our app up and running, but none of that was dollars spent on software licenses or professional services. We just had our app support a few tens of thousands of customers, and it did well. Some combination of the dot-com crash and a just terrible business plan prevented us from having to take our scalability problems to the next level.

Looking back, that was a special moment. I can't really imagine how much it cost our "grownup" counterparts at other dot-com startups to get their first app up and running. My guess is many millions more than we spent.

Our open source counterparts who did solve the scale problem, had some serious hardware costs to deal with. So did I when I finally found myself building an app with real scalability, a few years later, but a combination of our just-in-time scalability technique and great open source scaling tools, made it manageable.

We're in a new wave of platform evolution. Now you can build an app at true web scale without incurring any software costs, or any fixed hardware costs. You don't need to invent a new architecture, and you don't need to even build your architecture up-front. You can turn your entire application infrastructure investment into a pay-as-you-go variable cost, and bring new products to market at speeds an order of magnitude faster than just 10 years ago.

The lean startup

(Update April, 2011: In September, 2008 I wrote the following post in which I published my thoughts on the term "lean startup" for the first time. In the interests of preserving that history, I have left the original post unchanged and unedited. To learn more about the progress of the the lean startup movement since 2008, click here.)

I've been thinking for some time about a term that could encapsulate trends that are changing the startup landscape. After some trial and error, I've settled on the Lean Startup. I like the term because of two connotations:
  1. Lean in the sense of low-burn. Of course, many startups are capital efficient and generally frugal. But by taking advantage of open source, agile software, and iterative development, lean startups can operate with much less waste.
  2. The lean startup is an application of Lean Thinking. I am heavily indebted to earlier theorists, and highly recommend the books Lean Thinking and Lean Software Development. I also owe a great debt to Kent Beck, whose Extreme Programming Explained: Embrace Change was my first introduction to this kind of thinking. (So far, I have found "lean startup" works better with the entrepreneurs I've talked to than "agile startup" or even "extreme startup.")
What are the characteristics of a lean startup? One that is powered by three drivers, each of which is a part of a major trend:
  1. The use of platforms enabled by open source and free software. At the application-stack layer, I see LAMP + Danga as the most common combination. In recent years, we've also got great new options all up and down the stack, in particular things like Amazon EC2 and RightScale (none of which would be possible without the free software movement).
  2. The application of agile development methodologies which dramatically reduce waste and unlock creativity in product development. (See Customer Development Engineering for my first stab at articulating the theory involved)
  3. Ferocious customer-centric rapid iteration, as exemplified by the Customer Development process.
My belief is that these lean startups will achieve dramatically lower development costs, faster time to market, and higher quality products in the years to come. Whether they also lead to dramatically higher returns for investors is a question I'm looking forward to studying.

Sunday, September 7, 2008

Customer Development Engineering

Yesterday, I had the opportunity to guest lecture again in Steve Blank's entrepreneurship class at the Berkeley-Columbia executive MBA program. In addition to presenting the IMVU case, we tried for the first time to do an overview of a software engineering methodology that integrates practices from agile software development with Steve's method of Customer Development.

I've attempted to embed the relevant slides below. The basic idea is to extend agile, which excels in situations where the problem is known but the solution is unknown, into areas of even greater uncertainty, such as your typical startup. In a startup, both the problem and solution are unknown, and the key to success is building an integrated team that includes product development in the feedback loop with customers.

As always, we had a great discussion with the students, which is helping refine how we talk about this. As usual, I'm heavy on the theory and not on the specifics, so I thought I'd share some additional thoughts that came up in the course of the classroom discussion.

  1. Can this methodology be used for startups that are not exclusively about software? We talk about taking advantages of the incredible agility offered by modern web architecture for extremely rapid deployment, etc. What about a hardware business with some long-lead-time components?

    To be clear, I have never run a business with a hardware component, so I really can't say for sure. But I am confident that many of these ideas still apply. One major theory that has influenced the way I think about processes comes from Lean Manufacturing, where they use these same techniques to build cars. If you can build cars with it, I'm pretty sure you can use it to add agility and flexibility to any product development process.

  2. What's an example of a situation where "a line of working code" is not a valid unit of progress?

    This is incredibly common in startups, because you often build features that nobody wants. We had lots of these examples at IMVU, my favorite is the literally thousands of lines of code we wrote for IM interoperability. This code worked pretty well, was under extensive test coverage, worked as specified, and was generally a masterpiece of amazing programming (if I do say so myself). Unfortunately, positioning our product as an "IM add-on" was a complete mistake. Customers found it confusing and it turned out to be at odds with our fundamental value proposition (which really requires an independant IM network). So we had to completely throw that code away, including all of its beatiful tests and specs. Talk about waste.

  3. There were a lot of questions about outsourcing/offshoring and startups. It seems many startups these days are under a lot of pressure to outsource their development organization to save costs. I haven't had to work this model under those conditions, so I can't say anything definitive. I do have faith that, whatever situation you find yourself in, you can always find ways to increase the speed of iteration. I don't see any reason why having the team offshore is any more of a liability in this area than, say, having to do this work while selling through a channel (and hence, not having direct access to customers). Still, I'm interested in exploring this - some of the companies I work with as an advisor are tackling this problem as we speak.

  4. Another question that always comes up when talking about customer development, is whether VC's and other financial backers are embracing this way of building companies. Of course, my own personal experience has been pretty positive, so I think the answer is yes. Still, I thought I'd share this email that happened to arrive during class. Names have, of course, been changed to protect the innocent:

    Hope you're well; I thought I'd relay a recent experience and thank you.
    I've been talking to the folks at [a very good VC firm] about helping them with a new venture ... Anyway, a partner was probing me about what I knew about low-burn marketing tactics, and I mentioned a book I read called "Four Steps to the…"
    It made me a HUGE hit, with the partner explaining that they "don't ramp companies like they used to, and have very little interest in marketing folks that don't know how to build companies in this new way."

Anyway, thanks to Steve and all of his students - it was a fun and thought-provoking experience..