Sunday, December 7, 2008

Getting started with split-testing

One of the startup founders I work with asked me a smart question recently, and I thought I'd share it. Unlike most of the people who've endured my one-line split-testing talk, this team has taken it to heart. They're getting started creating their first A/B tests, and asked "Should we split-test EVERYTHING?" In other words, how do you know what to split-test, and what just to ship as-is? After all, isn't it a form of waste to split-test something like SSL certs, that you know you have to do?

I love questions like this, because there is absolutely no right answer. Split-testing, like almost everything in a lean startup, requires judgment. It's an art, not a science. When I was just starting out with practices like split-testing, I too sought out hard and fast rules. (You can read about some of the pitfalls I ran into with split-testing in "When NOT to listen to your users; when NOT to rely on split-tests"). That said, I think it's important to get your split-testing initiative off to a good start, and that means being selective about what features you tackle with it.

The goal in the early days of split-testing is to produce unequivocal results. It's a form of waste to generate reports that people don't understand, and it impedes learning if there is a lot of disagreement about the facts. As you get better at it, you can start to apply it to pretty much everything.

In the meantime, here are three guidelines to get you started:
  1. Start simple. Teams that have been split-testing for a long time get pretty good at constructing tests for complex features, but this is a learned skill. In the short term, tackling something too complex is more likely to lead to a lot of wasted time arguing about the validity of the test. A good place to start is to try moving UI elements around. My favorite is to rearrange the steps of your registration process for new customers. That almost always has an effect, and is usually pretty easy to change.

  2. Make a firm prediction. Later, you'll want to use split-tests as exploratory probes, trying things you truly don't understand well. Don't start there. It's too easy to engage in after-the-fact rationalization when you don't go in with a strong opinion about what's going to happen. Split-testing is most powerful when it causes you to make your assumptions explicit, and then challenge them. So, before you launch the test, write down your belief about what's going to happen. Try to be specific; it's OK to be wrong. This can work for a change you are sure is going to have no effect, too, like changing the color of a button or some minor wording. Either way, make sure you can be wrong. If you're a founder or top-level executive, have the courage to be wrong in a very public way. You send the signal that learning always trumps opinion, even for you.

  3. Don't give up. If the first test shows that your feature has no effect, avoid these two common extreme reactions: abandoning the feature altogether, or abandoning split-testing forever. Set the expectation ahead of time that you will probably have to iterate the feature a few times before you know if it's any good. Use the data you collect from each test to affect the next iteration. If you don't get any effect after a few tries, then you can safely conclude that you're on the right track.

Most importantly, have fun with split-testing. Each experiment is like a little mystery, and if you can get into a mindset of open-mindedness about the answer, the answers will continually surprise and amaze you. Once you get the hang of it, I promise you'll learn a great deal.

2 comments:

  1. A few days ago I open-sourced a framework for split-testing, A/B testing or continous optimization.

    http://github.com/gregdingle/genetify/wikis

    I'd love to hear what anybody thinks.

    ReplyDelete
  2. I'd be interested to hear your thoughts on split testing SSL certificates and the effect of domain names.

    For example, does an Extended Validation (EV) SSL certificate gain higher conversions than a standard SSL? Does having an expensive Verisign branded SSL and site logo gain higher conversion than the low-cost GoDaddy equivalent? Does using a subdomain for SSL areas gain better conversion results (https://domain.com against https://secure.domain.com, etc)

    This would make for a good discussion topic, hopefully you'll catch my post and have as much interest.

    Thanks

    ReplyDelete