Monday, June 15, 2009

Why Continuous Deployment?

Of all the tactics I have advocated as part of the lean startup, none has provoked as many extreme reactions as continuous deployment, a process that allows companies to release software in minutes instead of days, weeks, or months. My previous startup, IMVU, has used this process to deploy new code as often as an average of fifty times a day. This has stirred up some controversy, with some claiming that this rapid release process contributes to low-quality software or prevents the company from innovating. If we accept the verdict of customers instead of pundits, I think these claims are easy to dismiss. Far more common, and far more difficult, is the range of questions from people who simply wonder if it's possible to apply continuous deployment to their business, industry, or team.

The particulars of IMVU’s history give rise to a lot of these concerns. As a consumer internet company with millions of customers, it may seem to have little relevancy for an enterprise software company with only a handful of potential customers, or a computer security company whose customers demand a rigorous audit before accepting a new release. I think these objections really miss the point of continuous deployment, because they focus on the specific implementations instead of general principles. So, while most of the writing on continuous deployment so far focuses on the how of it, I want to focus today on the why. (If you're looking for resources on getting started, see "Continuous deployment in 5 easy steps")

The goal of continuous deployment is to help development teams drive waste out of their process by simultaneously reducing the batch size and increasing the tempo of their work. This makes it possible for teams to get – and stay – in a condition of flow for sustained periods. This condition makes it much easier for teams to innovate, experiment, and achieve sustained productivity. And it nicely compliments other continuous improvement systems, such as Five Whys.

One large source of waste in development is “double-checking.” For example, imagine a team operating in a traditional waterfall development system, without continuous deployment, test-driven development, or continuous integration. When a developer wants to check-in code, this is a very scary moment. He or she has a choice: check-in now, or double-check to make sure everything still works and looks good. Both options have some attraction. If they check-in now, they can claim the rewards of being done sooner. On the other hand, if they cause a problem, their previous speed will be counted against them. Why didn't they spend just another five minutes making sure they didn't cause that problem? In practice, how developers respond to this dilemma is determined by their incentives, which are driven by the culture of their team. How severely is failure punished? Who will ultimately bear the cost of their mistakes? How important are schedules? Does the team value finishing early?

But the thing to notice in this situation is that there is really no right answer. People who agonize over the choice reap the worst of both worlds. As a result, developers will tend towards two extremes: those who believe in getting things done as fast as possible, and those who believe that work should be carefully checked. Any intermediate position is untenable over the long-term. When things go wrong, any nuanced explanation of the trade-offs involved is going to sound unsatisfying. After all, you could have acted a little sooner or a little more careful – if only you’d known what the problem was going to be in advance. Viewed through the lens of hindsight, most of those judgments look bad. On the other hand, an extreme position is much easier to defend. Both have built-in excuses: “sure there were a few bugs, but I consistently over-deliver on an intense schedule, and it’s well worth it” or “I know you wanted this done sooner, but you know I only ever deliver when it’s absolutely ready, and it’s well worth it.”

These two extreme positions lead to factional strife in development teams, which is extremely unpleasant. Managers start to make a note of who’s on which faction, and then assign projects accordingly. Got a crazy last-minute feature, get the Cowboys to take care of it – and then let the Quality Defenders clean it up in the next release. Both sides start to think of their point of view in moralistic terms: “those guys don’t see the economic value of fast action, they only care about their precious architecture diagrams” or “those guys are sloppy and have no professional pride.” Having been called upon to mediate these disagreements many times in my career, I can attest to just how wasteful they are.

However, they are completely logical outgrowths of a large-batch-size development process that forces developers to make trade-offs between time and quality, using the old “time-quality-money, pick two fallacy.” Because feedback is slow in coming, the damage caused by a mistake is felt long after the decisions that caused the mistake were made, making learning difficult. Because everyone gets ready to integrate with the release batch around the same time (there being no incentive to integrate early), conflicts are resolved under extreme time pressure. Features are chronically on the bubble, about to get deferred to the next release. But when they do get deferred, they tend to have their scope increased (“after all, we have a whole release cycle, and it’s almost done…”), which leads to yet another time crunch, and so on. And, of course, the code rarely performs in production the way it does in the testing or staging environment, which leads to a series of hot-fixes immediately following each release. These come at the expense of the next release batch, meaning that each release cycle starts off behind.

Many times when I interview a development team caught in the pincers of this situation, they want my help "fixing people." Thanks to a phenomenon called the Fundamental Attribution Error in psychology, humans tend to become convinced that other people’s behavior is due to their fundamental attributes, like their character, ethics, or morality – even while we excuse our own actions as being influenced by circumstances. So developers stuck in this world tend to think the other developers on their team are either, deep in their souls, plodding pedants or sloppy coders. Neither is true – they just have their incentives all messed up.

You can’t change the underlying incentives of this situation by getting better at any one activity. Better release planning, estimating, architecting, or integrating will only mitigate the symptoms. The only traditional technique for solving this problem is to add in massive queues in the forms of schedule padding, extra time for integration, code freezes and the like. In fact, most organizations don’t realize just how much of this padding is already going on in the estimates that individual developers learn to generate. But padding doesn’t help, because it serves to slow down the whole process. And as all development teams will tell you – time is always short. In fact, excess time pressure is exactly why they think they have these problems in the first place.

So we need to find solutions that operate at the systems level to break teams out of this pincer action. The agile software movement has made numerous contributions: continuous integration, which helps accelerate feedback about defects; story cards and kanban that reduce batch size; a daily stand-up that increases tempo. Continuous deployment is another such technique, one with a unique power to change development team dynamics for the better.

Why does it work?

First, continuous deployment separates out two different definitions of the terms “release.” One is used by engineers to refer to the process of getting code fully integrated into production. Another is used by marketing to refer to what customers see. In traditional batch-and-queue development, these two concepts are linked. All customers will see the new software as soon as it’s deployed. This requires that all of the testing of the release happen before it is deployed to production, in special staging or testing environments. And this leaves the release vulnerable to unanticipated problems during this window of time: after the code is written but before it's running in production. On top of that overhead, by conflating the marketing release with the technical release, the amount of coordination overhead required to ship something is also dramatically increased.

Under continuous deployment, as soon as code is written, it’s on its way to production. That means we are often deploying just 1% of a feature – long before customers would want to see it. In fact, most of the work involved with a new feature is not the user-visible parts of the feature itself. Instead, it’s the millions of tiny touch points that integrate the feature with all the other features that were built before. Think of the dozens of little API changes that are required when we want to pass new values through the system. These changes are generally supposed to be “side effect free” meaning they don’t affect the behavior of the system at the point of insertion – emphasis on supposed. In fact, many bugs are caused by unusual or unnoticed side effects of these deep changes. The same is true of small changes that only conflict with configuration parameters in the production environment. It’s much better to get this feedback as soon as possible, which continuous deployment offers.

Continuous deployment also acts as a speed regulator. Every time the deployment process encounters a problem, a human being needs to get involved to diagnose it. During this time, it’s intentionally impossible for anyone else to deploy. When teams are ready to deploy, but the process is locked, they become immediately available to help diagnose and fix the deployment problem (the alternative, that they continue to generate, but not deploy, new code just serves to increase batch sizes to everyone’s detriment). This speed regulation is a tricky adjustment for teams that are accustomed to measuring their progress via individual efficiency. In such a system, the primary goal of each engineer is to stay busy, using as close to 100% of his or her time for coding as possible. Unfortunately, this view ignores the overall throughput of the team. Even if you don’t adopt a radical definition of progress, like the “validated learning about customers” that I advocate, it’s still sub-optimal to keep everyone busy. When you’re in the midst of integration problems, any code that someone is writing is likely to have to be revised as a result of conflicts. Same with configuration mismatches or multiple teams stepping on each others’ toes. In such circumstances, it’s much better for overall productivity for people to stop coding and start talking. Once they figure out how to coordinate their actions so that the work they are doing doesn’t have to be reworked, it’s productive to start coding again.

Returning to our development team divided into Cowboy and Quality factions, let’s take a look at how continuous deployment can change the calculus of their situation. For one, continuous deployment fosters learning and professional development – on both sides of the divide. Instead of having to argue with each other about the right way to code, each individual has an opportunity to learn directly from the production environment. This is the meaning of the axiom to “let your defects be your teacher.”

If an engineer has a tendency to ship too soon, they will tend to find themselves grappling with the cluster immune system, continuous integration server, and five whys master more often. These encounters, far from being the high-stakes arguments inherent in traditional teams are actually low-risk, mostly private or small-group affairs. Because the feedback is rapid, Cowboys will start to learn what kinds of testing, preparation and checking really do let them work faster. They’ll be learning the key truth that there is such a thing as “too fast” – many quality problems actually slow you down.

But for engineers that have the tendency to wait too long before shipping, they too have lessons to learn. For one, the larger the batch size of their work, the harder it will be to get it integrated. At IMVU, we would occasionally hire someone from a more traditional organization who had a hard time letting go of their “best practices” and habits. Sometimes they’d advocate for doing their work on a separate branch, and only integrating at the end. Although I’d always do my best to convince them otherwise, if they were insistent I would encourage them to give it a try. Inevitably, a week or two later, I’d enjoy the spectacle of watching them engage in something I called “code bouncing.” It's like throwing a rubber ball against the wall. In a code bounce, someone tries to check in a huge batch. First they have integration conflicts, which require talking to various people on the team to know how to resolve them properly. Of course, while they are resolving, new changes are being checked in. So new conflicts appear. This cycle repeats for a while, until the team either catches up to all the conflicts or just asks the rest of the team for a general check-in freeze. Then the fun part begins. Getting a large batch through the continuous integration server, incremental deploy system, and real-time monitoring system almost never works on the first try. Thus the large batch gets reverted. While the problems are being fixed, more changes are being checked in. Unless we freeze the work of the whole team, this can go on for days. But if we do engage in a general check-in freeze, then we’re driving up the batch size of everyone else – which will lead to future episodes of code bouncing. In my experience, just one or two episodes are enough to cure anyone of their desire to work in large batches.

Because continuous deployment encourages learning, teams that practice it are able to get faster over time. That’s because each individual’s incentives are aligned with the goals of the whole team. Each person works to drive down waste in their own work, and this true efficiency gain more than offsets the incremental overhead of having to build and maintain the infrastructure required to do continuous deployment. In fact, if you practice Five Whys too, you can build all of this infrastructure in a completely incremental fashion. It’s really a lot of fun.

One last benefit: morale. At a recent talk, an audience member asked me about the impact of continuous deployment on morale. This manager was worried that moving their engineers to a more-rapid release cycle would stress them out, making them feel like they were always fire fighting and releasing, and never had time for “real work.” As luck would have it, one of IMVU’s engineers happened to be in the audience at the time. They provided a better answer than I ever could. They explained that by reducing the overhead of doing a release, each engineer gets to work to their own release schedule. That means, as soon as they are ready to deploy, they can. So even if it’s midnight, if your feature is ready to go, you can check-in, deploy, and start talking to customers about it right away. No extra approvals, meetings, or coordination required. Just you, your code, and your customers. It’s pretty satisfying.

21 comments:

  1. Nice post. Batch size and delay explains so much. My last job was as a build and release manager. I'd be constantly redefining done for developers. What amazed me was the waste. Changes queued for over a year, meetings to decide if a change was safe to release, and changes would still fail in production. This approach gives you the feedback with less waste - which was the issue for the business. They were trying to do things while we did theatre.

    ReplyDelete
  2. Amen!

    When people first have a problem, it seems so obvious to them that the thing to do is to slow down and add layers of manual approval. When they continue to have problems, they double down, making things even worse.

    Even if short cycles and small batch sizes were harder, I think the value of faster product feedback would still outweigh that. But once a team picks up the necessary supporting skills and attitudes, small batches are so much easier!

    ReplyDelete
  3. Eric,

    I love reading your posts, you make strong arguments and provide clear proof points.

    One thing you seem to be treating lightly is how to use continuous development in business systems. For companies that build CRM software, marketing software, HR systems, etc., where thousands of users repeatedly use the same features, isn't training and process change a major issue? Corporate customers seem to prefer regular but distinct releases. I'd like to hear from you more about how you propose dealing with these issues, or from anyone else who has hands-on experience.

    Thanks,
    Steve

    ReplyDelete
  4. Steve et al -

    I agree, most clients' processes have to change to accommodate the implications of continuous deployment, or the educators and leaders in client organizations will have a hard time staying in sync.

    The processes in question are tied to communication and learning, and they are mored tied to human interaction than to interaction with software or curriculum.

    Processes that, in the Industrial Age, were tightly scripted, with predictable outcomes, must in the Networked World, become adaptable, contextual, and focus on getting measurable (often surprising!) results rather than specific scripted outcomes.

    Thanks for the post, Eric. Very well-thought-through. (and thanks to Dmitry Shapiro for calling it to my attention.)

    ReplyDelete
  5. If the true goal of continuous deployment is to catch bugs early by continually examining changes in small batches, I wonder if you are doing it a disservice by insisting on the deployment of those batches all the way through to the live production site?

    While it's tantalizing from a developer perspective... you are going to run head-on into business, marketing, and product management interests if you insist on automatically deploying to production as soon as a build has passed automated tests (no matter how broad your test coverage is).

    Think about financial services or e-commerce companies. There are a whole range of valid reasons why non-developers would want to dictate the production release schedule (Seasonal/timing issues, marketing, fulfillment concerns, documentation/training, revenue controls, legal/regulatory. etc..). Additionally, those same people are going to be put off by the fact that you seemingly advocate subjecting their customers to what could potentially be a sub-par experience (even if it's just 1%). This is why they gladly pay top dollar for a QA group that is supposed to be that "1%" that finds the bugs first (now how good a job they actually do is a whole different subject!)

    In fighting to have developers checking in code dictate production releases, you risk looking unreasonable and diluting a powerful message that everyone should agree on: "small batches of changes that are automatically and continuously tested".

    ReplyDelete
  6. Very well stated! We use continuous integration, testing and deployment. We wouldn't be where we are today without it ... and we aren't even close to where we want to be. You've articulated things well.

    ReplyDelete
  7. tl dr pc (probably crap)

    ReplyDelete
  8. great post.

    Can you elaborate about the flow of working with the version control system?

    I work with git and for every new feature/bug fix/spike I use branch and try to quickly merge it ASAP.

    is it different in your team?

    ReplyDelete
  9. "Think about financial services or e-commerce companies. There are a whole range of valid reasons why non-developers would want to dictate the production release schedule (Seasonal/timing issues, marketing, fulfillment concerns, documentation/training, revenue controls, legal/regulatory. etc..)."

    Actually this process works really well in a financial company. It lets the customer and development team spot problems with calculations almost immediately. Also, the customer is able to receive the small features they are constantly asking for almost daily and the larger feature show up on a weekly basis.

    ReplyDelete
  10. I've been selling this idea internally for the last couple of years as well.

    One of the distinctions that has helped at the executive level is to stress that you're reducing the COST of failure, rather then its PROBABILITY. If the cost of each failure is dirt cheap, then you can afford to make a lot more failures in the course of learning.

    ReplyDelete
  11. Many of the comments bring up important points about continuous deployment's applicability to the business systems used daily by large organizations. One major challenge lies in training.

    If software companies can convince client executives that continuous deployment won't expand the training budget, then implementation will be much more acceptable. As Bonifer mentions, the key is to enhance collaboration among all parties involved. As collaboration starts to foster learning over time, the software company will be able to use its acquired knowledge to offer solutions to alleviate potential training concerns.

    ReplyDelete
  12. This is extremely inspiring. I've been accused of being the Cowboy and also of being the Quality Defender, both while I've attempted to stay on message. It really appears that continuous deployment is the way to find that right balance and have it agreed upon by the whole team.

    ReplyDelete
  13. Besides training, how does documentation/Help keep up?

    Also, if your software goes into even one other language, how can translation (localization) keep up?

    It seems you would need documentation writers and translators to be on the same schedule as the developers (50 times a day). Since most translation is outsourced, can somebody elaborate on how this has been handled in their teams?

    ReplyDelete
  14. Great post Eric!

    Philosophically I totally agree with you. I would bet almost every software team could benefit from a slightly smaller batch size.

    How small is the question, which gets into the How of it. I believe you also need to think about the Who....you know, "customers define quality". If you're releasing CRM software (like one commenter noted) that requires a lot of training for a large workforce, then 50 changes a day might not work. Some customers might not like Google Docs changing every day. Part of putting this into action is requiring your team to have a very deep understanding of their customers.

    ReplyDelete
  15. Would you buy a car that constantly had to go back to the dealer to be improved? I should think not. Why do software developers think this is acceptable?

    ReplyDelete
  16. Thanks for this. We've got a standard iterative process with many weeks between deploys; some of these ideas may help. Unfortunately, in modern webapps there are many tests that are very hard to automate, so I'd love to hear to more ideas on UI testing.


    FYI: your "cluster immune system" link seems to be pointing to the wrong place; probably should point here: http://startuplessonslearned.blogspot.com/2008/11/five-whys.html

    ReplyDelete
  17. Eric, continuous deployment has its place in projects where the risk of not following change control policies is not a big deal. As long as you can trust in your job being secure when bad changes impact customers or fail audits, it works.

    I can appreciate the benefits of a fast develop-build-test cycle improving quality, improving team satisfaction and improving productivity, but automated deployment will take some getting used to.

    ReplyDelete
  18. The most irresponsible idea I have heard of in years. Glad you don't develop software for space shuttles.

    ReplyDelete
  19. > The most irresponsible idea I have heard of in years. Glad you don't
    > develop software for space shuttles.

    Exactly! I don't develop software for space shuttles, and (if I did) wouldn't use the same process I use in a startup. It's equally insane to use "space shuttle" style process in a context of extreme uncertainty.

    ReplyDelete
  20. Great post ;-) Continuous deployment helps reduce software inventory which has numerous positive effects on flow. Less rework (and therefore context switching), lower costs as code does not become stale in the pipeline, lower cycle times, shorter time to market, earlier feedback from customers, greater ability to respond to change...

    ReplyDelete
  21. Thank you fo this write up. My company has recently embraced CD. As long as solid testing patterns are employed that strategy has worked very well.

    ReplyDelete