Saturday, August 8, 2009

Revisiting the Software Design Manifesto (and what's changed since then)

My recent article on technical debt and its positive uses generated a fair bit of controversy. One of the topics that raised heated debate was whether I had conflated technical design with product design, because I made the admittedly counter-intuitive claim that sometimes good technical design actually leads to increased technical debt. You can follow some of that debate here and here; I continue to believe that this idea is correct.

The argument itself got me thinking a lot about design and its role in building products. As a profession, we have a set of intuitions about what good design looks like, and I've come to believe that some of these intuitions have become obsolete. In this post, I'd like to explore the reasons why.

I thought a good place to start was with the origins of the idea that "software design" should be considered a discipline in its own right, on par with computer science, software engineering, and computer programming. Over the years, many people have advocated for this idea, but I wanted to go back to an early source: Mitch Kapor's original Software Design Manifesto. We owe a lot to this seminal document. Re-reading, I was struck by how much of it we now take for granted. And as Kapor himself points out, the core ideas have even older origins:

The Roman architecture critic Vitruvius advanced the notion that well-designed buildings were those which exhibited firmness, commodity, and delight.

The same might be said of good software. Firmness: A program should not have any bugs that inhibit its function. Commodity: A program should be suitable for the purposes for which it was intended. Delight: The experience of using the program should be pleasurable one. Here we have the beginnings of a theory of design for software.

This simple three-part framework underlies almost all discussions about technical design today, and it was clearly on display in the recent debates over technical debt. What's interesting to me is how much we have tended to focus on Firmness and Delight as the key elements of technical design. A Firm design is one that works reliably, that has a transparent internal structure, and is easy to change. Great engineers see it and smile. And Delight is a similar feeling, but for a different constituency: the end-user. In the more than ten years since the original Manifesto, we've made strides in both areas. User-centric and interaction design, test-driven development, continuous integration, services-oriented architectures - the list goes on. Although some of these practices are counter-intuitive, they all have been gradually adopted as their benefits become clear.

But what about Commodity? I think this is the area where our intuitions are most out of step with the new reality we are living in. In antiquity just as much as in the early days of software engineering, Commodity was rightly understood as a mostly static quality. Sure, during the requirements and specification phases, there might be a lot of prototyping and iterating. But once the design was locked and implementation began, the intended purpose was relatively well understood and not subject to revision.

To be clear, that didn't mean that the design didn't change. Kapor addresses that directly:
In general, the programming and design activities of a project must be closely interrelated. During the course of implementing a design, new information will arise, which many times will change the original design. If design and implementation are in watertight compartments, it can be recipe for disaster because the natural process of refinement and change is prevented.

These principles are every bit as true today as then. What's changed is that these interactions used to be confined primarily to the implementation phase of the project. The kinds of "new information" in the quote above are implementation details. The design may call for a certain look-and-feel that is impossible to implement, or has negative performance implications, which would require changes in the design, which might uncover additional issues, etc. This back-and-forth would continue up until the project entered its certification phase. Over time, if everything's working right, the magnitude of the design changes should become smaller and smaller, as the team converges on the final design.

But notice something interesting about this process. At no point is the overall purpose of the design changing. It doesn't start life as a toaster and end the design process as a microwave. Of course, it's possible that after the product is shipped and customer feedback is solicited, the next product design might be different. But think of the time-scale involved - in antiquity as well as a few decades ago. Building a cathedral takes years, and so even if the design of one cathedral affects the next, that's not particularly relevant to practitioners in the here-and-now. The same is true of a traditional waterfall-style IT project (although hopefully measured in months or years, and not decades). Yet a huge class of modern software projects are being developed in a very different context.

When it becomes possible to build products "live" with customers, the cycle time changes and design becomes a much more dynamic process. We still struggle to create Firm software that is defect-free, and it still requires customer insight (and maybe some customer development) to discover what will Delight. But it's Commodity that has become the most unstable. Every time we execute a product pivot - changing some elements of our vision but not others - we change the very purpose of the product being designed. My belief is that it's this increase in the rate of change that is what is causing our technical design intuitions to go haywire. It's like our compass no longer points to true north (like on Lost).

Let me quote an example that I used recently:

Remember IMVU's initial IM add-on product? It had a pretty good technical design. Here why:

- it kept each IM network in its own separate module, and made it really easy to add new IM networks by composing a set of common objects
- it separated the underlying transport from the IM "session" itself, so it was robust in the face of the underlying client acting strangely, going away, or even having conversations switch clients altogether
- it compacted all of its information into brief, human-readable text messages that could be sent over any IM network in the clear

Those were strictly technical design decisions, and I think they were really good. Unfortunately, when we realized the product design was not what customers wanted, we had to pivot to a new product. But we had to bring that old codebase with us. Now the assumptions and abstractions that had served us well started to serve us badly. When we became a standalone network, it didn't matter how easy it was to add new networks, since we never did. And having the session abstracted from the transport made debugging much harder. Worse of all, the plaintext codes we were used to sending were considered non-authoritative, since they could be pulled off a third-party network. This made the actual transport much more difficult on our first-party network than was really necessary.

As a result, we have had to be constantly refactoring this design, a little bit at a time, to smooth out these rough edges. These design changes feel a lot like the interest payments incurred by technical debt. My argument is that there is no distinction to be had. That "good design" turned out to be technical debt, after all.

What I object to most is the idea that technical design is a linear quantity. There's no such thing as "improving the technical design" in any absolute sense. You can only improve it with regard to whatever the purpose of the current product is. When that purpose is changing, we're necessarily chasing a moving target.

There are huge opportunities that become unlocked when we recognize this change. For one, we have to abandon any pretense of a linear design process, that imagines that we'll design something, implement it, and then get feedback on it. As has been going on in the world of manufacturing for many decades now, we have to engage in these activities concurrently. This is called set-based concurrent engineering (SBCE). [1] We also have to recognize the important impact of batch size on the work that we do. When we work on a product in small increments, we accelerate feedback to each participant who works on the product. This includes the designers as well as the engineers and product managers. This is what allows them to have a constant stream of insights about the true Commodity of their design, and to change it when it's time to pivot.

This has big implications for where we should spend energy. As I mentioned in the technical debt piece, our choices are usually framed as a set of either-or trade-offs between quick-and-dirty hacks and slower but more elegant designs. Lean methods present a third option: to invest in our process so that our design gets more feedback sooner and is more adaptable to changes in purpose. (The economics of these process trade-offs are discussed in the Principles of Product Development Flow.)

Returning to the subject of technical design, this yields a new criteria for a good dynamic technical design. It should still be Firm, and still promote Delight for our current customers. But it should also be resilient to changes in purpose, even dramatic ones. That means that the internal design of the product is now inseparable from the process that is used to build it. It is time for software design to grow up, the same way manufacturing had to evolve beyond Taylorism. And as with all scientific evolutions, it's not that the old principles are discarded or proved to be false. What's new is that we have learned to apply those principles in new contexts, like the extreme uncertainty that is the soil in which startups grow. We may have to change our practices to adapt to this new reality, but that doesn't mean we don't owe a debt of gratitude to those who helped us get here. So, in that spirit: thanks, Mitch. We'll do our best to leave the next generation something of comparable value.



[1] For more on SBCE, see this MIT Sloan Management Review article. Here's an excerpt:

In a previous article, we called Toyota’s product development system the “second Toyota paradox.” TPS was the first; its features seem wasteful but result in a more efficient overall system, such as changing over manufacturing processes more frequently (presumably inefficient) in order to create short manufacturing lead times. The second paradox can be summarized in this way: Toyota considers a broader range of possible designs and delays certain decisions longer than other automotive companies do, yet has what may be the fastest and most efficient vehicle development cycles in the industry.

Traditional design practice, whether concurrent or not, tends to quickly converge on a solution, a point in the solution space, and then modify that solution until it meets the design objectives. This seems an effective approach unless one picks the wrong starting point; subsequent iterations to refine that solution can be very time consuming and lead to a suboptimal design.

By contrast, what we call “set-based concurrent engineering” (SBCE) begins by broadly considering sets of possible solutions and gradually narrowing the set of possibilities to converge on a final solution. A wide net from the start, and gradual elimination of weaker solutions, makes finding the best or better solutions more likely. As a result, Toyota may take more time early on to define the solutions, but can then move more quickly toward convergence and, ultimately, production than its point-based counterparts.

Reblog this post [with Zemanta]

5 comments:

  1. gr8 blog. it is very helpful for me. Thanks for sharing this.

    ReplyDelete
  2. Are there examples of using Set Based Concurrent Engineering (SBCE) productively in the customer discovery phase of a startup?

    SBCE feels like it should fit into the lean startup nicely, but I can't quite figure out how. Lets accept the idea that the unit of progress for a startup is the customer-fact. How does Set Based Concurrent Engineering help?

    Toyota uses it as a purely technical process. No customer is asked if they like the bigger battery but heavier hybrid or smaller battery and lighter car. The engineers making those decisions are bathed in customer knowledge but the decisions are made on a technical basis. There is no iteration in the process, in fact the SBCE serves to remove iterations (rework).

    I guess the crux of the problem is SBCE gives you a quicker path to a better solution of a known problem. But in the early stages of a startup you are still trying to hone in on what exactly the problem is. No?

    One place I can see the benefit of SBCE is by rubbing alternative designs against each other you should be able to discover what your unconscious assumptions are. How else can it be used?

    ReplyDelete
  3. "Returning to the subject of technical design, this yields a new criteria for a good dynamic technical design. It should still be Firm, and still promote Delight for our current customers. But it should also be resilient to changes in purpose, even dramatic ones"

    I think that's spot on especially if we qualify exactly what "resilient to changes" means. A common interpretation would be endless flexibility in a design which leads to gold plating etc. In contrast we require a well partitioned design with appropriate levels of coupling and cohesion etc.

    Just wondering if we could summarise the overall theme with a variation on "No battle plan survives contact with the enemy" (Moltke):

    "No design survives contact with the customer"

    Thus we should plan accordingly...

    ReplyDelete
  4. Customer Development is itself an example of SBCE. The way I translate it into the lean startup framework is to have a separate problem team and solution team - that work concurrently to shape the product that the company will ultimately build.

    They key to understanding how to translate these traditional product development or manufacturing processes into a startup context is to remember that startups break down the traditional barriers between departments. There is not "just product development" in a startup - the context of extreme uncertainty that we operate in means that every choice we make impacts the whole company.

    ReplyDelete
  5. One reason to distinguish between technical design and product design is that technical debt and product debt are different. The problem with the minimum viable product is that it has product debt: some absence of features, functionality, usability, polish, whatever. With technical debt, you pay the interest. With product debt, your users pay the interest. And compared to technical debt, the impact of product debt is much harder to measure. There's lots of information about performance and metrics out there, as well as a peer expectation that being concerned about those things is an important part of being a competent engineer.

    But product debt doesn't have this focus. I think the interest in quick and easy usability testing (avoids time/money being redirected away from engineering concerns), and split-testing (often misused to simplify product design decisions to binaries) are signs of the problem, not the solution.

    But what if the solution is staring us in the face? If rapid iteration and feedback is desired, why not iterate on throw-away prototypes that are trivial to produce and incur zero technical debt? This seems to be what SBCE points us to: considering a broad range of design options with enough fidelity to a real product to get useful feedback. Jeff Hawkins, the inventor of the Palm Pilot is said to have carried a block of wood around in his shirt pocket for a week to validate the concept.

    This isn't a very popular option because it runs counter to common engineering intuitions (or are they taboos?). Prototypes are easy to build and meant to be thrown away, which violates the intuition that engineers have that value is created by building things to last and doing things that are (technically) difficult. Incidentally, these also seem to be the same intuitions that lead to the typical attitude to technical debt. Perhaps this is a bit too provocative, but what if this means that getting past those taboos requires abandoning one's identity as an engineer?

    ReplyDelete