The Advanced Software Design Course continues to run, and continues to make big impacts on student lives. A former student of ours in Paris wrote to me recently:
Yesterday was my last day of a 13-month freelance project as a backend architect in a SaaS startup. It was so rewarding to be able to apply the course concepts to build their product from scratch. I've read a lot of books about SWE, but found myself only using the ones from the course most of the time. That's how useful they are! Thanks to the things I learned in this course, the code has always been able to evolve without rewrites when new requirements were introduced.
One of my favorite articles on data modeling is qntm’s Gay Marriage: The database engineering perspective. In a series of worked examples, it shows the evolution of a database design representing a common concept, starting with something unworkably bad, and fixing its limitations until it could faithfully represent the world 20 years ago, the world now, and finally the world in some unrealized future. It shows how a hotly-debated policy change — aside from questions about its effect on the social fabric — would definitely alter countless software systems. It urges readers to think about a definition of “marriage” that would stand the test of time.
And that is an invitation into a rabbit hole, a great way to spend hours debating something abstract with no contribution to the product. At least it is according to Benoît Fleury, a very senior engineer at Stripe and a “software philosopher” currently serving as course staff here at the-company-being-renamed-to-Mirdin. This sparked a discussion between us and a third staff member which became so interesting that I had to immortalize it as a newsletter.
Why are we talking about this?
Beginning with the philosophical questions like “what is marriage” is one way to start a design discussion. But it’s detached from the problem at hand. It can be satisfying, but seems more like an exercise done for its own sake than a productive discussion. What if we instead start with the requirements? What do you need to implement your system? Why does it care about marriage? A tax accounting firm and a restaurant giving discounts on Valentines Day may both care about “marriage,” but for totally different reasons.
If you start off by asking “what is marriage,” you may capture facts about the true structure of the world which will serve the system well in the face of unpredicted requirements changes. Or you may drown in irrelevant details. (And what exactly is “the true structure of the world,” if there exists such a thing.) We glean an alternate approach from On Understanding Computers and Cognition, the revered 1986 tome by Terry Winograd, the famous AI researcher, and Fernando Flores, a Chilean philosopher and government official.
Or so I’m told. If you’re someone who likes reading Continental Philosophy (the first half is an explainer on Heidegger and a few others), you’ll probably love this book. I am not, so I slogged through the relevant chapters, grasping only a fraction of its message. But that was enough to encounter a very different kind of thinking than my own.
People advocate ontological discussions because “requirements are often wrong.” But one can flip that around: a major part of designing software is debugging the requirements. One can time analyzing and debugging those requirements, or one can surrender prematurely because "they're often wrong" and have lengthy discussions about ontologies, discussions which may lead nowhere — or to over-generalized solutions. From one perspective, having design discussions without judging the choices against identified problems is like debating what tie to wear without considering the shirt. According to Winograd and Flores, distinctions about what marriage is or is not do not exist except in the context of what to do about it and how to talk about it. A marriage exists in its effect on the world. In that our system is part of the world, attempting to define it in our system invites circular definitions like “a marriage is something that gets me 50% off cake on Valentines Day” when our system is what makes it so. It is thereby meaningless to debate their true ontology because our system design creates the ontology.
Under this thought, instead of debating what something is, or even basing it on current debugged requirements, we should choose what to model and how by looking at the domain and identifying its “breakdowns,” a term attributed to Heidegger which roughly means “a distinction not supported by your current model, causing it to break down.” Let’s switch our example from marriage to addresses. As Winograd and Flores explain, in the context of a tailoring business concerned with making clothing fit perfectly:
As an obvious example, we can ask what a customer’s ‘address’ is. The immediate response is “For what?” (or, “What is the conversation in which it determines a condition of satisfaction?”). There are two distinct answers. Some conversations with customers involve requests for the physical transfer of goods while others involve correspondence. Different conditions of satisfaction require different kinds of address. This is a standard case, and most business forms and computer data bases will distinguish “shipping address” and “billing address.” But we may also need an address where the person can be found during the day to perform further measurements. In every case, the relevant ‘property’ to be associated with the person is determined by the role it plays in an action. This grounding of description in action pervades all attempts to formalize the world into a linguistic structure of objects, properties, and events.
Focusing on these breakdowns in design, proponents argue, provides an objective measure of design success. When you discuss about what things are, you have no yardstick, and discussions never end. It also lets you get your design out into the world faster so as to iterate, and further debug the requirements by discovering new breakdowns. Winograd and Flores continue:
This also leads us to the recognition that the development of any computer-based system will have to proceed in a cycle from design to experience and back again. It is impossible to anticipate all of the relevant breakdowns and their domains. They emerge gradually in practice. System development methodologies need to take this as a fundamental condition of generating the relevant domains, and to facilitate it through techniques such as building prototypes early in the design process and applying them in situations as close as possible to those in which they will eventually be used.
Two sides of the same coin
The breakdown-finder begins by asking the questions “For what do we need an address” and “What are things we may want to do with some addresses but not others.” Contrast the ontologist whom, detractors argue, would be likely to say “An address is a string in a certain format which identifies a building” and miss all the relevant issues.
But I am an aficionado of ontology, and that’s not how I would do it.
Given the question “What is an address,” I would think about how an address identifies a building, except that sometimes it identifies just part of a building and sometimes a large group of buildings. I would think about how in contracts it’s sometimes just used to uniquely identify a person with a common name. I would recall an article I read in middle school about Managua, a city without addresses, or my time in Dubai, where nearly every business is on the same un-numbered street and you order goods by specifying a building name and nearby landmarks. I might even realize that what I have in the database is not an address, but is instead whatever the user typed in and said was their address, in the same way that the famous painting below emblazoned with French for “This is not a pipe” is indeed not a pipe, but rather a painting of a pipe (and the image below is actually a computer reproduction of a scan of a painting of a pipe). A million questions would pop into my mind to research (are they associated with GPS coordinates? Do they change? Can there be multiple streets in the same city with the same name?), and I would mentally flit through them to identify which ones would be hard to fix later if ignored. And I would similarly scoff at my image of the short-sighted requirements-based designer unwilling to go out into the world and face these questions.
Both approaches share the goal of finding a data model that cleanly expresses what the software needs to do its job. Both approaches will use skills from the others’ toolbox. The skilled requirements-based designer actively searches ways in which reality causes a breakdown in their proposed model; the skilled ontologist always keeps in mind what is important and what can be ignored.
The requirements approach starts with your software's intended effects on the world and explores its internal organization and how nouns in the latter relate to the former, like how an "address" can be an input to the postal system, an input to the credit card system, or even something more metaphorical like a bitcoin or web address).
The ontological approach starts with the world, in all its gory detail (of which the implementer's knowledge is necessarily woefully incomplete) and agglomerates, approximates, abstracts, and avoids things until all that's left is what the software needs to care about.
As with most things where there are two approaches coming at the same problem from opposite sides, in practice they are often used in a forward-backward manner, either a "requirements-directed ontology" or "breakdown identification.”
So each side can learn from the other. Perhaps the debate between requirements and ontology is just one where you need to reverse any advice you hear?
Thank you to Benoît Fleury and Alex Rozenshteyn for the discussion that sparked this newsletter. Significant portions of the text above are paraphrased from their arguments. I have archived this discussion and made it available here.