A great platform as a product paper, and a fun platform philosophy thereof

I like this platform as a product paper a lot. You should check it out if you’re into DevOps, SRE, platform engineering, whatever. It’s also available in O’Reilly if you have that subscription and don’t want to lead-in yourself.

Here’s some fun parts:

Adopting a product mindset starts with continually evaluating the business context to manage “build versus buy” decisions. Contextual factors such as scale, compliance requirements, or the diversity of the workforce skill base and technology stacks often require organizations to opt out of an off-the-shelf solution and instead invest in a set of integrated capabilities designed for its specific needs. The resulting platform has users, and it requires design, iteration, feedback, and a clear value proposition. Without a mindset that takes all this into account, platform teams risk becoming internal service providers chasing feature requests rather than strategic enablers focused on outcomes.

That is, there’s always some long list of reasons to DIY your platform. If we accept that people don’t believe/care that that building your own platform is, generally, a provably bad idea, do you fight that, or adapt your platform philosophy to it?

The list of “contextual factors” leaves off resume-driven development and that builders just like building things, no matter the “business value.”

Next is a long passage. But it’s worth reading because what I think it’s saying - how I read it - is that the “DevOps team” lives in the app layer doing developer stuff. The platform team is not the DevOps team:

Many organizations arrived at platform engineering through their experiences with DevOps. DevOps promotes a culture of shared responsibility and faster delivery, but it is often interpreted as mean‐ ing that application developers will, or can, take on all infrastructure and operational concerns directly. In practice, this blurred respon‐ sibility leads to fractured tooling, inconsistent environments, and overloaded teams.

What is sometimes overlooked is that DevOps principles should apply across two layers: the application layer, where application teams own and operate the services they build, and the platform layer, where the platform team builds, runs, and continuously improves the platform itself as a product. Both layers use automation, continuous delivery, observability, and feedback loops, but they do so through different APIs aimed at different consumers.

This is not about reintroducing the old silos of “dev,” “ops,” or “QA.” Platform engineering involves layering APIs to ensure clarity around responsibilities, allowing teams to focus on developing for and supporting their domain without being disconnected from the rest of the delivery system. Application teams consume the platform through well-defined contracts, while platform teams consume infrastructure in the same way. Each layer can improve and evolve independently, provided the APIs remain stable and well documented. Accordingly, each team owns its flow of value through a separation of concerns. This promotes speed of delivery and lim‐ its the blast radius when things inevitably go wrong. In addition, loosely coupled teams end up creating a more cohesive experience for their end users/consumers.

The part I bolded up there is important. It makes a call-back/forward to another, big prescription in the paper: a platform is largely about wrapper layers of abstraction on-top of other systems to (a) codify the “contract” (think SRE SLOs that actually have teeth and can be automated/coded) between layers, and, (b) exist at the seams/sharp edges of the entire stack. Thus, if there is an API/contract layer, that means there’s a different team. The API/contract layers are the thick lines around your Conway charts. Which is intended to be good, and probably is.

That is, again, leads to the platform engineering world view that DevOps is for the developers, platform engineering is for running the platform. These are separate roles and responsibilities.

You kind of can’t even use an “inner loop” and “outer loop” framing on that. The platform team sits below both loops? Some of the loops? If you have the DevOps/platform team philosophy of the above, the whole way the loops are drawn doesn’t work. Which is fine. That was always a kind of tortured metaphor what could more easily just be described as the activities and the roles and responsible involved in going from idea to code to production.

There’s more in there that’s fun. For example, the four “one metric"s to rule them all they have are good. You can see that they’re going for thethe same sociotechnical-glue that SLOs are: “a reminder to have a conversation,” silo-diplomacy, and practical gauges…maybe even OKR/MBO/KPI seasoning.

Also, if you’re a connoisseur of these kinds of things, take notice that there’s little to no “in these fast moving times” talk. The urgency for doing all of this is driven by technical needs, unlike a lot of text like this where the “why do I care?” is driven by existential melt-downs in the boardroom.

I mean, that’s my read at least. If you read this this far, you should read it.