SRE × Team Topologies

SRE and Team Topologies have always felt like they are coming from the same place - a real-life view of how organisations actually work.

What emerges from the notes, papers, and field observations I’ve collected is something much more integrated. SRE × Team Topologies isn’t a mash-up. It’s the combined operating system for how modern organisations build and run software.

The unspoken problem is that SRE keeps being dropped in without changing the rest of the organisational structure.

If you’ve ever joined an organisation “implementing SRE,” you’ll recognise the pattern immediately. A handful of SREs are hired. Some from a DevOps background, some with an ops pedigree, some aspiring software engineers who want to work at scale. Leadership hands down a vague mandate - “make things more reliable” - and then everyone looks around waiting for clarity.

The Google SRE book acknowledges this: SRE moves in and out of teams depending on scale, complexity, and operability. Sometimes SRE coaches. Sometimes it partners. Sometimes it owns production. Sometimes it steps away.

But organisations rarely build the conditions for these relationships to work. They shove SRE into a static organisational chart, often tucked away next to platform or operations, and assume it will “just work” like a normal team.

It won’t. It can’t.

Because it was never designed to operate that way.

Team Topologies provides the missing grammar for SRE - an explicit language for how teams should interact, how cognitive load is distributed, and how platform capabilities should be consumed.

The cognitive load lens changes everything

Team Topologies introduced the idea that cognitive load - not expertise, not headcount, not budget - is the real limiting factor in digital organisations. When you look at SRE through that lens, things click into place immediately.

What is SRE, functionally, if not an attempt to rebalance cognitive load?

SLIs and SLOs lower the cognitive load of guessing operational health.
Standard dashboards lower the cognitive load of knowing what “normal” looks like.
Production readiness reviews lower the cognitive load of preparing for Release.
Golden paths lower the cognitive load of doing things the right way.

But here’s the catch - none of this works if the surrounding organisation keeps increasing the load faster than SRE can reduce it. That’s why some SRE teams feel permanently underwater: not because they’re bad at SRE, but because the topology guarantees that cognitive load flows toward them like gravity.

Team Topologies gives us a solution - a way to distribute ownership so reliability becomes a shared priority rather than a bottleneck.

Some organisations treat SRE as a high-level ops team - ticket queues, on-call fire-fighting, and a graveyard of half-finished tooling. Others treat it as a platform team responsible for standardising everything from dashboards to deployments.

But SRE is a pattern of collaboration, not a fixed node in an org chart.

Once you see that, you understand why you can have a model of SRE that behaves exactly like Team Topologies’ Enabling Team pattern: short engagements, deliberate coaching, production-readiness guidance, and then stepping away so teams become self-sufficient.

Platforms, interfaces, and fractals

One of the changes introduced - or at least clarified - in the Team Topologies 2nd Edition is the idea that platforms are not single entities. They’re groupings - fractals of internal services, each with its own stream of value and its own internal customers. The platform isn’t the thing; the platform is the shape of things.

When you apply that to SRE, a new organisational reality emerges: There is no such thing as “SRE”.

There is instead a constellation of internal services that collectively enable reliability:

Observability
Deployment safety
Incident tooling
Traffic engineering
Chaos testing
Performance engineering
Authentication and identity
Cost and capacity visibility
Data pipelines and streaming

No single team can hold all this.

Nor should they.

It is too much cognitive load for any one team to absorb.

Instead, SRE capabilities must be embedded across multiple teams inside the broader platform grouping

“You build it, you run it” only works if you know how to run it

The DevOps movement popularised this phrase, but it was always incomplete. In practice, it left teams holding the pager without the support structures that make ownership humane and sustainable.

Team Topologies completes the sentence:

“You build it, you run it…

with a platform that reduces your cognitive load,

and enabling teams that grow your capabilities.”

Then SRE completes it a different way :

“You build it, you run it…

**guided by SLOs, informed by observability, supported by paved roads.**”

Only when you fuse these layers does “you build it, you run it” become something empowering rather than punitive. Ownership without capability uplift is cruelty. Ownership with SRE as an enabler leads to mastery.

SRE as an organisational balancing act

Throughout the lifecycle of a service, the SRE relationship evolves. Early on, the development team handles everything. As the system scales, SRE steps in as coach and partner, lending expertise and helping teams adopt operability practices. At certain levels of complexity, SRE may temporarily take operational responsibility - but only if the service’s scale justifies it and the application team continues to improve its operability. If operability stagnates, SRE steps back and the team regains full ownership.

The companies that get this right don’t talk about SRE as a team or a function. They don’t build giant centralised platforms; they build modular internal services with clear boundaries. And they don’t treat “You build it, you run it” as a threat or a badge of honour. They treat it as an agreement between adults who want to improve things.

SRE × Team Topologies is less a framework and more a recognition of how modern software organisations actually work when they’re at their best: Distributed, composable, sense-and-respond ecosystems where reliability emerges from clarity, ownership, and shared patterns - not heroics.

If we treat reliability as an organisational design problem first and an engineering problem second, everything else falls into place.

SRE × Team Topologies

The cognitive load lens changes everything

Platforms, interfaces, and fractals

“You build it, you run it” only works if you know how to run it

SRE as an organisational balancing act

Keep Reading

Reliability & Validation Engineering