Microservice everything isn't always good

I watched a video recently about the hidden costs of microservices, and it put words to something I've been thinking about for a while. The video walks through the usual pitch - break the monolith apart, give each function its own service, deploy independently. And it's honest about the benefits. Independent deployment and isolated scaling are real advantages when the conditions are right. But the part that stuck with me was the cost side. Not the cloud bill, which is at least visible, but the labor cost. The platform engineers, the on-call rotations, the hours spent debugging systems that nobody fully understands. That cost is invisible until someone actually goes looking for it.

I've seen this firsthand. I worked on a team of five or six engineers that decided to go the microservices route. The problem was that the workflows we were splitting up were inherently linear. Step A had to finish before Step B could start, and Step B had to finish before Step C. There was no independent scaling to be gained, no team autonomy to unlock. We added network overhead between steps that had to run in order anyway.

Debugging became the real problem. In a monolith, when something breaks, you read a stack trace. One file, one line number. When we had the same failure spread across three or four services, nobody could see the whole picture. You'd find that Service A reported success, Service B timed out, and Service C was working with stale data. Figuring out what actually happened meant stitching together logs from multiple containers and hoping the relevant lines hadn't rotated out. We spent more time on that than on building features.

The video covers the Amazon Prime Video case study, and I think it's the clearest example of this pattern. Their video quality monitoring pipeline had been built as distributed microservices - separate services, separate compute, separate operational overhead. When their team actually looked at the numbers, they collapsed it into a monolith and cut infrastructure costs by about 90%. The bottleneck had been the overhead between services, not the compute inside them. The pipeline was sequential, and distributing it just added overhead without any scaling benefit.

What's worth noting is that Prime Video didn't abandon microservices everywhere. Their consumer-facing features, recommendation systems, and content delivery stayed distributed. They just recognized that one specific workload was paying the full complexity tax without getting any of the benefits back.

The video also brings up Conway's law, and I think that's the piece most teams skip. If your org structure doesn't change when you adopt microservices, you end up with all the operational complexity - the Kubernetes clusters, the service mesh, the distributed tracing - without the team independence that makes any of it worthwhile. You trade merge conflicts in a shared codebase for cross-team tickets to a platform team, and the coordination overhead comes back in a different form.

To me, the question was never monolith versus microservices. It's whether the complexity you're taking on is actually delivering value, or whether you're just paying for complexity. Every service boundary you introduce is a network hop, a serialization cycle, a potential point of failure, and something someone has to be on call for at 2 AM. If the workload is genuinely independent and the team structure supports it, that tradeoff can be worth it. But if the workflow is inherently sequential and the team isn't structured for independent ownership, the complexity isn't buying you anything.