Breaking up the Monolith

You’re breaking up a monolith and replacing it with microservices. But if you’re not thinking about coupling and cohesion, you’re missing the point.

Monoliths

A monolith is often described, in so many words, as a single service that handles all back-end tasks for a given application. This description has two difficulties. The first difficulty is that the scope of an application is not always clear. Depending on your point of view, you might consider a system to consist of one (larger) application or several (smaller) related applications. That, in turn, means that whether a service is a monolith turns in part on how you define the scope of your application. But "breaking up" a monolith involves changes to the service, not just changes to some definition. Clearly, there must be more going on here.

The second difficulty is that it is a rare service that truly handles all back-end tasks. Taken literally, that would imply that the service has its own implementation of queues, databases, caches, and so on. While that’s not impossible, it’s not realistic. Realistically, even a monolithic service is making use of separate services for these purposes. Even a monolithic service can rely on other services for help.

A more apt, if still fuzzy, definition might define a monolith as a service that can be decomposed into two or more smaller services. Put differently: a monolith is a service that does two or more separable tasks. It need not be all the tasks. It need not be associated with a single application. Decomposing it may not even be a good idea. (More on this in a moment.) But these are the minimal conditions under which we can at least consider breaking it up into smaller ("micro") services.

English majors may squirm at this definition, given that the word "monolith" originally refers to a single, indivisible stone. But here the term is used in the more recent sense of an organized whole acting as a single unit. If there is some internal structure, then we can at least consider breaking it apart.

A practical and useful definition of a monolith, then, is a service composed of multiple components acting as a single unit. And if it does more than one thing, then decomposition is possible. But is it a good idea?

Microservices

The microservice label, like the monolith label, suffers from a degree of under-definition. A trivial definition might describe microservices as services that do just one thing, but of course the definition of "thing" is itself slippery.

Indeed, the "micro" prefix in the name is not a reference to an absolute size, but rather a relative term: microservices are "micro" because they are smaller than monolithic services. Thus, while the basic notion of replacing a monolith with microservices is clear—a set of smaller services replace one larger one—there is no intrinsic or absolute metric describing just how far to take the decomposition.

Coupling

Why might implementing more than one task in a single service cause trouble? The root problem is that merely packaging multiple tasks into a single service can create unnecessary coupling between those different tasks. That is, the different tasks can end up being connected to each other in ways that produce negative effects: reducing performance, increasing the cost of change, and more.

Broadly speaking, coupling can occur on three axes: code, data, and compute.

Code coupling occurs whenever code is shared between two or more tasks. As always, sharing code has costs and benefits. Sharing code tends to be beneficial when the shared code addresses a discrete, solved problem. For example, it makes sense to use a shared library to compute Fast Fourier Transforms (FFTs), as used in various signal-processing applications. The algorithm is well-known, its interface is stable, and highly-optimized versions are available. If you have two tasks in a single service that both invoke an FFT, it makes all the sense in the world for them to share that code.

Shared code becomes problematic when the it is not well-isolated, not stable, or the sharing is inadvertent. When one or more of these conditions applies, changes to the shared code on behalf of one task have a highly likelihood of impacting the other task. That is: because the two tasks are coupled via this shared code, the shared code becomes a conduit via which changes to one affect the other. If such changes are not made carefully, they may break the other dependent tasks. If such changes are made with care, then they take more time for coordination, review, and testing. Either way, code coupling reduces development velocity.

Monolithic service architectures don’t force code to be shared. Sufficient development discipline can keep tasks and their dependencies isolated, even if they are deployed in the same service. Many service frameworks (Jakarta Servlets, ASP.NET, etc.) try to encourage that sort of discipline. However, in practice, code separation is difficult to maintain without enforcement. Even when code bases are cleanly factored and every change is properly reviewed, unexpected depedendencies can sneak in. As a result, changes that should be isolated aren’t, and changes that should only impact one task disturb others. Testing for and debugging those occurrences slows you down.

If you ever had code fail to deploy because your production deployment didn’t include your unit test framework, then you’ve experienced a variation of code coupling. You probably had rules and reviews designed to prevent production code relying on testing code. And yet!

Data coupling occurs when different tasks operate on the same data. Now, in trivial cases, this issue won’t arise. If one task in your service reads customer records from the customer database, and a second task reads product records from a product database, those two tasks do not exhibit any data coupling—even if they are part of the same service.

In a simple form of data coupling, records of different types—such as the customer records and product records in our example—are accessed via the same database. Of course, the database is not the microservice, and it’s not the mere fact that both records are in the same database that creates coupling in the microservice. The coupling occurs when this knowledge of data placement works its way into the monolith’s implementation. For example, the service might create a single database connection, sharing it across both tasks.

In a more complex example, the two tasks may work on the same data. For example, consider an e-commerce system. One task manages customer records, including shipping addresses. Another task accepts orders. When accepting an order, the default shipping address is taken from the customer’s record. Both tasks work with shipping addresses.

The implications of this relationship depend on how much coupling is introduced in the implementation. If the two tasks are in separate services, then at most, one will depend on the other to retrieve the shipping address. Furthermore, the address will be retrieved via an interface, which limits coupling. Nothing about placing both tasks in the same service prevents the discipline requires tighter coupling. And yet: placing both tasks in the same service certainly reduces barriers to tighter coupling. Once the two tasks are in the same service, it may seem easier, faster, and even more reasonable for them to jointly manage shipping address data, rather than holding each other at arm’s length.

For example, records for active customers might be cached in memory. With both tasks in the same service, both could be implemented to operate against that cache. That will speed up the order task, which can now use an in-memory copy of the shipping address. But now the two are much more tightly coupled. They depend not only on the form of the shipping address (i.e., its schema) but also the cache protocol, which manages concurrent access to cache entries. Any changes to the schema or the cache impact both tasks and such changes must be rolled out with tight coordination.

Object-oriented design seems to encourage this type of coupling. By wrapping access to, say, the shipping address in a customer object, it gives the designer the sense that access to the data has been properly encapsulated. However, while object-oriented design works well enough within a program, it has nothing intrinsic to say about these larger issues of deployment and database connections. Domain-Driven Design is, in part, a reaction to the downsides of just this kind of coupling created by shared data and shared objects. It explicitly favors duplicating some information—shipping addresses, in this example—across domains so as to avoid such coupling. And while Domain-Driven Design doesn’t forbid the use of monolithic services, microservices are certainly more philosophically aligned with this design approach.

Compute coupling occurs when different tasks compete for computational resources. Compute coupling is insidious, capable of occurring even when you have no code or data coupling. Compute coupling arises in monolithic services because the service is also your deployable unit. Whether your service is deployed to a virtual machine, container, or even just a single process, all tasks in that deployment unit will be competing for the available compute capacity on that target.

For example, suppose a service consists of one compute-intensive task and one memory-intensive task. If you deploy them together, you’ll need a machine with both a fast CPU and lots of memory. If the load on the compute-intensive task increases, you can allocate more machines to this service. But because you need a machine that supports both tasks, you’ll end up allocating both more CPU (which you need) and more memory (which will go unused). Whereas, if you decoupled the two, you could pick different—and more suitable—machine profiles for each.

Or, suppose you deploy two compute-intensive tasks together, one of which is called frequently and the other infrequently. Each time the infrequently-called task is invoked, it will force information out of the CPU cache as it competes with the frequently-called task. These infrequent calls disrupt and slow down the frequently-called task simply because they happen to be deployed to the same machine. This contention can be avoided only by separating the two such that they are not coupled via CPU, cache, or other compute resources.

Function-as-a-Service (FaaS) architectures take the notion of decoupling compute to an extreme, forcing every invocable element in your code to be deployed separately. For reasons we’ll discuss below, and as is so often true in software engineering, these "extreme" designs tend to perform poorly. Engineering is compromise, and the sweet spot in design is often somewhere in the middle.

Benefits

Decomposing your monolithic service into a set of smaller services can help reduce coupling on each of these fronts:

If your code has to run in separate services, you’ll catch unexpected dependencies on shared code for other tasks when compiling, packaging, or testing. You can still arrange to deliberately share code via code libraries packaged with each microservice.
To access data from (what are now separate) services, you’ll need to move it behind an API. APIs can evolve in backwards-compatible ways more readily than sharing direct access to a cache or databases.
Services can be independently deployed to different compute resources, giving you greater control over the provisioning of those resources to each service. Using containers, you can make a separate decision as to whether deploy them to the same virtual machine for efficiency, or separately to reduce contention.

These changes can pay real benefits in development velocity. A successful decomposition will minimize all forms of coupling between the individual microservices. It’s easier to make changes, because changes are isolated to specific services. And it’s easier to deploy those changes, because deployments need not be coordinated. Decoupling accelerates the rate at which each microservice can evolve.

Drawbacks

So far that all sounds pretty promising. But there’s no free lunch, and not every decomposition will achieve those outcomes.

The problem is that reduced coupling also imposes costs. Those costs tend to show up in communication overhead. For each call dependency between tasks, you’ll be replacing what was a local, in-process function call with a remote, out-of-process network call. And network calls are both orders of magnitude slower than function calls and much less reliable.

Now, if the tasks in your monolith truly had no connections, then separating the pieces into separate services won’t result in any additional network calls, and you won’t pay this cost. But then again, if that was the case, the lack of such connections means you probably weren’t suffering much from your monolithic structure to begin with. Possibly you had some compute coupling via deployment to shared compute resources. And that’s the sweet spot for service decomposition, where you’re just trying to sidestep coupling via compute.

The toughest cases involve data coupling. Suppose that you have two tasks that were sharing an in-memory data cache for some entity. (That was the shipping address in our earlier example.) There are different methods for decoupling data, but no matter which approach you take, you’ll be paying extra network costs to transfer the data between services. You may also take on the overhead of storing duplicate copies of the data in separate databases. That’s a lot of overhead added to what was a very fast access path.

Cohesion

So how does one determine when a decomposition is likely to be worth the effort? The answer comes down to cohesion.

Cohesion describes the degree to which code in a function, module, or service "belongs together." Intuitively, a service with high cohesion comprehensively addresses just one need. A service with low cohesion is a random assortment, serving either multiple needs or only partial needs, or even some odd combination of the two.

When a service addresses two unrelated needs, it has relatively lower cohesion. In turn, the coupling created by virtue of being a single service is incidental. And there is always induced coupling because, even if the code and data serving the two needs are unrelated, they’ll be coupled via shared compute. In this scenario, both needs will be better served by splitting the service in two and allowing compute to be assigned separately. Because the needs are unrelated, there are no relationships between the two services to counteract that gain. The result is two loosely-coupled, high-cohesion services.

Whereas, if you split a service that addresses just one need, it begins with high cohesion. Breaking it up will result in two services with lower cohesion (each contains only part of the solution) and high coupling (the two services still need to work together). Thus, breaking up a single service into two parts can also result in two highly-coupled, low-cohesion services. That’s a poor trade-off, and the service is better left intact.

Decomposition

Decomposition is the design phase in which designers balance the competing concerns of cohesion and coupling, ultimately determining which capabilities will be organized into the same components and which will be separated. Loose coupling and high cohesion should be paramount concerns during decomposition.

All too often, decomposition is instead approached based on the technology at hand. If you’re using virtual machines and your process for building and deploying a new virtual machine is burdensome, it will be easier to "glom on" the next task to your existing service rather than to develop and deploy a new service. Conversely, if you use lighter-weight technologies, such as containers, to develop services, it may seem easier to spin up a new service than to modify an existing one.

When you fall into either of these modes, you’re reasoning based on the wrong criteria. And if the technology you’re using to build and deploy services gets in the way of create loosely-coupled, high-cohesion services then it may be time to make a technology change. Our technology choices should serve good design, not override it.

Decomposition is a significant topic in its own right. If you wish to explore it further, consider Righting Software by Juval Löwy. Chapter 2, titled simply "Decomposition", is an excellent exploration of what he calls volatility-based decomposition. I would argue cohesion is the more conventional term for what he describes, but that doesn’t minimize the value of his explanations.

Summary

While the "monolith vs. microservice" framing is oversimplified, it exists because it touches on a deeper truth. If you’ve chucked too many tasks into a single service, you’ve probably created a service with high coupling and low cohesion. Other things being equal, continued development on services with these qualities tends to be slow and difficult. The high coupling means makes it difficult to make independent changes. And the low cohesion means all that unrelated work is happening in the same service.

Decomposing highly-coupled, low-cohesion monoliths (services that do two or more tasks) into loosely-coupled, high-cohesion microservices (each doing one task) will address these shortcomings. You’ll end up with a set of services that accelerate, not slow, your development efforts. Loose coupling will allow you to evolve each service independently, and high cohesion will ensure that changes addressing one need occur in one unique service.

But if you just break up your monolithic services without due consideration, you risk making the problem worse. Go too far, and you’ll find yourself creating a set of tightly-coupled microservices, each with no internal cohesion. That will slow your development velocity and damage system performance just as much as any monolithic service architecture. Indeed, if you find yourself struggling to get a set of microservices to work or perform well, you may have gone too far—and should consider re-composing them back into a larger, single service.

Much of the velocity and longevity of a software system can be attributed to its levels of coupling and cohesion. Systems composed of elements with low coupling and high cohesion tend to be easier to maintain, scale, and change. Systems with low coupling and high cohesion are the systems that evolve and stay relevant.

The trend towards breaking up monoliths is rooted in these facts. But it is not always the right move. So before you break up your monolith, do your homework. Analyze the coupling and cohesion in your system and find the sweet spot that works best for your system.

Back to the Miscellany