As solutions architects today, we wear a rich tool belt and often solve complex problems. It’s an exciting time. However, at the same time, I’ve been part of confusing conversations trying to navigate solutions amid complex requirements and hype. Almost all of those conversations were about facing the ever-increasing demand. Demand for services, products, or increased efficiency in delivering any one of those. Ever-increasing demand has made all of us to relook and rethink how we can scale our business functions and our services to maximize throughput. However the history teaches us that this is not a recent phenomenon; this is why IT moved from mainframes to RPC, from RPC to SOAP/XML-RPC, and then to modern RESTful styles.

We understand that disaggregation of systems can accommodate increasing demand. That's fairly straightforward. However, I feel the confusions lie at our technology decision making process. This rant is about those confusions and misguided decisions.

What Do You Mean by Disaggregated Architectures?

This is your old CORBA platform, your in-house SOA initiative, or the newer microservices implementation. All of these are implementations of a software architecture that outlines a framework for building and communicating in a distributed system. Rather than building a monolithic application and scaling it to accommodate demand, you create a domain model and implement each component of the model individually and network them together to provide a cohesive business experience.

When I look at distributed systems architectures, I broadly see two categories; Distributed Object Architectures (DOA) and Distributed Services Architectures (DSA). All others are patterns that are derivatives of these two. DSA is more liberal in how it communicates while DOA performs better and conservative in communication. I would categorize SOA with REST/HTTP as DSA while microservices with gRPC as DOA. I am a big fan of service contracts, loose coupling, domain driven design, and fine-grained systems. These are attributes of SOA, and not so surprisingly, these are the key attributes of MSA too. You can call these principles in different names in each paradigm but the essence and the end goal is the same. I see these principals as pillars of disaggregated architectures.

Cost of Disaggregation

We often overlook the cost of disaggregation. This is not just a recent phenomenon in the age of microservices. Historically architects and technical leaders have underestimated the cost of implementing a fully distributed system. The framework of thinking was influenced by positive experiences of programming the monolith and by overlooking the complexities of the distributed system. Remember the fallacies of distributed computing? The distributed systems developer should be aware of the network, it’s much more of a chaotic environment to operate than the constrained monolith. The rise of libraries, frameworks, and technologies to solve these set of problems prove that this added burden is taking a toll on opting for distributed architectures.

Investment in tooling, processes, and governance is significant in comparison. Maturity in SDLC contributes massively towards the success of a distributed systems implementation. This has been a reason why many SOA projects failed in the past and why today MSA seems more realistic. It’s that maturity of tooling and processes have significantly increased.

I do not like the idea of explaining MSA by saying its “SOA done right.” Does it mean SOA was practiced in some wrong way years ago? (fun fact: WSDL v1.0 was released in 2000, that’s 18 years ago) At the time when SOA emerged, the technological landscape has been very different. CI/CD has not evolved to this level, infra as code is limited to shell/bash scripts, devs are deploying code in bare metal mostly and barely into VMs. So the “right” way of doing SOA as perceived by today’s architects has a lot to do with the massive advancement of the enabling systems landscape and has almost nothing to do with how the architecture has been implemented.

I cannot highlight enough about the importance of developer discipline and responsibility. A disaggregated services platform essentially means that the ownership and the responsibility for best practices are also distributed. This means investment on a mature engineering team, coaching, mentoring and training, while equally or more investment on site reliability engineering (SRE). It’s true that these type of costs are applicable to the monolith, too, but in comparison disaggregated platforms depends heavily on these smart investments.

Cost of the polyglot development is debatable. On one hand, by enabling polyglot development, you enable faster development with the best technology for the job by the masters of that technology. On the other hand, organizationally you need to invest in different tooling and technology ecosystems to enable polyglot development. I see this as a delicate balance and something that can quickly go out of hand.

Confusion in Architectural Principles

SOA has defined means for distributed systems modeling and communication quite elegantly. I believe it’s the most well thought architectural paradigm that can hold true through time. The way we are implementing SOA will evolve with technological advancement and rich tooling. If you look back, SOA was all about heavy specifications and strict rules (remember WS-* ?), communication standard was well structured and rigid (WS-A, WS-RM), governance was with specification heavy registries like UDDI. But then came REST/HTTP and the rules were relaxed. WS-* solved almost everything with a SOAP-based specification, while REST/HTTP went with alternatives (like doing RM with messaging, instead of handshakes and ACKs over the wire). SOA evolved over time and today we see SOA being implemented with microservices.

As I wrote before, demand is driving the need for disaggregation. Demand for throughput, efficiency, and productivity. Demand broke the monolithic application to services, the demand later made us realize that those services are monoliths themselves and can further be reduced to micro-components (microservices) and today we are reorganizing our distributed systems landscape with microservices that are very much fine-grained with shared-nothing type of designs. In my opinion, not realizing what drives the need for disaggregation is the foremost confusion. Choosing the right architecture for the need is the first and foremost. A monolith is not always a bad thing.

“ESB is dead in microservices architecture,” #noESB - Confusion of “ESB” to the “Service Bus” pattern and the place of ESBs in microservices architecture is a sensitive matter. This is where I think revisiting fundamentals is important. MSA advises against a centralized intelligent bus and this happens to be the “ESB” in SOA. The anti-pattern is that you should not put domain-specific business logic in a centralized infrastructure. It will kill agility, will be hard to manage, control and trace. This is true. This is one of the caveats of how some of us have implemented SOA using the service bus pattern. This is also partly how technology vendors positioned their specialized service buses, so instead of using the tool to implement a service bus you have given into all the bells and whistles (workflows/choreography, transactions, messaging, compensation, EDI transformations, data mapping etc.) of the vendor product for convenience. This will ultimately make all your services ecosystem in a large distributed system depend on a centralized master or an orchestrator and worse, you will depend on a vendor on one technology stack.

SOA states that you can decouple services through messaging and the communication will be more asynchronous, highlighting the “service bus” and “messaging” as architectural patterns to solve a problem. The mistake we have made in the past is to use one bulky piece of infrastructure to put all that decoupling logic in a very centralized fashion. This is also a reason why back in the day (and still) frameworks like Apache Synapse, Apache Camel became very popular. They were lightweight and enabled devs to deploy service bus (mediation) logic in more decentralized fashion. However, my view on this is that the elements that enable us to be more disaggregated today were not present five years ago. So today we have the luxury of thinking “distributed first” in a much more efficient manner. This is why I see “ESB” as a confusing term and pity #noESB rhetoric. I try to highlight that ESB is a pattern that can be used effectively in microservices architectures alike.

“Smart endpoints and dumb pipes” is another misunderstood idea. Under this characteristic, Martin Fowler explains why an ESB is an anti-pattern. But does that mean MSA advice point to point connections? This certainly bothered me for a while. The best explanation I found was at Sam Newman’s book on “Building Microservices” this is where he explains orchestration and choreography with regard to MSA. He argues in favor of choreography over orchestration for service interactions in MSA which of course increases decoupling, putting more “smarts” into the endpoint.

However, he emphasizes a mixed architecture where certain services acting as orchestrators, chaining anemic CRUD services. So that is definitely not point-to-point connections. In a mixed architecture of orchestration and choreography, you will have certain units of logic acting as service coordinators or buses and some as message producers or subscribers. Now, if you unpack the phrase “smart endpoints and dumb pipes,” you understand that MSA gives prominence and advice for choreography. But the questions for architects is can you design every system in such a reactive, asynchronous way? The point I am making is that there are great use-cases for choreography and then there are some for orchestration but almost always never for point to point coupling. This very much aligned with principles of SOA too.

“Microservices architecture is about building a service mesh” is another misconception. My opinionated view is that if your microservices architecture is a mesh network of microservices, connected point to point over the wire (HTTP/gRPC), then that’s a mess in making. This doesn’t mean I undermine service mesh frameworks (like Istio, Consul, etc.). It’s a great technology that provides God's view to a chaotic landscape. However, I believe using a service mesh framework to validate/prove an architecture of a mesh network of microservices is illogical.

In contrast, when I work with microservices application projects I try to prevent creating a massive mesh of services. Whether I have a service mesh framework or not, a mesh of services is harder to maintain, troubleshoot over a period of time. I like to use messaging (queuing/pub-sub) more often (choreography) for inter-service communication and limit service chaining and orchestration for must have request-response flows. Looking at the two diagrams below, you can see that I am depicting the use of service mesh in both models, however, I am biased towards the model on the right as that avoids point to point service coupling.

Asynchronous inter-service communication is a mental shift, not believing so is a fallacy. I feel this is something we often overlook. As we disaggregate more and decouple our architectures with messaging, as we discussed before, the burden of realizing that architecture falls upon the developers. Integrating services through messaging is not the same as programming request-response flows. The mental shift is in process modeling, handling transactions, providing acknowledgments for upstream services, compensating the errors and programming for disaggregated fault tolerance when there is no apparent call stack.

Every Problem Is Not an Internet-Scale Problem

As architects, we are exposed to a rich set of tools. Today information and knowledge are freely available. Open source is winning industries. Large internet companies are opening up their powerful frameworks, technologies, and infrastructure for the masses to learn, consume and improve. If demand is the cause for disaggregation, we have to clearly understand what type of and the scale of a problem we solve.

We have to realize that not every problem is internet scale. In Fact, we rarely see internet scale problems that Netflix or twitter try to solve every day. If you are opting for a microservices architecture then the scope of the microservice is up to you to define. As enterprise solutions architects, we can decide if a given microservice is an anemic CRUD service or a service itself for read (or write) operations following a pattern like CQRS. We have to decide how economical it is for us to choose finer granularity for the sake of architectural fad. We have to realize the costs of disaggregation as there is no architectural pattern without its tradeoffs.