TL;DR: Introducing a concurrency-first REST framework called codon to make writing aggregation service easy. Create API workflows and manage tasks spanning over multiple services.

As a startup, we need to experiment with our product a lot — constantly changing our UI and our backend processes to support new experimental features to figure out what works best for our customers. This requires our developers to move fast with writing new functions and changing the process flow of our APIs and then discarding or cleaning up the failed experiments which do not make it.

Adapting to microservice architecture has given us the agility and scalability we needed for evolution of our APIs. It has allowed us to distribute API development to various teams based on functionality. It has also given different teams the freedom to choose their own technology stack. But the transition wasn’t without issues, and we would like to discuss some of them with their solutions.

Issues we faced

A monolithic system allows you to share utilities, databases, and classes. There is no clear boundary between code for one functionality and another. An API implementation in a monolithic system may call various functions to give a cohesive response to the client.

A microservice on the other hand, is supposed to be as independent of other microservices as possible to reap the full benefits of this architecture. Microservices which deal only in their core functionality are easy to develop and maintain. However to give a cohesive response to the client, multiple microservices would have to be called. Some installations do this by chaining the API calls, i.e. API Gateway calls Service A, which calls Service B, and so on. Each service adds its business value to the response.

Increasing interdependence of microservices

At Grofers, we initially started to chain our microservices. Our API Gateway would forward the request to one of our primary services which would call a bunch of other services to generate a final response. The glue code which was part of the monolithic service was written into different microservices, and as it became more complex we were in danger of writing a monolith service all over again, with just the view functions hosted on different machines.

The first implementation of an API is usually fairly simple. You probably have one or two database calls, and maybe an API call to another service. Such simple APIs are easy to implement using any web API framework like Flask or Django. But after API evolves and more feature requests are implemented, the call flow becomes increasingly complex. The number of microservices requested during an API call also increase. Microservices start to become interdependent with each other, and are no longer complete systems on their own.

This increases dependencies and obligations between teams, reduces their autonomy and slows down development.

Increasing code complexity

Chaining of microservices also increases code complexity for the microservice which ends up with the responsibility of glueing all the data together from various sources. In a lot of these cases, the final result contains data which should not even be the responsibility of that microservice. It just doesn’t seem like that the code belongs there. But due to lack of alternatives the glue code ends up being stuck there.

It is possible to have the client make multiple requests and have the data finally glued together on client side. But this is not suitable for mobile clients as any change in logic would need a full app release. Moreover this would make the backend data corruptible and insecure. For an ecommerce app, the backend needs to provide APIs as if it is a monolith service but actually have microservices powering it.

Increasing latency

Almost every incremental change to an API in a microservice is relatively small compared to existing functionality. When experimenting and moving fast, concurrency of upstream requests is not a priority. Usually these changes are small hacks which call microservices serially. These have minimal impact on code complexity and latencies. Over time these hacks add up and code becomes too complex to optimize without halting development of that function altogether.

Every single time you work on that API and look at your code, you think that you should refactor it to include concurrency wherever possible and decrease latencies. If you are working with microservice architecture you will probably recognize some tasks which don’t belong in that microservice. The code will appear unclean but the cost of refactoring is high. You make the small change to the view function, thereby increasing the cost of refactoring even further.

This will probably continue till the latency is dangerously high and the API performs so many tasks in sequence that rewriting the whole thing is probably the only sane thing to do. Lather, rinse, repeat.