Inventory platform is powered by MongoDB. As soon as the inventory data is consumed by the edge services, it is stored in a database and transformation operations are applied at the database level, to keep the load on the backend application minimal.

The core service in inventory:

It is a data pipeline service which performs a series of tasks like data transformation, validation, enrichment, Finding anomalies in the inventory and applying business rules and alerts on top of it. Exposes a list of API’s to create, update, fetch, delete and execute the pipelines. This pipeline is designed in such a way that only the configuration is being passed between the tasks, actual data to be processed is stored in the database.

1.2 How Does Fynd Receive Inventory

Inventory Source Nature can be broadly divided into three major categories:

Inventory Sources and Nature

1. Low Velocity High volume

This means the frequency of inventory received is low however the article updates is high. This is usually the case with brands having POS as their Inventory Management Systems.

2. High velocity low volume

The inventory updates for this category are fast but the articles’ update is low. This case is majorly for the brands functioning from a single stock point where they have integrated with an omni-channel — a platform to manage both orders and inventory across marketplaces.We have done real-time integration with omni-channels like Unicommerce, Vinculum to name a few.

3. Medium Velocity Medium Volume

The above two categories suffices most of the brands, however there are many brands with no such automation readily available. For them, we have other ways of sharing inventory like web services or SFTP/FTP dump. Where we provide endpoints for them to hit or we hit on their endpoints. These are not exactly real time but we can increase the frequency to make it near real time.

2. What were the challenges we faced

We had three different sets of challenges to scale the inventory platform. One with the infrastructure, another with the database scaling and the last one with design of the inventory platform.

2.1 Infrastructure Challenges

Our complete infrastructure is on AWS. We had a single EC2 instance for inventory platform on which all the services were running which created below challenges:

Downtime while deployment: We had only one copy of all the services hence every time we do the deployment, we had to face 60–120 seconds of downtime. High possibility for Single Point Of Failure (SPOF): Since complete platform was running on the single instance. If that instance goes down then entire platform could go down. Unable to Scale Horizontally: We did not have the infrastructure in place for platform to scale services horizontally, the only option left was to upgrade the CPU and RAM of the instance. However, we found another way which we’ll understand in the third section.

2.2 Database Scaling Challenges

Database Bottleneck: Due to a very high number of transformation operations performed in the database, the platform started facing database bottleneck. We needed some solution to distribute the database load across multiple database instances. Hampered Throughput: Requests for High Velocity inventory data were much frequent than the requests for Low Velocity inventory, which is given — however it hampered the amount of total jobs processed thereby impacting throughput.

2.3 Challenge with Design

We only had one challenge in our design which was:

Before Centralised Control Architecture

No Centralised control: To handle downstream system failures without affecting the continuous inventory feed from multiple sources we needed a central control where we could pause or resume the inventory processing.

We had a queue in between edge services and inventory core service to control the inventory feed but some edge services were directly communicating with inventory core service by skipping queue which enforced us to start/stop those individual services to pause the inventory feed processing.

3. How we overcame the challenges and achieved our Goal

Now that we were aware of the challenges to be faced, we started tackling them one-by-one.

3.1 Infrastructure: A New Approach

Idea was to make every service in the inventory platform scalable. In turn, complete inventory platform would become scalable.

We had three options open for scaling the platform:

Legacy Approach (Deploying a service on EC2 instances in auto scaling group sitting behind a load balancer)

(Deploying a service on EC2 instances in auto scaling group sitting behind a load balancer) Moving to Kubernetes

Moving to AWS ECS

We did not choose Kubernetes since our existing infrastructure is on AWS which does not provide managed Kubernetes service and setting up Kubernetes cluster on our own was time taking and complex. Legacy EC2 scaling has pain points of managing AMI builds, ansible configurations of different environments on host causing operational issues taking a lot of engineering effort which could be leveraged somewhere else. So we were left with Moving to AWS ECS which we did. ECS is a managed container service provided by AWS. It provides easy cluster management, flexible scheduling of containers and many other useful features out of the box. We dockerized all the platform services, migrated to ECS and created a cluster in ECS which ensured high availability of all services. We implemented deployment strategies to avoid downtime of the service while deployment. For some services, we implemented rolling deployment strategy and for other services blue-green deployment strategy.

3.2 Resolving Database Scaling

First, we’ll understand how we tackled bottlenecking of a database: