10.3 Million Concurrent People Watched It On Hotstar!

This article talks at a high level about how we developed the Hotstar mobile applications from scratch before IPL 2018, the challenges we faced and how we overcame them. We will be following up with deep dives in all these areas.

Data

Data and Analytics are the key things that drive Product and Engineering at Hotstar. Our data tells us that double digit million customers engage with Hotstar on a daily basis and given the scale, we were worried, because the IPL was just round the corner.

However, we were unfazed and were committed to rewrite everything from scratch so that we could build a world class mobile app for our customers, one that in the hands of more than 10M customers, helped us to reach a streaming milestone!

Why Re-write ?

Let’s start with the elephant in the room — Why?

Well, simple rule of programming is “Don’t mess with something which is already working”. BUT it wasn’t really working out for us. Product was flooding our JIRA board with new features and we were not really able to cope up with it due to the restrictive legacy scaffolding.

So we wanted to rewrite the app because -

“Legacy” architecture and design of the app which wasn’t really scalable, and fixing such a codebase was more painful than rewriting it from scratch. Was costly too, in sheer engineering hours spent to clean things up.

Our services were planning to upgrade for similar reasons. So anyway, the app had to adopt the new backend while maintaining old backend during transition phase. That means more mess on app side if we don’t fix the existing one. Imagine a backend which expects too many API calls to display some data and app has to process it to translate it to extract what it actually needs !

We wanted to adopt new design guidelines and lot of new things on architecture side to make app faster, smoother to deliver seamless streaming experience !

Challenges !

Given the impact it could have on our customers, we were reluctant to touch the existing working code to make any kind of drastic changes ! So we started brainstorming -

Is it a good idea to deliver a complete new app developed from scratch to 100 million user base just like that ?

What if something breaks after app goes live to all these million users? Do we have back up plan?

If not from scratch then how do we do it?

…and few more challenges like these !

Rocky !

Once upon a time it all started with one whole monolithic structure called “app” which contained everything. (UI, Business Logic, Network Layer etc.) We decided to build new module, internally titled “Rocky” with some supporting modules to fight new challenges! We decided that rather than do a “big-bang”, we’d layer on newer “Rocky” modules and start swapping functionality in place. Release incrementally, learn and then keep going, so as to ease the impact of any new changes.

This is what our project structure looks like:

App — Old App

Rocky — New App (UI + Business Logic)

Network Module — Supporting Network Module For Rocky.

Player — Handle One Or Multiple Players.

Common Utils

Download — Download Manager

Now “app” and “rocky” were co-existing and we didn’t have to worry about touching the forbidden app ! Pheww…

Strategy For The Battle

Strategy 1 : Delivering app in patches instead of 1 big update !

We picked features one by one and started writing it from scratch in Rocky. We created 2 way syncing between app and Rocky. This way Rocky and app could communicate with each other directly or through interface. Rocky is a separate module and it’s directly accessible to app but the challenge was to make Rocky aware of all the data the app was storing or updating in runtime.

We solved this issue by setting up interface link between app and Rocky. e.g. PreferenceChangeListener would help us sync changes in preferences and update it accordingly in app and Rocky both.

We set up A/B Tests through Firebase Remote Config. e.g. New user comes to app and wishes to sign up / sign in. Here Firebase A/B tests decide whether user should continue with app or with the shiny new Rocky. Basically every feature was config gated.

Once a feature built in Rocky was stable enough, we just made it available to 100% of the users. Just had to set “User in Random Percentile” User Property condition in firebase to 100%. That’s it ! One complete feature now moved from app to Rocky ! Similarly we migrated all other features from app to Rocky.

In case anything would have gone wrong in Rocky we could have jumped to app quickly without code changes.

Strategy 2 : Back Up Plan !

Rocky was getting developed using the old backend services but the problem was to have such an architecture where if we require to move to a new, more robust and scalable backend, we would not have to touch our UI layer. With that in mind, we designed and developed the Network Module.

This module will have the ability to switch between old and new backend or to make it more generic, ability to switch between multiple backends at runtime. During transition phase, in case we find any issues at any point of time we wanted to jump to old working and stable backend.

So here is what we did -

Rocky was made completely agnostic to the backend.

Network module was made responsible to handle that burden and deliver the response what rocky is expecting irrespective of backend.

High Level Design Displaying Linking Between App, Rocky & Network Module

Rocky was built using Android Architecture Components (Let’s jump there later). ViewModel layer from rocky implemented HS Interface Layer from Network module.

Network Module was responsible for filling up Rx Observables what rocky was expecting. It used same Firebase Remote Config to make decision amongst backend services.

So another problem was solved !

SOLID Architecture

Time when we decided to rewrite app, Android Architecture Components were still blooming and getting mature. We used following android components extensively to keep the design loosely coupled as much as we could.

We used Fragments extensively. They helped us to keep everything modular. Just look at the list of some features a player page alone holds -

Player

Vertically and horizontally scrolling lists showing other contents. Type of data displayed and design in these lists varies based on what type of content it is.

Watch and Play, Emojis.

Heatmap and Key Moments.

Different type player controllers. — Live, Ads, VoD (Episodes, Movies etc.)

Different type of Ad formats

Nudge to ask user to login.

Nudge to ask user to pay for All Live Sports.

Chromecast

Content Desription

Error View and lot more !

Performance

At Hotstar we are relentless when it comes to performance ! We analysed our data and we figured out that we could cache certain responses and certain data rather than requesting it from the backend every time. This improved our app start time significantly and reduced load on backend too !

We made sure users are getting their personalised content right there on home page so that they can jump to playback experience in minimum no of clicks.

We monitored every severe error and fatal exceptions occurring on millions of devices and tried to figure out it’s root cause. We rectified them or added retry mechanism whatever it was applicable. In case data was not enough, we added more metrics to our analytics to get more data around the problem. e.g. Carrier, type of network, country etc. As IPL started, we started getting appreciations on social media.

Appreciation On Social Media On The 2nd Day Of IPL 2018

We kept on monitoring our data and pushed more updates through weekly release and made our app stronger everyday and every week ! We are proud that we created an robust and scalable app in 4–5 months that sailed through such a major tournament.

After The Battle

We are not yet done here and we have more battles to win! So we are hiring! Find more about openings, tech stack and key people here :