At Tapjoy we have a team that focuses on optimizing our infrastructure. We've identified areas in our infrastructure that can be rewritten to reduce our cloud expenses as well as prepare our infrastructure to scale up to handle our traffic growth. To this end we're employing more and more Go because it's a strongly typed compiled language that is easy to learn and gives performance benefits right out of the box compared to other VHLLs. Of late we've been looking to inject Go into heavy compute areas that would normally require scaling up clusters and increasing our costs. Without giving away too much of the secret sauce, we’re breaking off pieces of a monolith rewriting them in Go. More of a migration, if you will, and less of a microservices story.

One of the problems we faced in this migration is the models we needed were partially computed and the focus of this migration was on porting the logic, not porting those models and computations. We didn’t have the time to do that work. The models were, however, behind JSON hyper-schema managed endpoints so we could easily retrieve the raw and computed data in JSON format. We found generating the JSON document containing all of the required models (5,000+, 43M/7M compressed) took too long to retrieve via an HTTP GET. Instead we elected to periodically generate the JSON document and push the data to S3. The new Go service periodically retrieves the data from S3 and deserializes the JSON back to the models we need. This decoupling allowed the new service to periodically update data without blocking requests with the downside that the data might be slightly out-of-date, yet still well within our SLA. We left the database behind, for now, but avoided the cost of porting over a lot more code than we wanted. Also, we leveraged the existing legacy system to do a lot of our work and allowed us to focus on the key business issue: reducing our cluster size by migrating our compute intensive area to a more efficient language.

Synchronizing our models from S3 proved quite simple and efficient--it takes roughly 3.5 seconds to download the 7M JSON file, decompress it and deserialize it into our structs. In development mode we found it useful, nay, necessary at times to override S3 and read from a file. If during development a field type was changed we would have to modify the hyper schema definition in the legacy system, re-generate the JSON blob and then upload that document to S3 so that it can be read by the new service. However, the JSON blob schema has changed and breaks the service for everyone else on the team. We found it was better to read from a local file for the duration of development and testing. How could we have added this simply? We could have found a package to mock S3 but we wanted something lighter and under our control.

Let's start with a contrived code example that we'll morph to use a simple pattern to abstract S3: