War Stories

In 2010, the night before we launched Instagram v1, my co-founder Kevin and I bet on how many people would download the app its first day in the wild. Kevin guessed 2,500, and in an especially optimistic moment, I went big and guessed 25,000. The next day, the realist in me couldn’t believe I had hit it on the nose.

Now, on our 5th birthday, Instagram has 400 million users all over the world who upload 80 million photos and videos a day. Looking back, we’ve balanced the simplicity and craft in our original product while, just in the last year, revamped search & discovery, launched a brand-new take on Instagram Direct, and continued to release creative tools like Layout.

While our team has (thankfully) grown and evolved over the last 5 years, we’ve stayed committed to our mantra of doing the simple thing first, and are keeping it at the core of how we continue to scale into the next five years. Here’s a look at some of our biggest milestones from building Instagram over the past five years — the good, the bad, and the surprising. I hope there are takeaways that help you build and grow your own teams and companies.

Milestone #1: 1 million users in 3 months

File Under: Biggest Challenge

The first months after launch were pretty much a blur — 3AM server alert pages were the norm rather than the exception. After exploding into 25,000 users the first day, we continued to grow rapidly until we hit 1 million.

There is no motivation stronger than people actually wanting to use your product, and we went into high gear to make sure we could support the growing demand. When we started, we were running on a single server in LA with less computing power than a Macbook Pro. When I called the hosting provider asking for another server given our first day’s growth, they quoted me a four-day turnaround — 48 hours if we rushed it. Given how unpredictable our growth looked, we decided to move onto Amazon’s Web Services cloud.

Given that neither of us had deep infrastructure experience, we had to soak up as much knowledge as we could. There were great conference videos from QCon and Velocity, and articles from Facebook, Netflix, Twitter, and others. The open culture of sharing technical insights is one of the best things about our industry, and the main motivator behind our engineering blog.

Takeaway: Our mantra, “Do the simple thing first” took shape during these first weeks and months. Since there were only two of us, we had to determine the fastest, simplest fix each time we faced a new challenge. If we had tried to future-proof everything we did, we might have been paralyzed by inaction. By determining the most important problems to solve, and choosing the simplest solution, we were able to support our exponential growth.

Milestone #2: Launching Android

File Under: Most Anticipated Launch

For the first couple of years of Instagram, Kevin and I would get a single question every single time we were onstage: “When is the Android app coming out!?”

We’d started iOS-only first because we wanted to be able to iterate on our product quickly — and we were just two engineers. As we entered 2012, though, it was time to expand to multiple platforms. In typical Instagram style, our Android app was built in three months with three engineers, two of whom learned Android to complete the project along with Philip, who joined us from building Gowalla’s Android app and leads Instagram’s mobile efforts to this day.

Part of my role at the time became “Professional eBay Shopper”, since we wanted to test our app on as many devices as possible, including something called the “M865 Ascend II 2 Touch”. More often than not, we’d unpack a new phone arrival in our office, load up our work-in-progress app, and stand amazed at how well the app worked on it. The breadth of Android devices has posed some challenges for us — especially when we built our Instagram Video product — but it was pretty amazing to launch to such a wide variety of devices with minimal customization required.

Over a million new people joined Instagram in the first 12 hours of our launch — it was an incredible response. At the time, I wrote up some of our lessons learned on infrastructure too. Over time, our Android app has evolved to feel more native on the platform, and today is one of the fastest, highest-rated Android apps.

Takeaway: Starting on a single platform allowed us to focus and iterate quickly without having to implement everything twice (we often say “do fewer things better” inside Instagram). When it came time to expand to multiple platforms, we built a small team combining deep Android expertise with talented engineers who were new to the platform. Over time, building a full-fledged Android team has allowed us to adapt our app more closely to the platform.

Milestone #3: 2012 Virginia Storms

File Under: Worst Outage

I was in Portland for a quick three day weekend getaway in 2012 when my phone buzzed: “Instagram.com is DOWN”. A quick check online showed that it was beyond just Instagram — Netflix and others were experienced issues as well. I ran back to our hotel, brought up my laptop and saw a dreaded message on the Amazon Web Services status page: “Power event in us-east”. A huge storm had blown through Virginia, and almost half of our instances had lost power. The next 36 hours would be a brutal rebuilding of almost our entire infrastructure. The silver lining is that it generated this meme image:

At the time, our whole backend team consisted of myself, our first engineer Shayne, and Rick, who had started at Instagram less than a month prior. No user data had been lost, but this outage exposed how much work we had left to do in automating our infrastructure.

This outage was the kick in the butt we needed to move to a more repeatable server provisioning process. Over the next year, we moved all of our provisioning away from fragile shell scripts towards a full Chef system and substantially lowered the bar for new team members to work with our infrastructure.

We also moved away from relying on Amazon’s Elastic Block Storage for database backups, instead adopting WAL-E and Postgres’ WAL shipping replication. We also kicked off a reliability initiative that most recently yielded our Cross-Data Center effort, which has gotten Instagram running in geographically distributed data centers.

Takeaway: Having a scriptable infrastructure requires upfront work but can pay huge dividends in bringing new engineers onto your infra team, as well as helping in disaster-recovery scenarios. Also, I was so glad we’d hired engineers with the right stuff — when faced with an unimaginably bad scenario, both Shayne and Rick rolled their sleeves up and started bringing us back up, one issue at a time, Mark-Watney-style.

Milestone #4: Instagration

File Under: Most Ambitious Engineering Project

October 5, 2010: 0 users 😬

October 6, 2010: 25,000 users😜

November 2010: 1 million users😅

2012: 30 million users😆

2013: 200 million users😵

By 2013 we had 200 million people using Instagram every month and over 20 billion photos stored. Our team was growing but small, and we were thrilled by the continued growth of the Instagram community.

As time went on, we kept finding new integrations we wanted to do with Facebook’s existing backend systems — for example, their Site Integrity systems would be critical to helping us fighting spam. But doing these integrations would be difficult while we were on Amazon Web Services, and the longer we waited the harder it would be to migrate our ever-growing (and ever-pricier) infrastructure.

It was clear we should migrate to Facebook’s infrastructure, but we didn’t want to disrupt our services while we moved millions of people and billions of photos. And so began Instagration, or what I like to refer to as swapping out all of a car’s parts while it’s going 100mph. A small team of eight Instagram and Facebook engineers worked to first build a common network to move Instagram from EC2 to Amazon’s Virtual Private Cloud (VPC) using a tool we built in-house called Neti. Then we meticulously migrated our systems and tooling, including building an “ig” command-line tool that bridged the patterns our developers were familiar from AWS into the new FB datacenter environment. The end result was a huge migration with minimal disruptions.

Takeaway: Don’t reinvent the wheel. By moving to Facebook’s servers we were able to give our infrastructure a faster, more efficient home, as well as take advantage of Facebook’s other tools like spam fighting, etc. We’re able to stay small but take advantage of Facebook’s resources and experience, and move that much more quickly.

Milestone #5: Trends on Instagram

File Under: Next Big Bet

Earlier this year, we revamped Search & Explore and expanded the ability to easily find interesting moments on Instagram as they happen in the world. We introduced trending hashtags and places, and built all new infrastructure to support identifying, ranking and presenting the best content on Instagram.

Our first take on trending, back in 2010, was our “Popular” page, which was available at Instagram’s launch. The algorithm was pretty simple: effectively the number of likes on each photo, decayed by the age of photo over 4 hours. This worked great when our community was smaller, but over time we realized we needed a more nuanced approach.

Given our larger community, in 2014 we worked on personalizing Explore, bringing infinitely scrollable pages of photos & videos tailored to each person. Within a few months, our users were interacting with content at 5x the rate of our un-personalized Explore. This year, we brought back the intention of the original Popular page — a glimpse at the gestalt of Instagram — as our Trending product. With the ranking and machine learning experts that had since joined our team, we were able to adapt well-known trending algorithms to the nuances of Instagram’s community.

Takeaway: Doing the simple thing first doesn’t mean your solution will work forever. We’ve learned to be open to evolving our product, and spinning up purpose-built teams like our Datagram team, to adapt to our rapidly scaling community.

The last five years have been a wild ride for many of us, and it’s been nice to pause and reflect on occasion of our birthday. I’m sure as our community continues to grow and our product continues to evolve, there’ll be no shortage of things to talk about in my “looking back 10 years” Medium post. Here’s to the next five years!