I started at the top, first working on the dashboard which would serve the executive team. This would be in the form of a weekly tearsheet of metrics to be handed out at the executive meeting every Tuesday. I worked closely with our VP Product and VP Operations, iterating on different ways that the company can be measured.

We went back and forth and iterated over a few solutions. Over a few weeks, the format was settled. The tearsheet would list out the most important metrics relevant to each team. The tearsheet would go back five weeks.

In separate tabs, each metric in this document would be clearly documented with precise definitions.

Next to each weekly number, there would also be a projected number, drawn from our master planning document. We could compare the two to see if we are on track with our overall strategy.

This exec tearsheet is our most important dashboard. It lays the foundation for how people at 500px should be thinking about their work. From this tearsheet, dashboards for the other teams followed.

I ended up creating a set of dashboards for each of the following teams:

Product

Web

Mobile

Marketing

Sales

These dashboards were done in Periscope and were visual. Each dashboard included the metrics that the teams were responsible for in the exec tearsheet, and also included other supporting metrics. There would be one chart per metric, usually a line chart. Each chart had definitions included with it, so users can hover over over the info icon next to the chart and read the definition.

There were two ways to access these dashboards. Users can log in to see their dashboards anytime they need to. But I also set Periscope to email each team their dashboards every morning.

Emailing dashboards out was especially helpful to developers. They didn’t need to log into Periscope very often and thus didn’t need to build the habit. They can get everything they need through email.

On Defining Metrics

There are better resources than my own experience for how to measure companies. But I will talk briefly about the themes for each dashboard.

At the executive level, the tearsheet includes metrics that are mostly daily averages by week. This makes it easy to stay within one frame of reference, for example when comparing against daily counts of active users.

The metrics that we measure exist in relation to our engagement funnels. You can think of three funnels at 500px:

visitor -> signup -> daily active -> daily engaged -> paid subscriber

visitor -> signup -> photo upload -> photo submit to marketplace -> photo sold on marketplace

visitor -> signup -> purchase photo from marketplace

Each team owns different parts of this funnel for different products:

The marketing teams own (page views) and the top and bottom (revenue) of this funnel

The product teams has less of an emphasis on top of funnel metrics

The development teams (web and mobile), want to see the entire funnel with respect to their own products.

The book Lean Analytics promotes the use of ratios (uploads/dau per date) over counts (uploads per date). Our team decided that counts would be easier to socialize up and down the leadership chain.

Counts are used for everything except product metrics, where we focus on measuring features in two ways:

Frequency of use (# of times a user uploads photos a month) Magnitude of use (# of photos per upload)

Implementation

Rolling out the infrastructure and new measurement system requires training. People need to learn how to use Periscope and have a clear idea of what the metrics are all about.

Periscope Training

Periscope is based on SQL, so people need to know how to write SQL to use it effectively. There’s two great resources that I point people to:

Once SQL skills are acquired, then I give them the Schema Definitions Document. In this Google Sheet I define every table and column in the data warehouse. If a column is categorical, I list its possible values.

Metrics Evangelism

Just giving everyone dashboards and reports is not enough. Everyone needs to get on the same page with respect to what we are measuring.

I hosted an hour long lunch and learn where I described the measurement framework to the entire company. I spent a good 45 minutes talking about every metric that we use. I tried my best to be fun and engaging in the light of such dry material.

This is necessary. Not everyone knows what a daily active user is. Its not obvious. There is great confusion when dozens of metrics are thrown around but there’s no consensus on definitions.

On top of this, for a company to be data driven, everyone needs to know how their work fits with respect to the metrics. So metrics needs to be ingrained in the culture, talked about from when people set their quarterly objectives to when problems are solved tactically.

The communications campaign needs to be ongoing.

Data Accuracy

Metrics are often wrong. I think this is something that is often not talked about as much publicly out of general embarrassment. But every company faces this.

There’s so much room for failure in this system:

Logs could get parsed wrong, and you undercount or over-count an event

The log sender on an API server could fail and you don’t notice and you miss a fraction of your events

There might be network issues one day, and 10% of your log entries might fail to send to S3

Some metrics that are important might be hard to query in the database and you might pull the wrong number

etc

This doesn’t include hard fail events like your ETL’s failing and the metrics not being refreshed. These errors are silent.

Errors will happen. Someones butt is gonna get kicked every time metrics need to be restated to the board. But its hard to avoid completely.

How do you combat this?

Always be auditing.

Tie important numbers to external sources of truth. We pull Daily Active Users from our data warehouse and compare them against Google Analytics daily. We also do a match of our subscription metrics to Stripe/Paypal.

We also cross reference the fact tables that we create from the logs, to the dimension tables that we create from MySQL. This is a check against parse errors. For example, we compare new users in our users dimension table, to signup events in the signups fact table.

Lastly, we rely on our teams to report when they think the metrics look funny. Unexplainable outliers, jumps or dips in numbers, or a pair of inconsistent metrics — these are all signals that something might be wrong.

But the system is still crazy complex and hard to QA. After talking to people in analytics roles at many different companies, realistically you should trust your numbers at most with an 80% degree of confidence.

This is a hard truth to swallow.

Working with Engineering

Analytics is a partnership between several business units: product, marketing, customer support, sales, and engineering.

In particular, care must be taken to manage the relationship with engineering.

Analytics needs to be on the engineering roadmap. At each feature release:

data models need to be easily queried for analytics

event logging needs to be put in for relevant user actions

At the same time, analysts need to be aware of how engineers work. They need to know the cadence of the sprints, how work is distributed on the teams, and how best to put in requests for engineering time.

I was fortunate that as my needs grew, engineering and product were both considerate. On my side, I had a bit to learn about how to work with established engineering teams.

20% Rule — Analytics Needs Evangelism

Its funny. Considering that in my last role I sold software to tech companies, I greatly underestimated how much selling this role would need.

Without the right structure, the work that analysts do is hard to understand. They are alone in an organization, drifting from team to team, project to project. Its hard to stay motivated without visibility and a support network.

I’ll give you my own experience.

My plan at 500px was to build analytics in stages:

infrastructure reporting analytics predictive modelling

Analytics, the part where analysts dig deep into the data to figure out the business, wouldn’t be effective to do until reporting and the infrastructure were fixed.

But when questions from the top start floating to you about why you aren’t doing your job — why the analytics part isn’t being done, and you are still working on building out the infrastructure, its highly demotivating. Its hard to push back unless you had buy-in into your roadmap in the first place.

Time also needs to be spent communicating wins. People need to feel excited about the progress happening on the data front. It helps with getting stuff done that you need help on, and keeping politics off your back.

People don’t naturally get data, why its important, and how its done. All three are important to the success of the analyst.

If I had to do anything differently, or next time, I would allocate 20% of my time towards evangelism. Building processes for communication. Lunch and learns. Presentations. Socialising. Selling.

Spend 20% of your time selling. Its important.

Reflection

In the past year, we’ve fixed the data infrastructure and have a sane reporting system. We’ve rolled this out to the company, with training and socialization.

Hurdles faced along the way included dealing with lack of resources, and inter company politics. But they didn’t stop us from getting things done.

We now have a system where people know how the company is doing, and can self serve their own analysis and data pulls without asking the analyst. We’re far from where we started and well on our way to transforming 500px into a top data-driven start-up.

What’s Next?

What a year. We haven’t even scratched the surface yet with respect to analytics. There’s so much more to do:

Front end analytics like Google Analytics aren’t well understood Modelling out the whole pirate metrics by channel Segmentation of our user base Causal modelling for the impact of feature changes Creating a LTV model to use for marketing A process for A/B testing Expanding the team

Its going to be an exciting time.

But I’m stepping away from my role. My plan for coming into 500px was always that it would be temporary, and that I would eventually move west to San Francisco.

In July, I will be moving to SF to join Wish. I’ll be working on their data platform.

I’m excited to head to tech mecca. And I’m looking to meet others that are interested in data.

If you want to grab coffee in SF, or are interested in replacing me at 500px (we are looking for a senior data engineer, business analyst, and data analyst), send me a message at shanzhen.hu@gmail.com. All jobs are posted at https://500px.com/jobs.

Thanks to @josephby for being a great boss, @andyyangstar for bringing me in, Kevin Martin for taking over the infrastructure after I left, @robbraun for showing me how Uken Games implented its analytics infrastructure, and @seanpower for a great talk about data. Its been a good year.