Source: Tech in Asia, https://www.techinasia.com

After five years of working in the game industry as an analyst and data scientist (DS), I decided to make the transition to the startup life and join a company working in the finance technology sector. While this is a new domain with new types of problems to solve, many of the skills that I developed in the game industry are still relevant in my new role. The goal of this post is to show how learning data science skills in one domain can transfer to new unexpected opportunities.

One of the key differences in my new role was a shift from business to consumer (B2C) to business to business (B2B). Rather than performing analysis on millions of consumers, I’m now building data products that help hundreds of non-profit organizations and for-profit companies. There’s also less of a focus on product analytics, and more of a focus on providing accurate data to our end users. Rather than exploring tracking data from video games and video streaming services, I’m now working with data from a variety of different data sources including real-estate sales and political contributions.

My primary goal in my new role is to build predictive models that provide accurate estimates of net worth for affluent households in the US. The data sources are different and our customers are businesses rather than consumers, but many of the key functions of data science translate well across these domains. Here’s some of the skills I’ve put to use in this new role.

Exploratory Data Analysis (EDA)

Being able to dig into data sets is important for any domain. A data scientist should be able to visualize distributions of data, identify outliers, and find correlations in the data.

Often the goal of exploratory analysis is to determine if there is a signal in the data for predicting an outcome. For example, in the game industry a common goal is to identify if a user is about to churn, or lapse in gameplay. This can be represented as a classification problem, but usually providing aggregate statistics, such as a funnel analysis, was sufficient for providing feedback to a game team. In my FinTech role, a common goal is to identify features that are correlated with net worth, and to determine if we can engineer additional features that will improve the accuracy of our net worth estimates.

Here’s the types of questions I investigated in these domains:

Gaming: Is a new subscription model going to cannibalize sales?

Is a new subscription model going to cannibalize sales? FinTech: Are political contributions correlated with net worth?

I used similar methods to explore these types of questions, including SQL for data munging, and R for visualization and correlation analysis.

Experimentation

One of the core tenets at Twitch was experiment to decide, and the data science team used a combination of A/B testing and staged rollouts to realize this goal. An important element of experimentation is being able measure the results of a change and determine if the change resulted in a significant impact in user behavior. In my current role, we often need to determine if a marketing campaign had a measurable impact for our customers, and similar methods can be used for this type of task. In both cases, I’ve used bootstrapping to measure the difference between treatment and holdout groups in an experiment, and test for statistical significance.

Here’s examples of experiments in these different domains:

Gaming: Did an app redesign improve mobile user retention?

Did an app redesign improve mobile user retention? FinTech: Did a marketing campaign drive incremental revenue?

Similar methodologies can be used to measure the impact of these experiments, even though the domains are significantly different.

Predictive Modeling

Another important skill for most data science roles is building predictive models. In the game industry, this often involves building models to predict users that are likely to purchase or lapse in gameplay. If you can detect which users are most likely to perform these actions, you can nudge users to take a desired action, such sending an email for players to log in and collect an in-game reward. Another use of predictive models is to evaluate feature importance in order to guide game design. For example, at Electronic Arts I built a regression model which identified that users that explored more of the playbook generally had lower retention rates than players that focused on optimizing a few specific plays.

One of the changes I had to make when changing domains was using new metrics for evaluating the performance of a predictive model. Metrics that were useful for classification tasks in the game industry, such as F1 score and ROC, didn’t translate well to tasks in FinTech such as predicting the value of a home. For this regression task, you can use relative error, correlation coefficients, or other error metrics such as mean log error.

Here’s examples of predictive models I built for these domains:

Gaming: Which users are most likely to purchase a subscription?

Which users are most likely to purchase a subscription? FinTech: Which households are most likely to donate to a nonprofit?

Similar to EDA, the same tools can be used for building predictive models across these domains and I used R for prototyping models in both.

Data Products

Beyond prototyping predictive models, data scientists should have a process in place for scaling up models for deployment. In the game industry, this typically involved handing off a model specification to an engineering team, while in my current role I am much more hands on in productizing models. In both domains, you can use an intermediate model format such as PMML to separate model training and model deployment tasks.

In both of these domains, I’ve built batch-mode data products, that calculate values for millions of records on a weekly basis:

Gaming: Calculate a subscription conversion score.

Calculate a subscription conversion score. FinTech: Create estimated valuations for real estate.

My tool set for performing these tasks has changed between roles. I previously worked with engineering teams to build custom solutions, and have shifted to using open tools such as PMML and Apache Beam to productize models.

Conclusion

I made a big shift in my data science career from the game industry to FinTech and work with much different data sources in my new role, but many of the skills I developed in gaming have translated well. Exploratory analysis, experimentation, predictive modeling, and building data products are useful skills for all data scientists independent of the domain they are applied.

My day-to-day focus has also changed, but this is due to a change in type of role as much as the change in domain. I’ve shifted from a role focused on product analytics, to a role more focused on machine learning. I spend less time building dashboards, and more time writing code and submitting PRs.