Top 5 Skills you need to acquire before transitioning to Applied AI

1 — Statistics

In order to understand Machine Learning, a solid knowledge of statistics fundamentals is essential. This involves understanding the following:

Different ways to measure model success (precision, recall, area under ROC curve, etc.). How your choice of loss function and evaluation metric biases the outputs of your model.

(precision, recall, area under ROC curve, etc.). How your choice of loss function and evaluation metric of your model. How to understand overfitting and underfitting , and the bias/variance tradeoff .

and , and the . What confidence can you attribute to the results of your model.

2 — Machine Learning Theory

When you are training a neural network, what is actually happening? What makes some tasks doable and others not? A good approach to this might be to first try to understand Machine Learning through graphics and examples, before diving deeper into the theory.

Concepts to understand range from how different loss functions work, why back propagation is useful, or what a computational graph is. A deep understanding is crucial both for building a functional model, and to communicate about it efficiently to the rest of the organization. Following are a few resources, starting with high level overviews, and diving deeper.

Another fundamental skill is the ability to read, understand and implement research papers. It can seem like a daunting task at first, so a good way to start is to look up a paper that already has code attached to it (on GitXiv for example) and try to understand the implementation in depth.

3 — Data Wrangling

Ask any Data Scientist and they’ll tell you 90% of the work they do is data munging. This is just as important for Applied AI, as the success of your model correlates hugely with the quality (and quantity) of your data. Data work comes in many aspects, and falls within a few categories:

Data acquisition (finding good data sources, accurately gauging the quality and taxonomy of the data, acquiring and inferring labels )

and of the data, acquiring and inferring ) Data pre-processing ( missing data imputation, feature engineering , data augmentation , data normalization , cross validation split)

imputation, , data , data , cross validation split) Data post-processing (making the outputs of the model usable, cleaning out artifacts, handling special cases and outliers)

The best way to get familiar with data wrangling is to grab a dataset in the wild and try to use it. There are many datasets online and many social media and news outlets sites have great APIs. Following the steps above, a good way to learn is to:

Grab an open dataset and examine it. How big is it (number of observations and features)? How is the data distributed? Are there any missing values or clear outliers?

Start building a pipeline of transformations to go from raw data to usable data. How will you backfill missing values? What is a proper way to deal with outliers? How will you normalize data? Can you create more expressive features?

Examine your transformed dataset. If everything looks good, move on to the next part!

4 — Debugging/Tuning models

Debugging Machine Learning algorithms that fail to converge or to give sensible results involves a very different process from debugging code. In the same vein, finding the right architecture and hyperparameters requires solid theoretical fundamentals, but also good infrastructure work to be able to test different configurations out.

Because of the pace at which the fields evolve, the methods to debug models are constantly evolving. Here are a few “sanity checks” from our discussions and experience deploying models that mirror in some ways the principles of KISS familiar to many Software Engineers.

Start with a simple model that has been proven to work on similar datasets to get a baseline as soon as possible. Classical statistical learning models (linear regression, Nearest-neighbors, etc.) or simple heuristics or rules will often get you 80% of the way and be much faster to implement. Start with solving the problem in the simplest way (See the first few points of Google’s rules of ML).

that has been proven to work on similar datasets to get a as soon as possible. Classical statistical learning models (linear regression, Nearest-neighbors, etc.) or simple heuristics or rules will often get you and be much faster to implement. Start with solving the problem in the simplest way (See the first few points of Google’s rules of ML). If you decide to train a more complex to improve upon that baseline, start by training it to overfit on a very small sub-section of your dataset. This assures that your model at least has the capacity to learn. Iterate on your model until you can overfit on 5% of your data.

on a very of your dataset. This assures that your model at least has the capacity to learn. Iterate on your model until you can overfit on 5% of your data. Once you start training on more data, hyperparameters start playing a bigger role. Understand the theory behind those parameters to understand what are reasonable values to explore.

behind those parameters to understand what are reasonable values to explore. Take a principled approach to tuning your model. At the bare minimum write down the configurations you’ve used and a summary of their result. Ideally, use an automated hyperparameter search strategy. Random search can be sufficient at first. Feel free to explore more principled approaches.

A lot of those steps can be accelerated significantly by your development skills, which brings us to our last skill.

5 — Software Engineering

A lot of Applied Machine Learning will allow you to leverage Software Engineering skills, sometimes with a little twist. These skills include:

Testing various aspects of the pipeline (data pre-processing and augmentation, input and output sanitization, model inference time).

various aspects of the pipeline (data pre-processing and augmentation, input and output sanitization, model inference time). Building code in a modular and reusable way to accelerate the speed at which you can experiment.

and reusable way to accelerate the speed at which you can experiment. Backing up ( checkpointing ) models at different points in training.

) models at different points in training. Setting up a distributed infrastructure to run training, hyperparameter search, or inference more efficiently.

For more details on some of the software skills we recommend acquiring to become a quality Machine Learning Engineer, check out our post dedicated to transitioning to Applied AI from Academia.

Putting the tools to work

The resources above will help you approach and tackle actual Machine Learning problems. But the field of Applied AI changes extremely quickly, and the best way to learn, is to get your hands dirty and actually try to build out an end-to-end solution to solve a real problem.

Action Items:

Find a product you could build that would be interesting. What would make your life more efficient? What is a tool that could improve the way something is done using data? What is a data driven way to solve an interesting problem?

Search for datasets related to the question. For most tractable problems today, labelled data is what you are looking for here. If no labelled datasets exist for exactly your problem, it is time to get creative. What are ways you can find similar data, or label it efficiently, or bootstrap it some other way?

Start by exploring the data and see if the task you are trying to accomplish is possible with the amount and quality of data you have. Before you bring out TensorFlow, it is a good idea to look online for ways that people have solved similar problems. What are some relevant blog posts, and papers you could read to get up to speed on good avenues to explore?

Find some inspiration, then dive in! Remember that while Machine Learning Engineering is about building products at heart, there is a research aspect to it. You will explore models and paradigms that will prove unsuccessful, and that is perfectly fine, as it will lead you to understand the intricacies of the problem better.

Accelerate your transition

Iterating rapidly on modeling and deployment, and learning from those experiences, is the best way to quickly get up to speed. Because of this, individuals looking to make the transition to applied AI roles need to take advantage of GPU compute to accelerate their progress. We recommend using Paperspace for a hassle-free way of doing this.

Paperspace is a cloud computing platform with cutting-edge GPUs and all the latest AI frameworks. Their systems make it possible to get models up and running in a matter of minutes and to prototype recently published research within days. You can launch your own ML machine for as little as $5 using the code INSIGHTAI5.

Keeping up to date

AI is an exciting, ever-changing field. The demand for Machine Learning Engineers is strong, and it is easy to get overwhelmed with the amount of news surrounding the topic. We recommend following a few serious sources and newsletters, to be able to separate PR and abstract research from innovations that are immediately relevant to the field. Here are some sources to help out: