Top 5 Skills you need to acquire before transitioning to Applied AI

1 — System Design

When machine learning and deep learning is employed to solve business problems, you must design systems that consider the overall business operations. The system’s components should be architected in a modular way, operate under a solid logic, and have extensive documentation for others.

Questions to ask for each project:

How do you efficiently train your model without impacting day to day production?

Where do you store and backup models?

How do you run quick inference?

What are concrete metrics you can relate to your model?

If needed, how do you integrate a human feedback loop?

Do you need deep learning to solve the problem, and if so, why?

2 — Structured ML Modules

Jupyter notebooks, while wildly popular for rapidly prototyping deep learning models, are not meant to be deployed in production. For this reason, academics should push themselves to build structured ML modules that both use best practices and demonstrate you can build solutions that others can use.

Action Items:

Take a look at this GitHub repo (and related blog) from an Insight AI Fellow that, in addition to having some exploratory notebooks, converts these ideas into a well structured repo that can be called from the command line.

Read up on Tensor2Tensor (repo) and work to expose your model’s training and inference through an elegant API.

3— Software Testing

Academics often run code to find and eliminate errors in an ad hoc manner, but building AI products requires a shift towards using a testing framework to systematically check if systems are functioning correctly. Using services like Travis CI or Jenkins for automatic code testing is a great first step to showing you can work in a company’s production environment.

Action Items:

Check out a good starter blog post on testing by Alex Gude (Insight alumnus, now Staff Data Scientist at Intuit).

Read Thoughtful Machine Learning, which goes more in depth on how to test machine learning systems.

Read this paper on the tests and monitoring that companies care about for production-ready ML

Work through how you would test machine learning algorithms. For example, design tests that ensure a piece of your ML system is modifying data in the way you assumed (e.g. correctly preprocessing image data by making it the correct size for the model to use).

Check out testing options in Python.

Read this article on the differences between different continuous integration services. We recommend trying out Travis CI as a quick intro into this industry-standard practice.

4 — Integrating with data infrastructure

No matter what company you join, you will have to access their often large data stores to provide the training and testing data you need for your experiments and model building. To show that you would be able to contribute on day one, demonstrate that you are able to interface with structured data records.

Academics typically experience a world where all the data they use can be stored locally, which is often atypical in industry. Similarly, many competitions and research problems are structured in a way that academics only need to use a folder of images.

To demonstrate industry know-how, academics should show that they can (1) query from large datasets and (2) construct more efficient datasets for deep learning training.

Action Items:

5 — Model Serving

It’s one thing to have built a solid ML or deep learning model that has excellent accuracy. It’s another thing to turn that model into a package that can be incorporated into products and services. While many academics using ML are very familiar with model metrics (e.g. accuracy, precision, recall, F1 scores, etc), they need to become familiar with metrics that companies care about when it comes to fast, reliable, and robust ML services.

Action Items:

Accelerate your transition

Iterating rapidly on modeling and deployment, and learning from those experiences, is the best way to quickly get up to speed. Because of this, individuals looking to make the transition to applied AI roles need to take advantage of GPU compute to accelerate their progress. We recommend using Paperspace for a hassle-free way of doing this.

Paperspace is a cloud computing platform with cutting-edge GPUs and all the latest AI frameworks. Their systems make it possible to get models up and running in a matter of minutes and to prototype recently published research within days. You can launch your own ML machine for as little as $5 using the code INSIGHTAI5.

Keeping up to date

AI is an exciting, ever-changing field. The demand for Machine Learning Engineers is strong, and it is easy to get overwhelmed with the amount of news surrounding the topic. We recommend following a few serious sources and newsletters, to be able to separate PR and abstract research from innovations that are immediately relevant to the field. Here are some sources to help out: