Editor’s note: Our team has recently put together a full ebook on this topic! Download for free to access a complete examination of the project lifecycle, with best practices from our experts.

While flashy deep learning research grabs headlines, what happens to models after they are trained is equally important. To build a great product, you need to plan for the entire lifecycle of machine learning models, from data collection and training to deployment and monitoring. This becomes even more critical when deploying ML models outside of the cloud, directly in mobile apps where you face the unique challenges of supporting multiple platforms, hundreds of chipsets, and billions of installs.

The good news is that the same best practices used to create lovable experiences still apply when neural networks are involved. Your tools, though — version control, continuous integration, monitoring, security, etc — need to be built with mobile machine learning in mind. Below are 7 stages of a model’s lifecycle you need to manage in order to deliver reliable, scalable mobile experiences.

1. Collect

Gathering an initial dataset is the first step in any machine learning project regardless of where the results will be deployed. When targeting mobile devices, though, think carefully about the conditions in which applications will be used and augment training data accordingly.

A model may achieve high accuracy on the bright images from the ImageNet dataset, but perform poorly in low light settings encountered by smartphone users. Augmenting your data by dimming, blurring, and adding noise to images or collecting them directly from mobile cameras will boost performance of models in production. Smartphones come equipped with a large array of high quality sensors you can use to create your own proprietary data.

Continue collecting data even after you app is deployed. Capture inputs and outputs of models running on devices and monitor accuracy to improve models over time. Make sure to respect storage, bandwidth, and connectivity limitations on devices. You don’t want to fill up storage with cached images or deplete a data allowance streaming video back to the cloud. In some cases you’ll need to filter out sensitive data for greater privacy.

2. Train

Today, most model training happens in the cloud. Datasets are large, and optimizing hundreds-of-millions of parameters requires a lot of processing power. In the future, AI-specific mobile processors will enable training performed directly on mobile devices, keeping user data private and secure. For now, a variety of cloud-based training platforms support exporting trained models directly to mobile-friendly formats like Core ML or TensorFlow Lite.

Regardless of where your model is being trained, the best results are achieved when the training environment matches deployment as closely as possible. Make sure to simulate optimizations like quantization inside the training loop to keep accuracy high. Finally, you need to keep track of metadata related to each model you train, including the datasets used, hyperparameters, the platform it’s targeting, and any other versioning information.