This is why we use third party data labelling providers to clean datasets so we end up with perfect data for AI training. This switch several months ago was the single most impactful change we implemented to reach our goal.

Knowing the Business Domain

Sterblue’s strategy has always been to address business verticals one after each other. This proved highly valuable as we realized that deep business domain knowledge is a critical factor of success in applied machine learning.

This business knowledge allowed us to design appropriate data representations, optimize labelling tools, ensure data quality, and many other aspects of the data science pipeline.

One of the many classes of defects found on distribution power lines

Training deep neural networks without business knowledge would be like teaching a topic you don’t know anything about just by following a book: you’d think it could work in theory, but it fails badly in practice.

On the other hand, the detailed business knowledge we learned by interacting with real world data over the course of several years is an invaluable asset for us now.

Making Hard Science Happen

As its name implies, data science is a science. Some people portray it as an art form, but they could not be more wrong.

Data science is science, and that means assessing results against real-life objective measurements, and never losing sight of the hard truths. It’s easy to fool yourself and be enthusiastic looking at these few amazing results your AI provided, when all around is a sea of garbage results. Consistent objective metrics are a good way to know where you are precisely and whether you are going in the right direction. Anecdotal results are not.

Monitoring neural network training

Success in AI is not achieved by performing one-of-a-kind stunts, but by methodically applying state-of-the-art methods, with a healthy dose of pragmatism and creativity along the way.

The Power of Mixing

Machine Learning is not a fully solved problem. This means that for one given problem, several methods can provide good results. As a data scientist, it is necessary to be open to various approaches and try them in order to find the most relevant ones to the use case.

A neural network architecture

Often, the robustness of the final solution is achieved by using a smart mix of several solutions. An illustration of this is adversarial examples: a simple sticker or a few changed pixels can fool some neural network architectures into confusing an object with another seemingly unrelated object.

Just like pure-breed animals are the most fragile and mixed-race animals are more robust, pure-breed application of neural nets are sometimes fragile. A product that mixes various applied machine learning approaches will be more robust.

Tooling Support

We see that achieving valuable results in applied machine learning relies on several key elements listed above in this post.

What I didn’t talk about yet is that each of these elements relies on a lot of tooling in order to be performed efficiently. Developing tools that support and optimize all these steps is actually what applied machine learning is about.

Various Sterblue tools on the platform

Developing new neural network models is just a tiny part of what makes a success in applied machine learning. The bulk of the work is actually all the supporting tools that go around it.

From super efficient data labelling interfaces to optimal drone flight planning, all Sterblue tools participate in supporting an efficient use case for machine learning at scale.