With Python SDK and Jupyter notebooks integrated any statistics and visualizations are straight forward to implement.

Consensus labeling scenario

It’s quite common when several persons get a task to label the same set of images. Then, images for which manual annotations are similar considered to be successfully labeled and the images that annotated differently are subject for manual revision. The way to measure annotation differences are task specific: for classification — tag matching, for object detection and semantic segmentation — Intersection over Union. Python SDK is used to build a script that compares annotations and associate a specific tag (“pending review”) with images for which annotations disagree. This could be done via intergrated Jupyter Notebook in a couple of clicks using prepared cookbooks.

Figure 12. Example of Python code to perform automatic comparison of annotations

Reviewer goes through images automatically marked with “pending review” tag and manually resolves conflicts.

Figure. 13. In this example “Alex” made a correct annotation, while “Dima”’s labeling lacks precision

Additional materials:

Collection of Jupyter Nootbooks — to automate common data processing workflows. Supervisely SDK overview — an outline of available modules and areas they cover. Working with annotation primitives with SDK — the basics of working with annotated images in Supervisely format using Python SDK.

What’s next

In the next section we further extend a range of tasks covered with the platform by introducing neural networks. As you will see, active learning, AI-assisted labeling, multi-stage processing pipelines are natural extension of the platform functionality previously described.

5. Neural networks

Unique feature of Supervisely is the largest collection of Deep Learning models for computer vision available online that can be used for training and inference tasks in unified way without any coding (figure 14).

Below is a brief summary of models available:

Figure 14. Overview of Deep Learning models available

The underlying reason that allows us to provide so many production level and state of the art models is that Supervisely is a framework agnostic.

Figure 15. Docker and Supervisely SDK allow to support Deep Learning models from variety of frameworks

We rely on Docker technology and Supervisely SDK to integrate neural networks from TensorFlow, Pytorch, Keras, Darknet and other frameworks that provide python interface (figure 15).

We also understand that machine learning experts have already implemented tons of task specifics neural network architectures so here are the guideline on integrating your model into the platform.

Once the model is chosen (either the one from our library or you have integrated your model), it’s time to leverage more training data to improve the model performance.

Active learning

Once the model is chosen, it’s time to leverage the data to improve recognition quality.

Here is where Supervisely shines: since it incorporates labeling interface, data manipulation, team management and neural networks into single environment, implementing Active Learning approach becomes straightforward.

Each time the model improvement is needed, the following steps should be taken.

Training step. Use existing Training Set to train a model. Evaluation step. Run the model to process labeled images from the Test Set and calculate performance metrics. Run the model to process unlabeled images and save predictions of the model. Labeling step. Identify mislabeled images, send them for manual labeling and, then, add newly labeled images to the Training Set. Go to step 1.

Figure 16. Continuous model improvement via Active Learning

Each time model improvement is needed, steps 1–3 are taken (figure 16). If dozens of tools are involved in the process the time spent are high and the mistakes due to data conversions / transformations might arise. These kinds of mistakes are extremely expensive and hard to identify.

Training data verification

Applying a model to annotated images and tagging the images for which prediction errors are high is an easy way to identify annotation mistakes in the training data. If these mistakes are systematic, identifying and correcting them early on, may save months of data scientists’ time.

Multi-stage data processing pipelines

A common scenario for production settings is when several models are applied sequentially. The next model relies on the processing results of the previous model. Figure 17 illustrates road quality inspection pipeline where semantic segmentation and detection models are interchangeably applied to identify road defects.

Figure 17. Roads quality inspection pipeline

Similar pipeline but for nutrition facts recognition task is shown on figure 18. In this case, semantic segmentation model is applied to find Nutrition facts label on the box. Then a rectangle is put around each line of text. Finally, CNN-LSTM OCR model is used to process images within the rectangles to recognize the text.

Figure 18. Nutrition facts recognition pipeline

The pipelines above are the production systems that solve the task in end-to-end manner and can be continuously improved inside the platform as more labeled data is available.

AI-assisted labeling

AI-assisted labeling is the key to get more high quality training data in a shorter period of time. There are several ways to leverage deep learning models to label data more efficiently.

1. Leverage available models for classification, object / landmarks detection, semantic segmentation.

A model is applied to unlabeled images and automatically generated predictions are used as a starting point in annotation process (figure 19).

Figure 19. Using pre-trained detection model to speedup labeling process

2. Leverage Deep Learning models that were design to interact with user and to minimize the number of clicks required to label an object.

Great example of this approach is our Smart tool — basically, it is a neural network trained in a class agnostic way to obtain a segmentation mask for dominant object inside specified rectangle (figures 20,21).

Figure 20. Smart tool for pixel-wise annotation of car

Figure 21. Smart tool for pixel-wise food annotation

The interactive approach above is applicable not only for semantic segmentation task but for landmarks detection as well. This feature is currently under research stage and it will released in future.

Smart tool is yet another neural network and can be trained the same way other Deep Learning models are trained inside the platform. So, usually, it’s a matter of several mouse clicks to run data preparation scripts and initiate training procedure to obtain a version of the Smart tool optimized for your specific objects / domain.

Additional materials:

What’s next

Let’s add compute power on top of the previous components. As you will see, adding more computational resources is a matter of running a single command.

6. Computational cluster

Figure 22 illustrates the process of adding one more machine with GPUs to the computational cluster. Essentially, it’s one step procedure — just execute autogenerated command on your Linux machine with Docker and Nvidia Docker installed. After running the command a Supervisely Agent (open sourced tiny Python program) is installed on the machine and it can be used to run training or inference tasks

Figure 22. Attaching GPU machine to the platform

Here is an important note:

It does not really matter whether you use AWS machines, your home or office computers or any combination of the resources. As soon as Supervisely agent is installed, your computational cluster is 24/7 available.

Additional materials:

What’s next

The last section is devoted to potential Enterprise customers with developed internal infrastructure, for whom custom integration is required

7. Custom integration options

For our corporate customers we offer Supervisely Enterprise Edition. Supervisely EE was designed to run inside private network without any access to the internet. Thus, you can be sure that no services or data would be exposed to potential threats from the web.

You can also use any S3-compaible storage if you want to have your data in local cloud storage or attach your custom data storage backend. Your existing user management system can be intergrated via OpenID or LDAP.

But even if the existing integration mechanisms won’t meet your needs, you can always extend Supervisely via low-level Restful API, attach custom plugin as a Docker image or use intergrated Python Notebooks.

Figure 23. Supervisely infrastructure overview

Additional materials:

Check out our API Reference

Next we will outline future direction of our work.

Future work

There are a number of directions to follow that will make the platform even stronger

1. More Deep Learning models available

Even today Supervisely provides the largest collection of models for semantic segmentation, object detection and classification tasks. Ideally, every State of The Art neural network related to computer vision is available in our platform. We will keep monitoring the latest research to maintain neural network library up-to-date as well as support more computer vision tasks and corresponding Deep Learning models.

2. AI assisted labeling

There are categories of tasks for which traditional ways of labeling do not work at all. Take portrait segmentation task as an example. It’s impossible to manually create high quality segmentation masks for objects like hairs — probably, the only way to address it is by using Photoshop. But Photoshop is not designed to work with training data and we see AI powered photo-editing tools as part of the platform in the future.

3. Additional annotation primitives

In the next release we are going to include a tool that allows to annotate images with custom graphs — ideal tool for landmarks annotation. As Deep learning methods are used more extensively in AI products, more and more complex structures should be supported.

4. Further workflow simplification

Time-to-market is critical, so we will keep further simplify common workflows to further shorten the path from raw data to production models.

Conclusions

Supervisely platform is a good starting point for AI powered production level applications in Computer Vision. The platform is designed to address a wide range of tasks from data annotation to building and deployment of latest Deep Learning models.

Due to our entire focus on Computer Vision, we were able to design a platform to allow annotators, reviewers, machine learning and domain experts to work within single web-based platform in collaborative manner, use and customize data processing pipelines and move fast from raw images to production application.

We have introduced seven major components of the platform that address the tasks on the way to the product in a systematic manner. We highly encourage you to try Community Edition of Supervisely for free or speak with us about an Enterprise solution for your business.

Also, most components of Supervisely are open sourced, so everyone is very welcome to contribute our GitHub repo.