🚘 Traveling the ML. Next hop: step by step guide to recognize drivable area

1,733 reads

We are proud to release “Supervisely Country Roads” dataset, share our research as tutorial on how to build semantic segmentation model for drivable area and provide source codes, so you will understand how to build, download and use custom models with Supervisely. So you also can reproduce all steps below easily.

Motivation

Self-driving industry is huge today, companies invest in hardware, software, and labeled data. And it’s not only about self-driving cars, but it’s also about robots-couriers to deliver stuff, road quality inspection and other apps.

The main purpose of this post is to give the quickest possible guide to the most common task in self-driving industry — drivable area recognition.

What we want is to …

Take training data, spend as little time as possible on data conversion / processing routines

Pick a model, something Unet-like, train & run it

Visualize the results in jupyter notebook

Homework: try to improve our initial solutions by leveraging public datasets and more complex neural network architectures

What we don’t want is to …

Spend hours to download gigabytes of annotated images and to convert them from one format to another one

Know anything about data formats

Try various github repos until we find the right one — implementation of semantic segmentation model that actually works

Now that our preferences are defined, let’s think about overall approach to address the task. There are two important challenges to deal with:

It’s highly desirable to perform the entire research within single environment. In this case, there will be no need to do unnecessary data conversion and to switch between different tools and github repos Though there are public datasets (cityscapes or mapillary) that we can use to extract drivable area segmentation masks, our dataset is better to be lightweight, trivial and easy to use.

So, doing all the work inside Supervise.ly platform solves the first challenge. To address the second — we have released “Supervisely Country Roads” dataset.

All steps of this tutorial will be done inside Supervisely without any coding. The high level plan is the following: you will add “Supervisely Country Roads” dataset to your account, use DTL (Data Transformation Language) to create augmented training dataset, train NN and apply it to test images, then you can download NN weights and run inference right in jupyter notebook.

Let’s get it started!

Step 1. Add “Supervisely Country Roads” dataset to your account.

Cityscapes and Mapillary datasets are mostly about city environment. But there are a lot of cases when autonomous vehicles have to drive on country roads: autonomous harvesters, trucks and agriculture robots. Other reason why you can not use mentioned datasets for this task directly is that many classes (e.g. “road shoulder”) are marked as neutral. It leads to inaccurate prediction of drivable area on country roads.

We at DeepSystems had a project on this topic and we are happy to share our internal dataset and few experiments with community. We believe this contribution will be useful for many researchers. We release 1000 labeled images and 500 test images. We tried to keep the variety of labeled images.

To get this dataset just go to “Import” -> “Datasets Library” page, choose “Supervisely country roads” dataset and define the name of the resulting project as “country_roads_labeled”. That’s all.

Here are the few examples of labeled images:

Step 2. Prepare training dataset

Inside Supervisely we have special module called DTL (Data Transformation Language). It allows to merge datasets, make class mappings, filter objects and images, and of course apply various augmentations to your data.

You just have to stack the data operations together. So you can build custom computational graph (analogy is the graph of neural networks layers), each your image and its annotations will go through this graph and the results will be saved to new project. In most cases automatic augmentations significantly increase the accuracy.

We prepared DTL query for you. Just go to “DTL” page, copy&paste json file (link to github), and press “Start” button. As a result of DTL query new project with 12k images will be created. It will contain original images, flipped version, and number of another augmentations: random crops, brightness, contrast and color transformations, and blurring. This example illustrates how it is easy and quick to prepare training dataset with our platform.

You will find the detailed description of applied transformations below.

Step 3. Train neural network for semantic segmentation

Before train NN you have to connect computer with GPU to your account. Go to “Cluster” page and click “Add node” button. Then execute following command on you computer.

Docker image will be pulled, and Supervisely Agent (our task manager) will be started on your computer. It will automatically connect your PC to your account and you will be able to use your GPU for all NN related tasks (train/inference/deploy). Source codes also available in our public git repo.

Let’s train segmentation neural network to recognize drivable area. Just go to the “Neural Networks” -> “Model Zoo” page, choose “UNet V2 (VGG weights)” and click “Add model” button. This model will be added to your account. Then click “Train” button near the model, choose training project “training_01”, define the name of resulting NN (latest checkpoint) as “nn_roads”, choose training configuration from template (it will be automatically added after you add model from Model Zoo) and press “Start training” button. Find example in our docs with tons of screenshots here.

After training model “nn_roads” will appear in your models list. During training you can monitor metrics in real time:

Step 4. Visualize predictions

Let’s add test images to your account from “Import” -> “Datasets Library” -> “Supervisely country roads [test]” as project “country_roads_test”.

Now you can press “Test” button near your model (that you have just trained), choose “country_roads_test” project, define the name of resulting project that will contain test images and neural network predictions and press “Start inference” button. Here are few examples of NN predictions on test images:

Step 5. Download final model and use it the way you want (e.g. jupyter notebook)

The idea is that you can use Supervisely to perform a lot of research experiments without any coding. But how to use the model in production? Option1. Deploy as API (we described the entire process in our previous post). In this tutorial we would like to show you another way — option 2: you can download NN weights and then use our source codes to run it right in jupyter notebook. Notebook contains the example of inference.

By default model is stored on your machine. Just click “Upload to Supervisely” button and then “Download” button (link to docs). As a result the tar archive will be downloaded.

Here is the link to our github with example for this tutorial. Just follow instructions in readme.md (just unpack the model to certain directory and run docker container).

Conclusion

In this post we have shown how to build drivable area segmentation model from scratch.

Although, the task gives you a taste of what self-driving companies are working on, production ready solutions require more challenges to overcome.

Multiple classes of objects from various datasets. In a more complicated scenarios we have to build a model capable of segmenting multiple classes of objects like cars, pedestrians, bikes, road landmarks and so on. In practice, constructing “multi class datasets” requires time & efforts due to the need to label / merge / filter / clean the data from various sources.

In a more complicated scenarios we have to build a model capable of segmenting multiple classes of objects like cars, pedestrians, bikes, road landmarks and so on. In practice, constructing “multi class datasets” requires time & efforts due to the need to label / merge / filter / clean the data from various sources. Class imbalance problem. In a multi class settings, some of the objects are underrepresented. For example, on a road scenes, bikes are far less frequent object than cars. This makes it challenging to train a neural network that achieves required level of accuracy for the rare classes of objects.

In a multi class settings, some of the objects are underrepresented. For example, on a road scenes, bikes are far less frequent object than cars. This makes it challenging to train a neural network that achieves required level of accuracy for the rare classes of objects. Continuous model improvement. Production level solutions should be stable in way that “edge case” images are recognized in a proper way. So we need to identify these “edge case” images first, then label them and finally, retrain the model. The process described is continuous and, sometimes, called Active Learning. It’s highly desirable to automate it as much as possible.

If you have any questions, feel free to ask in our public Slack. We are happy to help!

We will address the challenges above in the next posts. Meanwhile, If you found this article interesting, give it some 👏, so that more people could see it!

Tags