1. What is Image Segmentation?

Before the relatively recent progress of convolutional neural networks, one of the most-used computer vision techniques was “The segmentation of a color image”, which consists of partitioning the image into regions in order to extract the different objects present in the image.

Segmentation methods can be divided into two families: analyzing the color distribution of pixels in the image plane or in a color space.

Methods that analyze the color distribution of pixels in a color space consider that each pixel in the image is represented by a color point in a color space. The most used color space is the RGB space, in which the coordinates of a color point are the levels of the colors red, green, and blue in a corresponding pixel.

Other color spaces can be used, and the performance of an image segmentation process depends on the choice of color space. Several authors have attempted to determine which color spaces are best suited to their specific image segmentation problems.

Unfortunately, there is no single color space that provides satisfactory results for the segmentation of all types of images. It’s generally assumed that regions of the image with homogeneous colors constitute color-point clouds in the color space, with each cloud corresponding to a class of pixels that share similar colorimetric properties.

Classes are constructed by a cloud identification process that’s performed either by a color histogram analysis or by a cloud analysis. When classes are built, the pixels are assigned to one of them by a decision rule. Region labels are assigned to related pixels that are assigned to the same classes in order to construct the segmented image.

Semantic Segmentation

Semantic segmentation is a basic building block for scene comprehension. By classifying all the pixels of an image in a dense way, it’s then possible to construct abstract representations interested in the objects and their forms.

Fully convolutional networks (FCNs) are a particularly effective tool for semantic segmentation for many types of images: multimedia, aerial, medical, or autonomous vehicles.

Example of segmentation for aerial images using different segmentation techniques

However, the literature regularly encounters problems of imprecise inter-class boundaries or noisy segmentation, requiring the use of post-event regularizations to smooth segmentation. The community has thus looked at different post-treatments to improve the sharpness of the contours and to constrain the segmentations to respect the same topology as the ground truth. Often, these are graphical models added at the end of the network or using a specific knowledge a priori.