Hi everyone!

The first summer month has been quite busy for us. It is time to share the news with our community!

New neural network on prostate cancer recognition on histopathological slides

This neural network has been in progress for a long time already and now it’s the time to unveil the details of its development.

This ANN helps to analyze the histopathological slides — the digitized images of prostate biopsy, made in a very high resolution.

All of the open-source related data are labeled superficially and can only be used to detect initial signs of cancer presence, which are absolutely not enough to start a proper treatment. It is very important to know the type, grade and stage of tumor, as well as other abnormalities, which are important in diagnostics.

In order to create a proper dataset for our research, we have started a joint project with a group of experienced pathologists. They have labeled more than 500 of unique slides with various abnormalities. It can only seem that 500 is a small number — the resolution of each image can reach 200,000 pixels in one dimension.

One of the slides we used for training

At the stage of development, we have also actively consulted the doctors. We figured out that it is required to do the comprehensive image analysis in order to detect and confirm the cancer accurately and precisely.

Since some indicators of cancer presence can be noticed only at the microlevel on the maximum zoom (cancer nucleoli in the nucleus), some of them (such as shape of gland cells, shape of gland groups, etc.) can be seen only on the macrolevel. The solution consists of several neural networks, where each of them contributes to the final output.

On the microlevel several neural networks with various architectures analyze the structure of the glands and their nuclei. We have built these networks on the basis of the classic residual blocks, but made our own architecture as the initial ones showed not a very good result. They are analyzing the patch (part of a huge slide) of 512x512 pixels.

To analyze the indicators of cancer on the macrolevel we also use several neural networks with a large receptive field to concentrate the focus not on the pixels, but on a picture of a larger scale. We used Encoder-Decoder Unet with Atrous Separable Convolution neural network as a basis.

After we gathered all the signs from both levels, we have used them in the final classifying neural network, which provided the final output about the pathology and its borders.

The output of a neural network

The main problem we faced is the overfitting of several ANNs on various marks and the unbalanced dataset — some of the pathologies are quite rare and they present on slides 10 or even 100 times less often than the others. To solve these problems we’ve used various instruments to reduce overfitting — data augmentation, dynamic data rebalancing and, of course, the assistance of doctors.

We still add more images to the dataset and are experimenting with other training methods. At this moment our neural network is capable of detecting and recognizing the most common types of prostate cancer, such as adenocarcinoma, foam-cell and ductal carcinoma with the accuracy of 91%. As for others outputs, such as inflammation and gland atrophy, we still gather data and are looking for the optimal training methods.

Apart from the defining the areas of cancer presence, we also have the separate algorithm for scaling the tumor on a Gleyson score, which is also very important to establish the diagnosis.