So, now we know how to start working with satellite images in Open CV. Let’s try to play with it a bit!

We’re gonna automatically detect some areas on this image. Let’s start with detecting water.

First, let’s try to work with the same image of Manhattan, that we used before

Our goal is somehow to distinguish water from other areas.

We’re gonna use thresholding – a simple but very powerful method.

In thresholding key thing is thresh value. For now we’ll define it manually: 65

Here’s the code for simple thresholding operation for detecting water:

destinationFile = 'thresholded.jpeg' inpuDataset = gdal . Open( 'NYC_clipped.gtiff' ) format = 'JPEG' driver = gdal . GetDriverByName(format) outputDataset = driver . CreateCopy(destinationFile, inpuDataset, 0 ) inpuDataset = None outputDataset = None nycImage = cv2 . imread(destinationFile, cv2 . IMREAD_LOAD_GDAL & cv2 . IMREAD_GRAYSCALE) th, processedPic = cv2 . threshold(nycImage, 65 , 255 , cv2 . THRESH_BINARY) cv2 . imwrite(destinationFile, processedPic) cv2 . imshow( 'thresholded' , processedPic) cv2 . waitKey( 0 ) cv2 . destroyAllWindows()

The most important part for us is

th, processedPic = cv2 . threshold(nycImage, 65 , 255 , cv2 . THRESH_BINARY)

Though performing different operations on rasters in natural colors (which we see in services like Google Maps, Apple Maps, Bing Maps) is a rare case in GIS and remote sensing we’ll try to start with natural colors and then move on.

In the snippet above we try to perform binary thresholding. This means that we’re taking under consideration only pixels with values less then 65 (which should give us water), “highlight” them with black color and ignore all other values.

In addition here we’re creating output thresholded image in JPEG format. Here it is shown that with GDAL it’s easy to do.

Let’s run this snippet and browse in QGIS what we get:

Looks promising!

Now water areas are black and all the rest is mainly white. But this should be in theory. Here we see, that many places on our thresholded picture where there is no water at all are marked black.

Which is of course wrong.

Compare how it looks on original RGB image. These areas should not be touched, because these are parks.

In addition, not only parks were marked as water, but also shadows of skyscrapers across the whole Manhattan (see below).

So, we definitely need more advanced technique for detecting water.

But for now let’s make one small step aside and see what’s gonna happen if we load satellite image into Open CV objects without GRAYSCALE flag. I mean, what’s gonna happen if we change

nycImage = cv2 . imread(destinationFile, cv2 . IMREAD_LOAD_GDAL & cv2 . IMREAD_GRAYSCALE)

to

nycImage = cv2 . imread(destinationFile, cv2 . IMREAD_LOAD_GDAL )

Let’s run our snippet with proposed change and see what we get

Interesting! 😉 But for binary thresholding we need grayscale image, so let’s get back to our grayscale image.

As we said before, we need to find a way to exclude “green” parks areas and skyscrapers shadows from “blue” water.

It’s important to understand why these areas have similar pixel values. There could be many reasons for that, but the most obvious is different types of water on this image have different pixel values which could be very close to values of other areas. Here we see sea-water of Atlantic ocean mixed with fresh-water of East river. In addition pollution, water plants and many other factors (like time of the year, temperature, etc) could make these “blue” water areas have similar values to “green” parks.

In this situation we can only play with value of the thresholding parameter. By increasing or decreasing this variable we can get more or less accurate results.

But let’s try to find a better source image for thresholding. This may increase quality of our outcome. We definitely need to find a way to distinguish water and plants.

NDVI will help us in this case. We’re gonna calculate a special value for each pixel that will give us difference between water and plants. Very-very simplified idea behind is that water should not contain “plants” (which is simplification but for this case this would be enough).

There is a great article with explanation what each band of Sentinel-2 image means and how each of them can be used.

NDVI for Sentinel-2 images requires 2 spectral bands for calculation: Band 8 and Band 4.

Calculating NDVI in SNAP is easy. Since we downloaded from Sentinel-2 hub all bands, we have both Band 8 and Band 4. There are couple ow ways to do this in SNAP. I followed the simplest way

Let’s open them in SNAP

Then, we should use Band Maths functionality

By clicking Edit Expression we can edit expression for NDVI

Here, in this Expression area $1.band_1 variable means Band 8 (which is accessible through Product dropdown) and band_1 means Band 4.

By Clicking Ok a new band is created

Let’s Open Image Window (right click on this newly created band) and see what we get:

Looks much better! Now we see, that park areas look “white” (see below), buildings area look “gray” (including most of the shadows from skyscrapers) and water is “black”. This is exactly what we want!

After couple of conversion steps in SNAP (which surprisingly was not so straightforward) we have a new georeferenced .tif file (for some reason jpeg format had no spatial reference).

Now we can try to run once again our snippet for new source file and destination. So, we should have now

destinationFile = 'thresholded_NDVI.jpeg' inpuDataset = gdal . Open( 'NDVI_clipped.tif' )

In addition, we should play a bit with thresholding value. It turned out that now better results are with value 110.

Here’s final version of snippet:

destinationFile = 'thresholded_NDVI.jpeg' inpuDataset = gdal . Open( 'NDVI_clipped.gtiff' ) format = 'JPEG' driver = gdal . GetDriverByName(format) outputDataset = driver . CreateCopy(destinationFile, inpuDataset, 0 ) inpuDataset = None outputDataset = None nycImage = cv2 . imread(destinationFile, cv2 . IMREAD_LOAD_GDAL & cv2 . IMREAD_GRAYSCALE) th, processedPic = cv2 . threshold(nycImage, 110 , 255 , cv2 . THRESH_BINARY) cv2 . imwrite(destinationFile, processedPic) cv2 . imshow( 'thresholded_NDVI' , processedPic) cv2 . waitKey( 0 ) cv2 . destroyAllWindows()

Let’s run it now and see what we have in QGIS:

This looks much better! Parks and almost all shadows and other “noise” disappeared.

Let’s compare it with thresholded value 110 for initial (non-NDVI ) input dataset:

Difference is huge! Parks, streets, shadows, water – everything is black.

So, at the end we can say that to certain extent NDVI-based input images could give relatively good outcome for very-very simplified water detection methods in Open CV based on thresholding functionality. Playing with threshold value could give us some flexibility, but quality of input images is also very important.

We touched a lot of questions in this post that require separate posts (like calculating NDVI not in SNAP but programmatically with the help of GDAL, saving outcome of detection to shape file, etc). Later on we’ll cover this topics in details 😉