Today we are going to talk about the experiment we did at Esri together with our Australian business partner, AAM Group, which specializes in the collection, analysis, presentation and delivery of geospatial information. In particular, collecting airborne LiDAR point clouds for electric utility companies to detect the power lines and vegetation growing or any other easement encroachment into the power lines’ safety corridor.

Vegetation and encroachment monitoring is a critical task which needs to be performed on a regular basis to ensure the safety of both transmission and distribution networks. Utility companies operate tens of thousands of miles of power lines, yet missing a single tree canopy growing too close can lead to a massive wildfire or a power outage affecting thousands of consumers.

To minimize the risk of such events, typically, an annual survey is performed on the entire grid by flying low altitude manned airplanes or drones equipped with LiDAR sensors. Once the point cloud is collected, the power line points are manually labeled inside a GIS / CAD system, then any intrusions into the safety zone are automatically detected and placed into a work-order system for field teams to address.

Manual labeling of the wires in raw point clouds is an extremely labor intensive process: just for one of its customers, AAM Group invests about 50,000 man hours a year to label the points which belong to overhead conductors.

Can a Deep Neural Network help?

Previously, we have experimented with a deep neural network called PointCNN which allows for efficient semantic segmentation (automatic assignment of classes like Ground, Water, Building, Vegetation etc. to each point) of raw point clouds. Back then, we trained a PointCNN model to label building points and received some quite impressive results which, in most cases, outperformed traditional deterministic algorithms.

PointCNN deep neural network labeled raw XYZ point cloud.

But the buildings are much simpler to detect compared to overhead wires: if the former, especially in the urban areas, are contained inside relatively balanced sets, the latter are represented by disappearingly low number of points, e.g. only 12,500 out 3.6M points as in the LAS file below.

Significant class imbalance: ~12,500 wire points (red), ~5,100 utility pole points (blue), ~3,607,000 points of everything else in this particular LAS file.

Another challenge — a building, due to its size, is easier to discern from the surrounding noise e.g. touching or overhanging tree canopies, adjacent bushes, street furniture, etc. The overhead conductors, on the other hand, are represented by non-planar zero-area point neighborhoods and are much harder to discriminate from nearby buildings, trees, utility poles.

Spoiler Alert. Given the above, we were conservative in our expectations about PointCNN abilities to learn general rules needed to detect and label overhead power lines. The good news — we were wrong.

The experiment

We took a fairly small subset of a manually labeled point cloud, partially covering an Australian city, containing about 540M points total with average density of ~60 points per square meter. After some preliminary filtering and compression done in ArcGIS Pro, these are the classes we ended up training the model with: 0 — Other, 1 — Wires, 2 — Stay-Wires, 3 — Utility Poles.

Classes and their distribution in the training set.

We used the TensorFlow-based implementation of the PointCNN architecture and a single NVIDIA Quadro GV100 card with 32GB of VRAM to train and test a model on the above LAS dataset.

Data Prep and Know-hows

The data preparation work, to some degree, is complex when it comes to working with point clouds in general and PointCNN in particular. The framework splits the input points into two sets of 50% overlapping voxels, so the inner points get processed four times in order to probe them in various local neighborhoods.

If your GPU does not have a good amount of VRAM (in our experiments we used a card with 32GB) you may hit the Out-Of-Memory limitations with the default settings. To deal with this, multiple options exist: from reducing the mini-batch size (leads to a slower convergence), to thinning the voxels. The later may be a better alternative when working with larger objects: if voxel’s point count exceeds the pre-configured limit, it will be thinned by the framework through sampling into multiple 100% overlapping voxels. This leads to a trade-off though: large voxels are needed to capture larger objects v.s. smaller objects require a higher point density per voxel.

To capture best of both of these worlds, you really want to use a GPU with the largest VRAM you can get your hands on.

If you do not have a large VRAM, but still need to label objects of significantly different sizes, it may make sense to train different PointCNN models with different voxel sizes and point densities.

Since our primary focus was on the Wire class, we chose a 250 m² cross section for the voxels with 24,576 max points density.

Not that surprisingly, but still worthwhile to mention that we achieved better results when training not just on pure XYZ values, but also adding the Intensity and the Number Of Returns attributes to the set of input features.

LiDAR point cloud symbolized by the Number Of Returns: you can see the wires standing out from the background.

LiDAR point cloud symbolized by the Intensity: also a strong signal here, particularly helpful in discriminating from tree canopies.

Another important fact: tensorboard shows signs of overfitting on the validation loss much earlier than the Recall stops growing, while the Precision dynamics matches the validation loss fluctuations pretty accurately. Therefore, it may make sense to train a bit longer and sacrifice some of the Precision in return for a higher Recall and the overall F1-score.

Results

We received the best Precision value around the minimum of the validation loss, at the Iteration 67,000. The best Recall though, after almost a double number of epochs:

*- Update from August 2019 (more details below). After training on a larger training set, we achieved Precision for Wires of 0.966, and 0.82 for Poles; Recall — 0.981 and 0.775 respectfully.

Below are some side-by-side comparisons from the test set of the ground-truth (left) and PointCNN predictions(right) from the 116,000th iteration.