Needing to correctly align one set of data with another is a common task in many geospatial analyses. One example of this is snapping a set of locations to a linear network. This is usually an essential step in network analyses because you need to know the exact location of that thing on a given network. For instance, in order to calculate the travel distance to the nearest electric scooter, you need to know where it is within a city’s grid network of streets. If you need to know how many miles of river are upstream of a dam, you need to know where that dam is located on the river network.

In both cases, there is usually some measurement error in the location of the scooter or the dam, and it may not correctly align with the network. Or the location is correct, but the data source for the network has measurement error. Or both are wrong, which is often the case. Take your pick.

To get around this issue, you need to “snap” the locations to the nearest point on the network. This involves a few steps:

Find the nearest line to the original location Find the nearest point on that line to the original location Move the original location to that point Proceed with the network analysis

This is relatively straightforward for a few points and lines but can be a slow operation if you have to do this with lots of data. In my case, I needed a reasonably fast way to snap the locations of hundreds of thousands of dams and culverts against aquatic networks with millions of line segments. I have been working with the Southeast Aquatic Resources Partnership to help analyze the relationships between aquatic networks and aquatic barriers like dams as part of building the Southeast Aquatic Barrier Prioritization Tool. In order to analyze aquatic connectivity, we first needed to overcome the fact that the locations of many dams were not actually on the latest available aquatic network data from the National Hydrography Dataset. You would think everything would be in the correct place, right? Nope.