Recently I wanted to benchmark PostGIS spacial engine with massive amounts of data. If I were to be a GIS professional I would probably have a “go-to” place for huge datasets, but I am not so I couldn’t. I wanted to benchmark spatial queries against 1 million records, just to get a good idea of where PostGIS stands.

Postgre Function

After abandoning the quest for a huge dataset, I figured I would create my own Postgre function. Half an hour of troubleshooting/trial and error resulted in the following.

How To Use It

1.) Your Table

The function looks to insert records into a table called points with 2 column location and name_of_place , the name of the place being a string and the location a geo_point.

2.) Create a Function

I created a function in my postgre database and named it fillPoints . Copy and paste the above code as your function’s structure and save.

If you would like to use this function to fill your own data table with predefined columns. Simply edit the INSERT statement to match your structure.

3.) Use The Function

Now that everything is in place, lets fill that table. All you need to do is execute a select statement that call the function and pass the number of records you want to create as a parameter. I used an Amazon EC2 micro instance for this test so filling my table with 1 million records took about 9 minutes.

SELECT fillPoints(1000000);

Benchmark: 1 Million Records

-- This will search 1/4 of the overall region. -- My Result: 249,881 points found in ~800ms SELECT count(*) FROM points WHERE ST_Within( location, 'POLYGON((0 0, 0 180, 180 180, 180 0, 0 0))' ); -- This will search 1/8 of the overall region. -- My Result: 62,420 points found in ~200ms SELECT count(*) FROM points WHERE ST_Within( location, 'POLYGON((0 0, 0 90, 90 90, 90 0, 0 0))' );

Conclusion

To say the least I was totally surprised by the benchmark. The latter is so amazing! 64,000 records in 200ms is an incredible speed for a spatial querying app/api. This experiment with PostGIS was sparked by the idea of a mobile listings application. I will need a speedy spatial database engine behind my application. I am happy with the performance I’ve seen thus far in PostGIS. Perfect opensource solution for a small one-man project like mine.

This would also be a great alternative for REALTOR websites. Where an MLS can have 60,000 listings at any one time. Searching 60,000 records using spatial data (search homes inside polygon) would be a simple task that could be done in much less than 100ms.