TLDR: Tegola, a vector tile server written is Go, now comes with support for running on AWS Lambda. You can now run a vector tile server that is cost effective, fast and reliable on serverless architecture. Checkout the README for instructions.

Vector tiles are all the rage in the world of online mapping. They offer numerous benefits over their raster predecessors including dynamic client side styling, smaller payloads and crisp feature detail across device resolutions. Since the release of the MVT specification the FOSS4G vector tile ecosystem has been growing fast, but running the server side component is still complex and expensive.

The cost of running a vector tile stack

There are two primary costs with running any server: infrastructure and cognitive. From the infrastructure cost standpoint at least some number of servers need to be online and have a strategy for handling traffic spikes. From the cognitive standpoint at least someone (but it’s best to have multiple someone’s) needs to implement and manage the infrastructure. As the infrastructure demands increase typically so do cognitive demands. The more technology, the more moving pieces, the more that can and will go wrong. All of these factors convert into business costs.

Running a vector tile server is no trivial task. There are a lot of moving parts and in order to get great performance you will need multiple layers of caching and a well tuned horizontal scaling strategy. Generating vector tiles is CPU intensive so a single server will quickly lock up under load.

There are a few strategies for running a vector tile backend:

Pre generate all the tiles : In order to pre generate the entire planet you will need to consider the zoom range from 0–20 which will result in hundreds of billions of tiles being generated. Depending on the architecture of the system this will take from days to weeks to process. With that many files you will also need to start thinking about your file system capabilities or leveraging an object store like S3. Also note that when the data changes at least some part of the tile generation process will need to run again.

: In order to pre generate the entire planet you will need to consider the zoom range from 0–20 which will result in hundreds of billions of tiles being generated. Depending on the architecture of the system this will take from days to weeks to process. With that many files you will also need to start thinking about your file system capabilities or leveraging an object store like S3. Also note that when the data changes at least some part of the tile generation process will need to run again. Serve tiles on demand: A vector tile server responds to tile requests on demand. The server will fetch data from a data provider, perform geo processing and then respond to the request. This strategy can work if data density is low and a good horizontal scaling strategy is in place. Without a scaling strategy the server will lock up and the user experience will quickly degrade. Additionally, without a caching strategy there’s a lot of unnecessary pressure on the database.

A vector tile server responds to tile requests on demand. The server will fetch data from a data provider, perform geo processing and then respond to the request. This strategy can work if data density is low and a good horizontal scaling strategy is in place. Without a scaling strategy the server will lock up and the user experience will quickly degrade. Additionally, without a caching strategy there’s a lot of unnecessary pressure on the database. Pre generate and serve: Pre generate the coarser zooms (i.e. 0–12) and then serve the fine grained zooms (13–20+) on demand. Coupled with an object store (i.e. S3) for pre generated tiles and caching along with a content delivery network (i.e. Cloudfront) for edge caching, and a fairly robust backend is starting to take shape.

Unless you chose to pre generate all the tiles, the problem of managing a server cluster to handle traffic spikes still exists and so do the discussions around 0 downtime deployments, server upgrades, security patches, load balancing, etc.

Enter serverless.

The evolution of serverless for vector tiles

AWS introduce Lambda to the world November 2014 which has ushered in a new approach to computing. AWS describes Lambda as:

AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume — there is no charge when your code is not running.

Only pay for what we use?! Sounds fantastic, but unfortunately the first version of Lambda was not capable of running a vector tile backend. Over time there have been several improvements to the service which have made this possible:

Addition of binary support (November 2016) — Vector tiles use binary encoding. Prior to this announcement only JSON responses were supported. Without binary support serving vector tiles was not possible.

(November 2016) — Vector tiles use binary encoding. Prior to this announcement only JSON responses were supported. Without binary support serving vector tiles was not possible. Custom domain names with regional deployments (November 2017) — Prior to custom domain names with regional deployments there was no way to leverage a content delivery network for edge caching. Lambda could run on the edge, but this causes upwards of an additional 500ms response time, which is not ideal for running a vector tile backend.

(November 2017) — Prior to custom domain names with regional deployments there was no way to leverage a content delivery network for edge caching. Lambda could run on the edge, but this causes upwards of an additional 500ms response time, which is not ideal for running a vector tile backend. Native Go Support (January 2018) — Go is not needed for running a vector tile backend, but there’s a great open source, MIT licensed, vector tile server written in Go called Tegola. ;-)

Tegola a vector tile server written in Go

Over the last couple years I have been working on implementing a vector tile server in Go called Tegola. After AWS announced Lambda support for Go I started looking into what it would take to run Tegola on Lambda. After a week of fiddling around with configurations I landed on a beautiful architecture I call the “holy grail” deployment and it’s beautiful. A quick overview of the stack:

RDS : For running PostGIS and housing our geospatial data. Alternatively, GeoPackage could be used. Note, depending on your caching strategy and traffic load RDS is another pressure point that may need a good scaling strategy.

: For running PostGIS and housing our geospatial data. Alternatively, GeoPackage could be used. Note, depending on your caching strategy and traffic load RDS is another pressure point that may need a good scaling strategy. S3: For storing pre generated and cached tiles.

For storing pre generated and cached tiles. Lambda for Go: For running Tegola.

For running Tegola. API Gateway: For proxying requests to Lambda. Used with a custom domain name (regional).

For proxying requests to Lambda. Used with a custom domain name (regional). Cloudfront: A content delivery network for edge caching.

With this stack you’re getting an extremely affordable, performant, low ops, scalable vector tile backend and it’s easy to setup and update. We have tested this against a planet scale deployment of Open Street Map (around 160GB database) and the deployment just purrs. Lambda solves so many problems including:

Only pay for what you use: Lambda is billed on a per second basis so you’re only paying when users are interacting with your service.

Lambda is billed on a per second basis so you’re only paying when users are interacting with your service. Individual processes: Each request is run in a separate Lambda instance so tile requests no longer complete for CPU resources. This is truly a beautiful thing, especially with the CPU demands of geo processing.

Each request is run in a separate Lambda instance so tile requests no longer complete for CPU resources. This is truly a beautiful thing, especially with the CPU demands of geo processing. Scalability: Lambda scales out horizontally on demand. No need to worry about auto scaling rules and spinning up servers.

Lambda scales out horizontally on demand. No need to worry about auto scaling rules and spinning up servers. Reduced cognitive load: As mentioned in the first part of this article, the cognitive load of running the vector tile backend is dramatically lower. Deployments consist of an archive with the Tegola binary and a Tegola configuration file.

If you’re interested in giving this setup a try, check out Tegola and review the README for running it on Lambda. If you have any questions feel free to ping me on Twitter @arolek or open an issue on the Github repo.