Solution 2: TimescaleDB

TimescaleDB architecture allows for nearly constant ingestion rate when data grows over time, as explained in many of their Medium posts like this one

Timescale is an extension built on top of the popular SQL database, PostgreSQL: it provides all its features plus a set of utilities thought precisely for time-series data.

It basically organizes and indexes data in chunks of time, called buckets, which allows for really fast retrieval when performing range queries. Tables made this way are called hypertables.

Being still in beta when I tried it, I was a bit skeptical about Timescale. However, it looked too good of a match for my requirements so I decided to give it a try.

The setup experience was really pleasant: all I had to do was to install the extension on my database, and call create_hypertable on the vanilla SQL tables I wanted to add the bucket index on.

Then I proceeded on the bulk import of CSV files, using the wonderful pg-promise library. And again, I found an unexpected result: the underlying indexes Timescale was creating were so heavy, that waiting times on insertions had become huge.

So I decided to skip the NodeJS wrapper and went on using a Go tool that Timescale developed for ingesting a lot of data. Luckily the process sped up by orders of magnitude, but I started wondering if the bottleneck represented by all those indexes was actually worth it.

Furthermore, at the time of my tests, Timescale did not support any kind of Foreign Key constraint on hypertables, so it was a big no-no for me as referential integrity is one of the main selling points of SQL databases.

Eventually I decided to ditch Timescale and look for other solutions. However, I want to praise their wonderful Slack support and their community, which has evolved a lot since last year. Plus, Timescale now support FKs on hypertables!