Data processing more than billion rows per second



presented by Kohei KaiGai

Nowadays, GPU is not only for computing intensive workloads, but for I/O intensive big-data workloads also.

This talk introduces how SSD-to-GPU Direct SQL, implemented as extension of PostgreSQL, optimizes data flow from storages to processors over PCIe-bus for efficient execution of analytic/reporting workloads.

Combination of this technology with comprehensive database features (e.g, columnar-store, partitioned tables, ...) pulled out maximum capability of the latest hardwares, for more than billion rows per second grade data processing on a single-node PostgreSQL server.

Its main focus is log-data processing on IoT/M2M area where tons of data is generated day-by-day. Our approach allows to simplify the system landscape, and utilize engineer's knowledge and experiences of PostgreSQL.

In short, this talk contains the items below from the technology viewpoint.

SSD-to-GPU Direct SQL

Columnar-store (Arrow_Fdw)

PCIe-bus level optimization using table partitioning

Benchmark results

Customer case (under the negotiation)

For your references: