Hi, hackers!

We have held education project at Sirius edu center (Sochi, Russia) with mentors from Yandex. The group of 5 students was working on improving the shared buffers eviction algorithm: Andrey Chausov, Yuriy Skakovskiy, Ivan Edigaryev, Arslan Gumerov, Daria Filipetskaya. I’ve been a mentor for the group. For two weeks we have been looking into known caching algorithms and tried to adapt some of them for PostgreSQL codebase.

While a lot of algorithms appeared to be too complex to be hacked in 2 weeks, we decided to implement and test the working version of tweaked LRU eviction algorithm.

===How it works===

Most of the buffers are organized into the linked list. Firstly admitted pages jump into 5/8th of the queue. The last ⅛ of the queue is governed by clock sweep algorithm to improve concurrency.

===So how we tested the patch===

We used sampling on 4 Yandex.Cloud compute instances with 16 vCPU cores, 8GB of RAM, 200GB database in 30-minute YCSB-like runs with Zipf distribution. We found that on read-only workload our algorithm is showing consistent improvement over the current master branch. On read-write workloads we haven’t found performance improvements yet, there was too much noise from checkpoints and bgwriter (more on it in TODO section).

Charts are here: [0,1]

We used this config: [2]

===TODO===

We have taken some ideas expressed by Ben Manes in the pgsql-hackers list. But we could not implement all of them during the time of the program. For example, we tried to make LRU bumps less write-contentious by storing them in a circular buffer. But this feature was not stable enough.

The patch in its current form also requires improvements. So, we shall reduce the number of locks at all (in this way we have tried bufferization, but not only it). “Clock sweep” algorithm at the last ⅛ part of the logical sequence should be improved too (see ClockSweepTickAtTheAnd() and places of its usage).

Unfortunately, we didn’t have more time to bring CAR and W-TinyLFU to testing-ready state.

We have a working implementation of frequency sketch [3] to make a transition between the admission cycle and LRU more concise with TinyLFU filter. Most probably, work in this direction will be continued.

Also, the current patch does not change bgwriter behavior: with a piece of knowledge from LRU, we can predict that some pages will not be changed in the nearest future. This information should be used to schedule the background writes better.

We also think that with proper eviction algorithm shared buffers should be used instead of specialized buffer rings.

We will be happy to hear your feedback on our work! Thank you :)

[0] LRU TPS https://yadi.sk/i/wNNmNw3id22nMQ

[1] LRU hitrate https://yadi.sk/i/l1o710C3IX6BiA

[2] Benchmark config https://yadi.sk/d/L-PIVq--ujw6NA

[3] Frequency sketch code https://github.com/heyfaraday/postgres/commit/a684a139a72cd50dd0f9d031a8aa77f998607cf1

With best regards, almost serious cache group.