FoundationDB (FDB) is an ACID-compliant, multi-model, distributed database. The software started out life almost ten years ago. In March of 2015, Apple acquired the company behind FDB and in 2018, they open sourced the software under an Apache 2.0 license. VMWare's WaveFront runs an FDB cluster with at least a petabyte of capacity, Snowflake runs FDB for their metadata storage for their Cloud database service and Apple uses FDB for their CloudKit backend.

FDB uses the concept of Layers to add functionality. There are layers for MongoDB API-compatible document storage, record-oriented storage and SQL support among others.

FDB is optimised for SSDs to the point that you need to decide between HDD and SSD-specific configurations when setting up a database. The clustering support allows for scaling both up and down with data being automatically rebalanced. FDB utilises SQLite for its underlying storage engine.

FDB itself is written in Flow, a programming language the engineers behind FDB developed. The language adds actor-based concurrency as well as new keywords and control-flow primitives to C++11. As of this writing FDB is comprised of 100K lines of Flow / C++11 code and a further 83K lines of C code.

In this post I'll take a look at setting up a FoundationDB cluster and running a simple leaderboard example using Python. The leaderboard code used in this post originated in this forum post.

A FoundationDB Cluster, Up & Running I've put together a cluster of three m5d.xlarge on AWS EC2. These instance types come with 4 vCPUs, 15 GB of RAM, 150 GB of NVMe SSD storage and up to 10 GBit/s of networking connectivity. The three instances cost of $0.75 / hour to run. On all three instances I'll first format the NVMe partition using the XFS file system. This file system was first created by Silicon Graphics in 1993 and has excellent performance characteristics when run on SSDs. $ sudo mkfs -t xfs /dev/nvme1n1 $ sudo mkdir -p /var/lib/foundationdb/data $ sudo mount /dev/nvme1n1 /var/lib/foundationdb/data I'll then install some prerequisites for the Python code in this post. $ sudo apt update $ sudo apt install \ python-dev \ python-pip \ virtualenv On the first server I'll create a virtual environment and install the FoundationDB and Pandas python packages. $ virtualenv ~/.fdb $ source ~/.fdb/bin/activate $ pip install foundationdb pandas FoundationDB's server package depends on the client package being installed beforehand so I'll download and install that first. The following was run on all three instances. $ wget -c https://www.foundationdb.org/downloads/6.0.18/ubuntu/installers/foundationdb-clients_6.0.18-1_amd64.deb $ sudo dpkg -i foundationdb-clients_6.0.18-1_amd64.deb The following will install the server package and was run on all three instances as well. $ wget -c https://www.foundationdb.org/downloads/6.0.18/ubuntu/installers/foundationdb-server_6.0.18-1_amd64.deb $ sudo dpkg -i foundationdb-server_6.0.18-1_amd64.deb I'll run a command to configure the first server to switch binding from the local network interface to the private network instead. This way it'll be reachable by the other two servers without being available to the wider internet. $ sudo /usr/lib/foundationdb/make_public.py /etc/foundationdb/fdb.cluster is now using address 172.30.2.218 I'll then take the contents from /etc/foundationdb/fdb.cluster on the first server and place them in the same file on the other two servers. With the cluster configuration synced between all three machines I'll restart FDB on each of the systems. $ sudo service foundationdb restart I'll then configure FDB for SSD storage, triple replication and set all three instances up as coordinators. $ fdbcli configure triple ssd coordinators auto This is the resulting status after those changes. status details Using cluster file `/etc/foundationdb/fdb.cluster'. Configuration: Redundancy mode - triple Storage engine - ssd-2 Coordinators - 3 Cluster: FoundationDB processes - 3 Machines - 3 Memory availability - 15.1 GB per process on machine with least available Fault Tolerance - 0 machines (1 without data loss) Server time - 02/18/19 08:53:13 Data: Replication health - Healthy (Rebalancing) Moving data - 0.000 GB Sum of key-value sizes - 0 MB Disk space used - 0 MB Operating space: Storage server - 142.4 GB free on most full server Log server - 142.4 GB free on most full server Workload: Read rate - 14 Hz Write rate - 0 Hz Transactions started - 9 Hz Transactions committed - 1 Hz Conflict rate - 0 Hz Backup and DR: Running backups - 0 Running DRs - 0 Process performance details: 172.30.2.4:4500 ( 1% cpu; 0% machine; 0.000 Gbps; 0% disk IO; 0.3 GB / 15.1 GB RAM ) 172.30.2.137:4500 ( 1% cpu; 0% machine; 0.000 Gbps; 0% disk IO; 0.4 GB / 15.2 GB RAM ) 172.30.2.218:4500 ( 1% cpu; 0% machine; 0.026 Gbps; 0% disk IO; 0.4 GB / 15.1 GB RAM ) Coordination servers: 172.30.2.4:4500 (reachable) 172.30.2.137:4500 (reachable) 172.30.2.218:4500 (reachable)