One of the most important components of a network automation strategy is how to manage your data and/or Source of Truth (SoT). In networking, the data elements are often massive. As an example, most Fortune 500 type companies will have to manage a few hundred thousand to potentially millions of switchports, each with several data points (VLAN, MTU, IP, etc) per interface. The scale and criticality of this data present several challenges. The intention of this blog is describe the need for a versionable database in network automation and serve as an primer to a versionable database.

Problem Space

When managing and storing your Source of Truth data, there are two primary mechanisms, each with it’s own pros and cons.

Source Control (Git)

Provides the ability to version data in such a way the exact state of the data at any point in the past is known.

Provides the ability to know who is the owner of the data.

Provides the ability to populate data in a staging area, without modifying the production data.

Tooling integration with things such as CI Systems.

Decentralized storage of the data.

Database

Provides ACID-based transactions of the data.

Has a native querying language to obtain, filter, and all around work with the data.

Provides schema enforcement of the data.

The ability to scale to large datasets.

There is a clear dichotomy between these two choices. On one hand there is great integrations with all standard DevOps tooling, on the other hand there is an enterprise-grade manager of the data.

History

For several years, a few of us have been searching for solutions that combine these two concepts. If one could build a database with Git constructs, this would allow versionable management of your data, with the ability to query, effectively store, and manage at scale. Through various searches it would seem that this has not been solved in any meaningful way. The closest I have come across was a project called NOMS, but it has limitations.

Recently a startup company called liquidata was founded and is creating a solution–a library called Dolt. Dolt is is written on top of the NOMS project, in Go. The intention is to support all Git semantics (such as branch, add, commit, etc..), as well as all MySQL semantics (such as insert, create, update, etc…).

By supporting all MySQL semantics, applications that use a MySQL server should be able to simply use a Dolt-SQL server with no disruption. Whether you are using an ODBC driver or raw SQL, in theory it should still work. You would naturally have to build in the capability for your application to take advantage of the Git capabilities or use traditional “Git-like” workflows.

By supporting all Git semantics, data should be able to be managed in a decentralized fashion using Git workflows to branch, add, diff, and merge the data. In a similar vein to GitHub, they have developed DoltHub, to provide that level of tooling expected in a standard Git user interface, such as forking, API’s, webhooks, and CI integrations.

Primer

In this brief introduction to the technology, we will create a Dolt data repository, called “simple-inventory” and run it on a dolt-sql server. If you care to follow along the only requirements are to have Docker and internet access.

Setup

Start the Docker container.

docker run -it --rm --name dolt golang:1.12.14

Install Dolt by running curl -L https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/install.sh | bash

root@2a29f8a34dd3:/go# curl -L https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/install.sh | bash % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 601 0 601 0 0 2647 0 -- :--:-- -- :--:-- -- :--:-- 2647 100 3038 100 3038 0 0 8392 0 -- :--:-- -- :--:-- -- :--:-- 8392 Downloading: https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/dolt-linux-amd64.tar.gz Installing dolt, git-dolt and git-dolt-smudge to /usr/local/bin. root@2a29f8a34dd3:/go#

You can create a Dolt data repository by creating a folder, and then initializing the repository. Just like Git, in Dolt, you need to add yourself to the Dolt config. If you are familiar with Git, these commands will be familiar, only changing the command from git to dolt , but keeping all of the other options the same.

root@2a29f8a34dd3:/go# mkdir simple-inventory root@2a29f8a34dd3:/go# cd simple-inventory root@2a29f8a34dd3:/go/simple-inventory# dolt config --global --add user.email "ken@celenza.org" Config successfully updated. root@2a29f8a34dd3:/go/simple-inventory# dolt config --global --add user.name "itdependsnetworks" Config successfully updated. root@2a29f8a34dd3:/go/simple-inventory# dolt init Successfully initialized dolt data repository. root@2a29f8a34dd3:/go/simple-inventory# dolt status On branch master nothing to commit, working tree clean root@2a29f8a34dd3:/go/simple-inventory# dolt log commit hc8gq0434ckkjeo36rccn55mo14gk87q Author: itdependsnetworks <ken@celenza.org> Date: Tue Jan 28 04:22:33 +0000 2020 Initialize data repository root@2a29f8a34dd3:/go/simple-inventory#

Creating Tables

Ideally, if you are familiar with MySQL the only difference should be using Dolt engine instead of a MySQL server, everything should be the same. To enter the Dolt SQL server, simply use the command dolt sql .

root@2a29f8a34dd3:/go/simple-inventory# dolt sql # Welcome to the DoltSQL shell. # Statements must be terminated with ';'. # "exit" or "quit" (or Ctrl-D) to exit. doltsql>

Create tables as you normally would, in this example, we will create two tables. The table device_inventory will have the inventory of the devices, with columns for hostname and IP address. The table vlan will have the hostname (with foreign key relationaship to the device_inventory table) column, VLAN , and name of the VLAN.

Note: Though the foreign key relationaship syntax is defined, it is not a currently supported Dolt feature.

doltsql > CREATE TABLE device_inventory ( -> hostname varchar ( 32 ) NOT NULL , -> ip_address varchar ( 15 ) NOT NULL , -> primary key ( `hostname` ) -> ); doltsql > CREATE TABLE vlan ( -> hostname varchar ( 32 ), -> vlan int NOT NULL , -> name varchar ( 32 ) NOT NULL , -> PRIMARY KEY ( hostname , vlan ), -> FOREIGN KEY ( hostname ) REFERENCES device_inventory ( hostname ) -> ); doltsql > exit

Here is where it starts to get interesting–we can combine these two different concepts and see how they interact. So far, the data has not been committed into master, only staged into our local environment. We can view this by issuing standard Git-like commands when we return back to the command line from the dolt-sql server.

Run the dolt status , dolt diff , and dolt diff -q commands.

root@2a29f8a34dd3:/go/simple-inventory# dolt status On branch master Untracked files: ( use "dolt add <table>" to include in what will be committed ) new table: device_inventory new table: vlan root@2a29f8a34dd3:/go/simple-inventory# dolt diff diff --dolt a/device_inventory b/device_inventory added table diff --dolt a/vlan b/vlan added table root@2a29f8a34dd3:/go/simple-inventory# dolt diff -q CREATE TABLE ` device_inventory ` ( ` hostname ` LONGTEXT NOT NULL COMMENT 'tag:0' , ` ip_address ` LONGTEXT NOT NULL COMMENT 'tag:1' , PRIMARY KEY ( ` hostname ` ) ) ; CREATE TABLE ` vlan ` ( ` hostname ` LONGTEXT NOT NULL COMMENT 'tag:0' , ` vlan ` BIGINT NOT NULL COMMENT 'tag:1' , ` name ` LONGTEXT NOT NULL COMMENT 'tag:2' , PRIMARY KEY ( ` hostname ` , ` vlan ` ) ) ; root@2a29f8a34dd3:/go/simple-inventory#

The dolt status command shows that the schema has not “taken effect” by being merged into master. The dolt diff with optional -q flag, shows that a new table has been created. Note the additional tag parameters, this is logic to track column names, which may change over time and cause name conflicts. The new schema can be committed into master.

Issue the commands dolt add -a , dolt commit -m 'initial schema' , and dolt log .

root@2a29f8a34dd3:/go/simple-inventory# dolt add -a root@2a29f8a34dd3:/go/simple-inventory# dolt commit -m 'initial schema' commit 4n29eaqb1v9pmnpnda2r43fc91p1vp9l Author: itdependsnetworks <ken@celenza.org> Date: Tue Jan 28 04:22:41 +0000 2020 initial schema root@2a29f8a34dd3:/go/simple-inventory# dolt log commit 4n29eaqb1v9pmnpnda2r43fc91p1vp9l Author: itdependsnetworks <ken@celenza.org> Date: Tue Jan 28 04:22:41 +0000 2020 initial schema commit hc8gq0434ckkjeo36rccn55mo14gk87q Author: itdependsnetworks <ken@celenza.org> Date: Tue Jan 28 04:22:33 +0000 2020 Initialize data repository root@2a29f8a34dd3:/go/simple-inventory#

Awesome! The first piece of information of the database was committed.

Adding data

Defining schema is not all the valuable without the data. Again, using standard SQL constructs we can commit the data. Let’s use proper branching strategies this time.

Create a branch and Enter dolt-sql

root@2a29f8a34dd3:/go/simple-inventory# dolt checkout -b intial_data Switched to branch 'intial_data' root@2a29f8a34dd3:/go/simple-inventory# dolt sql # Welcome to the DoltSQL shell. # Statements must be terminated with ';'. # "exit" or "quit" (or Ctrl-D) to exit. doltsql>

Add data

doltsql > INSERT INTO device_inventory ( hostname , ip_address ) VALUES ( "nyc-sw01" , "10.1.1.1" ); + ---------+ | updated | + ---------+ | 1 | + ---------+ doltsql > INSERT INTO device_inventory ( hostname , ip_address ) VALUES ( "nyc-sw02" , "10.1.1.2" ); + ---------+ | updated | + ---------+ | 1 | + ---------+ doltsql > INSERT INTO vlan ( hostname , vlan , name ) VALUES ( "nyc-sw01" , 10 , "user" ); + ---------+ | updated | + ---------+ | 1 | + ---------+ doltsql > INSERT INTO vlan ( hostname , vlan , name ) VALUES ( "nyc-sw01" , 20 , "printer" ); + ---------+ | updated | + ---------+ | 1 | + ---------+ doltsql > INSERT INTO vlan ( hostname , vlan , name ) VALUES ( "nyc-sw01" , 30 , "wap" ); + ---------+ | updated | + ---------+ | 1 | + ---------+ doltsql > INSERT INTO vlan ( hostname , vlan , name ) VALUES ( "nyc-sw02" , 10 , "user" ); + ---------+ | updated | + ---------+ | 1 | + ---------+ doltsql > INSERT INTO vlan ( hostname , vlan , name ) VALUES ( "nyc-sw02" , 20 , "printer" ); + ---------+ | updated | + ---------+ | 1 | + ---------+ doltsql > INSERT INTO vlan ( hostname , vlan , name ) VALUES ( "nyc-sw02" , 30 , "wap" ); + ---------+ | updated | + ---------+ | 1 | + ---------+ doltsql > exit Bye

Now that we have staged the data, we can view the date in two different ways. The first is via raw sql entries, and the other is more akin to a unix diff.

View the diff. In the console, diffs are actually color coded green and red for add and remove respectively.

root@2a29f8a34dd3:/go/simple-inventory# dolt diff diff --dolt a/device_inventory b/device_inventory --- a/device_inventory @ tktob1spsoos6isfdj4o9benp9c57iic +++ b/device_inventory @ b37haqr0n9j1e5bckj35sqt4kntojlp9 +-----+----------+------------+ | | hostname | ip_address | +-----+----------+------------+ | + | nyc-sw01 | 10.1.1.1 | | + | nyc-sw02 | 10.1.1.2 | +-----+----------+------------+ diff --dolt a/vlan b/vlan --- a/vlan @ 7f8lqlv2k9cpdth68kob9hl8atuejrpn +++ b/vlan @ e2kovelv2sfjuo584o0v6hp76207k720 +-----+----------+------+---------+ | | hostname | vlan | name | +-----+----------+------+---------+ | + | nyc-sw01 | 10 | user | | + | nyc-sw01 | 20 | printer | | + | nyc-sw01 | 30 | wap | | + | nyc-sw02 | 10 | user | | + | nyc-sw02 | 20 | printer | | + | nyc-sw02 | 30 | wap | +-----+----------+------+---------+ root@2a29f8a34dd3:/go/simple-inventory# dolt diff -q INSERT INTO ` device_inventory ` ( ` hostname ` , ` ip_address ` ) VALUES ( "nyc-sw01" , "10.1.1.1" ) ; INSERT INTO ` device_inventory ` ( ` hostname ` , ` ip_address ` ) VALUES ( "nyc-sw02" , "10.1.1.2" ) ; INSERT INTO ` vlan ` ( ` hostname ` , ` vlan ` , ` name ` ) VALUES ( "nyc-sw01" ,10, "user" ) ; INSERT INTO ` vlan ` ( ` hostname ` , ` vlan ` , ` name ` ) VALUES ( "nyc-sw01" ,20, "printer" ) ; INSERT INTO ` vlan ` ( ` hostname ` , ` vlan ` , ` name ` ) VALUES ( "nyc-sw01" ,30, "wap" ) ; INSERT INTO ` vlan ` ( ` hostname ` , ` vlan ` , ` name ` ) VALUES ( "nyc-sw02" ,10, "user" ) ; INSERT INTO ` vlan ` ( ` hostname ` , ` vlan ` , ` name ` ) VALUES ( "nyc-sw02" ,20, "printer" ) ; INSERT INTO ` vlan ` ( ` hostname ` , ` vlan ` , ` name ` ) VALUES ( "nyc-sw02" ,30, "wap" ) ; root@2a29f8a34dd3:/go/simple-inventory#

Personally, I’m pretty impressed with what is happening here, but there is still more to do to complete this workflow. The data needs to be committed, and merged from the “feature” branch into master branch.

Commit to the feature branch.

root@2a29f8a34dd3:/go/simple-inventory# dolt add -a root@2a29f8a34dd3:/go/simple-inventory# dolt commit -m 'initial data' commit gbt6r21mtbd0cvs5iahog912essjhn0n Author: itdependsnetworks <ken@celenza.org> Date: Tue Jan 28 04:22:50 +0000 2020 initial data root@2a29f8a34dd3:/go/simple-inventory#

Merge into the master branch.

root@2a29f8a34dd3:/go/simple-inventory# dolt checkout master Switched to branch 'master' root@2a29f8a34dd3:/go/simple-inventory# dolt branch intial_data * master root@2a29f8a34dd3:/go/simple-inventory# dolt merge intial_data Updating 4n29eaqb1v9pmnpnda2r43fc91p1vp9l..gbt6r21mtbd0cvs5iahog912essjhn0n Fast-forward root@2a29f8a34dd3:/go/simple-inventory# dolt log commit gbt6r21mtbd0cvs5iahog912essjhn0n Author: itdependsnetworks <ken@celenza.org> Date: Tue Jan 28 04:22:50 +0000 2020 initial data commit 4n29eaqb1v9pmnpnda2r43fc91p1vp9l Author: itdependsnetworks <ken@celenza.org> Date: Tue Jan 28 04:22:41 +0000 2020 initial schema commit hc8gq0434ckkjeo36rccn55mo14gk87q Author: itdependsnetworks <ken@celenza.org> Date: Tue Jan 28 04:22:33 +0000 2020 Initialize data repository root@2a29f8a34dd3:/go/simple-inventory#

So far this is showing the “Create” part of standard CRUD operations is working, meaning we can add data to the repository.

Naturally, as soon as data is entered, you will want to modify it. The process is the same to modify: branch, sql statements, add, commit, and merge.

Checkout a new branch, and enter Dolt SQL.

root@2a29f8a34dd3:/go/simple-inventory# dolt checkout -b change_vlan Switched to branch 'change_vlan' root@2a29f8a34dd3:/go/simple-inventory# dolt sql # Welcome to the DoltSQL shell. # Statements must be terminated with ';'. # "exit" or "quit" (or Ctrl-D) to exit. doltsql>

Update the data.

doltsql > update vlan set name = "prnt" where name = "printer" ; + ---------+---------+ | matched | updated | + ---------+---------+ | 2 | 2 | + ---------+---------+ doltsql > exit Bye

View the diff.

root@2a29f8a34dd3:/go/simple-inventory# dolt diff diff --dolt a/vlan b/vlan --- a/vlan @ bjg5ou111lmu03ciu6aisphf7m6jbk1r +++ b/vlan @ 7f8lqlv2k9cpdth68kob9hl8atuejrpn +-----+----------+------+---------+ | | hostname | vlan | name | +-----+----------+------+---------+ | < | nyc-sw01 | 20 | printer | | > | nyc-sw01 | 20 | prnt | | < | nyc-sw02 | 20 | printer | | > | nyc-sw02 | 20 | prnt | +-----+----------+------+---------+ root@2a29f8a34dd3:/go/simple-inventory# dolt diff -q UPDATE ` vlan ` SET ` name ` = "prnt" WHERE ( ` hostname ` = "nyc-sw01" AND ` vlan ` = 20 ) ; UPDATE ` vlan ` SET ` name ` = "prnt" WHERE ( ` hostname ` = "nyc-sw02" AND ` vlan ` = 20 ) ; root@2a29f8a34dd3:/go/simple-inventory#

A keen eye will note the captured update statements shown in the diff are not the same as the update statement inputted in the SQL server. This conversion makes sense, since the data could be merged on a different set of data that had more or less data in it. It also illustrates the complexity of building this technology.

Commit to a feature branch and merge to master.

root@2a29f8a34dd3:/go/simple-inventory# dolt add -a root@2a29f8a34dd3:/go/simple-inventory# dolt commit -m 'update data' commit jqe3vb84fhfunn7uapksdt5thb4rr7r7 Author: itdependsnetworks <ken@celenza.org> Date: Tue Jan 28 04:23:03 +0000 2020 update data root@2a29f8a34dd3:/go/simple-inventory# dolt checkout master Switched to branch 'master' root@2a29f8a34dd3:/go/simple-inventory# dolt merge change_vlan Updating gbt6r21mtbd0cvs5iahog912essjhn0n..jqe3vb84fhfunn7uapksdt5thb4rr7r7 Fast-forward root@2a29f8a34dd3:/go/simple-inventory# dolt sql # Welcome to the DoltSQL shell. # Statements must be terminated with ';'. # "exit" or "quit" (or Ctrl-D) to exit. doltsql>

View the data from dolt sql.

doltsql > select * from vlan ; + ----------+------+------+ | hostname | vlan | name | + ----------+------+------+ | nyc - sw01 | 10 | user | | nyc - sw01 | 20 | prnt | | nyc - sw01 | 30 | wap | | nyc - sw02 | 10 | user | | nyc - sw02 | 20 | prnt | | nyc - sw02 | 30 | wap | + ----------+------+------+ doltsql >

You can also view who is the owner of the data by tracking to the commit, which includes the auther, time, and commit hash metadata as well, using the dolt blame <table> command.

root@2a29f8a34dd3:/go/simple-inventory# dolt blame vlan +----------+------+--------------+-------------------+------------------------------+-------------------+ | HOSTNAME | VLAN | COMMIT MSG | AUTHOR | TIME | COMMIT | +----------+------+--------------+-------------------+------------------------------+-------------------+ | nyc-sw01 | 30 | initial data | itdependsnetworks | Tue Jan 28 04:22:50 UTC 2020 | gbt6r21mtbd0cvs5i | | nyc-sw02 | 10 | initial data | itdependsnetworks | Tue Jan 28 04:22:50 UTC 2020 | gbt6r21mtbd0cvs5i | | nyc-sw02 | 20 | update data | itdependsnetworks | Tue Jan 28 04:23:03 UTC 2020 | jqe3vb84fhfunn7ua | | nyc-sw02 | 30 | initial data | itdependsnetworks | Tue Jan 28 04:22:50 UTC 2020 | gbt6r21mtbd0cvs5i | | nyc-sw01 | 10 | initial data | itdependsnetworks | Tue Jan 28 04:22:50 UTC 2020 | gbt6r21mtbd0cvs5i | | nyc-sw01 | 20 | update data | itdependsnetworks | Tue Jan 28 04:23:03 UTC 2020 | jqe3vb84fhfunn7ua | +----------+------+--------------+-------------------+------------------------------+-------------------+ root@2a29f8a34dd3:/go/simple-inventory#

Summary

This is just a primer, and there is still a lot of work to be done to truly have feature parity with both MySQL and Git, but this is a great step in the right direction. I plan to continue to monitor and follow up with examples, use cases, and library updates over time. Specifically, next time, I want to extend the workflow to include DoltHub to review those capabilities as well.

-Ken