Introduction to the fast new UnQLite Python Bindings

About a year ago, I blogged about some Python bindings I wrote for the embedded NoSQL document store UnQLite. One year later I'm happy to announce that I've rewritten the library using Cython and operations are, in most cases, an order of magnitude faster.

This was my first real attempt at using Cython and the experience was just the right mix of challenging and rewarding. I bought the O'Reilly Cython Book which came in super handy, so if you're interested in getting started with Cython I recommend picking up a copy.

In this post I'll quickly touch on the features of UnQLite, then show you how to use the Python bindings. When you're done reading you should hopefully be ready to use UnQLite in your next Python project.

What is UnQLite?

UnQLite is a serverless JSON document store built on a fast key/value database. The key/value features make UnQLite kin to DBM-style databases (BerkeleyDB, KyotoCabinet), while the JSON document store is closer to something like MongoDB. UnQLite occupies an especially unique place in the NoSQL world, though, through it's use of a special scripting language to manage the JSON document store. To make an analogy, UnQLite is to MongoDB what SQLite is to Postgres, and the Jx9 scripting language serves the same purpose in UnQLite as SQL does in SQLite. (Note that although UnQLite sounds like SQLite, the projects are not affiliated).

Here is a quick run-down of some of the features UnQLite's creators, Symisc Systems, decided were worth putting on the project's homepage:

Serverless, NoSQL database.

Transactional (ACID) database.

Zero config.

Single database file, no temporary files.

Cross-platform file format.

Self-contained C library without dependencies.

Standard key/value store with powerful disk storage engine supporting O(1) lookup time.

Document store (JSON) database via Jx9.

Cursors for linear record traversal.

Supports Terabyte sized databases.

BSD licensed.

Installing UnQLite

To get started, let's create a virtualenv and install unqlite-python. unqlite-python comes with a pre-generated C source-code file for the extension, but if you'd like you can install Cython and a new source file will be generated.

$ virtualenv unqlite-demo New python executable in unqlite-demo/bin/python2 Also creating executable in unqlite-demo/bin/python Installing setuptools, pip...done. $ cd unqlite-demo $ source bin/activate (unqlite-demo) $ pip install Cython unqlite ... Successfully built Cython unqlite Installing collected packages: Cython, unqlite Successfully installed Cython-0.22.1 unqlite-0.4.1

You can verify your install worked by running the following, which should produce no output:

$ python -c "import unqlite; unqlite.UnQLite()"

Key/Value Features

If UnQLite were only a key/value store, it would still be a fantastic database thanks to it's speed, cursors and transaction support. In this section we'll take a look at how to use the key/value features of UnQLite.

UnQLite databases can reside in a single file on disk, or entirely in memory. To begin working with UnQLite, the first step is to create a database object:

>>> from unqlite import UnQLite >>> db = UnQLite ()

The above statements will create an in-memory database. To use a file, you would instead pass in the filename when instantiating your db object.

unqlite-python implements a similar API to Python's dict object, so it should feel pretty familiar:

>>> db [ 'foo' ] = 'bar' >>> print db [ 'foo' ] bar >>> 'foo' in db True >>> del db [ 'foo' ] >>> len ( db ) 0 >>> db . update ({ 'huey' : 'kitty' , 'mickey' : 'puppy' }) >>> print [ item for item in db ] [('huey', 'kitty'), ('mickey', 'puppy')]

As shown in the example above, you can iterate directly over the database, which will yield key/value pairs. unqlite-python databases also support keys() and values() methods.

For finer-grained iteration, you can use Cursors.

>>> for i in range ( 7 ): ... db [ 'k %s ' % i ] = 'v %s ' % i >>> with db . cursor () as cursor : ... cursor . seek ( 'k4' ) ... print cursor . value () ... for key , value in cursor : # Cursors are also iterable. ... print ( key , value ) ... v4 ('k4', 'v4') ('k5', 'v5') ('k6', 'v6')

If you're using a file-backed database, UnQLite supports transactions. The simplest way to use transactions is as a context manager:

>>> db = UnQLite ( '/tmp/test.udb' ) >>> with db . transaction (): ... db [ 'foo' ] = 'bar' ... >>> print db [ 'foo' ] bar >>> with db . transaction (): ... db [ 'foo' ] = 'baze' ... db . rollback () # Undo the changes. ... >>> print db [ 'foo' ] # Prints the original value. bar

JSON Document Store

UnQLite has this crazy scripting language baked-in, which is used to query the JSON document store. Jx9 serves the same purpose in UnQLite as SQL does in SQLite, but it can also do a whole lot of crazy stuff.

I took some care to make it really easy to pass Python values into the Jx9 scripts, and pull them back out after execution. Here is a silly example:

>>> script = """ ... $my_data = { ... os_name: uname(), // jx9 builtin function ... date: __DATE__, // another builtin ... foo: $py_value // just a simple key/value. ... }; ... """ >>> with db . vm ( script ) as jx9_vm : ... jx9_vm [ 'py_value' ] = { 'baze' : 'nugget' } # Set the value of $py_value ... jx9_vm . execute () ... print jx9_vm [ 'my_data' ] # Extract $my_data from the executed script. ... {'date': '2015-07-21', 'os_name': 'Linux 4.0.7-2-ARCH #1 ... lambda x86_64', 'foo': {'baze': 'nugget'}}

This procedural scripting language is used to work with Collections of JSON documents. Rather than forcing you to write Jx9 scripts, unqlite-python abstracts away some of the most common operations behind a Collection class:

>>> users = db . collection ( 'users' ) >>> users . create () # Create the collection. >>> users . store ({ ... 'name' : 'Charlie' , ... 'pets' : [{ 'name' : 'mickey' }, { 'name' : 'huey' }], ... 'best friends' : [ 'Leslie' , 'Connor' ], ... }) 0

When we store an object, UnQLite returns the __id of the newly-created document. Multiple objects can be stored by passing in a list of dictionaries. To update an object, simply specify the __id and the dictionary of new data:

>>> users . store ([{ 'name' : 'Leslie' }, { 'name' : 'Connor' , 'type' : 'baby' }]) 2 >>> users . update ( 1 , { 'name' : 'Leslie' , 'favorite_color' : 'green' }) True

To view the documents in a collection, you can call Collection.all() :

>>> users . all () [{'__id': 0, 'best friends': ['Leslie', 'Connor'], 'name': 'Charlie', 'pets': ['mickey', 'huey']}, {'__id': 1, 'favorite color': 'green'}, {'__id': 2, 'name': 'Connor', 'type': 'baby'}]

Things get neat when it comes to filtering collections. To filter a collection, you write a filter function in Python, and unqlite-python will expose it to a Jx9 script that performs the filtering:

>>> def babies ( document ): ... return document . get ( 'type' ) == 'baby' ... >>> users . filter ( babies ) [{'__id': 2, 'name': 'Connor', 'type': 'baby'}]

For more information about collections, check out the unqlite-python documentation.

Thanks for reading

Thanks for taking the time to read this post, I hope you found the content interesting. I think UnQLite is one of those weird quirky projects that would be fun to build all sorts of little apps with. It could be used as a cache, of course, but you could go deep into learning Jx9 and write all sorts of crazy stuff.

If you have any questions or comments, please leave a comment or contact me. If you are trying out unqlite-python and believe you've found a bug, don't hesitate to create a ticket on GitHub.

Links

Here are some links you may find useful:

Here are some related blog posts that you may enjoy:

Commenting has been closed, but please feel free to contact me