Tutorial: Use FlatBuffers in Go

October 26, 2015 FlatBuffers Golang

This is a post in an ongoing series on FlatBuffers.

The FlatBuffers project is awesome. In this tutorial, you’ll learn how to use it in Go.

To learn more about why we need yet another way to encode data, go read my post Why FlatBuffers.

FlatBuffers is a serialization format from Google. It’s really fast at reading and writing your data: much quicker than JSON or XML, and often faster than Google’s other format, Protocol Buffers. It’s schema-versioned, which means your data has integrity (like in a relational database). FlatBuffers supports six programming languages: C++, C#, Go, Java, Javascript, and Python.

This post will show you how to set up FlatBuffers and then use it in a demo Go program. We’ll finish with speed measurements, because we all love micro-benchmarks!

(Full disclosure: I maintain the Go and Python ports.)

This tutorial has seven short parts:

If you’d like to see all of the code in one place, I’ve put the project up at a GitHub repository.

1. Install the FlatBuffers compiler

First things first: let’s install the compiler.

The compiler is used only in development. That means you have no new system dependencies to worry about in production environments!

Installation with Homebrew on OSX

On my OSX system, I use Homebrew to manage packages. To update the Homebrew library and install FlatBuffers, run:

$ brew update $ brew install flatbuffers

Personally, I like to install the latest development version from the official Git repository:

$ brew update $ brew install flatbuffers --HEAD

If successful, you will have the flatc program accessible from your shell. To verify it’s installed, execute flatc :

$ flatc flatc: missing input files ...

Other installation methods

If you’d like to install from source, install a Windows executable, or build for Visual Studio, head over to my post Installing FlatBuffers for more.

2. Write a schema definition

All data in FlatBuffers are defined by schemas. Schemas in FlatBuffers are plain text files, and they are similar in purpose to schemas in databases like Postgres.

We’ll work with data that make up user details for a website. It’s a trivial example, but good for an introduction. Here’s the schema:

// myschema.fbs namespace users; table User { name:string; id:ulong; } root_type User;

Create a new directory for our tutorial, and place the above code in a file called myschema.fbs .

This schema defines User , which holds one user’s name and id . The namespace for these types is users (which will be the generated Go package name). The topmost type in our object hierarchy is the root type User .

Schemas are a core part of FlatBuffers, and we’re barely scratching the surface with this one. It’s possible to have default values, vectors, objects-within-objects, enums, and more. If you’re curious, go read the documentation on the schema format.

3. Generate Go accessor code from the schema

The next step is to use the flatc compiler to generate Go code for us. It takes as input a schema file and outputs ready-to-use Go code.

In the directory with the myschema.fbs file, run the following command:

flatc -g myschema.fbs

This will generate Go code under the directory users , which was the namespace we declared in the schema file. Here’s what the directory looks like afterwards:

$ tree . ├── myschema.fbs └── users └── User.go 1 directory, 2 files

One file is generated for each first class datatype. In our case, there is one file, for User .

A quick browse of users/User.go shows that there are three sections to the generated file. Here’s how to think about the different function groups:

Type definition and initialization type User struct { ... } func GetRootAsUser(buf []byte, offset flatbuffers.UOffsetT) *User { ... } func (rcv *User) Init(buf []byte, i flatbuffers.UOffsetT) { ... }

Instance methods providing read access to User data func (rcv *User) Name() []byte { ... } func (rcv *User) Id() uint64 { ... }

Functions used to create new User objects func UserStart(builder *flatbuffers.Builder) { ... } func UserAddName(builder *flatbuffers.Builder, name flatbuffers.UOffsetT) { ... } func UserAddId(builder *flatbuffers.Builder, id uint64) { ... } func UserEnd(builder *flatbuffers.Builder) flatbuffers.UOffsetT { ... }

We’ll use these functions when we write the demo program.

4. Install the FlatBuffers Go runtime library

The FlatBuffers Go runtime package is go get -able. However, because this article is a self-contained tutorial, I’m going to mangle the GOPATH environment variable to make installation local to this directory:

GOPATH=$(pwd) go get github.com/google/flatbuffers/go

( pwd prints the absolute path of the current directory.)

Your project directory should now have 1 file and 3 directories at the toplevel:

$ ls -1 myschema.fbs pkg src users

5. Write a demo Go program to encode and decode example data

Let’s create a full program to write and read our User FlatBuffers.

Imports

The following code provides the package name and imports.

Copy this into a new file, main.go :

// main.go part 1 of 4 package main import ( "fmt" "./users" flatbuffers "github.com/google/flatbuffers/go" )

This code imports fmt for printing, ./users to access our generated code, and the flatbuffers runtime library.

Writing

FlatBuffer objects are stored directly in byte slices. Each object is constructed using the generated functions we made with the flatc compiler.

Append the following snippet to your main.go :

// main.go part 2 of 4 func MakeUser(b *flatbuffers.Builder, name []byte, id uint64) []byte { // re-use the already-allocated Builder: b.Reset() // create the name object and get its offset: name_position := b.CreateByteString(name) // write the User object: users.UserStart(b) users.UserAddName(b, name_position) users.UserAddId(b, id) user_position := users.UserEnd(b) // finish the write operations by our User the root object: b.Finish(user_position) // return the byte slice containing encoded data: return b.Bytes[b.Head():] }

This function takes a FlatBuffers Builder object and uses generated methods to write the user’s name and ID. (Note how the string value is created before the creation of the User object. This is needed because variable-length data are built ‘bottom to top’. I’ll write more about this in a future article.)

Reading

FlatBuffer objects are stored as byte slices, and we access the data inside using the generated functions (that the flatc compiler made for us in ./users ).

Append the following code to your main.go :

// main.go part 3 of 4 func ReadUser(buf []byte) (name []byte, id uint64) { // initialize a User reader from the given buffer: user := users.GetRootAsUser(buf, 0) // point the name variable to the bytes containing the encoded name: name = user.Name() // copy the user's id (since this is just a uint64): id = user.Id() return }

This function takes a byte slice as input, and initializes a FlatBuffer reader for the User type. It then gives us access to the name and ID values in the byte slice.

The main function

Now we tie it all together. This is the main function:

// main.go part 4 of 4 func main() { b := flatbuffers.NewBuilder(0) buf := MakeUser(b, []byte("Arthur Dent"), 42) name, id := ReadUser(buf) fmt.Printf("%s has id %d. The encoded data is %d bytes long.

", name, id, len(buf)) }

This function writes, reads, then prints our data. Note that buf is the byte slice with encoded data. (This is the object you could send over the network, or save it to a file).

Running it

Now, we run it:

$ GOPATH=$(pwd) go run main.go Arthur Dent has id 42. The buffer is 48 bytes long.

To recap, what we’ve done here is write a short program that uses generated code to write, then read, a byte slice in which we encoded data for an example user.

6. Write and run benchmarks

To conclude, write a short benchmark program, then run it.

Place the following code in main_test.go :

// main_test.go package main import ( "bytes" "testing" flatbuffers "github.com/google/flatbuffers/go" ) func BenchmarkWrite(b *testing.B) { builder := flatbuffers.NewBuilder(0) b.ReportAllocs() for i := 0; i < b.N ; i++ { builder.Reset() buf := MakeUser(builder, []byte("Arthur Dent"), 42) if i == 0 { b.SetBytes(int64(len(buf))) } } } func BenchmarkRead(b *testing.B) { builder := flatbuffers.NewBuilder(0) name := []byte("Arthur Dent") buf := MakeUser(builder, name, 42) b.SetBytes(int64(len(buf))) b.ReportAllocs() for i := 0; i < b.N ; i++ { got_name, _ := ReadUser(buf) // do some work to prevent cheating the benchmark: bytes.Equal(got_name, name) } } func BenchmarkRoundtrip(b *testing.B) { builder := flatbuffers.NewBuilder(0) b.ReportAllocs() for i := 0; i < b.N ; i++ { builder.Reset() buf := MakeUser(builder, []byte("Arthur Dent"), 42) got_name, _ := ReadUser(buf) if i == 0 { b.SetBytes(int64(len(buf))) } // do some work to prevent cheating the benchmark: bytes.Equal(got_name, []byte("Arthur Dent")) } }

Now, invoke it like this:

$ GOPATH=$(pwd) go test -test.bench .

On my system, these are the results:

BenchmarkWrite-4 10000000 214 ns/op 223.35 MB/s 0 B/op 0 allocs/op BenchmarkRead-4 20000000 72.4 ns/op 662.90 MB/s 0 B/op 0 allocs/op BenchmarkRoundtrip-4 5000000 302 ns/op 158.71 MB/s 0 B/op 0 allocs/op

Some things to note:

No heap allocations occur. We achieved this by using the Reset method on the Builder object, and by directly using []byte slices instead of string s.

method on the object, and by directly using slices instead of s. We can write 1e9 / 214 ~ 4,500,000 objects per second.

We can access 1e9 / 73 ~ 13,000,000 objects per second.

Because this is FlatBuffers, our encoded data is schema-versioned, platform-independent, and requires no memory allocations to write or read.

7. Learn more and get involved

FlatBuffers is an active open-source project, with backing from Google. It’s Apache-licensed, and available for C++, Java, C#, Go, Python, and Javascript (with more languages on the way!).

Here are some resources to get you started:

I’ll be writing about FlatBuffers a lot on this blog, so stay tuned!