Work on storage serving live traffic Your tasks may consume the same data resources (disk IOPs, CPU, …) as your live traffic. Bistro only starts tasks if the relevant resource slots are available on the data hosts, rack switches, or any other physical bottleneck, which might otherwise impact live traffic. Bistro does not dictate how you store your data. At Facebook, we have used it with: MySQL and PostgreSQL databases

Files on GlusterFS

HBase regions

f4 Blob Storage System Divide work in space or time Map your work over nodes — Bistro’s way of modeling multi-part computations, e.g. prioritized workqueue: 1 node per job

map-only jobs: nodes are map shards

sharded, replicated database: a DAG of nodes — the DB hosts are sinks (for tracking resources), and DB instances are children of their hosts

periodic work (like cron ): a data node has a child node for each time period If needed, MapReduce computations can be modeled as a sequence of 3 dependent jobs: map, shuffle, reduce. Flexible, extensible, scriptable Concise JSON configuration

Command-line tooling

Tunable, scalable web UI

REST-style scheduler API

Customizable via small plugins

Powerful job model dependencies priorities host placement node filters

Many deployment styles one scheduler, many workers sharded share-nothing workers, with results aggregated by the UI multi-scheduler/multi-worker



Why Use Bistro?

You should consider using Bistro instead of rolling your own solution, because in our experience at Facebook, every heavily used task-execution service eventually has to build out a very similar set of layers:

Execution shell: Start, wait for, kill tasks

Start, wait for, kill tasks Worker pool: Maintain the distributed state of what tasks are running where

Maintain the distributed state of what tasks are running where Data model: Currently available tasks; resource accounting

Currently available tasks; resource accounting Configuration: Add/remove jobs; tune parameters

Add/remove jobs; tune parameters Persistence: If the service crashes, don’t forget task completions

If the service crashes, don’t forget task completions Scheduling logic: Priorities; dependencies; resource constraints; host placement

Priorities; dependencies; resource constraints; host placement Task logs: Make logs accessible, while controlling disk usage

Make logs accessible, while controlling disk usage Command-line tools: Scriptable, yet friendly

Scriptable, yet friendly Web UI: Minimal learning curve; great for visualizing data

Minimal learning curve; great for visualizing data RPC API: When command-line scripting is too slow or clunky

Bistro provides all of the above, and improves continuously, so instead of re-inventing, you can focus on innovating.

Early-release software

Although Bistro has been in production at Facebook for over 3 years, the present public release is partial, including just the server components. The CLI tools and web UI will be shipping shortly.

The good news is that the parts that are released are continuously verified by build & unit tests on Travis CI, so they should work out of the box.

There is already a lot of code, with which to play, so you have any thoughts — write us a note! Open-source projects thrive and grow on encouragement and feedback.

As flexible as a library

Although Bistro comes as a ready-to-deploy service, it may be helpful to think of it as a library. In our experience at Facebook, many teams’ distributed computing needs are highly individual, requiring more customization over time.

Luckily, Bistro can adapt to your needs in three different ways.

1. Configuration: most aspects of a Bistro deployment can be tuned:

Configuration: continuously re-read a file, or a more custom solution

Persistence: use no database, SQLite, or something fancier

Code execution: run alongside the scheduler, or on remote workers

Data shards: manually configured, periodic, or script-generated

2. Plugins: Bistro’s architecture is highly pluggable, so adding your own code is a cinch:

Every plugin you contribute back to the community makes the next person’s customization easier.

3. Embedding: Or, link just the pieces you want into your custom application. Need a C++ cron library? An optionally-persistent task status store?

Getting help, and helping out

Questions and bug reports

To contact the maintainers, post in this Facebook Group or file a Github Issue.

Tips for questions and bug reports:

Describe what you want to do, and what is going wrong.

Include your OS, the git hash of your build, the commands you ran, the scheduler & worker logs.

Contributions

We gladly welcome contributions to both code and documentation — whether you refactor a whole module, or fix one misspelling. Please send pull requests for code against the master branch and for the website or documentation against the gh-pages branch.

Our CONTRIBUTING.md aims to expedite the acceptance of your pull request. It includes a 15-line C++ style guide, and explains Facebook’s streamlined contributor license agreement.

Case study: data-parallel job with data and worker resources

While Bistro can be a capable traditional task queue, one of its more unique features is the ability to account not just for the resources of the machines executing the tasks, but also for any bottlenecks imposed by the data being queried.

Here is a simple example:

Your data resides on 500 database hosts serving live traffic.

Each database host has 10-20 logical databases.

The hosts have enough performance headroom to run at most two batch tasks per host, without impacting production traffic.

You have some batch jobs, which must process all the databases.

Bistro will use 1 scheduler, and a worker pool of 100 machines.

Levels

To configure Bistro in such a setup, first describe the structure of the computation in terms of nodes at several levels:

Database host nodes — to limit tasks per DB host.

nodes — to limit tasks per DB host. Logical database nodes, each connected to its DB host node. A running task locks its logical DB.

nodes, each connected to its DB host node. A running task locks its logical DB. If remote worker hosts are in use, a node per worker host is automatically created.

host is automatically created. If global concurrency constraints are desired, a top-level scheduler instance node is also available.

In a typical setup, each running task associates with one node on each level:

database : The logical DB is the work shard being processed.

: The logical DB is the work shard being processed. host : What host to query for data?

: What host to query for data? worker : Where is the process running?

Resources

Before running jobs, define resources for the nodes at various levels:

host nodes should be protected from excessive queries to avoid slowing down live production traffic — allocate 2 host_concurrency slots to honor the requirement above.

nodes should be protected from excessive queries to avoid slowing down live production traffic — allocate 2 slots to honor the requirement above. typical worker resources might be, e.g. cpu_cores , ram_giagabytes , etc. Bistro supports non-uniform resource availability among its worker hosts.

Nodes

Hosts fail, and therefore our host to database mapping will change continuously. To deal with this, Bistro continuously polls for node updates — it can periodically run your “node fetcher” script, or read a file, or poll a custom in-process plugin. One of the standard plugins creates and removes nodes corresponding to time intervals, akin to Cron. When nodes change, tasks are started on the new nodes, and stopped on the deleted ones. Bistro’s node model also naturally accommodates database replicas.

Jobs

Lastly, specify the jobs to run — a path to a binary will suffice, and the binary will be run against every database node found above. Many other job options and metadata are available. You can even implement your tasks as custom RPC requests.

Ready to go

The above configuration is continuously polled from a file (or other configuration source) by the Bistro scheduler. Bistro’s worker processes need only minimal configuration — tell them how to connect to the scheduler, and you are in business (they do have plenty of advanced reliability-related settings).

You can then interact with the scheduler (and tasks) via HTTP REST, via a command-line tool, or via a web UI.

Read our USENIX ATC’15 paper for a more principled explanation of this resource model. Note that the model Bistro implements in practice is an extension of the “resource forest model” that is the focus of the paper. Bistro’s data model is being refined and improved, so in the future you can expect it to handle even more applications with ease.

License

Bistro is BSD-licensed. We also provide an additional patent grant.