Pancakes, bicycles and Apache Spark

by Max Smolaks 8 October 2019

American startup Databricks, established by

the original authors of the Apache Spark framework, is planning to spend 100 million ($109.65m) over the next three years to

expand its AI lab in Amsterdam.

Databricks says it tripled the size of its

engineering team in Amsterdam over the past two years, and its European Development Center will require many

more data scientists to work on its Unified Data

Analytics Platform.

The companys growth

is a testament to its ability to attract a skilled workforce that wants to live

and work in a vibrant city like Amsterdam, as well as Dutch infrastructure like

our world-class broadband network, said Henny

Jacobs, executive director of the Americas for the Netherlands Foreign

Investment Agency.

From Silicon Valley to Silicon Canals

Databricks

was co-founded in 2013 by a team of academics

that met at Berkeley, including computer scientist Matei Zaharia, who developed

Spark as a PhD thesis in 2009 and later co-created the Apache Mesos cluster

manager. Both projects were released under an open source license.

Spark is a cluster computing engine that

relies on in-memory processing, and the de-facto standard for handling really large

datasets. Although it wasnt developed specifically for machine learning, Spark

has been embraced by the AI community for its scalability, language

compatibility, and speed.

The open source version of Spark, maintained by the Apache Foundation, is free to use; Databricks makes its money by selling a fully managed version of the software, hosted in the cloud. This is true open source, not the frequently maligned open core.

And this

model certainly works: a few years ago, Databricks reached valuation of more

than $1 billion, which meant some people inevitably started calling the company

a unicorn. Today, the valuation stands somewhere around $2.7bn, with Databricks securing $250 million in

its most recent funding round in February.

In June, the company capitalized on the popularity of Spark among machine learning enthusiasts by releasing MLflow, a machine learning management engine designed to simplify AI projects.

MLflow enables data scientists to track and distribute experiments, package and share models across frameworks, and deploy them  no matter if the target environment is a personal laptop or a cloud data center. Just like Spark, MLflow is available for free, and Databricks sells a managed version hosted with either AWS or Azure.

The company brought its software to Europe in 2017, and the Amstrdam office is expected to total 200 staff by the end of 2019.

Our investments in Amsterdam over the next three years will support our mission to help data teams solve the worlds toughest problems, and continuing to build a top notch engineering squad in Amsterdam is integral to our success, said Ali Ghodsi, co-founder and CEO at Databricks.

Amsterdam

is currently competing against London and Berlin for both tech talent and corporate investment. The city

is one of Europes largest hubs for digital infrastructure  along with Frankfurt,

London and Paris, sometimes referred to as the FLAP markets by data center

professionals.

AI Business will be

reporting from the upcoming Spark+AI Summit in Amsterdam, taking place on 15-17

October.