Update: MapD rebranded as OmniSci in 2018.

I was blown away when I recently heard MapD was going to make the source code for their GPU-powered database freely available on GitHub. MapD has always dominated the top of my benchmarks recap board but up until now if you wanted to use it you'd need to buy a commercial license or run MapD's AMI on AWS. Now anyone can compile their database from source and run it on any machine with as many GPUs as they'd like or take the compiled binaries and run them on any GPU-backed AWS, Google Cloud or Azure instance.

MapD can easily run workloads two orders of magnitude quicker than many other popular analytics engines I've worked with and it comes with a web-based charting and query interface so I suspect this news is going to cause an earthquake in the data world. Now that the cost barrier has been removed more developers can explore MapD and I expect its deployment numbers to grow like never before. Anyone running an Nvidia GPU on Linux can now compile, run and analyse the source code of the most advanced GPU-driven database I've worked with to date.

This should also be a big win for Nvidia as MapD uses their CUDA platform and GPU hardware to achieve its performance. That said, it is worth noting that although MapD relies on Nvidia GPUs for its performance, the software will function and run without a GPU present. On a GPU-less machine the Nvidia driver will complain that no devices were found and MapD will fallback to CPU mode. I haven't conducted any benchmarks using CPU mode so I can't comment on what sort of performance penalty there is but nonetheless MapD seems to function well and without issue.

In this blog post I'll walk through compiling and running MapD from source. As a heads up, if you're following along and you run into any issues please do head over to the MapD community forum to try and get your questions answered.

My Hardware & OS Setup I'm using a machine with an Intel Core i5 4670K clocked at 3.4 GHz, 8 GB of DDR3 RAM, a SanDisk SDSSDHII960G 960 GB SSD drive and an Nvidia GTX 1080 running on a fresh install of Ubuntu 16.04.2 Server LTS. I've picked this version of Ubuntu as it will be supported until April 2021.

Installing MapD's Dependencies I'll start by enabling the source code repositories in apt's sources list. $ sudo sed -i -- \ 's/# deb-src/deb-src/g' \ /etc/apt/sources.list I'll then refresh apt's sources lists and install 39 packages. $ sudo apt update $ sudo apt install \ autoconf \ autoconf-archive \ binutils-dev \ bison++ \ bisonc++ \ build-essential \ clang-3.8 \ clang-format-3.8 \ cmake \ cmake-curses-gui \ default-jdk \ default-jdk-headless \ default-jre \ default-jre-headless \ flex \ git-core \ golang \ google-perftools \ libboost-all-dev \ libcurl4-openssl-dev \ libdouble-conversion-dev \ libevent-dev \ libgdal-dev \ libgflags-dev \ libgoogle-glog-dev \ libgoogle-perftools-dev \ libiberty-dev \ libjemalloc-dev \ libldap2-dev \ liblz4-dev \ liblzma-dev \ libncurses5-dev \ libpng-dev \ libsnappy-dev \ libssl-dev \ llvm-3.8 \ llvm-3.8-dev \ maven \ zlib1g-dev I'll then download and install version 8.0 of Nvidia's CUDA Toolkit. This toolkit installs, among other things, graphics card drivers and will replace any existing drivers currently installed. $ curl -L -O https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb $ sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb $ sudo apt update $ sudo apt install cuda With the new drivers in place I'll reboot the system. $ sudo reboot Once the system is back up Nvidia's System Management Interface should display daignostics of your driver and GPU(s) installed. $ nvidia-smi MapD uses Thrift to communicate between its clients and server. I'll install it from source as I know the 0.10.0 release of Thrift is known to work well with MapD. $ sudo apt build-dep thrift-compiler $ curl -O http://apache.claz.org/thrift/0.10.0/thrift-0.10.0.tar.gz $ tar xvf thrift-0.10.0.tar.gz $ pushd thrift-0.10.0 $ ./configure \ --with-lua = no \ --with-python = no \ --with-php = no \ --with-ruby = no \ --prefix = /usr/local/mapd-deps $ make -j $( nproc ) $ sudo make install $ popd Folly is a library of C++11 components published by Facebook and is also used by MapD throughout its source code. Below are the steps I ran to compile and build the library from source. $ curl -O -L https://github.com/facebook/folly/archive/v2017.04.10.00.tar.gz $ tar xvf v2017.04.10.00.tar.gz $ pushd folly-2017.04.10.00/folly $ autoreconf -ivf $ ./configure \ --prefix = /usr/local/mapd-deps $ make -j $( nproc ) $ sudo make install $ popd Bison is one of the two libraries used by MapD for generating its SQL parser. Below are the steps I ran to compile and build the library from source. $ curl -O -L https://github.com/jarro2783/bisonpp/archive/1.21-45.tar.gz $ tar xvf 1 .21-45.tar.gz $ pushd bisonpp-1.21-45 $ ./configure $ make -j $( nproc ) $ sudo make install $ popd Below I'll make sure we're using the intended version of LLVM's binaries prior to MapD's compilation. $ for BIN in llvm-config llc clang clang++ clang-format do sudo update-alternatives \ --install \ /usr/bin/ $BIN \ $BIN \ /usr/lib/llvm-3.8/bin/ $BIN \ 1 done I'll setup the executable and library path environment variables with the following script. $ sudo vi /etc/profile.d/mapd-deps.sh LD_LIBRARY_PATH = /usr/local/cuda/lib64: $LD_LIBRARY_PATH LD_LIBRARY_PATH = /usr/lib/jvm/default-java/jre/lib/amd64/server: $LD_LIBRARY_PATH LD_LIBRARY_PATH = /usr/local/mapd-deps/lib: $LD_LIBRARY_PATH LD_LIBRARY_PATH = /usr/local/mapd-deps/lib64: $LD_LIBRARY_PATH PATH = /usr/local/cuda/bin: $PATH PATH = /usr/local/mapd-deps/bin: $PATH export LD_LIBRARY_PATH PATH $ sudo chmod +x /etc/profile.d/mapd-deps.sh $ source /etc/profile.d/mapd-deps.sh

Compiling MapD I'll clone MapD's core source code repository and checkout the 21fc39 commit. It's a good idea to stick to known good releases and/or the master branch but for the sake of these instructions working consistently I've pinned this walk-through to that specific commit. $ git clone https://github.com/mapd/mapd-core.git $ cd mapd-core $ git checkout 21fc39 I'll create a build folder for MapD and compile the source code with debugging enabled. $ mkdir -p ~/mapd-core/build $ cd ~/mapd-core/build $ cmake -DCMAKE_BUILD_TYPE = debug .. $ make -j $( nproc )