A Moment of Science: LLVM may mean the best of both worlds is possible for analytic computing¶

I got my start in programming with C++ (not counting hacking DOS video games like Duke Nukem in a hex editor). One summer in college I had an undergraduate grant with a professor who had me write a neural network algorithm using back propagation. The next summer this same professor had me recode it all in Java, as it was the hot new language. So from the start, my programming needs have been in the context of analytics, often exploratory in nature, which is why I quickly gravitated from compiled languages like C++ and Java (statically compiled) towards dynamic languages like MATLAB, R and Python (runtime compiled).

The trade-off in using dynamic languages over compiled ones is about speed. Dynamic languages better enable the exploratory programming needed when tackling new analytical problems, which leads to faster development of solutions. The trade-off is that this flexibility means that compilers do not have enough information about the code and data to optimize for run-time execution speed. In many settings, increasing computer time for saved human time is a good trade-off, but of course sometimes execution speed is critical, even in exploratory work (i.e., it is harder to iteratively refine a modeling approach when code takes many hours to run). Wouldn't it be great if there was a way to have the flexibility and expressiveness afforded by dynamic languages but with more of the execution performance of compiled languages?

This wouldn't it be great if wish has been around for a while, but three events centering on the LLVM compiler over the past year make this wish a lot closer to reality:

While this seems like an unrelated mix of events, the common thread is that important tools in the modern technical computing stack are moving towards using LLVM. So, I figured it was about time I became more familiar with the topic and the rest of this post shows some of my recent foray.

An example to dive in to: matrix factorization and recommendation systems¶

Matrix factorization methods serve as the backbone of many statistical algorithms. Two particular approaches -- singular value decomposition (SVD) and non-negative matrix factorization -- gained broader exposure with the 2009 Netflix Prize for improving movie recommendations. These are both computationally expensive algorithms, the kind that you want implemented in a compiled language. But, there is a nice tutorial on using matrix factorization for recommendation systems that codes the algorithm in Python, making the algorithm easy to follow and test. I'll be using a slightly modified version of the code used in this tutorial as the basis for my explorations.

Mathematically, what we are doing is taking an N by M matrix R of ratings by N people over M items and decomposing it into two constituent matrices, P and Q (I'll stick with the notation used in the tutorial). P will be an N by K matrix that we can think of as N people's preference weightings over K "hidden" (or latent) features of the items. Q will be a K by M matrix that we can think of as the "loadings" on the K features by each of these M items. The algorithm seeks P and Q such that the difference between R and PQ is minimized (in general, this solution is non-unique).

In what follows I'll compare algorithm speed across three scenarios:

Python C Python on the LLVM using numba

Baseline code¶