Welcome back to Across the Network — Lab41’s weekly look at what is going on in the world of AI. As always these are all links that I pulled from the Lab41 Slack channels.

Articles

Extraordinary Link Between Deep Learning and the Nature of the Universe — This is an “extraordinarily” exciting title. Physics → meet Deep Learning. The article describes a paper recently written by some professors at MIT and Harvard. They were seeking to explain how a neural network with just thousands or millions of parameters can generalize tasks (like object recognition) that have a seemingly infinite number of possibilities. Mathematicians have struggled to explain it. But the authors of the paper described in this article claim to have discovered the connection — in the laws of our universe. Are you hooked yet?

The answer is somewhat anti-climatic. The authors believe that the universe is not one with infinite possibilities. If you look at physics formulas, you find polynomial functions with just a different integers as the exponent (2. 3. 4). You don’t see physical phenomena described with formulas where the exponent is 19. And because the universe doesn’t have infinite possibilities, it makes it possible for “small” networks to describe large data.

Machine Learning with Weak Supervision — This blog post describes a new open source project called Snorkel that enables researchers with the task that everyone hates — finding and creating datasets — to create labeled data rapidly and in an automated way. The problem, as described by the authors, is that there isn’t enough labeled data for data scientists to create meaningful products. Sure, there are giant sets of labeled images of dog breeds or faces, but these don’t help people in the fields of finance, healthcare, or government whose data requires deep expertise to curate and label. Snorkel is all about combining a set of heuristics or rules of thumb created by experts (this is called weak supervision) into a single model that is able to accurately label data. The labeled data created via this approach is remarkably effective, and is definitely worth checking out!

How Hillary’s Campaign is (Almost Certainly) Using Big Data — We try to stay away from politics and religion here at Gab41, but this was just too interesting to not mention. The linked article on Scientific American discusses how campaigns use terms like “uplift” and “persuasion modeling” when thinking about voters. Those are the same terms that you’ll see advertisers talking about. The last few political campaign seasons have seen a marked increase in the use of analytics, both from the campaigns and from the news sources that cover them (including my favorite — fivethirtyeight).

Papers

Stealing Machine Learning Models via Prediction APIs- Say that you’re a researcher and you’ve spent the last 5 years developing the world’s best model to identify different types of cars. You’re the world’s best at telling the difference between the Aston Martin Vanquish and Vantage (without resorting to reading the back of the car — which is what I do).

Aston Martin Vanquish (image courtesy of www.astonmartin.com)

Aston Martin Vantage (image courtesy of www.astonmartin.com)

You decide that this is a commercially interesting product and so you start a company. You choose to make your product available via an API so that anyone who wants images of cars labeled can use your service. You should reconsider, according to the authors of this paper. They describe some pretty generic attacks that enable an API user to very quickly extract relevant information about the underlying models that provide answers to their queries. It is pretty interesting work about the leakiness of machine learning models.

As an aside, if you have been working on classifying vehicles, let me know.

Unsupervised Monocular Depth Estimation — This paper is outside the typical interests at Lab41, but I still found it to be really interesting. The problem the authors of this paper are contending with is that of depth estimation. A lot of ink has been spilled in describing different methods to do depth estimation of objects within images when you have multiple images (which provide perspective — and therefore aid in depth estimation). New research is focused on doing the same thing — but with a single image.

The authors contend that most of the current approaches to monocular depth estimation use tables of ground-truth depth data that is previously learned. They don’t need to because they can learn the depth in an unsupervised fashion. What’s their trick you might ask? They use binocular cameras. Huh? I could have sworn that the title of the paper had the word monocular in it. Still, the technique they developed is novel. They train a CNN to predict or reconstruct what the right-side camera photographed using just the left-side image. This trained network has therefore learned the depth in the scene. Pretty cool.

Resources

15 Favorite Data Science Resources —The folks over at Kaggle have put together a list of their favorite Data Science resources on the Internet. It’s a pretty good list — one that we’re happy to recommend to you. It includes everything from personal blogs to article aggregators to newsletters. There’s only one problem with the list. Where is Gab41?!

Unofficial Self-Organizing Conference on Machine Learning — Everyone’s favorite group of AI specialists over at OpenAI are having their own conference. And because they are ML celebrities, their official conference got overbooked. So they are supporting an unofficial conference as well. So if you want to see if a “self-organizing” conference can work and happen to be in the bay area on Oct 1, you should attend.

Shameless Plugs

My shameless plug this week is for our fearless leader — Bob. A few months ago, Bob wrote what is still the most popular article of all time at Gab41 — I need an AI BS Meter. Yes, the title was a bit “click-baity” (but at least it wasn’t — 4 reasons you should be worried about AI, you’ll never believe #3!), but the article discussed results provenance. This concept of trusting the results from your model is something that resonated with our audience. And so, Bob is back with a new article — A Chatbot? Are you Sirious?

Bob appears to have traded click-bait in his title for puns, but the important takeaway is that he presents a serious option for how you might build a system to helps analysts and data scientists with results provenance. I suggest you take a look.

That’s all I’ve got but I hope to see you again next week on Across the Network!