News in AI and machine learning

For more AI news and analysis, sign up to my newsletter here.

Reporting from 19th January 2017 through March 28th 2017

I’m Nathan Benaich — welcome to issue #18 of my AI newsletter! I will synthesise a narrative that analyses and links important happenings, data, research and startup activity from the AI world. Grab your hot beverage of choice ☕ and enjoy the read! A few quick points before we start:

1. Personal announcement: after almost 4 years of investing and building Playfair Capital, I’ve left to capitalise on exciting opportunities in AI. If you’re looking to invest, research, build, or buy AI-driven companies, do hit reply and drop me a line.

2. I gave a talk on How to start an AI company at the Oxford AI Society, which includes frameworks to find technology-problem fit, product design tips for an AI-driven software model, and financing advice.

3. My top picks: a) Five AI startup predictions by Bradford Cross (tl;dr death to bots and ML as a service, long vertical applications, but remember to tread with caution), and b) So your company wants to do AI by Eder Santana.

Referred by a friend? Sign up here. Help share by giving this it a tweet :)

*Technology news, trends and opinions*

🚗 Department of Driverless Cars

a. Large incumbents

In a massive deal this quarter, Intel CEO agreed to purchase Mobileye for $15.3bn. The 18-year old NYSE-listed Israeli company holds a portfolio of camera-based computer vision, sensor fusion, mapping and driving policy products for advanced driver assistance features such as pedestrian, vehicle and sign detection as well as relationships with Tier 1 OEMs. Intel takes the view that “whoever has the best data can develop the best AI”. Intel already has a strong position in silicon (in-house + Nervana and Movidius), memory and communications. Mobileye therefore adds the “brains” component to accelerate “rack scale” end-to-end autonomous solutions that customers are asking for. This includes the 16m installed EyeQ3 chips (those Tesla used) that run tools like Mobileye’s Road Experience Management (REM) product for mapping and localisation (going HD in 2018), as well as next generation chips for Level 3–4 automation. The Mobileye-Intel combination is set to compete head to head with NVIDIA with the following positioning (more details in this official presentation):

In an effort to distribute the challenge of building an automated safety and awareness processing stack, Udacity, which now has a permit to test AVs in California, and leading Chinese ride-sharing startup DiDi announced a $100k self-driving car challenge. The dataset includes Velodyne LiDAR point clouds, radar objects and camera image frames (here).

Waymo, the self-driving car spin-out from Google X, is proving far more effective on Californian roads than its competitors. Data shows that Waymo logged 30x more autonomous miles in 2016 than others and only required human intervention 0.2x per thousand miles for safety reasons. Waymo also made the news for pursuing a federal civil lawsuit against Uber. It focuses on Anthony Levandowski, who was a key engineer at Waymo and co-founder of Otto, a 7-month old driverless software company for trucks that Uber purchased for $700m. Waymo claims that Levandowski stole 14,000 confident files from Otto servers that describe key self-driving software and LiDAR IP before selling the company to Uber. It has subsequently transpired that Levandowski consulted to Uber on self-driving technology six month before he started Otto. Putting Uber further into the spotlight, one of its AVs was involved in a serious crash in Arizona, where the company’s cars retreated post their testing ban from SF streets.

Baidu has spent the better part of a $2.9bn R&D budget on AI over the last 2.5 years — an endeavor run by 1,300 researchers. The company was also a victim of an attempted cyber attack directed against its autonomous driving IP. High stakes at play here.

Tesla announced it was close to releasing version 2.0 of Autopilot (AP2) based on NVIDIA Drive PX2 and in-house software (vs. 1.0 Mobileye system). While the speed limit for Autosteer has been upped, only 1 out of 8 camera sensors on the new hardware stack is being used. As such, a US law firm is seeking to sue Tesla for selling AP2 to customers before it’s ready. On the other hand, an Ohio car insurance provider is offering reduced premiums for Tesla owners with Autopilot.

Ford, in contrast to BMW/VW/Mercedes-Benz, is said to consider removing all driving controls from their self-driving cars that are set to debut in 2021. They don’t buy that resting drivers can react sufficiently fast to intervene when needed, thus meaning Ford would skip from Level 3 to 5 autonomy.

NVIDIA announced DRIVE PX platform collaborations with Bosch, the world’s largest automotive supplier, and PACCAR, a leading global truck manufacturer.

b. The startups

AutoX, a US self-driving startup authorised to test on public Californian roads, released an impressive video demo of a 2017 Lincoln MKZ navigating autonomously in day, night, light rain and cloudy darkness using only a front-mounted cameras. The company’s stated go-to-market model is to provide the self-driving OS to OEMs instead of selling aftermarket or operating their own service. The team draws roots from Princeton’s vision group.

Oxbotica, the Oxford spinout led by Paul Newman and Ingmar Posner that has quietly built impressive mobile autonomy software, was featured in the FT. This team tightly couples fundamental research at the University’s Oxford Robotics Institute with real-world applications for self-driving. In 3 years, it’s accomplished significant feats without venture financing, releasing an autonomous control system, Selenium. #LongUKAI! A prime CMU/Uber-style acquisition target here…

Comma.ai announced the Panda, a circuit board that extracts granular driving data from a vehicle and can issue accelerator and brake commands to the car. The only way to get your hands on one is by accumulating sufficient points on the company’s Chffr dashcam video recording app. The (updated) longer term goal being to aggregate worldwide driving data, presumably as a pseudo-Mobileye REM product.

Slightly more information emerged about Drive.ai, another startups that can test on public roads in California. The company retrofits a roof-mounted rig equipped with nine HD cameras, two radars and six Velodyne Puck LiDAR sensors and uses sensor fusion with deep learning to translate inputs to driving instructions. Current limitations include altering the vehicle path on the fly to compensate for obstructions that suddenly appear. The company is also said to focus on logistics in dense geographic areas as opposed to transporting people.

The big boys

Apple joined the Partnership on AI to Benefit People and Society, appointing Siri co-founder/CTO Tom Gruber to the board along with representatives from DeepMind, Amazon, Microsoft, Facebook and IBM. Apple is also building out its engineering and AI research footprint in Seattle, following its acquisition of Turi last year. This includes a $1m endowed professorship in ML at the University of Washington. The company also released a new app, Clips, for native iOS video editing empowered by computer vision, NLP and AR tools. Furthermore, the new iOS 10.3 update includes a consent for Apple to read user iCloud data (following differential privacy manipulation) to improve predictive features in Siri.

Google made lots of announcements at Cloud NEXT 2017 including the acquisition of data science community Kaggle, GA release of a) Cloud ML Engine for training and deploying proprietary models to the cloud, and b) Cloud Vision API. There were also releases to help data scientists visually explore and prepare data (Cloud Dataprep) as well as integrate data from BigQuery and Commercial Datasets, and the fully-managed data processing pipeline for batch and streamed data (Dataflow). This shows that ML infrastructure is indeed still a native space where opportunities exist for specialised startups. Separately, YouTube announced that it had reached 1 billion machine-generated video captions for their audio content. More on video understanding later!

MIT Tech Review run a piece on Goldman Sachs’ efforts to breathe automation into their business. Starting with replacing 4 currency traders with 1 software engineer, the firm has mapped the 146 steps required to take a company public to identify many that are “begging to be automated”. On their side, JP Morgan has made significant investments to develop an internal cloud infrastructure and environment to build and run machine learning applications. This includes their Contract Intelligence software, which interprets commercial loan agreements. The product cuts down on the 360k human hours required to analyse 12k contracts a year.

Facebook “today cannot exist without AI”, says Joaquin Candela (head of applied ML group) in this Backchannel piece on the group’s genesis and its impact on Facebook, Instagram and Messenger over the last two years.

Hardware

British chip maker ARM, which was acquired by SoftBank for $32bn last year, announced their DynamIQ technology. It applies to ARM’s Cortex-A CPUs and enables custom configurations of large and small CPUs in a single cluster. It also provides a shared memory subsystem, faster data transfer with accelerators and power savings that collectively focus on delivering performance and efficiency for running AI applications at the edge. ARM recently passed the 100 billion chips sold milestone since 1991.

Intel announced a first product with their 3D XPoint memory technology that is positioned to replace hard drives or SSDs by providing greater density and performance.

NVIDIA keeps expanding the universe of cloud providers offering their Pascal architecture-based Tesla GPUs. They’ve just added Tencent Cloud, followed by a collaboration with Microsoft to develop a new hyperscale GPU accelerator powered by 8x Tesla P100 GPUs for AI cloud computing.

AI research in production

NYT ran a profile on various efforts to automatically create music. The piece includes samples from Jukedeck and DeepMind. Jukedeck CEO also discusses their progress on the BBC podcast that also features Geoff Hinton.

Innovation in AI, whether it occurs in the real world or research lab, builds upon the shoulders of published research. There are two fundamental flaws in the implementation of research: Papers a) seldom contribute much time to solving and openly discussing engineering problems, and b) are fraught with a lack of rigor and reproducibility. These are important problems that we must work to correct as a community. DistillPub, a new open-source publication for the machine learning, can help here. It provides new data visualization opportunities, transparency over methods and cash prizes for clearly communicated work.

Big ideas!

It’s clear that talent is a bottleneck in software engineering and even more in AI. In order to deliver on promises for AI, we need to drive more talent from diverse backgrounds into the field and do so by sustaining the institutions that educate future generations.

Turns out that the $100m investment of Braintree founder Brian Johnson in Kernel to create a neural interface between humans and machines isn’t as sound as he hoped. The project was apparently “too complex, too speculative, and too far from becoming a medical reality”. Note: make sure sci-fi projects are grounded in scientific reality. Meanwhile, Elon Musk finally announced the launch of Neuralink project!

Ben Medlock argues that the missing link between current AI systems and true AGI is an embodied system for the AI agent. He points to AI systems as only replicating one of the many layers of human cognition, where the others are the biological substrates and complexity of eukaryotic systems.

NYT runs a piece on Santiago Ramon y Cajal, a 20th century Spanish neuroscientist who published fundamental on how information flowed through the neurons and synapses in the brain. Equipped with a microscope, he painstakingly sketched these neural structures and quite incredibly built up his reasoning from there.

Last issue we talked about a new frontier in training AI agents: complex simulation environments. At Google NEXT 2017, Improbable founder Herman Narula presented a quick talk on their SpatialOS distributed computation infrastructure for simulation that you can watch here.

Policy and governance

Researchers in Cambridge published a sharply critical piece on the collaboration between Google DeepMind and the UK’s National Health Service (NHS). Based on analysis of information reported last year by the New Scientist, the authors argue that the breadth of patient data shared between parties was far greater than originally announced and concerns more than patients under direct care for acute kidney disorder. They claim that plans for a consolidated and canonical data infrastructure for the NHS is beyond the original stated remit of the collaboration (see DeepMind Blockchain project). More importantly, the authors state that minimal consultation was had with public bodies governing data privacy, health research and medical device regulation. The NHS and DeepMind responded saying that this paper misrepresents the use of data and makes both factual mistakes and analytical errors. While the tone of this piece is also unfairly harsh, it highlights the careful balance that needs to be struck between sufficiently complying with incumbent regulatory frameworks and streamlining these procedures to catalyse necessary upgrades to core NHS services.

The list of signatories to the Asilomar AI Principles run by the Future of Life Institute continues to grow. Videos from this year’s conference at which questions of ethics, values and longer-term goals are discussed can be found here.

Following Bill Gates earlier this year, French Socialist Party candidate Benoit Hamon suggested a corporation tax on economic value generated as a result of AI (“robot tax”) that will go to fund universal basic income. Pro case: it’s an effective way to prevent further wealth disparity between the rich who can afford robots to work for them and the less well off who can’t. Con case: a robot tax stifles innovation and automation isn’t the only factor that affects the incentive to participate in labor markets (e.g. education, safety nets, trade) and thus shouldn’t be targeted in isolation.

Next frontiers for AI

Video understanding: Developing systems that understand the contents of video in real-time remains a complex, unsolved problem. This is largely because current static image ML tools don’t go much beyond object recognition, semantic segmentation (labelling each pixel) and captioning. Facebook, which has users consume over 100 million hours of video a day, has set it sights on this problem because “video understanding is going to be ridiculously impactful”. Google launched a Kaggle competition using the YouTube-8M dataset, but that only focuses on predicting video labels from 4716 classes (e.g. “electric guitar”, “cuisine” and “talent show”). Meanwhile, a startup in Berlin called TwentyBN is attacking video understanding from a unique angle. First, they build a dataset of crowd-acting videos that depict short segments of objects interacting with one another (e.g. placing/pushing/dropping an object onto/on/off a table). Next, they train networks to accurately predict these correct action labels to learn common sense about the 3D world in which objects interact that can be transferred to new problems.

Learning to learn: Several research groups have shown that machine learning can be used to improve how learning systems learn (termed “learning to learn”). Jeff Dean of Google Brain stated that this “automated machine learning” is the most promising avenue his team are working on.

Data efficiency: WIRED features a piece on a few researchers and companies working on data efficient means of handling uncertainty. This is key in the real world where there are only a few examples of driving accidents as a proportion of regular driving footage. AI systems must reason on this uncertainty to make the best (interpretable) decisions.

Hardware for computation: Graphcore, the British semiconductor startup developing novel silicon optimised for intelligent applications, released beautiful teaser visualisations of networks at work on their hardware. Watch this space as the company unveil aspects of its core technology this quarter!

Healthcare

Arterys received 510(k) clearance from the FDA to market it’s deep learning solution for automated ventricle segmentation on cardiac MRI images. This is allegedly the first regulated implementation of cloud-based deep learning in the clinical setting and adds to a CE Mark received in December 2016.

MedyMatch announced a collaboration with IBM Watson Health to integrate and market its deep learning-based non-contrast CT system to help assess patients suspected of head trauma or stroke and rule out brain bleeds. The company is conducting a clinical trial a working towards PMA Class III regulation with the FDA.

Eleven Two Capital outline opportunities for data-driven health technology. I do agree that there’s huge value to be created in diagnostics (imaging and physiological sensor), therapeutic discovery and development (see this piece by NVIDIA), treatment & care monitoring, as well as clinical & administrative workflow optimisation. Recent examples include Grail and Freenome (liquid biopsies).

Researchers at the University of Toronto Scarborough (commercialised via Structura Bio) have demonstrated they’re able to reconstruct the 3D structure of protein molecules from tens of thousands of low-resolution 2D electron cryomicroscopy images. Existing methods require days to weeks and as much as 500,000 CPU hours and prior understanding of the target structure — new methods overcome these bottlenecks to speed up drug discovery. Paper here.

People tracker

Andrew Ng, Chief Scientist at Baidu and the original lead of Google Brain, announced his departure from the Chinese search giant. Andrew remains a driving motivational and educational force behind the adoption of AI in companies and by students (e.g. via his Coursera ML lessons) worldwide. Wang Haifeng steps up to lead AI at Baidu.

Zoubin Ghahramani, Professor of Information Engineering at the University of Cambridge, steps up to Chief Scientist at Uber in connection with the acquisition of Geometric Intelligence. Zoubin is a world-leader in probabilistic modelling and machine learning, focused on decision making under uncertainty and learning efficiently from limited data. Zoubin will move to the West Coast.

Murray Shanahan, Professor of Cognitive Robotics at Imperial College London, took up an appointment at DeepMind as a Senior Research Scientist. He moves to part time at Imperial. His early work focused on symbolic reasoning, cognitive robotics and increasingly on unifying symbolic reasoning with reinforcement learning.

Ian Goodfellow, formerly part of OpenAI’s founding team, has moved back to Google Brain.

Clement Farabet, who co-founded Madbits (acq. Twitter) and then tech lead for Twitter’s Cortex AI team has left to join NVIDIA as head of AI infrastructure.

*Research*

Recursive pixel super resolution, Google Brain. Lots of recent work has tackled the problem of taking a low resolution photograph and mapping it to a high resolution version (“super resolution”). However, these approaches tend to work poorly when considering a low-quality high magnification image where there are multiple reasonable high-quality mappings. Here, the authors train a probabilistic pixel-by-pixel CNN on pairs of low and high quality highly-magnified images. The model can be sampled to produce multiple plausible high resolution images that fool naive human observers.

What uncertainties do we need in Bayesian deep learning for computer vision?, University of Cambridge. The world around us is full of inherent uncertainty makes understanding the present and reasoning about the future a challenge. In this work, the authors present a framework for learning models to account for uncertainty a) inherent in environmental observations and b) in the learned model as it applies to computer vision tasks. Their approach unifies modelling of both uncertainties to achieve new state-of-the-art results on segmentation and depth regression benchmarks for street level and home interior images.

One shot imitation learning, OpenAI and UC Berkeley. This paper considers the problem of a robot efficiently learning a) a task by watching just one demonstration (start and finish) and b) generalising to new conditions and tasks unseen in training data, also with just one demonstration. This is interesting because while it is possible to use behavioural cloning (supervised learning) and inverse reinforcement learning (reward function that explains the behaviour), these methods don’t allow a robot to accelerate its learning to imitate new skills. Reinforcement learning, on the other hand, requires many examples of trial and error.

Overcoming catastrophic forgetting in neural networks, DeepMind and UCL. Today’s neural networks are very effective at learning supervised task with supervision. However, in order for a network trained for task #1 to perform well on a new task #2, it must be retrained with data for task #2. In doing so, it loses its ability to solve task #1 — a major limitation towards general intelligence that is termed “catastrophic forgetting”. This work proposes an approach to overcome this problem by slowing the updating of weights in a neural network that were key to its ability to solve task #1 while it’s learning task #2. This selective decreasing of weight plasticity protects prior knowledge and enables continual learning in challenging reinforcement learning scenarios of Atari 2600 games. Blog post here from DeepMind.

Generative temporal models with memory, DeepMind. In order to model temporal and sequential data (e.g. language, time series, video streams), it is important to learn long temporal dependencies inherent in the data. Using Long Short-Term Memory (LSTMs) RNNs to store and protect information over the longer term, however, doesn’t scale with large capacity storage. The authors present a generative temporal model where computation is separate from memory, which can store early information from a sequence and efficiently reuse the information in the future.

Neural Episodic Control, DeepMind. Deep reinforcement learning methods exhibit very slow learning rates. For example, state of the art agents require >200 hours of gameplay to perform as good as a human with 2 hours of experience. Here, the authors introduce Neural Episodic Control as a method to dramatically improve the learning rate and discovery of highly successful strategies in Atari 2600 environments. This is accomplished by writing all the agent’s experiences to memory and updating its memory faster than the rest of the deep neural network. Separately, researchers at Carnegie Mellon University published an approach to storing 2D memory images in order to solve long-term navigation in 3D mazes using deep RL (Neural Map: Structured memory for deep reinforcement learning).

*Resources*

Researchers at Facebook and others released PyTorch, a python package that offers a GPU-ready Tensor library to replace numpy and a framework to build neural networks using dynamic instead of static graphs, which handle variable workloads better. Stephen Merity discusses why that’s useful here.

Have a basic knowledge of ML and keen to learn best practices from Google? Here’s a 43 rule playbook on by Martin Zinkevich, Research Scientist at Google.

Listen to Ian Goodfellow and Richard Mallah’s highlights for AI in 2016 on the Future of Life Institute podcast.

Best practices for training deep learning networks: a high level infographic!

*Financings and exits*

135 deals (64% US and 25% EU) totalling $1.26bn (43% US and 4% EU).

Big rounds

DataRobot, which automates laborious steps in data preparation, predictive model design, training and evaluation, raised a $55m first tranche of a Series C round led by NEA. This marks significant interest in ML-specific infrastructure software to bring tools that otherwise only really exist in AI-first companies like Facebook and Google to the masses.

Chorus.ai, a software product to record, analyse and enhance sales call effectiveness, raised a $16m Series A led by Redpoint Ventures and Emergence Capital. The goal is to rigorously discover opportunities on a live call and upskill sales staff.

Uptake Technologies, which offers data science software for predictive applications in industry, raised a $70m of a $125m planned Series C led by RevolutionGrowth.

Early rounds

DeepScale, which published a significantly compressed yet powerful model for computer vision tasks (SqueezeNet) that can be applied to edge devices for self-driving, raised a $3m seed round from Bessemer, Greylock and Auto Tech Ventures.

BluHaptics, which develops software that uses video fusion to enable humans to assist robots in automating high-value tasks where there is low margin for error, raised a $1.36m Series A round led by Seattle Angel. The team draws its experience from the University of Washington and work with the Navy to clean up weapons and munitions from the seafloor.

Y-Combinator graduated (W17: day 1 and day 2) a few ML-based companies including: 1) AlemHealth (telemedicine for machine-driven analysis of CT scans in emerging markets), 2) Vinsight (crop yield optimisation using satellite and weather data), 3) Clover Intelligence (voice-based analytics for sales calls aimed at tracking and improving performance), 4) Quiki (using NLP to transform previous customer support interactions into website FAQs) and 5) lvl5 (3D point clouds from street level images for autonomous vehicle localisation and navigation).

Entrepreneur First graduated (#EF7) many ML-driven companies including: 1) Transformative (predicting onset of sudden cardiac arrest), 2) Observe (optimising fish feeding in aquafarms using underwater video footage), 3) Optimal (controlling greenhouse environments to optimise farming yields).

18 acquisitions (of 5 are still in progress), including:

Mobileye acquired by Intel for $15.3bn as discussed earlier.

Neokami, a German startup data security solutions, was acquired by Relayr to beef up the security of their IoT implementation platform.

Kaggle, the largest community of data scientists online who compete on contributed data problems in a tournament style, was acquired by Alphabet. This move buys Google mindshare amongst data scientists and provides a new channel to distribute cloud infrastructure services to a burgeoning market.

Dextro, which developed video search and discovery solutions applied to live streaming, was acquired by TASER International to form a computer vision group working on police crime video data.

MightyTV, which offered a video content recommendation service, was acquired by Spotify.

—

Anything else catch your eye? Do you have feedback on the content/structure of this newsletter? Just hit reply!