Moving from Desks to Pockets

Machine learning is currently transitioning from PCs to mobile devices, and more effort is being put in to get Machine Learning Applications working efficiently on mobile phones to solve problems. Companies have started to develop specific hardware as well as software tools and frameworks like TensorFlow Lite, Core ML, Caffe, etc. to give mobile developers easy entry into machine learning.

Employing machine learning on-device has a few benefits: it makes it more accessible to the general public much easier to use. For example, on-device ML could allow you to apply filters on images wherever you are or predict categories and conditions of flowers you see and smell. You obviously wouldn't want to carry your laptop everywhere to do all that.

However, mobile phones have much less powerful CPUs and GPUs as compared to desktop computers and laptops. Making predictions on these lower-power processors takes a few seconds, which isn’t optimal for real-time applications.

In late 2017, major chip makers like HiSilicon, Qualcomm, and Apple realized the potential of AI and ML on mobile and started increasing its resource allocation to this area. They started making their GPUs more powerful and started integrating Neural Processing Units (NPUs) dedicated to on-device machine learning.

Neural Processing Unit (NPU)

According to Wikichip:

A neural processor or a neural processing unit (NPU) is a microprocessor that specializes in the acceleration of machine learning algorithms, typically by operating on predictive models such as artificial neural networks (ANNs) or random forests (RFs).

Integrating Neural Net Processing units in mobile chips enables faster and more power efficient processing of neural networks on-device. There are several advantages of carrying out such computationally intensive tasks on-device:

The presence of an NPU on the device itself won’t require any dependence on cloud services, thereby cutting server-side costs as well as speeding up the process, and making machine learning accessible entirely offline which will enable users with low internet connectivity to take complete advantage of machine learning. Not depending on the cloud also means that all code will be executed on the device, making it much more secure

Neural net processors consist of neural networks creating a brain-like computer that mimics millions of human brain neurons and synapses. Such an implementation allows complex convolutional neural networks (CNNs) to perform multiple calculations in parallel to quickly recognize and analyze images, audio, video, and text.

Apple (with their neural engine), Google (with their Neural Net APIs), and Qualcomm (with their Snapdragon Neural SDKs) have taken big first steps in bringing some form of hardware acceleration to their platforms.

Apple’s A12 Bionic Chip

Apple released their first chip with its neural engine in 2017 — the A11 Bionic. It was the best chip ever made in a smartphone until it was replaced by Apple’s new chip released with the iPhone XS — the A12 Bionic. It also happens to be the world’s first 7-nanometer chip.

Along with a hexa-core CPU and a quad-core GPU, they have included an 8-core neural engine, which is dedicated to neural networks. The neural engine allows Apple to implement neural networks and machine learning in a more energy-efficient manner.

Apple claims that the A12 Bionic could perform up to a massive 5 trillion calculations per second, and the addition of 6 extra cores to the neural engine has made Core ML up to to 9 times faster than it was on the A11 bionic.

Core ML performance by device. Higher is better. Note the y-axis is logarithmic. Data from Fritz.

Apple wants developers to use Core ML and the A12 Bionic’s power to develop new and innovative ML applications. Apple lays out some examples by harnessing the power of its A12 Bionic’s neural engine along with another hardware accelerator just for image processing — the Image Signal Processor (ISP)—to implement a super fast Face ID (iPhone’s secure 3d face unlock), Animoji and Memoji with 3d real-time face tracking, and Augmented Reality (AR) applications and games. This ISP processes images that were taken by the camera and makes them look beautiful and realistic, and it enables advanced modes such as smart HDR and Bokeh.

For the past few years, Apple has been the leading chip maker, mainly because its hardware is closely integrated with its software. Even so, they aren’t the only ones developing chips suited for on-device machine learning.