Emerging technologies like machine learning (ML), virtual reality (VR), augmented reality (AR), and computer vision (CV) are providing fantastic opportunities for innovation and business growth across the whole Arm ecosystem and in all market segments.

Software developers throughout the semiconductor supply chain should be able to focus their efforts on innovation, and not on re-implementing common technologies and optimizations. This also channels into product development, where our partners should be spending their time on features and differentiation, not on enabling.

Happy days! As part of our continued efforts to support our partners and developer ecosystem, today we are making the Arm Compute Library publicly available.

The Arm Compute Library is a collection of low-level software functions optimized for Arm Cortex CPU and Arm Mali GPU architectures, targeted at a variety of use-cases including: image processing, computer vision and machine learning. It is available free of charge under a permissive MIT open-source license.

The Arm Compute Library initially contains a large number of functions implemented for the Cortex-A family of CPUs and for the Midgard and Bifrost families of Mali GPUs. It is a convenient repository of low-level, optimized software functions that developers can source individually or use as part of complex pipelines in order to accelerate their algorithms and applications.

What is included in the Arm Compute Library?

The first release of this library includes a comprehensive set of functions which have been built over years of experience working with partners and developers around imaging and vision based products, as well as our experience optimizing machine learning frameworks such as Google TensorFlow.

The libraries include the following categories of functions:

Basic arithmetic, mathematical and binary operator functions

Colour manipulation (conversion, channel extraction, and more)

Convolution filters (Sobel, Gaussian, and more)

Canny Edge, Harris corners, optical flow and more

Pyramids (such as Laplacians)

HOG (Histogram of Oriented Gradients)

SVM (Support Vector Machines)

H/SGEMM (Half and Single precision General Matrix Multiply)

Convolutional Neural Networks building blocks (Activation, Convolution, Fully connected, Locally connected, Normalization, Pooling, Soft-max)

We have listened to our partners and we’ll continue to do so - please get in touch with your feedback: developers@arm.com.

What are the benefits of the Arm Compute Library?

Apart for being a comprehensive, one-stop solution for common CV and ML performance optimized routines, an important characteristic of the Arm Compute Library is portability. The CPU functions are implemented using NEON intrinsics, which enables developers to re-compile them for their target architecture. This means the code is transferable between Armv7 or Armv8 processor implementations and can be compiled for 32-bit and 64-bit. The GPU version of the library consists of kernel programs written using the OpenCL standard API, which again is portable across multiple processors and architectures (albeit specifically optimized for Arm Mali GPUs).

The library functions can be used to accelerate key stages of computer vision and machine learning pipelines

The library is operating system agnostic and has already been deployed on a broad variety of modern Linux and Android Arm-based system-on-chip platforms.

It is a useful tool that can significantly reduce cost and effort for developers targeting imaging, vision and machine learning applications – enabling them to focus on differentiation and reduce their products’ time to market. The Arm Compute library is mature and tested, has been utilised by several consumer and mobile silicon vendors and OEMs inside their products, as well as many ISVs across the globe.

How about other similar open-source libraries?

If you developed or prototyped any computer vision software you very likely used the OpenCV library. OpenCV is a fantastic tool, the most comprehensive toolkit anyone may need to enable fast prototyping of products and solutions in the field of CV and ML.

However, in OpenCV today, the level of support for mobile and embedded processors is still limited. Today the OpenCV project contains around 40 NEON accelerated functions. There is also an OpenCL module, which enables acceleration of key functions using compatible processors but the code is tuned for selected desktop class GPU architectures and does not perform well (or in some cases at all) on mobile OpenCL implementations.

Compared to existing open source alternatives such as OpenCV, the Arm Compute Library provides a much more comprehensive set of functions as well as superior performance out of the box. We tested functions that are common between the two libraries on modern smartphones such as the Huawei Mate 9 (HiSilicon Kirin 960) and found that in all cases the Arm Compute Library outperformed OpenCV. The diagrams below show the performance uplift that was observed, with results grouped by function category.

Arm Compute Library vs OpenCV, single-threaded, CPU (NEON), tested on HiSilicon Kirin 960

Arm Compute Library vs OpenCV, single-threaded, CPU (NEON), tested on HiSilicon Kirin 960

The Arm Compute Library complements the landscape of Arm optimized libraries by providing optimized primitives specifically targeting ML and CV. Other libraries which provide NEON optimization worth highlighting include:

The HPC Performance Library, a collection of standard core math libraries for high-performance computing applications on Arm (Optimized BLAS, LAPACK and FFT). These can be evaluated for free or licensed for a fee.

Ne10 is an open source C library, hosted by Arm on github, containing a set of the most common processing-intensive functions heavily optimized for Arm

libyuv is an open source project that includes YUV scaling and conversion

skia is an open source 2D graphics library used as the graphics engine for Google Chrome and Chrome OS, Android, Mozilla Firefox and Firefox OS, and many other products.

Who is using this library today?

A large number of silicon vendors, OEMs and ISVs are currently using this library to improve performance of their vision and AI products.

At the Mobile World Congress we demonstrated a smartphone application that estimates food calorie count through a combination of CV and ML technologies. The demonstration was developed by our valued partner ThunderView (ThunderSoft) and it made use of some optimized primitives Arm has developed. You can read more about this in my previous blog and in the report from Peter Clark of EEtimes and see it in action in this video report from Reuters (the Arm piece starts at 1:58).

Watch this space to learn more about what our partners are doing with the Arm Compute Library.

View Arm Compute Library resources