Machine learning and quantum computing have their staggering levels of technology hype in common. But certain aspects of their mathematical foundations are also strikingly similar. In a paper in Nature, Havlíček et al.1 exploit this link to show how today’s quantum computers can, in principle, be used to learn from data — by mapping data into the space in which only quantum states exist.

Read the paper: Supervised learning with quantum-enhanced feature spaces

One of the first things one learns about quantum computers is that these machines are extremely difficult to simulate on a classical computer such as a desktop PC. In other words, classical computers cannot be used to obtain the results of a quantum computation. The reason is that a lot of numbers are required to describe each internal step of the computation. Consider the multi-step procedure that many people learn at school for dividing large numbers. If this were a quantum computation being simulated on a classical computer, every intermediate step could easily need more numbers to describe it than there are atoms in the observable Universe.

The state of a quantum system when described by a collection of numbers is known as a quantum state. And if a quantum state is associated with many values, it is said to ‘live’ in a large space. For certain quantum computers that are based on continuous variables, such spaces can even be infinitely large.

Machine learning, by comparison, analyses data that live in much smaller spaces — that is, the data are described by many fewer values. For example, a photograph that contains one million pixels records just three million numbers to describe the amount of red, green and blue in each pixel. A prominent task of machine learning could be to guess the content of the image, or to produce similar images. However, a well-established theory in machine learning called kernel methods2 treats data in a way that has a similar feel to how quantum theory deals with data.

In a nutshell, kernel methods carry out machine learning by defining which data points are similar to each other and which are not. Mathematically speaking, similarity is a distance in data space — that is, a distance between the representations of data points as numbers. Similar images are assumed to have similar content, and distances between data points can be crucial in machine learning. But defining similarities is not as straightforward as it sounds. For example, what is the distance in data space between two images if derived on the basis of the amount of red in each image?

Kernel theory showed that many definitions of similarity in data space are mathematically equivalent to a simple measure of similarity in a much larger, possibly infinitely large, space (Fig. 1). Consequently, every time two images are compared, the images are implicitly mapped to a representation in a huge space, and a simple similarity is computed. No ordinary computer can calculate this large representation explicitly. But perhaps a quantum computer can? Because quantum computers carry out computations in extremely large spaces, what happens if data are mapped into the space that is inhabited by quantum states?

Figure 1 | Quantum-enhanced machine learning. Havlíček et al.1 demonstrate how quantum computers could improve the performance of machine-learning algorithms. In this simple illustration, a conventional (classical) computer uses machine learning to classify images of animals. Images whose pixels contain similar colours are positioned close together in data space. The classical computer sends these data to a quantum computer that maps each of the images to a particular quantum state in a space of such states. Images that are close together in data space, but are different in content, are represented by states that are far apart in quantum space. The quantum computer sends the distances between the quantum states to the classical computer to improve the image classification.Polar bear: Art G. (CC BY)

Havlíček et al. and my research team3 recognized this potentially powerful link between machine learning and quantum computing at roughly the same time. Remarkably, both groups proposed essentially the same two strategies for using the idea to design quantum algorithms for machine learning. The first strategy makes only minimal use of the quantum computer, as a mere hardware addition to a conventional machine-learning system: the quantum device returns similarities when given two data points. The second strategy carries out the actual learning on the quantum computer, aided by the classical one.

A key contribution from Havlíček et al. is that they implemented the two strategies in a proof-of-principle experiment on a real quantum computer: one of IBM’s quantum chips. Despite the inflated claims of some news reports, anyone who has tried quantum computing in the cloud knows that collecting meaningful data from these devices is notoriously difficult, owing to the high levels of experimental noise in the computation. This is probably why the authors’ experiment is stripped to its bare bones, in some people’s view, maybe too much. The quantum space has only four dimensions, because the set-up uses two quantum bits (qubits) of IBM’s smallest, five-qubit chip — at a time when the IBM cloud service already offers access to a 20-qubit device. The data set is likewise hand-engineered in such a way that it is simple to analyse in this four-dimensional space.

Nevertheless, Havlíček and colleagues’ work presents an intriguing proof-of-principle demonstration of a potentially revolutionary way of using quantum computers for machine learning. After many studies offering various attempts to mould the much more popular artificial neural networks into quantum computing, kernel methods provide a refreshingly natural bridge between machine learning and quantum theory. However, recognizing this bridge is only the beginning.

For instance, it remains to be seen whether the way in which Havlíček et al. represent data in quantum space is actually useful for real-world machine-learning applications. That is, it is not known whether the approach is associated with a meaningful measure of similarity that, for example, in classifying images of animals, places cat pictures close to cat pictures but not to dog pictures. Moreover, it is unclear whether there are other strategies that would work better. And would these techniques be good enough to beat almost 30 years of classical methods? If so, the desperate search for a ‘killer application’ for quantum computers would be over. But the answer to this question is probably more complicated.