D-Wave, a company we have covered extensively in the past, has taken some heat because a number of tests have shown that its adiabatic quantum optimizer is no faster than a regular computer. But the company is now claiming that its machine is between two to 600 times faster than a classical computer. What happened to obtain such a huge speed-up? A new benchmark, that's what.

I remember when every OS manufacturer, chip maker, and video card vendor would either cheat or invent a new benchmark in order to claim to be the fastest. So my eyes initially rolled so hard that Newton's third law just about kicked me out of my chair. After recovering from the muscle strain, I got a hold of the paper, and much to my surprise, I agree with the authors.

Inappropriate benchmarks

Before I get to the D-Wave benchmark, let's take a quick look at how a naive benchmark might work. You can set your computer a task and time how long it takes to perform it. Or you can set it a repetitive set of fundamental operations (say multiplying two numbers) and see how many operations it can perform per second.

I'm not going to discuss how the D-Wave annealer works (I've described it here here , and here ). But one thing you need to know here is that unlike traditional computers, the hardware doesn't perform discrete operations. So counting individual operations clearly isn't going to work.

The surprise is that setting a task and timing it is also not a good measure. Let's take an example: imagine that I am in charge of figuring out the shortest route for a courier to take over the course of a day. There might be several million different possible routes, and maybe one of them is actually the shortest by a significant margin. But it's more likely that there are 10 or so solutions that are within 0.1 percent of the shortest route, and there might be a hundred or more that are within one percent.

Why do we care about these non-optimum routes? Because the chance of finding the 100 solutions that are within one percent of the shortest route is much higher than finding a single shortest route. In other words, the longer I search for a solution, the better the solution I will get. But unless you test every single possible route, you can never tell when you have found the solution—only that you have found a solution that is good enough.

For these sorts of problems, the scaling involved in exhaustive searches is very dramatic. If your computer takes a few milliseconds to find a top-100 solution, it might take an hour to find a top-five solution. Time is definitely not on your side for these problems.

If you're going to use the time it takes to find a solution as a benchmark, then you first have to limit yourself to problems for which you can find the solution. In other words, very small examples of the routing problem (in this instance). But if the performance of two different computer systems scale very differently with problem size, then a small example is not going to be predictive of what it's like to actually use them on hard problems. Furthermore, it simply isn't representative of how the computer will be used.

Time to temperature?

The new benchmark tries to capture this difference. It takes advantage of the fact that both classical computers and the D-Wave annealer use the same basic approach to computation, called annealing. The idea is that you set up an energy landscape where the configuration of bits that has the lowest energy is the solution to the problem. You start with the landscape in a configuration that you understand and place the bits in the lowest energy state. You then slowly modify the landscape so that it corresponds to that of the problem you want to solve.

If you do it right, the bits flip back and forth to stay in the lowest energy configuration. At the end of the computation, you read out the bits, and that is the solution. What actually happens, though, is that bits don't stay in the lowest energy configuration, but they often stay near it.

In the D-Wave system, each bit is actually a physical object (a ring of superconducting current), so the annealing, as it is called, is performed by adjusting the coupling between the rings. In a classical computer, this is all simulated. The important point is that the coupling between the bits represents an energy, so the sum of all these can be measured. The smaller the sum is, the better your solution.

To create a benchmark, you choose a problem and run it many times. From this, you obtain a distribution of energies—you can judge the quality of these solutions by their deviation from the mean. Remember, we don't know what the actual minimum energy is; instead, we say that a good solution has an energy below a certain extreme point of the distribution. Having set these thresholds, you then measure how many times you have to perform the anneal before you get a solution that's at or below the target energy. Given a fixed run-time per sweep, you then have the time it takes to get to a target energy.

A fair benchmark?

The results of the benchmarks show that the D-Wave system pretty much spanks all the other solvers, but mostly in the cases where the problem size is large. For small problems, the difference is much smaller and sometimes reverses once you take into account the time required to read the solution out from the D-Wave chip.

The key difference is that the D-Wave chip optimizes all of its bits at the same time as the couplings between them sweep in an analog fashion between pre-programmed beginning and endpoints. To simulate this on a classical computer, each bit's value needs to be recomputed, based on updated couplings and bit values from neighbors. Indeed, the classical computer needs to perform many operations per bit and then perform all of them on each bit for every step in the sweep. This will be slow, and it gets much worse when there are more bits.

While the bits in D-Wave's chip update themselves, it is slow to read out, and setting up the problem space probably takes some time as well. These are fixed costs, but they become less important as the time to perform the actual computation is increased.

The message being, quantum or not, real annealing should beat simulated annealing any day of the week for large problems.

In that respect, the new benchmark shows exactly the advantages of a physical annealing system compared to a simulated annealing system. It will also be very useful to compare the performance between two different physical annealing systems. So yes, this is an excellent benchmark.

On the other hand

This result, though, kind of feels like the situation where I happen to have a digital computer and you happen to have a calculator. For any single computation, the calculator wins, but for a long series, the computer wins. We generally don't run benchmarks to compare the two because we accept that the two are not used for the same problems.

This comes into play in two ways. One is that the D-Wave system still has too few bits to be used for real-world problems—it's still the equivalent of a computer that can only compete with calculators. The benchmark only works within the space of problems that fit on the chip, which makes it unrealistic for now, although there is no denying that it shows the potential of D-Wave hardware.

At present, D-Wave is aiming to compete with simulated annealers and equivalent solvers that can run on standard hardware. But once the company has scaled it to the point where their chip is useful, the real competition (and the real use of the benchmark) will be other physical annealing systems.

arXiV.org, 2015, 1508.05087