Amplitude estimation,12 formally introduced in Supplementary Information, allows us to estimate a in the state \({\cal A}\left| 0 \right\rangle _{n + 1} = \sqrt {1 - a} \left| {\psi _0} \right\rangle _n\left| 0 \right\rangle + \sqrt a \left| {\psi _1} \right\rangle _n\left| 1 \right\rangle\) for an operator \({\cal A}\) acting on n + 1 qubits. AE uses m additional sampling qubits and Quantum Phase Estimation18 to produce an estimator \(\tilde a = {\mathrm{sin}}^2(y\pi {\mathrm{/}}M)\) of a, where y ∈ {0, …, M − 1} and M, the number of samples, is 2m. The estimator \(\tilde a\) satisfies

$$\left| {a - \tilde a} \right| \le \frac{\pi }{M} + \frac{{\pi ^2}}{{M^2}} = O\left( {M^{ - 1}} \right),$$ (1)

with probability of at least 8/π2. This represents a quadratic speed-up compared to the O(M−1/2) convergence rate of classical Monte Carlo methods.1

To use AE to estimate quantities related to a random variable X we must first represent X as a quantum state. Using n qubits we map X to the interval {0, …, N − 1}, where N = 2n. X is then represented by the state

$${\cal R}\left| 0 \right\rangle _n = \left| \psi \right\rangle _n = \mathop {\sum}\limits_{i = 0}^{N - 1} \sqrt {p_i} \left| i \right\rangle _n\quad {\mathrm{with}}\quad \mathop {\sum}\limits_{i = 0}^{N - 1} {\kern 1pt} p_i = 1$$ (2)

created by the operator \({\cal R}\). Here pi ∈ [0, 1] is the probability of measuring the state \(\left| i \right\rangle _n\) and i ∈ {0, …, N − 1} is one of the N possible realizations of X. Next, we consider a function f:{0, …, N − 1} → [0, 1] and a corresponding operator

$$F:\left| i \right\rangle _n\left| 0 \right\rangle \mapsto \left| i \right\rangle _n\left( {\sqrt {1 - f(i)} \left| 0 \right\rangle + \sqrt {f(i)} \left| 1 \right\rangle } \right),$$ (3)

for all i ∈ {0, …, N − 1}, acting on an ancilla qubit. Applying F to \(\left| \psi \right\rangle _n\left| 0 \right\rangle\) yields

$$\mathop {\sum}\limits_{i = 0}^{N - 1} \sqrt {1 - f(i)} \sqrt {p_i} \left| i \right\rangle _n\left| 0 \right\rangle + \mathop {\sum}\limits_{i = 0}^{N - 1} \sqrt {f(i)} \sqrt {p_i} \left| i \right\rangle _n\left| 1 \right\rangle .$$

With AE we approximate the probability of measuring \(\left| 1 \right\rangle\) in the last qubit, which equals \(\mathop {\sum}

olimits_{i = 0}^{N - 1} {\kern 1pt} p_if(i) = {\Bbb E}\left[ {f(X)} \right]\). AE can be used to approximate the expected value of a random variable.19,20 By choosing f(i) = i/(N − 1) we estimate \({\Bbb E}\left[ {\frac{X}{{N - 1}}} \right]\) and hence \({\Bbb E}[X]\). If we choose f(i) = i2/(N − 1)2 we can efficiently estimate \({\Bbb E}[X^2]\) and obtain the variance \({\mathrm{Var}}(X) = {\Bbb E}[X^2] - {\Bbb E}[X]^2\).

We now extend this technique to risk measures such as VaR and CVaR. For a given confidence level α ∈ [0, 1], VaR α (X) is the smallest value x ∈ {0, …, N − 1} such that \({\Bbb P}[X \le x] \ge (1 - \alpha )\). To find VaR α (X) on a quantum computer, we define the function f l (i) = 1 if i ≤ l and f l (i) = 0 otherwise, where l ∈ {0, …, N − 1}. Applying F l , the operator corresponding to f l , to \(\left| \psi \right\rangle _n\left| 0 \right\rangle\) produces

$$\mathop {\sum}\limits_{i = l + 1}^{N - 1} \sqrt {p_i} \left| i \right\rangle _n\left| 0 \right\rangle + \mathop {\sum}\limits_{i = 0}^l \sqrt {p_i} \left| i \right\rangle _n\left| 1 \right\rangle .$$ (4)

The probability of measuring \(\left| 1 \right\rangle\) for the last qubit is \(\mathop {\sum}

olimits_{i = 0}^l p_i = {\Bbb P}[X \le l]\). Therefore, with a bisection search over l we find the smallest level l α such that \({\Bbb P}[X \le l_\alpha ] \ge 1 - \alpha\) in at most n steps. The smallest level l α is equal to VaR α (X). This estimation of VaR α (X) has accuracy O(M−1), i.e. a quadratic speed-up compared to classical Monte Carlo methods (omitting the additional logarithmic complexity of the bisection search).

CVaR α (X) is the conditional expectation of X restricted to {0, …, l α }, where we compute l α = VaR α (X) as before. To estimate CVaR we apply the operator F corresponding to the function \(f(i) = \frac{i}{{l_\alpha }} \cdot f_{l_\alpha }(i)\) to \(\left| \psi \right\rangle _n\left| 0 \right\rangle\) to create

$$\begin{array}{r}\left( {\mathop {\sum}\limits_{i = l_\alpha + 1}^{N - 1} \sqrt {p_i} \left| i \right\rangle _n + \mathop {\sum}\limits_{i = 0}^{l_\alpha } \sqrt {1 - \frac{i}{{l_\alpha }}} \sqrt {p_i} \left| i \right\rangle _n} \right)\left| 0 \right\rangle \\ + \mathop {\sum}\limits_{i = 0}^{l_\alpha } \sqrt {\frac{i}{{l_\alpha }}} \sqrt {p_i} \left| i \right\rangle _n\left| 1 \right\rangle .\end{array}$$ (5)

The probability of measuring \(\left| 1 \right\rangle\) for the ancilla, approximated using AE, equals \(\mathop {\sum}

olimits_{i = 0}^{l_\alpha } \frac{i}{{l_\alpha }}p_i\). However, since \(\mathop {\sum}

olimits_{i = 0}^{l_\alpha } p_i\) does not sum up to one but to \({\Bbb P}[X \le l_\alpha ]\), as evaluated during the VaR estimation, we must normalize the probability of measuring \(\left| 1 \right\rangle\) to get

$${\mathrm{CVaR}}_\alpha (X) = \frac{{l_\alpha }}{{{\Bbb P}[X \le l_\alpha ]}}\mathop {\sum}\limits_{i = 0}^{l_\alpha } \frac{i}{{l_\alpha }}p_i.$$ (6)

We also multiplied by l α , otherwise we would estimate \({\mathrm{CVaR}}_\alpha \left( {\frac{X}{{l_\alpha }}} \right)\). Even though we replace \({\Bbb P}[X \le l_\alpha ]\) by an estimation, the error bound on CVaR, see Supplementary Information, still achieves a quadratic speed-up compared to classical Monte Carlo methods.

We have shown how to choose f to calculate the expected value, variance, VaR and CVaR of a random variable X represented by \(\left| \psi \right\rangle _n\). However, we are usually interested in \({\Bbb E}[f(X)]\) for a more general function f:{0, …, N − 1} → {0, …, N′ − 1}, \(N\prime = 2^{n\prime }\), \(n\prime \in {\Bbb N}\), for instance representing some payoff or loss depending on X. In some cases, as shown later, we can adjust F accordingly. In other cases, we can apply a corresponding operator \(\left| i \right\rangle _n\left| 0 \right\rangle _{n\prime } \mapsto \left| i \right\rangle _n\left| {f(i)} \right\rangle _{n\prime }\) and use the previously introduced algorithms on the second register. There exist numerous quantum algorithms for arithmetic operations21,22,23,24,25 as well as tools to translate classical logic into quantum circuits.26,27

Quantum circuits

We now show how the previously discussed algorithms can be mapped to quantum circuits. We start with the construction of \(\left| \psi \right\rangle _n\), see Eq. (2), representing the probability distribution of X. In general, the best known upper bound for the number of gates required to create \(\left| \psi \right\rangle _n\) is O(2n).28 However, approximations with polynomial complexity in n are possible for many distributions, e.g., log-concave distributions.29

Approximating \({\Bbb E}[X]\) using AE requires the operator F corresponding to f(x) = x/(N − 1), defined in Eq. (3). In general, representing F for the expected value or for the CVaR either requires an exponential O(2n) number of gates or additional ancillas to pre-compute the (discretized) function f into qubits, using quantum arithmetic, before applying the rotation.30 The exact number of ancillas depends on the desired accuracy of the approximation of F. Another approach consists of piecewise polynomial approximations of f.31 However, this also implies a significant overhead in terms of the number of ancillas and gates. In the following, we show how to overcome these hurdles by approximating F without ancillas using polynomially many gates, at the cost of a lower—but still faster than classical—rate of convergence. Note that the operator required for estimating VaR is easier to construct and we can always achieve the optimal rate of convergence as discussed later in this section.

Our contribution rests on the fact that an operator mapping \(\left| x \right\rangle _n\left| 0 \right\rangle\) to \(\left| x \right\rangle _n\left( {{\mathrm{cos}}(\zeta (x))\left| 0 \right\rangle + {\mathrm{sin}}(\zeta (x))\left| 1 \right\rangle } \right)\), for a given polynomial \(\zeta (x) = \mathop {\sum}

olimits_{j = 0}^k {\kern 1pt} \zeta _jx^j\) of degree k, can be efficiently constructed using multi- controlled Y-rotations. Single qubit operations with n − 1 control qubits can be exactly constructed, e.g., using O(n) gates and O(n) ancillas or O(n2) gates without any ancillas. They can also be approximated with accuracy \(\epsilon \; > \; 0\) using \(O(n\,{\mathrm{log}}(1{\mathrm{/}}\epsilon ))\) gates.32 For simplicity, we use O(n) gates and O(n) ancillas. Since the binary variable representation of ζ, leads to at most nk terms, the corresponding operator can be constructed using O(nk+1) gates and O(n) ancillas. An example for a second order polynomial is shown in Supplementary Information.

For every analytic function f, there exists a sequence of polynomials such that the approximation error converges exponentially fast to zero with increasing degree of the polynomials.33 Thus, for simplicity, we assume that f is a polynomial of degree s.

If we can find a polynomial ζ(y) such that \({\mathrm{sin}}^2(\zeta (y)) = y\), then we can set y = f(x), and the previous discussion provides a way to construct the operator F. Since the expected value is linear, we may choose to estimate \({\Bbb E}\left[ {c\left( {f(X) - \frac{1}{2}} \right) + \frac{1}{2}} \right]\) instead of \({\Bbb E}[f(X)]\) for a parameter c ∈ (0, 1], and then map the result back to an estimator for \({\Bbb E}[f(X)]\). The rationale behind this choice is that \({\mathrm{sin}}^2\left( {y + \frac{\pi }{4}} \right) = y + \frac{1}{2} + O(y^3)\). Thus, we want to find ζ(y) such that \(c\left( {y - \frac{1}{2}} \right) + \frac{1}{2}\) is sufficiently well approximated by \({\mathrm{sin}}^2\left( {c\zeta (y) + \frac{\pi }{4}} \right)\). Setting the two terms equal and solving for ζ(y) leads to

$$\zeta (y) = \frac{1}{c}\left( {\sin ^{ - 1}\left( {\sqrt {c\left( {y - \frac{1}{2}} \right) + \frac{1}{2}} } \right) - \frac{\pi }{4}} \right),$$ (7)

and we choose ζ(y) as a Taylor approximation of Eq. (7) around y = 1/2. Note that Eq. (7) defines an odd function around y = 1/2, and thus the even terms in the Taylor series equal zero. The Taylor approximation of order 2u + 1 leads to a maximal approximation error for Eq. (7) of

$$\frac{{c^{2u + 3}}}{{2(2u + 3)}} + O(c^{2u + 5}),$$ (8)

for all y ∈ [0, 1], as shown in Supplementary Information.

Now we consider the resulting polynomial ζ(f(x)) of order s(2u + 1). The number of gates required to construct the corresponding circuit scales as O(ns(2u+1)+1). The smallest scenario of interest is s = 1 and u = 0, i.e., both, f and ζ, are linear functions, which leads to a circuit for F where the number of gates scales quadratically with respect to the number of qubits n representing \(\left| \psi \right\rangle _n\).

Thus, using AE to estimate \({\Bbb E}\left[ {c\left( {f(x) - \frac{1}{2}} \right) + \frac{1}{2}} \right]\) leads to a maximal error

$$\frac{\pi }{M} + \frac{{c^{2u + 3}}}{{2(2u + 3)}} + O\left( {c^{2u + 5} + M^{ - 2}} \right),$$ (9)

where we ignore the higher order terms in the following. Since our estimation uses cf(x), we also need to analyze the scaled error \(c\epsilon\), where \(\epsilon > 0\) denotes the resulting estimation error for \({\Bbb E}[f(X)]\). Setting Eq. (9) equal to \(c\epsilon\) and reformulating it leads to

$$c\epsilon - \frac{{c^{2u + 3}}}{{2(2u + 3)}} = \frac{\pi }{M}.$$ (10)

Maximizing the left-hand-side with respect to c, i.e. minimizing the number of required samples M to achieve a target error \(\epsilon\), results in \(c^ \ast = (2\epsilon )^{1/(2u + 2)}\). Plugging c* into Eq. (10) gives

$$2^{\frac{1}{{2u + 2}}}\left( {1 - \frac{1}{{2u + 3}}} \right)\epsilon ^{1 + \frac{1}{{2u + 2}}} = \frac{\pi }{M}.$$ (11)

Translating this into a rate of convergence for the estimation error \(\epsilon\) with respect to the number of samples M leads to \(\epsilon = O\left( {M^{ - \frac{{2u + 2}}{{2u + 3}}}} \right).\) For u = 0, we get O(M−2/3), which is already better than the classical convergence rate of O(M−1/2). For increasing u, the convergence rate quickly approaches the optimal rate of O(M−1).

For the estimation of the expectation we exploited \({\mathrm{sin}}^2\left( {y + \frac{\pi }{4}} \right) \approx y + \frac{1}{2}\), for small |y|. For the variance we apply the same idea but use \({\mathrm{sin}}^2(y) \approx y^2\). We employ this approximation to estimate the value of \({\Bbb E}\left[ {f(X)^2} \right]\) and then, together with the estimation for \({\Bbb E}\left[ {f(X)} \right]\), we evaluate \({\mathrm{Var}}\left( {f(X)} \right) = {\Bbb E}\left[ {f(X)^2} \right] - {\Bbb E}\left[ {f(X)} \right]^2\). The resulting convergence rate is again equal to \(O\left( {M^{ - \frac{{2u + 2}}{{2u + 3}}}} \right)_{}^{}\).

The previous discussion shows how to build quantum circuits to estimate \({\Bbb E}[f(X)]\) and Var(f(X)) more efficiently than possible classically. In the following, we extend this to VaR and CVaR.

Suppose the state \(\left| \psi \right\rangle _n\) corresponding to the random variable X on {0, …, N − 1} and a fixed l ∈ {0, …, N − 1}. To estimate VaR, we need an operator F l that maps \(\left| x \right\rangle _n\left| 0 \right\rangle\) to \(\left| x \right\rangle _n\left| 1 \right\rangle\) if x ≤ l and to \(\left| x \right\rangle _n\left| 0 \right\rangle\) otherwise, for all x ∈ {0, …, N − 1}. Then, for the fixed l, AE can be used to approximate \({\Bbb P}[X \le l]\), as shown in Eq. (5). With (n + 1) ancillas, adder-circuits can be used to construct F l using O(n) gates,23 and the resulting convergence rate is O(M−1). For a given level α, a bisection search can find the smallest l α such that \({\Bbb P}[X \le l_\alpha ] \ge \alpha\) in at most n steps, and we get l α = VaR α (X).

To estimate the CVaR, we apply the circuit F l for l α to an ancilla qubit and use this ancilla qubit as a control for the operator F used to estimate the expected value, but with a different normalization, as shown in Eq. (5). Based on the previous discussion, it follows that AE can then be used to approximate CVaR α (X) with the same trade-off between circuit depth and convergence rate as for the expected value.

T-Bill on a single period binomial tree

Our first model consists of a zero coupon bond discounted at an interest rate r. See Supplementary Information for an introduction to the financial concepts used in this article. We seek to find the value of the bond today given that in the next time step there might be a δr rise in r. The value of the bond today, with face value V F , i.e. the amount of money the bond holder receives when the bond matures, is

$$V = \frac{{(1 - p)V_F}}{{1 + r + \delta r}} + \frac{{pV_F}}{{1 + r}} = (1 - p)V_{{\mathrm{low}}} + pV_{{\mathrm{high}}},$$ (12)

where p and (1 − p) denote the probabilities of a constant interest rate and a rise, respectively. This model is the first step of a binomial tree. Binomial trees can be used to price securities with a path dependency such as bonds with embedded options.34

The simple scenario in Eq. (12) could correspond to a market participant who bought a 1 year T-bill the day before a Federal Open Markets Committee announcement and expects a δr = 0.25%-points increase of the Federal Funds Rate with a (1 − p) = 70% probability and no change with a p = 30% probability.

We show how to calculate the value of the investor’s T-bill by running AE on the IBM Q Experience and mapping V to [0, 1] such that V low and V high correspond to $0 and $1, respectively.

Here, we only need a single qubit to represent the uncertainty and the objective and we have \({\cal A} = R_y(\theta _p)\), a Y-rotation of angle \(\theta _p = 2\,{\mathrm{sin}}^{ - 1}\left( {\sqrt p } \right)\), and thus, \({\cal A}\left| 0 \right\rangle = \sqrt {1 - p} \left| 0 \right\rangle + \sqrt p \left| 1 \right\rangle\).

AE requires applying exponentials of an operator Q derived from \({\cal A}\). For the single qubit case Q = R y (2θ p ). This implies Qj = R y (j2θ p ), which allows us to construct the AE circuit efficiently to approximate the parameter \(p = {\Bbb E}[X] = 30\%\).

Although a single period binomial tree is a very simple model, it is straight-forward to extend it to multi-period multi-nomial trees with path-dependent assets. Thus, it represents the smallest building block for interesting scenarios of arbitrary complexity.

We run several experiments on read quantum hardware in which we apply AE with a different number of evaluation qubits m = 1, 2, 3, 4 corresponding to M = 2, 4, 8, 16 samples, respectively, to estimate \(p = {\Bbb E}[X]\). This requires at most five qubits and can be run on the IBM Q 5 Yorktown (ibmqx2) quantum processor.35 Since the success probability of AE is larger than 8/π2 we would in principle only need, for instance, 24 repetitions to achieve a success probability of 99.75%.36 However, current quantum hardware introduces additional errors. Therefore, we repeat every circuit 8192 times (i.e., the maximal number of shots in the IBM Q Experience) to get a reliable estimate. We distinguish between the number of repetitions required by the imperfections of the hardware and the convergence rate of our algorithm. We thus consider the 8192 repetitions as a constant overhead, which we ignore when comparing the quantum and classical algorithms. The quantum circuit for m = 3 compiled to the IBM Q 5 quantum processor is illustrated in Fig. 1. The connectivity of the IBM Q 5 quantum processor, shown in Supplementary Information, requires swapping two qubits in the middle of the circuit between the application of the controlled Q operators and the inverse Quantum Fourier Transform. The results of the algorithm are illustrated in Fig. 2a where it can be seen that the most frequent estimator approaches the real value p and how the resolution of the algorithm increases with m. The quantum algorithm presented in this paper outperforms the Monte Carlo method already for M = 16 samples (i.e. m = 4 evaluation qubits), which is the largest scenario we performed on the real hardware, see Fig. 2b. The details of this convergence analysis are discussed in Supplementary Information.

Fig. 1 AE circuit for the T-Bill problem with m = 3. Dashed boxes highlight from left to right: the controlled Q operators, the swap of two qubits, and the inverse Quantum Fourier Transform (QFT). The swap is needed to overcome the limited connectivity of the chip. U 2 and U 3 , formally introduced in Supplementary Information, indicate single qubit rotations where the parameters are omitted. Note that the circuit could be further optimized, e.g., the adjoint CNOT gates at the beginning of the SWAP would cancel out, but we kept them for illustration Full size image

Fig. 2 a Results of running AE on real hardware for m = 1, …, 4 with 8192 shots each. Therefore, the error bars on the histograms are \({\mathrm{1/}}\sqrt {{\mathrm{8192}}}\) and not shown. The green bars indicate the probability of the most frequent estimate and the blue bars the probability of the other estimates. The red dashed lines indicate the target value of 30%. The gray dashed lines show the probability of the second most frequent value to highlight the resulting contrast. The possible values are not equally distributed on the x-axis, since AE first returns a number y ∈ {0, …, M − 1} that is then classically mapped to \(\tilde a = {\mathrm{sin}}^2\left( {\frac{{y\pi }}{M}} \right)\). b Comparison of the convergence of the error of Monte Carlo simulation and our algorithm with respect to the number of samples M. Although the quantum algorithm starts with a larger estimation error, for M ≥ 16 (m ≥ 4) the better convergence rate of the quantum algorithm takes over and the error stays below the Monte Carlo results. The blue solid line shows the error for our real experiments using up to m = 4 evaluation qubits. The blue dashed line shows how the estimation error would further decrease for experiments with m = 5, 6 evaluation qubits, respectively Full size image

Two-asset portfolio

We now illustrate how to use our algorithm to calculate the daily risk in a portfolio made up of 1-year US Treasury bills and 2-year US Treasury notes with face values \(V_{F_1}\) and \(V_{F_2}\), respectively. We chose a simple portfolio in order to put the focus on the AE algorithm applied to VaR. The portfolio is worth

$$V(r_1,r_2) = \frac{{V_{F_1}}}{{1 + r_1}} + \mathop {\sum}\limits_{i = 1}^4 \frac{{r_cV_{F_2}}}{{(1 + r_2{\mathrm{/}}2)^i}} + \frac{{V_{F_2}}}{{(1 + r_2{\mathrm{/}}2)^4}},$$ (13)

where r c is the annual coupon rate, i.e. the annual interest payment that the bondholder receives divided by the face value of the bond, paid every 6 months by the 2-year treasury note and r 1 and r 2 are the yield to maturity of the 1-year bill and 2-year note, respectively.

US Treasuries are usually assumed to be default free.37 The cash-flows are thus known ex ante and the changes in the interest rates are the primary risk factors. Therefore, a proper understanding of the yield curve, i.e. the yield of bonds versus their maturity, suffices to model the risk in this portfolio. We use the Constant Maturity Treasury (CMT) rates to model the uncertainty in r 1 and r 2 . To calculate the daily risk of our portfolio we study the difference in the CMT rates from 1 day to the next. These differences are highly correlated (as are the initial CMT rates), see Fig. 3a, making it unnecessary to model them all when simulating more complex portfolios. A principal component analysis reveals that the first three principal components, named shift, twist and butterfly account for 96% of the variance,38,39 see Fig. 3b, c. Therefore, when modeling a portfolio of US Treasury securities it suffices to study the distribution of these three factors. This dimensionality reduction also lowers the amount of resources needed by our quantum algorithm.

Fig. 3 Daily change in the CMT rates. a Correlation matrix. The high correlation between the rates can be exploited to reduce the dimension of the problem. b Shift, Twist, and Butterfly components expressed in terms of the original constant maturity treasury rates. c Eigenvalues of the principal components. The numbers show the cumulative explained variance. d Historical constant maturity treasury rates (1-year against 2-years to maturity) as well as the resulting principal components: shift (longer vector), and twist (shorter vector). e 8-bin histogram of historical shift data (bars) as well as fitted distribution (dashed line). f 4-bin histogram of historical twist data (bars) as well as fitted distribution (dashed line). In both cases the labels show the quantum state that will occur with the corresponding probability. g, h show the quantum circuits used to load the distributions of e, f, respectively, into the quantum computer Full size image

To study the daily risk in the portfolio we write r i = r i,0 + δr i for i = 1, 2, where r i,0 is the yield to maturity observed today and the random variable δr i follows the historical distribution of the 1 day changes in the CMT rate with maturity i. For our demonstration we set \(V_{F_1} = V_{F_2} = \$ 100\), r 1,0 = 1.8%, r 2,0 = 2.25%, and r c = 2.5% in Eq. (13). We perform a principal component analysis of δr 1 and δr 2 and retain only the shift S and twist T components. Figure 3d illustrates the historical data as well as S and T, related to δr i by

$$\left( {\begin{array}{*{20}{c}} {\delta r_1} \\ {\delta r_2} \end{array}} \right) = {\boldsymbol{W}}\left( {\begin{array}{*{20}{c}} S \\ T \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {0.703} & { - 0.711} \\ {0.711} & {0.703} \end{array}} \right)\left( {\begin{array}{*{20}{c}} S \\ T \end{array}} \right).$$ (14)

The correlation coefficient between shift and twist is −1%. We thus assume them to be independent and fit discrete distributions to each separately. We retained only the first two principal components to illustrate the use of principal component analysis despite the fact that, in this example, there is no dimensionality reduction. Furthermore, this allows us to simulate our algorithm in a reasonable time on classical hardware by keeping the number of required qubits low. We expect that all three components would be retained when running this algorithm on real quantum hardware for larger portfolios.

To model the uncertainty in the quantum computer we use three qubits, denoted by q 0 , q 1 , q 2 , to represent the distribution of S, and two, denoted by q 3 , q 4 , for T. Following Eq. (2), the probability distributions are encoded by the states \(\left| {\psi _S} \right\rangle = \mathop {\sum}

olimits_{i = 0}^7 \sqrt {p_{i,S}} \left| i \right\rangle _3\) and \(\left| {\psi _T} \right\rangle = \mathop {\sum}

olimits_{i = 0}^3 \sqrt {p_{i,T}} \left| i \right\rangle _2\) for S and T, which can thus take eight and four different values, respectively. We use more qubits for S than for T since the shift explains a larger part of the variance. Additional qubits may be used to represent the probability distributions at a higher resolution. The qubits naturally represent integers via binary encoding and we apply the affine mappings

$$S = 0.0626{\kern 1pt} x - 0.2188,$$ (15)

$$T = 0.0250{\kern 1pt} y - 0.0375.$$ (16)

Here x ∈ {0, …, 7} and y ∈ {0, …, 3} denote the integer representations of S and T, respectively. Given the almost perfect symmetry of the historical data we fit symmetric distributions to it. The operator \({\cal R}\) that we define prepares a quantum state \({\cal R}\left| 0 \right\rangle _5\), illustrated by the dots in Fig. 3e, f, that represents the distributions of S and T, up to the aforementioned affine mapping.

Next, we show how to construct the operator F to translate the random variables x and y into a portfolio value. Equations (13) through (16) allow us to define the portfolio value V in terms of x and y, instead of r 1 and r 2 . For simplicity, we use a first order approximation

$$\tilde f(x,y) = 203.5170 - 13.1896x - 1.8175y$$ (17)

of V around the mid points x = 3.5 and y = 1.5. From a financial perspective, the first order approximation \(\tilde f\) of V corresponds to studying the portfolio from the point of view of its duration.40 Higher order expansions, e.g. convexity could be considered at the cost of increased circuit depth.

To map the approximated value of the portfolio \(\tilde f\) to a function f with target set [0, 1] we compute \(f = \left( {\tilde f - \tilde f_{{\rm min}}} \right){\mathrm{/}}\left( {\tilde f_{{\rm max}} - \tilde f_{{\rm min}}} \right)\), where \(\tilde f_{{\rm min}} = \tilde f(7,3)\) and \(\tilde f_{{\rm max}} = \tilde f(0,0)\), i.e., the minimum and maximum values \(\tilde f\) can take for the considered values of x ∈ {0, …, 7} and y ∈ {0, …, 3}. This leads to

$$f(x,y) = 1 - 0.1349x - 0.0186y.$$ (18)

Polynomial approximations allow us to construct an operator F corresponding to f for a given scaling parameter c ∈ (0, 1].

We simulate the two-asset portfolio assuming an ideal quantum computer for different numbers m of sampling qubits to show the behavior of the accuracy and convergence rate. We repeat this task twice, once for a processor with all-to-all connectivity and once for a processor with a connectivity corresponding to the IBM Q 20 Tokyo, https://quantumexperience.ng.bluemix.net/qx/devices, accessed: 2018-05-22. chip, see Supplementary Information. This highlights the overhead imposed by a realistic chip connectivity. For a number M = 2m samples, we need a total of m + 12 qubits for expected value and VaR, and m + 13 qubits for CVaR. Five of these qubits are used to represent the distribution of the interest rate changes, one qubit is needed to create the state in Eq. (3) used by AE, and six ancillas are needed to implement the controlled Q operator. For CVaR we need one more ancilla for the comparison to the level l. Once the shift and twist distributions are loaded into the quantum computer, using the circuit shown in Fig. 3g, h, we apply the operator F to create the state defined in Eq. (3).

We compare the quantum estimation of risk to the exact 95% VaR level of $0.288. Based on Eq. (18), this classical VaR corresponds to 0.093, shown by the verticle line in Fig. 4. The quantum estimation of risk rapidly approaches this value as m is increased, Fig. 4. With m = 5 sample qubits the difference between the classical and quantum estimates is 9%. The number of CNOT gates needed to calculate VaR approximately doubles each time a sample qubit is added, see Table 1, i.e. it scales as O(M) with a resulting error of O(M−1). We find that the connectivity of the IBM Q 20 Tokyo chip increases the number of CNOT gates by a factor 2.3 when compared to a chip with all-to-all connectivity.

Fig. 4 VaR estimated through a simulation of a perfect quantum computer. As the number of sample qubits m is increased the quantum estimated VaR approaches the classical value indicated by the vertical blue line. The dashed lines are intended as guides to the eye. The stars indicate the most probable values Full size image

Table 1 Summary of the number of CNOT gates to estimate VaR as a function of m for a processor architecture featuring an all-to-all qubit connectivity and an architecture with a qubit connectivity corresponding to the IBM Q 20 Tokyo chip with 20 qubits Full size table

Computing the expected value or risk measures for the two-asset portfolio requires a long circuit. However, it suffices for AE to return the correct state with the highest probability, i.e. measurements do not need to yield this state with 100% probability. We now run simulations with errors to investigate how much imperfections can be tolerated before the correct state can no longer be identified.

We study the effect of two types of errors: energy relaxation and cross-talk, where the latter is only considered for two-qubit gates (CNOT gates). We believe this captures the leading error sources. Errors and gate times for single qubit gates are in general an order of magnitude lower than for two-qubit gates.41,42,43 Furthermore, our algorithm requires the same order of magnitude in the number of single and two-qubit gates. Energy relaxation is simulated using a relaxation rate γ such that after a time t each qubit has a probability 1 − exp(−γt) of relaxing to \(\left| 0 \right\rangle\).44 We set the duration of the CNOT gates to 100 ns and assume that the single qubit gates are done instantly and are thus exempt from errors. We also include qubit–qubit cross-talk in our simulation by adding a ZZ error-term in the generator of the CNOT gate

$${\mathrm{exp}}\{ - i\pi (ZX + \alpha ZZ){\mathrm{/}}4\} .$$ (19)

Typical cross-resonance45 CNOT gate rates are of the order of 5 MHz whilst cross-talk on IBM Q chips are of the order of −100 kHz.43 We thus estimate a reasonable value of α, i.e. the strength of the cross-talk, to be −2% and simulate its effect over the range [−3%, 0%].

We illustrate the effect of these errors by computing the expected value of the portfolio. Since the distributions are symmetric around zero and mapped to the interval [0, 1] we expect a value of 0.5, i.e. from one day to the next we do not expect a change in the portfolio value. This simulation is run with m = 2 sample qubits since this suffices to exactly estimate 0.5. The algorithm is successful if it manages to identify 0.5 with a probability >50%. With our error model this is achieved for relaxation rates γ < 10−4 s−1 and cross-talk strength |α| < 1%, see Fig. 5a–c, despite the 4383 gates needed. A generous estimation of current hardware capabilities with γ = 10−4 s−1 (loosely based on T 1 = 100 μs) and α = −2%, shown as red lines in Fig. 5, indicates that this simulation may be possible in the near future as long as other error sources (such as measurement error and unitary errors resulting from improper gate calibrations) are kept under control.