Classical computation

The one- and two-electron integrals defining the fermionic Hamiltonian in Eq. (3) are obtained from a preliminary HF calculation that is assumed to be easily feasible on a classical computer. In non-relativistic theory the one-electron integrals are given by

$${h}_{pq}=\int {\rm{d}}{\bf{r}}{\phi }_{p}^{* }({\bf{r}})\left(-\frac{1}{2}{

abla }_{{\bf{r}}}+V({\bf{r}})\right){\phi }_{q}({\bf{r}}),$$ (17)

where \(V({\bf{r}})\) is the electron-nuclear attraction potential from fixed nuclei at positions \({{\bf{R}}}_{i}\). The two-electron integrals are given by,

$${g}_{pqrs}=\iint {\rm{d}}{{\bf{r}}}_{1}{\rm{d}}{{\bf{r}}}_{2}\frac{{\phi }_{p}^{* }({{\bf{r}}}_{1}){\phi }_{q}({{\bf{r}}}_{1}){\phi }_{r}^{* }({{\bf{r}}}_{2}){\phi }_{s}({{\bf{r}}}_{2})}{{r}_{12}}.$$ (18)

For simplicity we used a finite difference technique to compute the matrix representations of perturbations corresponding to a change in nuclear coordinates and an external electric field

$$\frac{\partial \hat{H}}{\partial \lambda }\approx \frac{\hat{H}(\lambda +\delta \lambda /2)-\hat{H}(\lambda -\delta \lambda /2)}{\delta \lambda },$$ (19)

and

$$\frac{{\partial }^{2}\hat{H}}{\partial {\lambda }^{2}}\approx \frac{\hat{H}(\lambda +\delta \lambda )+\hat{H}(\lambda -\delta \lambda )-2\hat{H}(\lambda )}{\delta {\lambda }^{2}},$$ (20)

where \(\delta \lambda =0.001\) corresponds to a small change in \(\lambda\). The above (perturbed) quantum chemical Hamiltonians have been determined within the Dirac program51 and transformed into qubit Hamiltonians using the OpenFermion52 package. This uses the newly developed, freely available53 OpenFermion-Dirac interface, allowing for the simulation of relativistic quantum chemistry calculations on a quantum computer. While a finite difference technique was sufficient for the present purpose, such schemes are sensitive to numerical noise and have a high computational cost when applied to larger molecular systems. A consideration of the analytical calculation of energy derivatives can be found in the Supplementary methods.

Approximate bound calculation details

In this section we detail our method for calculating the approximate bounds in Table 1. We first estimate the error \(\epsilon\) (Table 2, first row; details of the non-trivial calculations for the PPE and ETA methods given in Section IV H and Section IV M), respectively. Separately, we may calculate the time cost by multiplying the number of circuits, the number of repetitions of said circuits (\({n}_{{\rm{m}}}\), \({N}_{{\rm{m}}}\), and \(K\) depending on the method), and the time cost of each circuit (Table 2, second row). (This assumes access to only a single quantum processor, and can in some situations be improved by simultaneous measurement of commuting terms, as discussed in Section IV C.) We then choose the scaling of the number of circuit repetitions as a function of the other metaparameters to fix \(\epsilon\) constant (Table 2, third row). We finally substitute the lower and upper bounds for these metaparameters in terms of the system size as stated throughout the remaining sections. For reference, we summarize these bounds in Table 3.

Table 2 Intermediate steps for the approximate calculation of the computational complexity as a function of the system size give in Table 1. Full size table

Table 3 Summary of approximations used to derive the scaling laws with \({N}\) sys in Table 1. Full size table

Quantum simulation of the electronic structure problem — preliminaries

To represent the electronic structure problem on a quantum computer, we need to rewrite the fermionic creation and annihilation operators \({\hat{c}}_{i}^{\dagger }\), \({\hat{c}}_{i}\) in terms of qubit operators (e.g., elements of the Pauli basis \({{\mathbb{P}}}^{N}={\{I,X,Y,Z\}}^{\otimes N}\)). This is necessarily a non-local mapping, as local fermionic operators anti-commute, while qubit operators commute. A variety of such transformations are known, including the Jordan–Wigner,54,55 and Bravyi–Kitaev56 transformations, and more recent developments.57,58,59,60,61,62,63,64

After a suitable qubit representation has been found, we need to design quantum circuits to implement unitary transformations. Such circuits must be constructed from an appropriate set of primitive units, known as a (universal) gate-set. For example, one might choose the set of all single-qubit operators, and a two-qubit entangling gate such as the controlled-NOT, C-Phase, or iSWAP gates.65 One can then build the unitary operators \({e}^{i\theta \hat{P}}\) for \(\hat{P}\in {{\mathbb{P}}}^{N}\) exactly (in the absence of experimental noise) with a number of units and time linear in the size of \(\hat{P}\).66 (Here, size refers to the number of non-identity tensor factors of \(\hat{P}\).) Optimizing the scheduling and size of these circuits is an open area of research, but many improvements are already known.67

Transformations of a quantum state must be unitary, which is an issue if one wishes to prepare, e.g., \(\partial \hat{H}/\partial \lambda \left|{\Psi }_{0}\right\rangle\) on a quantum register (\(\partial \hat{H}/\partial \lambda\) is almost always not unitary). To circumvent this, one must decompose \(\partial \hat{H}/\partial \lambda\) as a sum of \({N}_{{\rm{U}}}\) unitary operators, perform a separate circuit for each unitary operator, and then combine the resulting measurements as appropriate. Such a decomposition may always be performed using the Pauli group (although better choices may exist). Each such decomposition incurs a multiplicative cost of \({N}_{{\rm{U}}}\) to the computation time, and further increases the error in any final result by at worst a factor of \({N}_{{\rm{U}}}^{1/2}\). This makes the computational complexities reported in Table 2 highly dependent on \({N}_{{\rm{U}}}\). The scaling of \({N}_{{\rm{U}}}\) with the system size is highly dependent on the operator to be decomposed and the choice of decomposition. When approximating this in Table 3 we use a range between \(O({N}_{{\rm{sys}}})\) (which would suffice for a local potential in a lattice model) to \(O({N}_{{\rm{sys}}}^{4})\) (for a two-body interaction).

To interface with the outside world, a quantum register needs to be appropriately measured. Similarly to unitary transformations, one builds these measurements from primitive operations, typically the measurement of a single qubit in the \(Z\) basis. This may be performed by prior unitary rotation, or by decomposing an operator \(\hat{O}\) into \({N}_{{\rm{T}}}\) Hermitian terms \({\hat{O}}_{i}\) (which may be measured separately). \({N}_{{\rm{T}}}\) differs from \({N}_{{\rm{U}}}\) defined above, as the first is for a Hermitian decomposition of a derivative operator and the second is for a unitary decomposition. Without a priori knowledge that the system is near an eigenstate of a operator \(\hat{O}\) to be measured, one must perform \({n}_{{\rm{m}}}\) repeated measurements of each \({\hat{O}}_{i}\) to estimate \(\langle \hat{O}\rangle\) to an accuracy \(\propto {n}_{{\rm{m}}}^{-1/2}{N}_{{\rm{T}}}^{1/2}\). As such measurement is destructive, this process requires \({n}_{{\rm{m}}}{N}_{{\rm{T}}}\) preparations and pre-rotations on top of the measurement time. This makes the computational costs reported in Table 2 highly dependent on \({N}_{{\rm{T}}}\). The scaling of \({N}_{{\rm{T}}}\) with the system size \({N}_{{\rm{sys}}}\) is highly dependent on the operator \(\hat{O}\) to be measured and the choice of measurements to be made.11,33 In Table 3, we assume a range between \(O({N}_{{\rm{sys}}})\) and \(O({N}_{{\rm{sys}}}^{4})\) to calculate the approximate computation cost. This is a slight upper bound, as terms can be measured simultaneously if they commute, and error bounds may be tightened by accounting for the covariance between non-commuting terms.11 The details on how this would improve the asymptotic scaling are still lacking in the literature, however, and so we leave this as an obvious target for future work.

Throughout this text we require the ability to measure a phase \({e}^{i\phi }\) between the \(\left|0\right\rangle\) and \(\left|1\right\rangle\) states of a single qubit. (This information is destroyed by a measurement in the \(Z\) basis, which may only obtain the amplitude on either state.) Let us generalize this to a mixed state on a single qubit, which has the density matrix65

$$\rho =\left(\begin{array}{cc}{p}_{0}&{p}_{+}{e}^{i\phi }\\ {p}_{+}{e}^{-i\phi }&{p}_{1}\end{array}\right),$$ (21)

where \({p}_{0}+{p}_{1}=1\), and \(0\le {p}_{+}\le \sqrt{{p}_{0}{p}_{1}}\le 0.5\). If one repeatedly performs the two circuits in Fig. 5 (which differ by a gate \(R=I\) or \(R={R}_{Z}={e}^{i\frac{\pi }{4}Z}\)), and estimates the probability of a final measurement \(m=0,1\), one may calculate

$$\begin{array}{l}2{p}_{+}{e}^{i\phi }\,=\,P(m=0| R=I)-P(m=1| R=I)\\ +\,iP(m=0| R={R}_{Z})-iP(m=1| R={R}_{Z}).\end{array}$$ (22)

We define the circuit element \({M}_{T}\) throughout this work as the combination of the two circuits to extract a phase using this equation. As above, the error in estimating the real and imaginary parts of \(2{p}_{+}{e}^{i\phi }\) scales as \({n}_{{\rm{m}}}^{-1/2}\).

Fig. 5 Definition of the circuit element \({M}_{T}\) used throughout this work to estimate the phase \({e}^{i\phi }\) on a single qubit. This is done by repeatedly preparing and measuring the qubit along two different axis, by a combination of rotation and Hadamard gates and measurement \({M}_{Z}\) in the computational basis. The final measurements may then be combined via Eq. (22). Full size image

Hamiltonian Simulation

Optimal decompositions for complicated unitary operators are not in general known. For the electronic structure problem, one often wants to perform time evolution by a Hamiltonian \(\hat{H}\), requiring a circuit for the unitary operator \(U={e}^{i\hat{H}t}\). For a local (fermionic or qubit) Hamiltonian, the length \({T}_{{\rm{U}}}\) of the required circuit is polynomial in the system size \({N}_{{\rm{sys}}}\). However, the coefficient of this polynomial is often quite large; this depends on the chosen Hamiltonian, its basis set representation, the filling factor \(\eta\) (i.e., number of particles), and whether additional ancilla qubits are used.4,5 Moreover, such circuits usually approximate the target unitary \(U\) with some \(\tilde{U}\) with some bounds on the error \({\epsilon }_{{\rm{H}}}=\parallel U-\tilde{U}{\parallel }_{{\rm{S}}}\). This bound \({\epsilon }_{{\rm{H}}}\) is proportional to the evolution time \(t\), providing a ‘speed limit’ for such simulation.68 For the electronic structure problem, current methods achieve scaling between \(O({N}_{{\rm{sys}}}^{2})\)69 and \(O({N}_{{\rm{sys}}}^{6})\)67,70 for the circuit length \({T}_{{\rm{U}}}\), assuming \(\eta \propto {N}_{{\rm{sys}}}\) (and fixed \(t\), \(\epsilon\)). (When \(\eta\) is sublinear in \({N}_{{\rm{sys}}}\), better results exist.71) The proven \(O({N}_{{\rm{sys}}}^{6})\) scaling is an upper bound, and most likely reduced by recent work.72,73 For simpler models, such as the Hubbard model, scalings between \(O(1)\) and \(O({N}_{{\rm{sys}}})\) are available.64,74 As we require \(t\propto {N}_{{\rm{sys}}}^{-1}\) for the purposes of phase estimation (described in Section IV G), this scaling is reduced by an additional factor throughout this work (though this cannot reduce the scaling below \(O(1)\)). For Table 3, we use a range of \({T}_{{\rm{U}}}=O(1)\) and \({T}_{{\rm{U}}}=O({N}_{{\rm{sys}}}^{5})\) when approximating the scaling of our methods with the system size.

Ground state preparation and measurement

A key requirement for our derivative estimation methods is the ability to prepare the ground state \(\left|{\Psi }_{0}\right\rangle\) or an approximation to it on the system register. Various methods exist for such preparation, including QPE (see Section IV G), adiabatic state preparation,75 VQE,10,11 and more recent developments.76,77 Some of these preparation methods (in particular adiabatic and variational methods) are unitary, whilst others (phase estimation) are projective. Given a unitary preparation method, one may determine whether the system remains in the ground state by inverting the unitary and measuring in the computational basis (Section IV C). By contrast, such determination for QPE requires another phase estimation round, either via multiple ancilla qubits or by extending the methods in Section IV G. Unitary preparation methods have a slight advantage in estimating expectation values of unitary operators \(\hat{U}\); the amplitude amplification algorithm39 improves convergence of estimating \(\langle \hat{U}\rangle\) from \(\epsilon \propto {T}^{-1/2}\) to \(\epsilon \propto {T}^{-1}\) (in a total computation time \(T\)). However, this algorithm requires repeated application of the unitary preparation while maintaining coherence, which is probably not achievable in the near-term. We list the computation time in Tables 1 and 2 both when amplitude amplification is (marked with \(* *\)) and is not available.

Regardless of the method used, state preparation has a time cost that scales with the total system size. For QPE, this is the time cost \(K{T}_{{\rm{U}}}\) of applying the required estimation circuits, where \(K\) is the total number of applications of \({e}^{i\hat{H}t}\).78 The scaling of a VQE is dependent on the variational ansatz chosen.11,33 The popular UCCSD ansatz for the electronic structure problem has a \(O({N}_{{\rm{sys}}}^{5})\) computational cost if implemented naively. However, recent work suggests aggressive truncation of the number of variational terms can reduce this as far as \(O({N}_{{\rm{sys}}}^{2})\).33 We take this as the range of scalings for our approximations in Table 1.

Systematic error from ground state approximations (state error)

Traditionally, VQEs are not guaranteed to prepare the true ground state \(\left|{\Psi }_{0}\right\rangle\), but instead some approximate ground state

$$\left|{\tilde{\Psi }}_{0}\right\rangle =\sum _{j}{a}_{j}\left|{\Psi }_{j}\right\rangle .$$ (23)

Recent work has provided means of expanding VQEs iteratively to make the wavefunction error \(1-| {a}_{0}{| }^{2}\) arbitrarily small,79,80,81 although it is unclear how the time cost of these procedures scale with the system size. One may place very loose bounds on the error induces in the energy

$$\begin{array}{rcl}2\parallel \hat{H}{\parallel }_{{\rm{S}}}(1-| {a}_{0}{| }^{2})\ge | {E}_{0}-{\tilde{E}}_{0}| =\sum \limits_{j>0}{a}_{j}^{* }{a}_{j}({E}_{j}-{E}_{0})\\ &\ge &\delta (1-| {a}_{0}{| }^{2})\ge 0,\end{array}$$ (24)

where here \(\parallel \hat{H}{\parallel }_{{\rm{S}}}\) is the spectral norm of the Hamiltonian (its largest eigenvalue). (Note that while in general \(\parallel \hat{H}{\parallel }_{{\rm{S}}}\) is difficult to calculate, reasonable approximations are usually obtainable.) As \(\left|{\tilde{\Psi }}_{0}\right\rangle\) is chosen to minimize the approximate energy \({\tilde{E}}_{0}\), one expects to be much closer to the smaller bound than the larger. For an operator \(\hat{D}\) (such as a derivative operator \(\partial \hat{H}/\partial \lambda\)) other than the Hamiltonian, cross-terms will contribute to an additional error in the expectation value \({D}_{0}=\langle | \hat{D}| \rangle\)

$$\begin{array}{ll}&| {\tilde{D}}_{0}-{D}_{0}| =\left|\sum _{i,j>0}{a}_{i}^{* }{a}_{j}\langle {\Psi }_{i}| \hat{D}| {\Psi }_{j}\rangle \right.\\ &\left.+\,2\,{\rm{Re}}\sum _{j}{a}_{j}^{* }{a}_{0}\langle {\Psi }_{j}| \hat{D}| {\Psi }_{0}\rangle +(| {a}_{0}{| }^{2}-1){D}_{0}\right|.\end{array}$$ (25)

One can bound this above in turn using the fact that

$$\sum _{i,j}{a}_{i}^{* }{a}_{j}\le {\left(\sum _{i}| {a}_{i}{| }^{2}\right)}^{1/2}{\left(\sum _{j}| {a}_{j}{| }^{2}\right)}^{1/2}=(1-| {a}_{0}{| }^{2}),$$ (26)

which leads to

$$| {\tilde{D}}_{0}-{D}_{0}| \le 2\parallel \hat{D}{\parallel }_{{\rm{S}}}\left[(1-| {a}_{0}{| }^{2})+| {a}_{0}| \sqrt{1-| {a}_{0}{| }^{2}}\right].$$ (27)

Combining this with the error in the energy gives the bound

$$| {\tilde{D}}_{0}-{D}_{0}| \le 2\parallel \hat{D}{\parallel }_{{\rm{S}}}\left(\frac{| {\tilde{E}}_{0}-{E}_{0}{| }^{1/2}}{{\delta }^{1/2}}+\frac{| {\tilde{E}}_{0}-{E}_{0}| }{\delta }\right).$$ (28)

This ties the error in our derivative to the energy in our error, but with a square root factor that unfortunately slows down the convergence when the error is small. (This factor comes about precisely because we do not expect to be in an eigenstate of the derivative operator.) Unlike the above energy error, we cannot expect this bound to be loose without a good reason to believe that the orthogonal component \({\sum }_{j>0}{a}_{j}\left|{\Psi }_{j}\right\rangle\) has a similar energy gradient to the ground state. This will often not be the case; the low-energy excited state manifold is usually strongly coupled to the ground state by a physically-relevant excitation, causing the energies to move in opposite directions. Finding methods to circumvent this issue are obvious targets for future research. For example, one could optimize a VQE on a cost function other than the energy. One could also calculate the gradient in a reduced Hilbert space (see Section IV M) using eigenstates of \({\hat{H}}^{({\rm{QSE}})}+\epsilon {\hat{D}}^{({\rm{QSE}})}\) with small \(\epsilon\) to ensure the coupling is respected.

Quantum phase estimation

Non-trivial measurement of a quantum computer is of similar difficulty to non-trivial unitary evolution. Beyond learning the expectation value of a given Hamiltonian \(\hat{H}\), one often wishes to know specific eigenvalues \({E}_{i}\) (in particular for the electronic structure problem, the ground and low-excited state energies). This may be achieved by QPE.9 QPE entails repeated Hamiltonian simulation (as described above), conditional on an ancilla qubit prepared in the \(\left|+\right\rangle =\frac{1}{\sqrt{2}}(\left|0\right\rangle +\left|1\right\rangle )\) state. (The resource cost in making the evolution conditional is constant in the system size.) Such evolution causes phase kickback on the ancilla qubit; if the system register was prepared in the state \({\sum }_{j}{a}_{j}\left|{\Psi }_{j}\right\rangle\), the combined (system plus ancilla) state evolves to

$$\sum _{j}{a}_{j}\left|{\Psi }_{j}\right\rangle \otimes (\left|0\right\rangle +{e}^{ik{E}_{j}t}\left|1\right\rangle ).$$ (29)

Repeated tomography at various \(k\) allows for the eigenphases \({E}_{j}\) to be inferred, up to a phase \({E}_{j}t+2\pi \equiv {E}_{j}t\). This inference can be performed directly with the use of multiple ancilla qubits,9 or indirectly through classical post-processing of a single ancilla tomographed via the \({M}_{T}\) gate of Fig. 5.39,78,82,83,84,85

The error in phase estimation comes from two sources; the error in Hamiltonian simulation and the error in the phase estimation itself. The error in Hamiltonian simulation may be bounded by \({\epsilon }_{{\rm{H}}}\) (as calculated in the previous section), which in turn sets the time for a single unitary \({T}_{U}\). Assuming a sufficiently large gap to nearby eigenvalues, the optimal scaling of the error in estimating \({E}_{j}\) is \({A}_{j}^{-1}{t}^{-1}{K}^{-1}\) (where \({A}_{j}=| {a}_{j}{| }^{2}\) is the amplitude of the \(j\)th eigenstate in the prepared state). Note that the phase equivalence \({E}_{j}t+2\pi ={E}_{j}t\) sets an upper bound on \(t\); in general we require \(t\propto \parallel \hat{H}{\parallel }_{S}\), which typically scales with \({N}_{{\rm{sys}}}\). (This scaling was incorporated into the estimates of \({T}_{{\rm{U}}}\) in the previous section.) The scaling of the ground state amplitude \({A}_{0}\) with the system size is relatively unknown, although numeric bounds suggest that it scales approximately as \(1-\alpha {N}_{{\rm{sys}}}\)3 for reasonable \({N}_{{\rm{sys}}}\), with \(\alpha\) a small constant. Approximating this as \({N}_{{\rm{sys}}}^{-1}\) implies that \(K\propto {N}_{{\rm{sys}}}^{2}\) is required to obtain a constant error in estimating the ground state energy.

The error in estimating an amplitude \({A}_{j}\) during single-ancilla QPE has not been thoroughly investigated. A naive least-squares fit to dense estimation leads to a scaling of \({n}_{{\rm{m}}}^{-1/2}{k}_{\max }^{-1/3}\), where \({n}_{{\rm{m}}}\) is the number of experiments performed at each point \(k=1,\ldots ,{k}_{\max }\). One requires to perform controlled time evolution for up to \({k}_{\max }\approx \max ({t}^{-1},{A}_{j}^{-1})\) coherent steps in order to guarantee separation of \({\phi }_{j}\) from other phases. To obtain a constant error, one must then perform \({n}_{{\rm{m}}}\propto {k}_{\max }^{-\frac{2}{3}}\propto \max ({A}_{j}^{\frac{2}{3}},{t}^{\frac{2}{3}})\) measurements at each \(k\), implying \(K={n}_{{\rm{m}}}{k}_{\max }\propto \max ({A}_{j}^{-\frac{1}{3}},{t}^{-\frac{1}{3}})\) applications of \({e}^{i\hat{H}t}\) are required. For the ground state (\({A}_{j}={A}_{0}\)), this gives scaling \(K\propto {N}_{{\rm{sys}}}^{\frac{1}{3}}\). By contrast, multiple-ancilla QPE requires \({N}_{{\rm{m}}}\) repetitions of \({e}^{i\hat{H}t}\) with \({k}_{\max }=\max ({A}_{0}^{-1},{t}^{-1})\) to estimate \({A}_{0}\) with an error of \({({A}_{0}(1-{A}_{0}){N}_{{\rm{m}}}^{-1})}^{1/2}\). This implies that \({N}_{{\rm{m}}}\propto {A}_{0}\) measurements are sufficient, implying \(K\propto {N}_{{\rm{m}}}{A}_{0}^{-1}\) may be fixed constant as a function of the system size for constant error in estimation of \({A}_{0}\). Though this has not yet been demonstrated for single-round QPE, we expect it to be achievable and assume this scaling in this work.

The propagator and phase estimation method

In this section, we outline the circuits required and calculate the estimation error for our newly developed PPE method for derivative estimation. A prototype application of this method to a one-qubit system may be found in the Supplementary Notes of this work.

Estimating expectation values with single-ancilla QPE

Though single-ancilla QPE only weakly measures the system register, and does not project it into an eigenstate \(\left|{\Psi }_{j}\right\rangle\) of the chosen Hamiltonian, it can still be used to learn properties of the eigenstates beyond their eigenvalues \({E}_{j}\). In particular, if one uses the same ancilla qubit to control a further unitary operation \(\hat{U}\) on the system register, the combined (system plus ancilla) state evolves from Eq. (29) to

$$\sum _{j,j^{\prime} }{a}_{j}(\left|0\right\rangle \otimes \left|{\Psi }_{j}\right\rangle +{e}^{ik{E}_{j}t}\langle {\Psi }_{j^{\prime} }| \hat{U}| {\Psi }_{j}\rangle \left|1\right\rangle \otimes \left|{\Psi }_{j^{\prime} }\right\rangle ).$$ (30)

The phase accumulated on the ancilla qubit may then be calculated to be

$$g(k)=\sum _{j,j^{\prime} }{a}_{j}{a}_{j^{\prime} }^{* }\langle {\Psi }_{j^{\prime} }| \hat{U}| {\Psi }_{j}\rangle {e}^{ik{E}_{j}t}.$$ (31)

Note that the gauge degree of freedom is not present in Eq. (31); if one re-defines \(\left|{\Psi }_{j}\right\rangle \to {e}^{i\theta }\left|{\Psi }_{j}\right\rangle\), one must send \({a}_{j}\to {e}^{-i{\phi }_{j}}{a}_{j}\), and the phase cancels out. One may obtain \(g(k)\) at multiple points \(k\) via tomography of the ancilla qubit (Fig. 5). From here, either Prony’s method or Bayesian techniques may be used to extract phases \({\omega }_{j}\approx {E}_{j}t\) and corresponding amplitudes \({\alpha }_{j}\approx {\sum }_{j^{\prime} }{a}_{j}{a}_{j^{\prime} }^{* }\langle {\Psi }_{j^{\prime} }| \hat{U}| {\Psi }_{j}\rangle\).85 The amplitudes \({\alpha }_{j}\) are often not terribly informative, but this changes if one extends this process over a family of operators \(U\). For instance, if one chooses \(U={e}^{ik^{\prime} \hat{H}t}\hat{V}{e}^{ik\hat{H}t}\) (with \(\hat{V}\) unitary), an application of the Prony’s method on \(k\) returns amplitudes of the form

$${\alpha }_{j}(k^{\prime} )\approx \sum _{j^{\prime} }{a}_{j}{a}_{j^{\prime} }^{* }{e}^{ik^{\prime} {E}_{j^{\prime} }t}\langle {\Psi }_{j^{\prime} }| \hat{V}| {\Psi }_{j}\rangle ,$$ (32)

from which a second application of Prony’s method obtains phases \({\omega }_{j^{\prime} }={E}_{j^{\prime} }t\) with corresponding amplitudes

$${\alpha }_{j,j^{\prime} }\approx {a}_{j}{a}_{j^{\prime} }^{* }\langle {\Psi }_{j^{\prime} }| \hat{V}| {\Psi }_{j}\rangle .$$ (33)

Each subsequent application of QPE requires taking data with \({U}^{k}\) fixed from \(k=1,\ldots ,K\) to resolve \(K\) individual frequencies (and corresponding eigenvalues). However, if one is simply interested in expectation values \(\langle {\Psi }_{j}| \hat{V}| {\Psi }_{j}\rangle\) (i.e., when \(j=j^{\prime}\)), one may fix \(k=k^{\prime}\) and perform a single application of the Prony’s method, reducing the number of circuits that need to be applied from \(O({K}^{2})\) to \(O(K)\) (see Fig. 6). The error in the estimator \({\alpha }_{j,j}\) (Eq. (33)) may be bounded above by the error in the estimator \({\alpha }_{j}\) (Eq. (32)). However, to estimate \(\langle {\Psi }_{j}| \hat{V}| {\Psi }_{j}\rangle\) from Eq. (33), one needs to divide by \({A}_{j}\). This propagates directly to the error, which then scales as \({A}_{j}^{-1/2}{N}_{{\rm{m}}}^{-1/2}\). Thus constant error in estimating \(\langle {\Psi }_{j}| \hat{V}| {\Psi }_{j}\rangle\) is achieved if \(K\propto {N}_{{\rm{sys}}}\).

Fig. 6 A circuit to measure \(\langle {\Psi }_{j}| \hat{V}| {\Psi }_{j}\rangle\) without preparing \(\left|{\Psi }_{j}\right\rangle\) on the system register. The tomography box \({M}_{T}\) is defined in Fig. 5. Full size image

PPE circuits

As presented, the operator \(\hat{V}\) in Fig. 6 must be unitary. However if one applies additional phase estimation within \(\hat{V}\) itself, one can extend this calculation to non-unitary operators, such as those given in Eq. (9). This is similar in nature to calculating the time-domain Green’s function for small \(t\) on a quantum computer (which has been studied before in refs. 41,42,43), but performing the transformation to frequency space with the Prony’s method instead of a Fourier transform to obtain better spectral resolution. It can also be considered a generalization of ref. 37 beyond the linear response regime. To calculate a \(X\)th order amplitude (Eq. (10)), one may set

$$\hat{V}=\left(\prod _{x=1}^{X-1}{\hat{P}}_{x}{e}^{i{k}_{x}\hat{H}t}\right){\hat{P}}_{X},$$ (34)

which is unitary if the \({\hat{P}}_{x}\) are chosen to be a unitary decomposition of \(\partial \hat{H}/\partial {\lambda }_{x}\). In Fig. 7, we show two circuits for the estimation of a second order derivative with \(\hat{P}=\partial \hat{H}/\partial {\lambda }_{1}\), \(\hat{Q}=\partial \hat{H}/\partial {\lambda }_{2}\) (or some piece thereof). The circuits differ by whether QPE or a VQE is used for state preparation. If QPE is used for state preparation, the total phase accumulated by the ancilla qubit over the circuit is

$$\begin{array}{ll}g({k}_{0},{k}_{1})=\sum _{j,m,n}&{a}_{m}^{* }{a}_{n}\langle {\Psi }_{m}| \hat{P}| {\Psi }_{j}\rangle \langle {\Psi }_{j}| \hat{Q}| {\Psi }_{n}\rangle\,\times\, {e}^{i{k}_{0}t({E}_{m}+{E}_{n})}{e}^{i{k}_{1}t{E}_{j}}\end{array}.$$

Applying the Prony’s method at fixed \({k}_{1}\) will obtain a signal at phase \(2t{E}_{0}\) with amplitude

$${\alpha }_{0,0}({k}_{1})\approx \sum _{j}{a}_{0}^{* }{a}_{0}\langle {\Psi }_{0}| \hat{P}| {\Psi }_{j}\rangle \langle {\Psi }_{j}| \hat{Q}| {\Psi }_{0}\rangle {e}^{i{k}_{1}t{E}_{j}}.$$ (35)

A second application of the Prony’s method in \({k}_{1}\) allows us to obtain the required amplitudes

$${\alpha }_{0,j,0}\approx {a}_{0}^{* }{a}_{0}\langle {\Psi }_{0}| \hat{P}| {\Psi }_{j}\rangle \langle {\Psi }_{j}| \hat{Q}| {\Psi }_{0}\rangle ,$$ (36)

and the eigenvalues \({\omega }_{j}\approx {E}_{j}t\), allowing classical calculation of both the amplitudes and energy coefficients required to evaluate Eq. (9). If a VQE is used for state preparation instead, one must post-select on the system register being returned to \(|\overrightarrow{0}\rangle\). Following this, the ancilla qubit will be in the state

$$\frac{1}{\sqrt{2}}\left[\left|0\right\rangle +\left|1\right\rangle \sum _{j}{e}^{ikt{E}_{j}}\left\langle {\Psi }_{0}^{({\rm{VQE}})}| \hat{P}| {\Psi }_{j}\right\rangle \left\langle {\Psi }_{j}| \hat{Q}| {\Psi }_{0}^{({\rm{VQE}})}\right\rangle \right],$$

with an accumulated phase \(g(k)={\alpha }_{0,0}(k)\) (where \({\alpha }_{0,0}\) is as defined above). Here, \(|{\Psi }_{0}^{({\rm{VQE}})}\rangle\) is the state prepared by the VQE unitary (which may not be the true ground state of the system). Both methods may be extended immediately to estimate higher-order amplitudes by inserting additional excitations and rounds of QPE, resulting in amplitudes of the form \({\alpha }_{0,0}({k}_{1},\ldots ,{k}_{X})\).

Fig. 7 Circuits for calculating path amplitudes (Eq. (10)) for a second-order derivative on a quantum computer (individual units described throughout Section IV C), using either a VQE (top) or QPE (bottom) for state preparation. Both circuits require an \(N\)-qubit system register and a single ancilla qubit. Repeat measurements of these circuit at different values of \(k\) (top) or \({k}_{0}\) and \({k}_{1}\) (bottom) allow for the inference of the amplitude, as described in the text. \({M}_{Z}\) refers to a final measurement of all qubits in the computational basis, which is required for post-selection. Full size image

We note that the VQE post-selection does not constitute “throwing away data”; if the probability of post-selecting \(\left|{\Psi }_{0}\right\rangle\) is \(p\), we have

$$\sum _{{k}_{1},\ldots ,{k}_{X}}| {\alpha }_{0,0}({k}_{1},\ldots ,{k}_{X}){| }^{2}=p,$$ (37)

and as the variance in any term \({\alpha }_{i,j}({k}_{1},\ldots ,{k}_{X})\) scales as \(| {\alpha }_{i,j}({k}_{1},\ldots ,{k}_{X})(1-{\alpha }_{i,j}({k}_{1},\ldots ,{k}_{X}))|\), the absolute error in estimating a derivative scales as \({p}^{1/2}\) (note the lack of minus sign). (Note here that the relative error scales as \({p}^{-1/2}\)).

Energy discretization (resolution error)

The maximum number of frequencies estimatable from a signal \(g(0),\ldots ,g(k)\) is \((k+1)/2\). (This can be seen by parameter counting; it differs from the bound of \(k\) for QPE85 as the amplitudes are not real.) As the time required to obtain \(g(k)\) scales at best linearly with \(k\) (Section IV D), we cannot expect fine enough resolution of all \({2}^{{N}_{{\rm{sys}}}}\) eigenvalues present in a \({N}_{{\rm{sys}}}\)-qubit system. Instead, a small amplitude \({\mathcal{A}}({j}_{1},\ldots ,{j}_{X})\) (\(| {\mathcal{A}}({j}_{1},\ldots ,{j}_{X})| \le \Delta\)) will be binned with paths \({\mathcal{A}}^{\prime} ({l}_{1},\ldots ,{l}_{X})\) of similar energy (\(\delta ={\max }_{x}| {E}_{{j}_{x}}-{E}_{{l}_{x}}| \ll \Delta\)), and labeled with the same energy \({E}_{{B}_{x}}\approx {E}_{{j}_{x}}\approx {E}_{{k}_{x}}\).85 Here, \(\Delta\) is controlled by the length of the signal \(g(k)\), i.e., \(\Delta \propto \max {(k)}^{-1}\). This grouping does not affect the amplitudes; as evolution by \({e}^{ik\hat{H}t}\) does not mix eigenstates (regardless of energy), terms of the form \(\left|{\Psi }_{{j}_{x}}\right\rangle \left\langle {\Psi }_{{l}_{x}}\right|\) do not appear. (This additional amplitude error would occur if one attempted to calculate single amplitudes of the form \(\langle {\Psi }_{j}| \hat{P}| {\Psi }_{k}\rangle\) on a quantum device, e.g., using the method in Section IV G or that of ref. 37, and multiply them classically to obtain a \(d\ >\ 1\)st order derivative.) The PPE method then approximates Eq. (9) as

$$\begin{array}{ll}&D=\sum _{{\mathcal{A}}}\sum _{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}-1}}\, {{\mathcal{A}}}_{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}-1}}{f}_{{\mathcal{A}}}({E}_{0};{E}_{{B}_{1}},\ldots ,{E}_{{B}_{{X}_{{\mathcal{A}}}-1}}),\\ &{{\mathcal{A}}}_{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}-1}}=\sum _{{j}_{x}\in {B}_{x}}2\ {\rm{Re}}({\mathcal{A}}({j}_{1},\ldots ,{j}_{{X}_{{\mathcal{A}}}-1})).\end{array}$$ (38)

Classical post-processing then need only sum over the (polynomially many in \(\Delta\)) bins \({B}_{x}\) instead of the (exponentially many in \({N}_{{\rm{sys}}}\)) eigenstates \(\left|{\Psi }_{{j}_{x}}\right\rangle\), which is then tractable.

To bound the resolution error in the approximation \({f}_{{\mathcal{A}}}({E}_{0};{E}_{{j}_{1}},\ldots ,{E}_{{j}_{{X}_{{\mathcal{A}}}-1}})\to {f}_{{\mathcal{A}}}({E}_{0};{E}_{{B}_{1}},\ldots ,{E}_{{B}_{{X}_{{\mathcal{A}}}-1}})\), we consider the error if \({E}_{j}\) were drawn randomly from bins of width \(\parallel \hat{H}{\parallel }_{{\rm{S}}}\Delta\) (where \(\parallel \hat{H}{\parallel }_{{\rm{S}}}\) is the spectral norm). The energy functions \(f\) take the form of \({X}_{{\mathcal{A}}}-1\) products of \(\frac{1}{{E}_{{j}_{x}}-{E}_{0}}\). If each term is independent, these may be bounded as

$${\epsilon }_{f}\le {X}_{{\mathcal{A}}}{\delta }^{({X}_{{\mathcal{A}}}-2)}\parallel \hat{H}{\parallel }_{{\rm{S}}}\Delta ,$$ (39)

where \(\delta\) is the gap between the ground and excited states. Then, as the excitations \(\hat{P}\) are unitary, for each amplitude \({\mathcal{A}}\) one may bound

$$\sum _{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}-1}}| {{\mathcal{A}}}_{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}}}{| }^{2}\le 1.$$ (40)

Propagating variances then obtains

$${\epsilon }_{D}\le d{N}_{{\mathcal{A}}}^{1/2}{\delta }^{d-2}\parallel \hat{H}{\parallel }_{{\rm{S}}}\Delta ,$$ (41)

where \({N}_{{\mathcal{A}}}\) is the number of amplitudes in the estimation of \(D\). As we must decompose each operator into unitaries to implement in a circuit, \({N}_{{\mathcal{A}}}\propto {N}_{{\rm{U}}}^{d}\).

This bound is quite loose; in general we expect excitations \(\partial \hat{H}/\partial \lambda\) to couple to low-level excited states, which lie in a constant energy window (rather than one of width \(\parallel \hat{H}{\parallel }_{S}\)), and that contributions from different terms should be correlated (implying that \({N}_{{\mathcal{A}}}\) should be treated as constant here). This implies that one may take \(\Delta\) roughly constant in the system size, which we assume in this work.

Sampling noise error

We now consider the error in calculating Eq. (38) from a finite number of experiments (which is separate to the resolution error above). Following Section IV G we have that, if QPE is used for state preparation

$${\rm{Var}}[{{\mathcal{A}}}_{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}-1}}]\propto | {{\mathcal{A}}}_{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}-1}}| {A}_{0}^{-1}{N}_{{\rm{m}}}^{-1}$$ (42)

$$\begin{array}{l}{\rm{Var}}[f({E}_{0};{E}_{{B}_{1}},\ldots ,{E}_{{B}_{{X}_{{\mathcal{A}}}-1}})]\, \propto\, {X}_{{\mathcal{A}}}{\delta }^{2{X}_{{\mathcal{A}}}-4}{K}^{-2}{t}^{-2}| {{\mathcal{A}}}_{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}-1}}{| }^{-2}{A}_{0}^{-2}.\end{array}$$ (43)

If one were to use a VQE for state preparation, the factors of \({A}_{0}\) would be replaced by the state error of Section IV F. We have not included this calculation in Table 2 for the sake of simplicity. Then, assuming each term in Eq. (38) is independently estimated, we obtain

$$\begin{array}{ll}{\rm{Var}}[D]={\sum \limits_{\mathcal{A}}}{\sum \limits_{{B}_{1},\ldots ,{B}_{{X}_{\mathcal{A}}}}} \left\{{\rm{Var}}[{f}_{{\mathcal{A}}}({E}_{0};{E}_{{B}_{1}},\ldots ,{E}_{{B}_{{X}_{{\mathcal{A}}}}})]{\left|{{\mathcal{A}}}_{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}}}\right|}^{2}\right.\\ \left.+\,{\rm{Var}}[{{\mathcal{A}}}_{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}}}]{\left|{f}_{{\mathcal{A}}}({E}_{0};{E}_{{B}_{1}},{E}_{{B}_{2}},\ldots ,{E}_{{B}_{{X}_{{\mathcal{A}}}}})\right|}^{2}\right\}.\end{array}$$ (44)

Substituting the individual scaling laws one obtains

$$\begin{array}{l}{\rm{Var}}[D]\propto {\sum \limits_{\mathcal{A}}}{\sum \limits_{{B}_{1},\ldots ,{B}_{{X}_{\mathcal{A}}}}}\left\{X{\delta }^{2d-4}{K}^{-2}{t}^{-2}{A}_{0}^{-2}\right.\\ \left.+\,{\delta }^{2d-2}| {{\mathcal{A}}}_{{B}_{1},\ldots ,{B}_{{X}_{{\mathcal{A}}}}}| {A}_{0}^{-1}{N}_{{\rm{m}}}^{-1}\right\}\end{array}$$ (45)

$$\le {N}_{{\mathcal{A}}}\left\{{\delta }^{2d-2}{\Delta }^{\frac{d}{2}}{N}_{{\rm{m}}}^{-1}{A}_{0}^{-1}+d{\delta }^{2d-4}{\Delta }^{d}{K}^{-2}{t}^{-2}{A}_{0}^{-2}\right\},$$ (46)

where again \({N}_{{\mathcal{A}}}\propto {N}_{U}^{d}\). This result is reported in Table 2.

Eigenstate truncation approximation details

In this section, we outline the classical post-processing required to evaluate Eq. (9) in the ETA, using QSE to generate approximate eigenstates. We then calculate the complexity cost of such estimation, and discuss the systematic error in an arbitrary response approximation from Hilbert space truncation.

The chosen set of approximate excited states \(\left|{\tilde{\Psi }}_{j}\right\rangle\) defines a subspace \({{\mathcal{H}}}^{({\rm{QSE}})}\) of the larger FCI Hilbert space \({{\mathcal{H}}}^{({\rm{FCI}})}\). To calculate expectation values within this subspace, we project the operators \(\hat{O}\) of interest (such as derivatives like \(\partial \hat{H}/\partial \lambda\)) onto \({{\mathcal{H}}}^{({\rm{QSE}})}\), giving a set of reduced operators \({\hat{O}}^{({\rm{QSE}})}\) (\({O}_{i,j}^{({\rm{QSE}})}=\langle {\tilde{\Psi }}_{i}| \hat{O}| {\tilde{\Psi }}_{j}\rangle\)). These are \({N}_{{\rm{E}}}\times {N}_{{\rm{E}}}\)-dimensional classical matrices, which may be stored and operated on in polynomial time. Methods to obtain the matrix elements \({O}_{i,j}^{({\rm{QSE}})}\) are dependent on the form of the \(\left|{\tilde{\Psi }}_{j}\right\rangle\) chosen. Within the QSE, one can obtain these by directly measuring40

$$\langle {\chi }_{i}| \hat{O}| {\chi }_{j}\rangle =\langle {\Psi }_{0}| {\hat{E}}_{i}^{\dagger }\hat{O}{\hat{E}}_{j}| {\Psi }_{0}\rangle ,$$ (47)

using the techniques outlined in Section IV C, and rotating the basis from \(\{|{\chi }_{j}\rangle \}\) to \(\{|{\tilde{\Psi }}_{j}\rangle \}\) (using Eq. (14)).

The computational complexity for a derivative calculation within the QSE is roughly independent of the choice of \(\left|{\tilde{\Psi }}_{j}\right\rangle\). The error \(\epsilon\) may be bounded above by the error in each term of the \({N}_{{\rm{E}}}\times {N}_{{\rm{E}}}\) projected matrices, which scales as either \({N}_{{\rm{T}}}^{1/2}{n}_{{\rm{m}}}^{-1/2}\) (when directly estimating), \({A}_{j}^{-1/2}{N}_{{\rm{m}}}^{-1/2}\) (when estimating via QPE), or \({N}_{{\rm{U}}}^{1/2}{K}^{-1}{n}_{{\rm{m}}}^{-1/2}\) (using the amplitude estimation algorithm). We assume that the \({N}_{{\rm{E}}}^{2}\) terms are independently estimated, in which case \(\epsilon\) scales with \({N}_{{\rm{E}}}\). In general this will not be the case, and \(\epsilon\) could scale as badly as \({N}_{{\rm{E}}}^{2}\), but we do not expect this to be typical. Indeed, one can potentially use the covariance between different matrix elements to improve the error bound.40 As we do not know the precise improvement this will provide, we leave any potential reduction in the computational complexity stated in Table 2 to future work. The calculation requires \({n}_{{\rm{m}}}\) repetitions of \({N}_{{\rm{T}}}\) circuits for each pair of \({N}_{{\rm{E}}}\) excitations, leading to a total number of \({n}_{{\rm{m}}}{N}_{{\rm{T}}}{N}_{{\rm{E}}}^{2}\) preparations (each of which has a time cost \({T}_{{\rm{P}}}\)), as stated in Table 2. (With the amplitude amplification algorithm, the dominant time cost comes from running \(O({N}_{{\rm{T}}}{N}_{{\rm{E}}}^{2})\) circuits of length \(K{T}_{{\rm{P}}}\).)

Regardless of the method of generating eigenstates, the ETA incurs a systematic truncation error from approximating an exponentially large number of true eigenstates \(\left|{\Psi }_{j}\right\rangle\) by a polynomial number of approximate eigenstates \(\left|{\tilde{\Psi }}_{j}\right\rangle ={\sum }_{l}{\tilde{A}}_{j,l}\left|{\Psi }_{l}\right\rangle\). This truncation error can be split into three pieces. Firstly, an excitation \(\hat{P}\left|{\Psi }_{0}\right\rangle\) may not lie within the response subspace \({{\mathcal{H}}}^{({\rm{QSE}})}\), in which case the pieces lying outside the space will be truncated away. Secondly, the term \(\hat{P}\left|{\tilde{\Psi }}_{j}\right\rangle \left\langle {\tilde{\Psi }}_{j}\right|\hat{Q}\) may contain terms of the form \(\hat{P}\left|{\Psi }_{j}\right\rangle \left\langle {\Psi }_{l}\right|\hat{Q}\), which do not appear in the original resolution of the identity. Thirdly, the approximate energies \({\tilde{E}}_{j}\) may not be close to the true energies \({E}_{j}\) (especially when \(\left|{\tilde{\Psi }}_{j}\right\rangle\) is a sum of true eigenstates \(\left|{\Psi }_{l}\right\rangle\) with large energy separation \({E}_{j}-{E}_{l}\)). If one chooses excitation operators \({\hat{E}}_{j}\) in the QSE so that \(\hat{P}={\sum }_{j}{p}_{j}{\hat{E}}_{j}\), one completely avoids the first error source. By contrast, if one chooses a truncated set of true eigenstates \(\left|{\tilde{\Psi }}_{j}\right\rangle =\left|{\Psi }_{j}\right\rangle\), one avoids the second and third error sources exactly. In the Supplementary Notes of this work, we expand on this point, and place some loose bounds on these error sources.

Experimental methods

The experimental implementation of the geometry optimization algorithm was performed using two of three transmon qubits in a circuit QED quantum processor. This is the same device used in ref. 86 (raw data is the same as in Fig. 1e of this paper, but with heavy subsequent processing). The two qubits have individual microwave lines for single-qubit gating and flux-bias lines for frequency control, and dedicated readout resonators with a common feedline. Individual qubits are addressed in readout via frequency multiplexing. The two qubits are connected via a common bus resonator that is used to achieve an exchange gate

$$\left(\begin{array}{cccc}1&0&0&0\\ 0&\cos (\theta )&i\sin (\theta )&0\\ 0&i\sin (\theta )&\cos (\theta )&0\\ 0&0&0&1\end{array}\right),$$ (48)

via a flux pulse on the high-frequency qubit, with an uncontrolled additional single-qubit phase that was canceled out in post-processing. The exchange angle \(\theta\) may be fixed to a \(\pi /6000\) resolution by using the pulse duration (with a \(1\,{\rm{ns}}\) duration) as a rough knob and fine-tuning with the pulse amplitude. Repeat preparation and measurement of the state generated by exciting to \(\left|01\right\rangle\) and exchanging through one of \(41\) different choices of \(\theta\) resulted in the estimation of \(41\) two-qubit density matrices \({\rho }_{i}\) via linear inversion tomography of \(1{0}^{4}\) single-shot measurements per prerotation.87 All circuits were executed in eQASM88 code compiled with the QuTech OpenQL compiler, with measurements performed using the qCoDeS89 and PycQED90 packages.

To use the experimental data to perform geometry optimization for H\({}_{2}\), the ground state was estimated via a VQE.10,11 The Hamiltonian at a given H–H bond length \({R}_{{\rm{H-H}}}\) was calculated in the STO-3G basis using the Dirac package,51 and converted to a qubit representation using the Bravyi–Kitaev transformation, and reduced to two qubits via exact block-diagonalization12 using the Openfermion package52 and the Openfermion–Dirac interface.53 With the Hamiltonian \(\hat{H}({R}_{{\rm{H-H}}})\) fixed, the ground state was chosen variationally: \(\rho ({R}_{{\rm{H-H}}})={\min }_{{\rho }_{i}}{\rm{Trace}}[\hat{H}({R}_{{\rm{H-H}}}){\rho }_{i}]\). The gradient and Hessian were then calculated from \(\rho ({R}_{{\rm{H-H}}})\) using the Hellmann–Feynman theorem (Section II B) and ETA (Section IV M), respectively. For the ETA, we generated eigenstates using the QSE, with the Pauli operator \(XY\) as a single excitation. This acts within the number conserving subspace of the two-qubit Hilbert space, and, being imaginary, will not have the real-valued H\({}_{2}\) ground state as an eigenstate. (This in turn guarantees the generated excited state is linearly independent of the ground state.) For future work, one would want to optimize the choice of \(\theta\) at each distance \({R}_{{\rm{H-H}}}\), however this was not performed due to time constraints. We have also not implemented the error mitigation strategies studied in ref. 86 for the sake of simplicity.

Simulation methods

Classical simulations of the quantum device were performed in the full-density–matrix simulator (quantumsim).49 A realistic error model of the device was built using experimentally calibrated parameters to account for qubit decay (\({T}_{1}\)), pure dephasing (\({T}_{2}^{* }\)), residual excitations of both qubits, and additional dephasing of qubits fluxed away from the sweet spot (which reduces \({T}_{2}^{* }\) to \({T}_{2}^{* ,red}\) for the duration of the flux pulse). This error model further accounted for differences in the observed noise model on the individual qubits, as well as fluctuations in coherence times and residual excitation numbers. Further details of the error model may be found in ref. 86 (with device parameters in Table S1 of this reference).

With the error model given, \(100\) simulated experiments were performed at each of the \(41\) experimental angles given. Each experiment used unique coherence time and residual excitation values (drawn from a distribution of the observed experimental fluctuations), and had appropriate levels of sampling noise added. These density matrices were then resampled \(100\) times for each simulation.