Framework

Here we review the computational mechanics framework used to express our results (a more complete overview may be found elsewhere2,3). We consider continuous-time discrete-alphabet stochastic point processes. Such a process \({\cal P}\) is characterised by a sequence of observations (x n , t n ), drawn from a probability distribution P(X n , T n ).34 Here, the x n , drawn from an alphabet \({\cal A}_n\), are the symbols emitted by the process, while the t n record the times between emissions n − 1 and n. For shorthand, we denote the dual x n = (x n , t n ), and similarly X n for the associated stochastic variable. We denote a contiguous string of observations of emitted symbols and their temporal separations by the concatenation x l:m = x l x l+1 … x m−1 , and for a stationary process we mandate that \(P({\boldsymbol{X}}_{0:L}) = P({\boldsymbol{X}}_{s:s + L})\forall s,L \in {\Bbb Z}\). Note that the discrete-time case consists of either coarse-graining the t n , or considering processes where such dwell times are either identical or irrelevant.

We define the past of a process \(\overleftarrow {\boldsymbol{x}} = {\boldsymbol{x}}_{ - \infty :0}\left( {\emptyset ,t_{0^ + }} \right)\), where 0 is the current emission step (i.e., the next emitted symbol will be x 0 , and \({\emptyset}\) denotes that this symbol is currently undetermined), and \(t_{0^ + }\) is the time since the last emission, with associated random variable \(T_{0^ + }\). Analogously, defining \(t_{0^ - }\) as the time to the next emission, we can denote the future \(\overrightarrow {\boldsymbol{x}} = \left( {x_0,t_{0^ - }} \right){\boldsymbol{x}}_{1:\infty }\).27 The causal states of the process are then an equivalence class defined according to a predictive equivalence relation;1 two past sequences \(\overleftarrow {\boldsymbol{x}}\) and \(\overleftarrow {\boldsymbol{x}} ^\prime\) belong to the same causal state (i.e. \(\overleftarrow {\boldsymbol{x}} \sim _e\overleftarrow {\boldsymbol{x}} ^\prime\)) iff they satisfy

$$P\left( {\overrightarrow {\boldsymbol{X}} |\overleftarrow {\boldsymbol{X}} = \overleftarrow {\boldsymbol{x}} } \right) = P\left( {\overrightarrow {\boldsymbol{X}} |\overleftarrow {\boldsymbol{X}} = \overleftarrow {\boldsymbol{x}} ^\prime } \right).$$ (1)

We use the notation S j to represent the causal state labelled by some index j.

We desire models that are predictive, wherein the internal memory of a simulator implementing the model contains all (and no additional) information relevant to the future statistics that can be obtained from the entire past. The first part of this entails the simulator memory having the same predictive power as knowledge of the entire past (prescience2), while the second ensures that knowledge of the memory provides no further predictive power than observing the entire past output (information about the future accessible in this manner is referred to as oracular,35 and implies the simulator having decided aspects of its future output in advance). This notion of predictive models is stricter than the broader class of generative models, which must only be able to faithfully reproduce future statistics; internal states of models in the broader class may contain additional information that allows for better prediction of future outputs than knowledge of the past, violating the non-oracular condition. We note that while there exist generative models that can operate with lower memory than the optimal predictive models we will now introduce, as this is achieved by leveraging oracular information we do not consider such models here.

The probably optimal predictive classical models, termed ‘ε-machines’, operate on the causal states.1,2 In general the systematic structure of these models is well-understood only for discrete-time processes, though as we later discuss recent efforts have been made towards constructing corresponding continuous-time machines. A discrete-time ε-machine may be represented by an edge-emitting hidden Markov model, in which the hidden states are the causal states, the transitions (edges) between these states involve the emission of a symbol from the process alphabet, and the string of emitted symbols forms the process. The edges are defined by a dynamic \(T_{kj}^{(x)}\) describing the probability of transitioning from causal state S j to S k while emitting symbol x. The \(T_{kj}^{(x)}\) are thus defined by the statistics of the process, and because they depend only on the current hidden state the model is Markovian. Further, as the predictive equivalence relation ensures that the system is always in a definite causal state defined wholly and uniquely by its past output, ε-machines are unifilar.2 This means that for a given initial causal state and subsequent emission(s), the current causal state is known with certainty.

The quantity of interest for our study is the statistical complexity C μ , which answers the question “What is the minimal information required about the past in order to accurately predict the future?”. It is defined as the Shannon entropy36 of the steady state distribution π of the causal states S j ;

$$C_\mu = - \mathop {\sum}\limits_j {\kern 1pt} \pi \left( {S_j} \right){\mathrm{log}}_2\left( {\pi \left( {S_j} \right)} \right).$$ (2)

The use of Shannon entropy is motivated by considering the memory to be the average information stored about the past (alternatively, it can be viewed as the average information communicated in the process from the past to the future). Due to the ergodic nature of the processes considered, the time average and the ensemble average are equivalent. However, one could also consider the Hartley entropy, that is, the size of the substrate into which the memory is encoded (i.e., the logarithm of the number of states).1 It can be shown that the ε-machine also optimises this measure,2 though we shall here focus on the former measure, and implicitly consider an ensemble scenario. That is, when operating N independent simulators, the total memory required tends to NC μ as N → ∞.36 The statistical complexity is lower-bounded by the mutual information between the past and future of the process, referred to as the excess entropy \(E = I\left( {\overleftarrow {\boldsymbol{X}} ;\overrightarrow {\boldsymbol{X}} } \right)\).2

Although the predictive equivalence relation defines the optimal model for both discrete-time processes and continuous-time processes, as noted earlier, most works so far have been devoted to studying the ε-machines of discrete-time processes. It is only recently that a similar systematic causal architecture has been uncovered for a restricted set of continuous-time processes, renewal processes.27 Renewal processes form a special case of the above, where each emission occurs at an independent and identically distributed (IID) probabilistic time, and emits the same symbol. Such processes are defined entirely by this emission probability density ϕ(t), and the sequence is fully described by the emission times alone. It is useful to define the following quantities for a renewal process: the survival probability \({\mathrm{\Phi }}(t) = {\int}_t^\infty {\kern 1pt} \phi (t^{\prime})\mathrm{d}t^{\prime}\); and the mean firing rate \(\mu = \left( {{\int}_0^\infty {\kern 1pt} t\phi (t)\mathrm{d}t} \right)^{ - 1}\).

In Fig. 2 we show a generative model for such a process. Because of the IID nature of the process, the only relevant part of the past in predicting the future statistics is the time since the last emission \(t_{0^ + }\), and this assists us only in predicting the time to the next emission \(t_{0^ - }\).27 Thus, the causal equivalence relation simplifies to

$$t_{0^ + }\sim _et_{0^ + }^\prime \Leftrightarrow P\left( {T_{0^ - }|T_{0^ + } = t_{0^ + }} \right) = P\left( {T_{0^ - }|T_{0^ + } = t_{0^ + }^\prime } \right).$$ (3)

We label the causal states \(S_{t_{0^ + }}\) according to the minimum \(t_{0^ + }\) belonging to the equivalence class. Depending on the form of ϕ(t), we can determine which \(t_{0^ + }\) belong to the same causal state. Notably, if ϕ(t) is Poissonian, the time since the last emission is irrelevant (as the decay rate is constant), and hence all \(t_{0^ + }\) belong to the same causal state—the process is memoryless and has C μ = 0. All other processes involve a continuum of causal states, which may either extend indefinitely, terminate in a single state at a certain time, or eventually enter a periodic continuum (see Methods A). The steady state probability density π(S t ) of the causal states depends on this causal architecture (Methods B). We specifically highlight that states in the initial continuum have π(S t ) = μΦ(t); as we will later discuss, this is the only necessary part of the architecture once we turn to quantum causal states.

Fig. 2 Generative model for a renewal process. Diagram depecting a generative model for a renewal process. The labelling indicates that a symbol 0 is emitted with probability 1, at time t with probability density ϕ(t), and returns to the same state Full size image

The statistical complexity of the process can be defined in correspondence with Eq. (2), by taking the continuous limit of a discretised analogue of the process;

$$C_\mu = \mathop {{{\mathrm{lim}}}}\limits_{\delta t \to 0} - \mathop {\sum}\limits_{n = 0}^\infty {\kern 1pt} \pi \left( {S_{n\delta t}} \right)\delta t{\kern 1pt} {\mathrm{log}}_2\left( {\pi \left( {S_{n\delta t}} \right)\delta t} \right).$$ (4)

This quantity will however either be zero (for a Poissonian emission probability density), or infinite (for all other distributions), due to the infinitesimal coarse-graining. Classically therefore, it is not the most enlightening measure of complexity, and has motivated earlier work on this topic27 to instead consider use of the differential entropy for the statistical complexity; \(C_\mu ^{({\mathrm{DE}})} = - {\int}_0^\infty {\kern 1pt} \mathrm{d}t\pi \left. {\left( {S_t} \right)} \right){\mathrm{log}}_2{\kern 1pt} \pi \left( {S_t} \right)\). While this quantity allows for a comparison of the complexity of two processes, we find it lacking as an absolute measure of complexity, as it requires one to take logarithms of dimensionful quantities, and loses the original physical motivation of being the information contained within the process about its past. Instead, we will employ the true continuum limit of the Shannon entropy Eq. (4) as the measure of a process’ statistical complexity, accepting the infinities as faithfully representing that classical implementations of such models do indeed require infinite memory.

Quantum causal states

It has been shown that a quantum device simulating a discrete-time process can in general require less memory than the optimal classical model.17 In order to assemble such a device, for each causal state S j one must construct a corresponding quantum causal state \(\left| {S_j} \right\rangle = \mathop {\sum}

olimits_{xk} {\kern 1pt} \sqrt {T_{kj}^{(x)}} \left| x \right\rangle \left| k \right\rangle\), where, as defined above, the transition dynamic \(T_{kj}^{(x)}\) is the probability that a system in S j will transition to S k , while emitting symbol x. The machine then operates by mapping the state \(\left| k \right\rangle\) with a blank ancilla to \(\left| {S_k} \right\rangle\), following which measurement of the \(|x\rangle _{}^{}\) subspace will produce symbol x with the correct probability, while leaving the remaining part of the system in \(\left| {S_k} \right\rangle\). The internal steady state of the machine is given by \(\rho = \mathop {\sum}

olimits_j {\kern 1pt} \pi \left( {S_j} \right)\left| {S_j} \right\rangle \left\langle {S_j} \right|\). We refer to such constructions as q-machines, and their internal memory C q can be described by the von Neumann entropy36 of the steady state;

$$C_q = - {\mathrm{Tr}}\left( {\rho {\kern 1pt} {\mathrm{log}}_{\mathrm{2}}\rho } \right).$$ (5)

Unlike classical causal states, the overlaps \(\left\langle {S_j|S_k} \right\rangle\) of different quantum causal states are in general non-zero, and hence C q ≤ C μ (typically the inequality is strict); thence, the q-machine has a lower internal memory requirement than the corresponding ε-machine.17 Physically, this memory saving can be understood as the lack of a need to store information that allows complete discrimination between two pasts when they have some overlap in their conditional futures. This entropy reduction acquires operational significance when one considers an ensemble of independent simulators of a process sharing a common total memory.17 As with the classical case, C q is also lower bounded by the excess entropy of the process. Note that while this quantum construction is superior to the optimal classical model, it does not necessarily provide the optimal quantum model. Indeed, for particular classes of process, constructions involving several symbol outputs are known that have even lower internal memory,19,20 and there may exist as yet unknown further optimisations beyond this. Such known improvements however are not relevant for the processes we consider.

We now seek to extend this quantum memory reduction advantage to the realm of continuous-time processes. To do so, we first define a wavefunction \(\psi (t) = \sqrt {\phi (t)}\). We can rephrase the survival probability and mean firing rate in terms of this wavefunction: \({\mathrm{\Phi }}(t) = {\int}_t^\infty \left| {\psi (t^{\prime})} \right|^2\mathrm{d}t^{\prime}\); and \(\mu = \left( {{\int}_0^\infty {\kern 1pt} t\left| {\psi (t)} \right|^2\mathrm{d}t} \right)^{ - 1}\). Inspired by the quantum construction for discrete-time processes, we wish to construct quantum causal states \(\left| {S_t} \right\rangle\) such that when a measurement is made of the state (in a predefined basis), it reports a value t′ with probability (density) \(P\left( {T_{0^ - } = t^{\prime}|T_{0^ + } = t} \right)\). We may view the quantum causal state as a continuous alphabet (representing the value of \(t_{0^ - }\)) analogue of the discrete case, with only a single causal state (S 0 ) the system may transition to after emitting this symbol.

The probability density \(P\left( {T_{0^ - } = t^{\prime}|T_{0^ + } = t} \right)\) is given by \(\phi \left( {t + t^{\prime}} \right){\mathrm{/}}{\int}_t^\infty {\kern 1pt} \phi (t^{\prime})\mathrm{d}t^{\prime} = \phi (t + t^{\prime}){\mathrm{/\Phi }}(t)\). By analogy with the discrete case we construct our quantum causal states as \(\left| {S_t} \right\rangle = {\int}_0^\infty {\kern 1pt} \mathrm{d}t^{\prime}\sqrt {P\left( {T_{0^ - } = t^{\prime}|T_{0^ + } = t} \right)} \left| {t^{\prime}} \right\rangle\), and thus:

$$\left| {S_t} \right\rangle = \frac{1}{{\sqrt {{\mathrm{\Phi }}(t)} }}{\int}_0^\infty {\kern 1pt} \mathrm{d}t^{\prime}\psi (t + t^{\prime})\left| {t^{\prime}} \right\rangle .$$ (6)

We emphasise that while the wavefunction is encoding information about time in the modelled process, the q-machine used for simulation may encode it in any practicable continuous variable, such as the position of a particle. The measurement basis used to obtain the correct statistics is of course that defined by \(\left\{ {\left| t \right\rangle } \right\}\) (that is, measurement outcome t′ occurs with probability density \(\left| {\left\langle {t^{\prime}|S_t} \right\rangle } \right|^2 = \left| {\psi (t + t^{\prime})} \right|^2{\mathrm{/\Phi }}(t)\) when the system is in state \(\left| {S_t} \right\rangle\)).

When the first segment \([0,\tilde t)\) of the continuous variable in a quantum causal state is swept across, if the system is not found to be in this region the state is modified by application of the projector \({\mathrm{{\Pi}}}_{\tilde t} = {\int}_{\tilde t}^\infty {\kern 1pt} \mathrm{d}t\left| t \right\rangle \left\langle t \right|\) and appropriate renormalisation. When this projector is applied to the state \(\left| {S_t} \right\rangle\), the resulting state is simply \(\left| {S_{t + \tilde t}} \right\rangle\) displaced by \(\tilde t\); by correcting for this displacement the effect of the measurement sweep is exactly identical to the change in the internal memory of the machine if no emission is observed in a time period \(\tilde t\), and thus the quantum causal states automatically update when measurement sweeps are used to simulate the progression of time.

The overlap of two quantum causal states can straightforwardly be calculated:

$$\left\langle {S_a|S_b} \right\rangle = \frac{1}{{\sqrt {{\mathrm{\Phi }}(a){\mathrm{\Phi }}(b)} }}{\int}_0^\infty {\kern 1pt} \mathrm{d}t\psi (t + a)\psi (t + b).$$ (7)

By their very construction, these quantum states will automatically merge states with identical future statistics, even if we neglect the underlying causal architecture. Recall the causal equivalence relation Eq. (3). Since these probabilities wholly define the quantum states, if two quantum states have the same future statistics they are identical by definition. Due to the linearity of quantum mechanics, the steady state probabilities of the identical quantum states are added together to find the total probability for the state, much the same way as the underlying state probabilities are added together when merging states to form the classical causal states. Thus, when constructing the quantum ‘causal’ states, we are at liberty to ignore the classical causal architecture as described in Methods A, without any penalty to the information that is stored by the q-machine, and instead construct quantum states for all t ≥ 0 according to the prescription of Eq. (6). Note that the causal architecture can still be used as a calculational aid.

Memory of continuous-time q-machines

From Eq. (7) we see that in general the overlaps of the quantum causal states are non-zero, unlike the corresponding classical states, which are orthogonal. Because of this reduced distinguishability of the quantum causal states, the entropy of their steady state distribution is less than that of the classical causal states, and hence the amount of information that must be stored by the q-machine to accurately predict future statistics is less than that of the optimal classical machine, evincing a quantum advantage for the simulation of continuous-time stochastic processes. We will later show with our examples that this advantage can be unbounded, wherein q-machines have only a finite memory requirement for the simulation of processes for which the ε-machine requires an infinite amount of information about the past. Note that even when we consider coarse-graining the time since the last emission to a resolution of finite intervals δt we shall still see a quantum advantage due to the non-orthogonality of the quantum states. Note also that decoherence of the memory into the measurement basis destroys the quantum advantage, and will result in the classical internal memory cost C μ (see Methods C).

The density matrix describing the internal state of the q-machine is given by \(\rho = {\int}_0^\infty {\kern 1pt} \mathrm{d}t\pi \left( {S_t} \right)\left| {S_t} \right\rangle \left\langle {S_t} \right|.\) As discussed above, we can construct the quantum states \(\left| {S_t} \right\rangle\) for all t, in which case their steady state probability density π(S t ) is given by μΦ(t). We thus find that the elements of the density matrix are given by \(\rho (a,b) = \mu {\int}_0^\infty {\kern 1pt} \mathrm{d}t\psi (t + a)\psi (t + b)\). From this, we can construct a characteristic equation to find the eigenvalues λ n that diagonalise the density matrix:

$$\mu {\int}_0^\infty {\kern 1pt} \mathrm{d}b{\int}_0^\infty {\kern 1pt} \mathrm{d}t\psi (t + a)\psi (t + b)f_n(b) = \lambda _nf_n(a).$$ (8)

The information stored by the q-machine can then be expressed in terms of these eigenvalues; \(C_q = - \mathop {\sum}

olimits_n {\kern 1pt} \lambda _n{\kern 1pt} {\mathrm{log}}_2\lambda _n\). We find that this quantity is invariant under rescaling of the time variable in the emission probability density (see Methods D for details).

Building q-machine simulators of renewal processes

While we have explained in the abstract sense how one constructs the quantum causal states, it is interesting to also consider the structure of a device that would actually perform such simulations. In fact, a digital simulation of the process, that simply emits a sequence \(t_{0^ - :L}\) on demand drawn from the correct probability distribution \(P\left( {T_{0^ - :L} = t_{0^ - :L}|T_{0^ + } = t_{0^ + }} \right)\) would be very straightforward to assemble in principle: one must prepare the state \(\left| {S_{t_{0^ + }}} \right\rangle\), and L − 1 copies of \(\left| {S_0} \right\rangle\) (the states are all independent due to the renewal process emissions being IID). Measurement of the first state provides the \(t_{0^ - }\), while measurement of the others provides the t 1:L . Because of the self-updating nature of the quantum causal states under partial measurement sweeps \([0,\tilde t)\), measurement over such a range can be used to simulate the effect of waiting for a time \(\tilde t\) for an emission.

However, this scheme is unsatisfactory as one must manually switch to a new state after each emission. Rather, a device that automatically begins operating on the state for the next emission after the previous state is finished would be preferable. We now describe such a construction, and even go a step further, by devising a setup that enables an analogue simulation of the process, and is thus able to provide emission times in (scaled) real time. For illustrative purposes, we first describe the protocol for discrete timesteps (that may be coarse-grained arbitrarily finely), and then discuss how it can be performed in continuous-time.

The procedure for the discrete-time case is as follows. Consider an infinite chain of qubits (two state quantum systems) labelled from 0 to ∞. Using \(\left| {1_n} \right\rangle\) to denote the state where all qubits are in state \(\left| 0 \right\rangle\) apart from the nth, which is in state \(\left| 1 \right\rangle\), we can express the discretised analogues \(\left| {\sigma _t} \right\rangle\) of the quantum causal states \(\left| {S_t} \right\rangle\) as \(\left| {\sigma _t} \right\rangle = \mathop {\sum}

olimits_n \sqrt {P\left( {T_{0^ - } = n\mathrm{\delta} t|T_{0^ + } = t} \right)\mathrm{\delta} t} \left| {1_n} \right\rangle\), where \(P\left( {T_{0^ - } = n\mathrm{\delta} t|T_{0^ + } = t} \right) \to \phi (t + n\mathrm{\delta} t){\mathrm{/\Phi }}(t)\) as δt → 0. The location n of the qubit in state \(\left| 1 \right\rangle\) then represents the time nδt at which the emission occurs. We initialise the system in state \(\left| {\sigma _{t_{0^ + }}} \right\rangle\), according to the desired initial \(t_{0^ + }\). The chain is then processed sequentially, one qubit at a time, by performing a control gate on the qubit, which has the effect of mapping the next block of the chain to the state \(\left| {\sigma _0} \right\rangle\) if the qubit is in state \(\left| 1 \right\rangle\), and doing nothing otherwise (explicitly, the mapping required is \(\left| 0 \right\rangle \left| {1_n} \right\rangle \to \left| 0 \right\rangle \left| {1_n} \right\rangle\) \(\forall n \in {\Bbb Z}^ +\) and \(\left| 1 \right\rangle \left| 0 \right\rangle ^{ \otimes \infty } \to \left| 1 \right\rangle \left| {\sigma _0} \right\rangle\), where by construction these are the only possible input states). The qubit is then ejected from the machine (where measurement can be used to determine whether an emission event occurs at this time), and the machine then acts on the next qubit in the chain (Fig. 3a). This operation has the effect of preparing the chain in a state that provides the correct conditional probabilities if no emission is observed, and prepares the state with the correct distribution for the next emission step if an emission is observed.

Fig. 3 q-machine simulators of renewal processes. a Analogue simulator for a discrete-time renewal process, where a continuous chain of qubits is used to encode the quantum causal state. The simulator sweeps along the chain and alters the future of the chain conditional on the current qubit, with the mappings \(\left| 0 \right\rangle \left| {1_n} \right\rangle \to \left| 0 \right\rangle \left| {1_n} \right\rangle\) and \(\left| 1 \right\rangle \left| 0 \right\rangle ^{ \otimes \infty } \to \left| 1 \right\rangle \left| {\sigma _0} \right\rangle\). Measurement of the qubit state signifies whether an emission occurs in a given timestep. b Analogue simulator for continuous-time renewal processes, where the quantum causal state is encoded into the position of a particle. The simulator sweeps along this position and generates additional particles encoding future emissions conditional on the presence of the particle. Detection of the particle signals an emission event Full size image

To operate this protocol in continuous-time, instead of encoding the state onto a discrete chain, we instead use a continuous degree of freedom, such as spatial position (henceforth referred to as the ‘tape’). As with the discrete case, we process sequentially along the tape, performing a unitary gate on the future of the tape, controlled on the current segment. Each emission step has its emission time encoded by the position of a particle on the tape (Fig. 3b); the first particle on the tape is initialised in \(\left| {S_t} \right\rangle = \left( {1{\mathrm{/}}\sqrt {{\mathrm{\Phi }}(t)} } \right){\int}_0^\infty {\kern 1pt} \mathrm{d}x\psi (t + x)\left| x \right\rangle\), where x labels the position on the tape. Since the controlled unitary operation must be performed in discrete time, on a discrete length of tape, it is designed such that it acts, controlled on the presence of a particle in the block, by placing a particle in state \(\left| {S_0} \right\rangle\), displaced to have its zero at the location of the control particle, and does nothing otherwise, akin to the discrete case above (that is, if the present particle is at position x, the combined state of the old and new particle is mapped to \(\left| x \right\rangle \left| {S_{ - x}} \right\rangle\), where we clarify that ψ(t) = 0 if t is negative). More formally, this can be written as the transformation \({\int}_0^\infty {\kern 1pt} \mathrm{d}t{\int}_L {\kern 1pt} \mathrm{d}x\psi (x + t)a_{x + t}^\dagger a_x^\dagger a_x\), where L is the block of tape upon which the gate acts, and \(a_x^\dagger\) creates a particle at x. Strictly, the gate should act in a nested fashion, by further generating an additional particle in an appropriately displaced state, when the new particle is placed within the current block. The machine then progresses to perform the same operation contiguously on the next block, while feeding out the previous block (equivalently, the tape can be fed through a static machine). Measurement of the positions of particles on the tape fed out then provides the simulated emission times.

Examples

We illustrate our proposal with two examples. We show for both these examples that not only is there a reduction in the memory requirement of the q-machine compared to the ε-machine, but also that the q-machine needs only a finite amount of memory, while the classical has infinite memory usage. Here we summarise the results, and the technical details may be found in Methods E and F.

The first example is a uniform emission probability over the interval [0, τ). The corresponding emission probability density is ϕ(t) = 1/τ for 0 ≤ t < τ, and zero elsewhere (Fig. 4a). The corresponding mean firing rate and survival probability are given by μ = 2/τ and Φ(t) = 1 − t/τ (t < τ) respectively. The corresponding quantum causal states are given by \(\left| {S_t} \right\rangle = {\int}_0^{\tau - t} {\kern 1pt} \mathrm{d}t^{\prime}\left( {1{\mathrm{/}}\sqrt {\tau - t} } \right)\left| {t^{\prime}} \right\rangle\), and we can solve Eq. (8) to find that λ n = 8/(π(2n − 1))2 for \(n \in {\Bbb Z}^ +\). We can use an integral test (see Methods E) to show that \(C_q = - \mathop {\sum}

olimits_{n = 1}^\infty {\kern 1pt} \lambda _n{\kern 1pt} {\mathrm{log}}_2\lambda _n\) is bounded, and moreover, that C q ≈ 1.2809. In Fig. 4b we show how the memory required by the q-machine tends towards this value as we use an increasingly fine coarse-graining of the discretised analogue of the process to approach the continuous limit, while the memory needed by the optimal classical machine diverges logarithmically. The memory requirement exceeds the lower bound set by the excess entropy \(E = {\mathrm{log}}_2{\kern 1pt} e - 1 \approx 0.4427\).

Fig. 4 Uniform emission probability. a The corresponding emission probability density for a process with uniform emission probability in an interval [0,τ). b The classical memory C μ required to simulate the process diverges logarithmically as the discretisation becomes finer (N states), while the quantum memory C q converges on a finite value Full size image

For our second example, we consider a delayed Poisson process (ϕ(t) = (1/τ L ) exp(−(t − τ R )/τ L ) for t > τ R and 0 elsewhere), representing a process that exhibits an exponential decay with lifetime τ L , and a rest period τ R between emissions (Fig. 5a), forming, for example, a very crude model of a neuron firing. For this emission distribution we find that μ = (τ L + τ R )−1, and Φ(t) = 1 for t ≤ τ R and exp(−(t − τ R )/τ L ) for t > τ R . We can then show somewhat indirectly (see Methods F) that the corresponding quantum memory requirement is bounded for finite τ R /τ L (and vanishes as this ratio tends to zero), while in contrast C μ is infinite whenever this is non-zero. Further, due to the timescale invariance of the quantum memory, C q depends only on this ratio, and not the individual values of τ R and τ L . Varying this ratio allows us to sweep between a simple Poisson process with lifetime τ L , and a periodic process where the system is guaranteed to emit within an arbitrarily small interval at time τ R after the last emission. The quantum memory C q correspondingly increases with this ratio as we interpolate between the two limits (Fig. 5b), with the pure Poisson process being memoryless, and a periodic process requiring increasing memory with the sharpness of the peak. We also plot the excess entropy, given by \(E = {\mathrm{log}}_2(\tau _R{\mathrm{/}}\tau _L + 1)\) − \({\mathrm{log}}_2\mathrm{e}{\mathrm{/}}(\tau _L{\mathrm{/}}\tau _R + 1)\), which exhibits similar qualitative behaviour.