The parameter space of C-2U has over one thousand dimensions. Quantities of interest are almost certainly not convex functions of this space. Furthermore, machine performance is strongly affected by uncontrolled time-dependent factors such as vacuum impurities and electrode wear. Under nominal operating conditions, plasma shots can be taken with a cadence of about eight minutes. Experiments are run over an eight-hour shift, producing up to 60 plasma shots per day. Given these facts, efficient optimisation of system performance appears to be a close to intractable problem. Nonetheless, we will show that the Optometrist Algorithm can overcome these difficulties. Pseudocode for the Optometrist Algorithm is given in Alg. 1, with a detailed explanation following. Mathematical details are given in the Methods section.

Algorithm 1 The Optometrist Algorithm. Full size image

While C-2U has thousands of configurable parameters, most subsystems can be effectively described by a much smaller space of so-called “meta-parameters” (MPs). We can adjust the effects of most subsystems with one or two meta-parameters, capturing the most important behaviours. In this work we used sets of less than 30 MPs, described in detail in the Methods section.

At the start of each day’s run, we repeat a small number (less than ten) of recent experiments. The human operator chooses the best among them as a reference. All following experiments are based on this initial reference.

New settings are chosen in the reduced space of MPs via a stochastic algorithm. Starting from the known operating point, we move in a random direction in MP-space, adjusting the parameters by a relative amount. As the values and units of the MP dimensions are heterogeneous, relative changes allow us to compare them in a meaningful way.

After a step in MP-space is taken, actual machine settings are derived by undoing the functional mapping for each MP. These settings are loaded into the control system and the shot is taken. Measurements of the experimental outcome are recorded and shown to a human expert in the next step.

The Optometrist Algorithm asks the human to compare two shots: the reference shot and the new shot just generated. The human expert can use his or her judgement about which shot is better, using all the available measurements that are shown in a visualisation panel. The human can choose one of the two shots as better, or rate them “about the same”. “About the same” is explained to the human as “50% likely to be as good, if the shot were to be repeated”. This allows for human judgement on partially successful experiments.

The human expert looks at visualisations related to plasma initial conditions, confinement times, and stability. The expert can choose based on multiple criteria, instead of just a pre-defined metric. Further, the goals of the optimisation can be changed during a run based on newly-discovered plasma behaviour.

The inclusion of expert oversight in parameter exploration is especially important given the large number of simultaneously-perturbed MPs. For instance, a priori identification of safe operating regimes cannot always be made for all combinations of machine parameters. As such, it is possible to set the machine to an unsafe state due to unanticipated nonlinear interactions between settings. This is not clear until the unsafe shot is actually run. Furthermore, safe settings may evolve over the course of experiments. For example, as part of the learning and experimentation process, deliberate hardware or procedural changes may be made to the machine or its operation. These changes can quickly impact vacuum and vessel wall conditions. For these reasons human oversight is crucial. The human expert can reject settings that yield both excellent plasma performance but also high likelihood of damaging the machine.

If the new shot is better than, or “about the same” as the old reference shot, it becomes the new reference shot. Otherwise, the settings are rejected and the reference shot remains the same. This strategy avoids getting stuck in local maxima. This is analogous to the Metropolis-Hastings acceptance algorithm for Monte Carlo optimisation3, 16, which accepts steps that degrade the optimisation function with some probability.

We note that a few of the best shots taken on a given day will be repeated on the next experiment day in the “Choose reference settings” phase. This strategy connects the exploration of MPs across experiment days, even permitting new exploration of previously dead ends.