Experiments can be complex and produce large volumes of heterogeneous data, which make their execution, analysis, independent replication and meta-analysis difficult. We propose a mathematical model for experimentation and analysis in physiology that addresses these problems. We show that experiments can be composed from time-dependent quantities, and be expressed as purely mathematical equations. Our structure for representing physiological observations can carry information of any type and therefore provides a precise ontology for a wide range of observations. Our framework is concise, allowing entire experiments to be defined unambiguously in a few equations. In order to demonstrate that our approach can be implemented, we show the equations that we have used to run and analyse two non-trivial experiments describing visually stimulated neuronal responses and dynamic clamp of vertebrate neurons. Our ideas could provide a theoretical basis for developing new standards of data acquisition, analysis and communication in neurophysiology.

1. Introduction

Reproducibility and transparency are cornerstones of scientific methodology. As scientific experiments and analysis become increasingly complex, become reliant on computer code and produce larger volumes of data, the feasibility of independent verification and replication has—in practice—been undermined. Primary data sharing are now standard in applications of high throughput biology, but not in the many fields that produce heterogeneous observations [1,2]. Even when raw data and code used for experiments and analysis are fully disclosed, only a minority of findings can be reproduced without discrepancies [3–5]. It is often not realistic to verify that computer code used in analyses correctly implements an intended mathematical algorithm, yet errors can undermine a large body of work [6]. The combination of bias, human error and unverified software has led to the suggestion that many published research findings are flawed [7,8].

Some of these problems may be mitigated by developing explicit models of experimentation, evidence representation and analysis. A good model should simultaneously: (i) introduce a system of categorization directly relevant to the scientific field, such that scientists can define experiments and reason about observations in familiar terms, (ii) be machine executable; therefore unambiguous and practically useful, and (iii) consist exclusively of terms that directly correspond to mathematical entities, enabling algebraic reasoning about, and manipulation of, procedures. Experiments are difficult to formalize in terms of relations between mathematical objects because they produce heterogeneous data [2], and because they interact with the physical world. In creating a mathematical framework for experiments, we take advantage of progress in embedding input and output [9,10] into programming languages where the only mechanism of computation is the evaluation of mathematical functions [11].

Here, we define a formal framework for physiology that satisfies the earlier-mentioned criteria. We show that there is a large conceptual overlap between physiological experimentation and functional reactive programming (FRP; [12,13]), a concise and purely functional formulation of time-dependent reactive computer programs. Consequently, physiological experiments can be concisely defined in the vocabulary of signals and events introduced by FRP. Such a language does not describe the physical components of biological organisms; it has no concept of networks, cells or proteins. Instead, it describes the observation and calculation of the mathematical objects that constitute physiological evidence (‘observations’).

Our framework provides:

— an explicitly defined ontology of physiological observations. Physiological databases have not been widely adopted [14,15] despite many candidates being available [16–19]. This contrasts with bioinformatics and neuroanatomy, where databases are routinely used [20,21]. We suggest that a flexible, concise and simple structure for physiological quantities can remedy some of the shortcomings [1,15] of existing databases and thus facilitate the sharing of physiological data and metadata [22]; — a concise language for describing complex experiments and analysis procedures in physiology using only mathematical equations. Experimental protocols can be communicated unambiguously, highlighting differences between studies and facilitating replication and meta-analysis. The provenance [23–25] of any observation can be extracted as a single equation that includes postacquisition processing and censoring. In addition, analysis procedures in languages with a clear mathematical denotation are verifiable as their implementation closely follows their specification [26]; — the theoretical basis for new tools that are practical, powerful and generalize to complex and multi-modal experiments. In order to demonstrate this, we have implemented our framework as a new programming language and used it for non-trivial neurophysiological experiments and data analyses. A strength of our approach is that its individual elements could, alternatively, be adopted separately or in different ways to suit different demands.

2. Results

In order to introduce the calculus of physiological evidence (CoPE), we first define some terminology and basic concepts. We assume that time is global and is represented by a real number, as in classical physics. An experiment is an interaction between an observer and a number of organisms for a defined time period. An experiment consists of one or more trials: non-overlapping time periods during which the observer is running a program—instructions for manipulating the environment and for constructing mathematical objects, the observations. The analyses are further programs to be run during or after the experiment that construct other mathematical objects pertaining to the experiment. In the following sections, we give precise definitions of these concepts using terms from programming language theory and type theory, while providing an introduction to the terms for a general audience.

2.1. Type theory for physiological evidence

What kinds of mathematical objects can be used as physiological evidence? We answer this question within simple type theory [27,28], which introduces an intuitive classification of mathematical objects by assigning to every object exactly one type. These types include base types, such as integers ℤ, text strings String and the Boolean type Bool with the two values True and False, as well as the real numbers ℝ (which can be approximated in a programming language with Float). These base types are familiar to the users of most programming languages. In addition, modern type systems, including simple type theory, allow types to be arbitrarily combined in several ways. For instance, if α and β are types, the type α × β is the pair formed by one element of α and one of β; [α] is a list of αs; α → β is the type of functions that calculate a value in the type β from a value in α. The ability to write flexible type schemata and generic functions containing type variables (α, β,…), which can later be substituted with any concrete type, is called ‘parametric polymorphism’ [27] (or ‘templates’ and ‘generics’ in the programming languages C++ and Java, respectively) and is essential to the simplicity and flexibility of CoPE.

We distinguish three type schemata in which physiological evidence can be values. These differ in the manner in which measurements appear in a temporal context, but all derive their flexibility from parametric polymorphism. Signals capture the notion of quantities that change in time. In physiology, observed time-varying quantities often represent scalar quantities, such as membrane voltages or muscle force, but there are also examples of non-scalar signals such as the two- or three-dimensional location of an animal or of a body part. Here, we generalize this notion such that for any type α, a signal of α is defined as a function from time to a value in α, written formally as

For instance, the output of a differential voltage amplifier might be captured in a

In order to model occurrences pertaining to specific instances in time, FRP defines events as a list of pairs of time points and values in a type α, called the ‘tags’:

For example, an event can be constructed from a number-valued signal that represents the time of the largest amplitude value of the signal, with that amplitude in the tag. Events that do not have a value of interest to associate with the time point at which it occurred can be tagged with the unit type () that has only one element (i.e. no information). Events can therefore represent measurements where the principal information issomething happened, or measurements that concernhappened.

A third kind of information describes the properties of whole time periods. We define a duration of type α as a list of pairs, of which the first component is a pair denoting a start time and an end time. The last component is again a value of any type α:

Durations are useful for manipulating information about a whole trial or a single annotation of an entire experiment, but could also be observations in their own right, such as open times of individual ion channels, or periods in which activity of a system exceeds a set threshold (e.g. during bursts of action potentials). Lastly, durations could be used for information that spans multiple trials—for instance, the presence or absence of a drug.

As signals, events and durations can be instantiated for any type, they form a simple but flexible framework for representing many physiological quantities. We show a list of such examples primarily drawn from neurophysiology in table 1. A framework in any type system that does not support parametric polymorphism would have to represent these quantities fundamentally differently, thus removing the possibility of re-using common analysis procedures. Although parametric polymorphism is conceptually simple and the distinctions we are introducing are intuitive, common biomedical ontologies [29] cannot accommodate these definitions.

Table 1.Representation of physiological observations and quantities in the calculus of physiological evidence. Collapse quantity type voltage across the cell membrane Signal Float ion concentration Signal Float animal location in two-dimensional space Signal (Float × Float) action potential Event () action potential waveforms Event (Signal Float) spike detection threshold Duration Float spike interval Duration () synaptic potential amplitude Event Float drug present Duration () trial with parameter α Durationα visual stimulus Signal Shape laboratory notebook Event String

2.2. Calculating with signals and events

From direct observations, one often needs to process events and signals, create new events from signals, filter data and calculate statistics. Here, we formulate these transformations in terms of the lambda calculus [11], a family of formal languages for computation based solely on evaluating functions. These languages, unlike conventional programming languages, retain an important characteristic of mathematics: a term can freely be replaced by another term with identical meaning. This property (referential transparency; [30]) facilitates algebraic manipulation of and reasoning about programs [26]. The lambda calculus allows functions to be used as first-class entities: that is, they can be referenced by variables and passed as arguments to other functions. On the other hand, the lambda calculus disallows changing the value of variables or global states. These properties together mean that the lambda calculus combines verifiable correctness with a high level of abstraction, leading to programs that are in practice more concise [31] than those written in conventional programming languages. The lambda calculus or variants thereof has been used as a foundation for mathematics [32], classical [33] and quantum mechanics [34], evolutionary biochemistry [35], mechanized theorem provers [36,37] and functional programming languages [38].

In the lambda calculus, calculations are performed by function abstraction and application. λx → e denotes the function with argument x and body e, and fe the application of the function f to the expression e (more conventionally written f(e)). For instance, the function add2 = λx → x + 2, which can be written more conveniently as add2 x = x + 2, adds two to its argument; hence, add2 3 = (λx → x + 2) 3 = 3 + 2 by substituting arguments in the function body.

We now present the concrete syntax of CoPE, in which we augment the lambda calculus with constructs to define and manipulate signals and events. This calculus borrows some concepts from earlier versions of FRP, but focuses exclusively on signals and events as mathematical objects and their relations. It does not have any control structures for describing sequences of system configurations, where a signal expression depends on the occurrence of events [12,13], although such constructs may be useful for simulations. As a result, CoPE is quite different from conventional FRP, which is also reflected in its implementation.

Let the construct {: e :} denote a signal with the value of the expression e at every time point, and let the construct 〈: s :〉 denote the current value of the signal s in the temporal context created by the surrounding {: … :} braces. For instance,

denotes the signal that always has the value 1; and the functiondefined astransforms, for any two typesand, the signalofinto a signal ofby applying the functionof typeto the value of the signal at every time point.

The differential operator D differentiates a real-valued signal with respect to time, such that Ds denotes its first derivative and DDs the second derivative of the signal s. When the differential operator appears on the left-hand side of a definition, it introduces a differential equation (see §2.5).

Events and durations can be manipulated as lists. Thus, a large number of transformations can be defined with simple recursive equations, including filters, folds and scans that are pivotal in functional programming languages [31]. In addition, we have added a special construct to detect events from existing signals. For instance, a threshold detector generates an occurrence of an event whenever the value of a signal crosses a specific level from below. Here, we generalize the threshold detector to an operator ?? that takes a predicate (i.e. a function of type α → Bool), applies it to the instantaneous value of a signal and generates an event whenever the predicate becomes true. For instance,

denotes the event that occurs whenever the value of the signalstarts to satisfy the predicate> 5; that is, whenever it becomes greater than 5 after having been smaller. The expression (> 5)??thus defines a threshold detector restricted to threshold crossings with a positive slope.

Table S1 in the electronic supplementary material presents an informal overview of the syntax of CoPE; table S2 in the electronic supplementary material details the types and names of some of the functions we have defined using these definitions.

2.3. Interacting with the physical world

In the previous examples, signals, events and durations exist as purely mathematical objects. In order to describe experiments, however, it must also be possible to observe particular values from real-world systems, and to create controlled stimuli to perturb these systems. For this purpose, we introduce sources and sinks that act as a bridge between purely mathematical equations and the physical world.

A source is an input port through which the value of some external quantity can be observed during the course of an experiment by binding it to a variable. If the quantity is time-varying, then the bound variable will denote a signal. For instance, binding a variable to source denoting a typical analogue-to-digital converter yields a signal of real numbers. However, a source may also refer to a time-invariant quantity.

The construct

binds the value or signal resulting from the observation of theduring the course of an experiment to the variable. For a concrete example, the following code defines a simple experiment:where kHz= 1000 ·. This describes the observation of the voltage signal on channel 0 of an analogue-to-digital converter at 20 kHz, binding the whole signal to the variable. We have also used sources to sample values from probability distributions (see the electronic supplementary material).

In addition to making appropriate observations, an experiment may also involve a perturbation of the experimental preparation. For example, the manipulation could control the amount of electric current injected into a cell. Alternatively, non-numeric signals are used below to generate visual stimuli on a computer screen. Such manipulations require the opposite of a source, a sink: an output port connected to a physical device capable of effecting the desired perturbation. The value at the output at any point in time during an experiment is defined by connecting the corresponding sink to a signal. This is performed through the following construct, mirroring the source construct introduced earlier:

As a concrete example, suppose we wish to output a sinusoidal stimulus. We first construct a time-varying signal describing the desired shape of the stimulus. In this case, we read a clock source that yields a signal counting the number of seconds since the experiment started:

The sine wave can now be defined as

whereandare-valued constants specifying the amplitude, frequency and phase, respectively. We then writeto send thesignal to channel 0 of a digital-to-analogue converter at 20 kHz.

We have implemented this calculus as a new programming language and used this software to define and run two detailed experiments in neurophysiology.

2.4. Example 1

In locusts, the descending contralateral movement detector (DCMD) neuron signals the approach of looming objects to a distributed nervous system [39]. We have constructed several experiments in CoPE to record the response of DCMD to visual stimuli that simulate objects approaching with different velocities. In order to generate these stimuli, we augmented CoPE with primitive three-dimensional geometric shapes. Let the expression

image.tga

denote a cube centred on the origin, with side lengthrepresent the shape that results from translating the shapeby the vector () andindicate the shape identical toexcept with the colour intensity red, greenand blue. These primitives are sufficient for the experiments reported here. More complex stimuli can alternatively be defined in, and loaded from, external files; for instance, the sourceloads an image, such thatloads the binary image inand creates a new function, which—when applied to a shape—returns a similar but appropriately textured shape. Thus,denotes a textured cube. Sources for loading complex polygons could be defined similarly.

As signals in CoPE are polymorphic, they can carry not just numeric values but also shapes; so we represent visual stimuli as values in SignalShape. The looming stimulus consists of a cube of side length l approaching a locust with constant velocity v. The time-varying distance from the locust to the cube in real-world coordinates is a real-valued signal:

The distance signal is the basis of shape-valued signal loomingSquare representing the approaching square:

screen

differs from conventional protocols [ 40 ] for stimulating DCMD in that it describes an object that passes through the physical screen and the observer, and when displayed would thus disappear from the screen just before collision. In order not to evoke a large OFF response [ 41 ] at this point, the object is frozen in space as it reaches the plane of the surface onto which the animation is projected [ 42 ]. In order to achieve this effect, we define a new signal that has a lower bound of the distance from the eye to the visual display screenwherereturns the larger of the two numbersand′ is identical toexcept for the use of′.

Finally, loomingSquare′ is connected to a screen signal sink that represents a visual display unit capable of projecting three-dimensional shapes onto a two-dimensional surface

In our experiments, the extracellular voltage from the locust nerve (connective), in which the DCMD forms the largest amplitude spike, was amplified, filtered (see methods and listing 1 in the electronic supplementary material) and digitized:

Figure 1. Diagram of an experiment to record the looming response from a locust descending contralateral movement detector (DCMD) neuron, showing the first five recorded trials from one animal. Experiment design: blue lines, simulated object size-to-approach speed ratio (l/|v|) for given approach trial, red lines, simulated object distance, red triangles, apparent collision time. Observed signal: black lines, recorded extracellular voltage. The largest amplitude deflections are DCMD spikes. Analysis: green dots, DCMD spikes, with randomly jittered vertical placement for display, thin black line, spike rate histogram with 50 ms bin size. The inter-trial interval of 4 min is not shown.

′ andthus define a single object approach and the recording of the elicited response. This approach was repeated every 4 min, with different values of l/||. Figure 1 shows l/|| as values with type, together with theandsignals for the first five trials of one experiment on a common time scale.

The simplest method for detecting spikes from a raw voltage trace is to search for threshold crossings, which works well in practice for calculating DCMD activity from recordings of the locust connectives [40]. (We have also implemented in CoPE a spike identification algorithm based on template matching; see the electronic supplementary material.) If the threshold voltage for spike detection is v th , then the event spike can be calculated with

wherereplaces every tag in some event with a fixed value, so that spike has type(). The spike event detected by threshold crossing is displayed on the common time scale in figure 1 . The top row displays the spike rate histogramfor each trial. This definition exploits the list semantics of events by using the generic list-processing functionthat takes as arguments predicateand a list, and returns the list of elements infor which the predicate holds. Here, the predicate is(which returns the first element of a pair; here, the occurrence time) composed (○) with the function, which tests whether the last of three numbers lies between the first two.

We examined how the DCMD spike response varied with changes in l/|v|. The average of H spike for three different values of l/|v| is shown in figure 2a; whereas figure 2b,c shows the total number of spikes (length spike) and largest value of H spike , for each approach, plotted against the value of l/|v| [42]. The code that describes and executes these trials is given in the electronic supplementary material, listing 1. This code includes a description, captured in CoPE variables and with appropriate temporal context, of the experimental context that is not machine-executable. This description is based on proposed standards for minimal information about electrophysiological experiments [43]. Correspondences between these standards and CoPE variables is given in electronic supplementary material, table S3. Figure 2. (a) Spike rate histograms for approaches with l/|v| of 0.01 (small-dashed line), 0.02 (large-dashed line) and 0.04 s (solid line), with 50 ms bin size, with collision time indicated by a black triangle. (b) Scatter plot of number of counted spikes against approach l/|v| for individual trials. (c) Scatter plot of the maximum rate of spiking against l/|v| for individual trials. n = 1 animal, 272 approaches.

This experiment demonstrates that the calculus of physiological evidence can adequately and concisely describe visual stimuli, spike recording and relevant analyses for activation of a locust looming detection circuit. In order to demonstrate the versatility of this framework, we next show that it can be used to implement dynamic clamp in an in vivo patch-clamp recording experiment.

2.5. Example 2

Dynamic clamp experiments [44,45] permit the observation of real neuronal responses to added simulated ionic conductances; for instance, a synaptic conductance or an additional Hodgkin–Huxley type voltage-sensitive membrane conductance. A dynamic clamp experiment requires that the current injected into a cell is calculated at every time point based on the recorded membrane potential. Here, we use CoPE to investigate the effect of an A-type potassium conductance [46] on the response of a zebrafish spinal motor neuron to synaptic excitation.

The output current i is calculated at each time-step from the simulated conductance g and the measured membrane voltage V m :

The experiment is thus characterized by the conductance signal g (for clarity, here we omit the amplifier-dependent input and output gains).

In the simplest case, g is independent of V m —for instance, when considering linear synaptic conductances [47]. We first consider the addition of a simulated fast excitatory synaptic conductance to a real neuron. Simple models of synapses approximate the conductance waveform with an alpha function [48]:

In order to simulate a barrage of synaptic input to a cell, this waveform is convolved with a simulated presynaptic spike train. The spike train itself is first bound from a source representing a random probability distribution—in this case, series of recurrent events of type Event () for which the inter-occurrence interval is Poisson distributed. Our standard library contains a function convolveSE, which convolves an impulse response signal with a numerically tagged event such that the impulse response is multiplied by the tag before convolution.

syn

syn

where the signalcould be used directly in a dynamic clamp experiment using the above template (i.e.). Here, we will examine other conductances that modulate the response of the cell to synaptic excitation.

Both the subthreshold properties of a cell and its spiking rate can be regulated by active ionic conductances, which can also be examined with the dynamic clamp. In the Hodgkin–Huxley formalism for ion channels, the conductance depends on one or more state variables, for which the forward and backward rate constants depend on the membrane voltage. We show the equations for the activation gate of an A-type potassium current ([46]; following [49], but using International System of Units and absolute voltages). The equations for inactivation are analogous (see listing 2 in the electronic supplementary material).

We write the forward and backward rates as functions of the membrane voltage

αa1

αa2

αa3

βa3

βa1

βa2

0

andwhere with− 2.0 × 100.0469,0.01,1.75 × 10and0.0199,. The time-varying state of the activation gate is given by a differential equation. We use the notation= {:, 〈::〉):} to denote the ordinary differential equation that is conventionally written as d/d), with starting conditions explicitly assigned to the variable. The differential equation for the activation variableis

The inactivation state signal b is defined similarly.

The current signal from this channel is calculated from Ohm's law:

This is added to the signal i defined above to give the output current, thus completing the definition of this experiment:

Figure 3a,b shows the voltage response to a unitary synaptic conductance and a train of synaptic inputs, respectively, with g A ranging from 0 to 100 nS. By varying the value of rate, we can examine the input–output relationship of the neuron by measuring the frequency of postsynaptic spikes. Spikes were detected from the first derivative of the V m signal with

A

Figure 3. (a) Recorded intracellular voltage following conductance injections of a unitary simulated synaptic conductance, in the presence of A-type potassium conductances of increasing magnitude (values given are for the maximal conductance g A ; 0, solid line; 10, large-dashed line; 40, small-dashed line; 100 nS; small-dashed line). (b) Similar to (a), but with a simulated presynaptic spike train with inter-spike intervals drawn from a Poisson distribution (here a mean of 120s−1; the spike trains used to test the different levels of A-type conductance are identical). (c) The postsynaptic spike rate plotted against the rate of simulated presynaptic inputs, with g A as in (b).

3. Discussion

and the spike frequency calculated with thefunction. This relationship between the postsynaptic spike frequency and the simulated synaptic inputis plotted in figure 3 for four different values of. The code for example 2 is given in the electronic supplementary material, listing 2.

We present a new approach to performing and communicating experimental science. Our use of typed, functional and reactive programming overcomes at least two long-standing issues in bioinformatics: the need for a flexible ontology to share heterogeneous data from physiological experiments [15] and a language for describing experiments and data provenance unambiguously [23,50].

The types we have presented form a linguistic framework and an ontology for physiology. Thanks to the flexibility of parametric polymorphism, our ontology can form the basis for the interchange of physiological data and metadata without imposing unnecessary constraints on what can be shared. The ontology is non-hierarchical and would be difficult to formulate in the various existing semantic web ontology frameworks (web ontology language, [29], or resource description framework), which lack parametric polymorphism and functional abstraction. Nevertheless, by specifying the categories of mathematical objects that constitute evidence, it is an ontology in the classical sense of cataloguing the categories within a specific domain, and providing a vocabulary for that domain. We emphasize again that it is an ontology of evidence, not of the biological entities that give rise to this evidence. It is unusual as an ontology for scientific knowledge in being embedded in a computational framework, such that it can describe not only mathematical objects but also their transformations and observations. Recent work on metadata representation [51] has focused on delineating the information needed to repeat an experiment [43,52], but in practice it is often not clear a priori what aspects of an experiment could influence its outcome. With CoPE, our main goal was to describe unambiguously machine-executable aspects of the metadata of an experiment. Nevertheless, any information that can be captured in a type can be represented in the temporal contexts provided by CoPE, i.e. it can exist as signals, events or durations, as we have demonstrated in the code listings in the electronic supplementary material. Here, we make no fundamental distinction between the representation of data and metadata. All relevant information about an experiment is indexed by time and thus linked by overlap on a common time scale.

Parametric polymorphism and first-class functions are generally associated with research-oriented languages such as Haskell and ML rather than mainstream programming languages such as C++ or Java. It is likely that much of CoPE could be implemented in C++, where template metaprogramming implements a static form of parametric polymorphism, and template functors can be used to represent functions. These must all be resolved at compile-time, however; so it is difficult to use dynamically calculated or arbitrarily complex functions. The importance of true first-class functions and parametric polymorphism is becoming increasingly well recognized, and these features are now being implemented in the mainstream programming languages C++, C# and Java. We therefore expect that it will soon be possible to implement the formalism we are proposing in a wide range of programming languages.

Our mathematical definitions are unambiguous and concise, unlike typical definitions written in natural language, and are more powerful than those specified by graphical user interfaces or in formal languages that lack a facility for defining abstractions. Our framework is not only a theoretical formalism, but we show that it can also be implemented as a very practical tool. This tool consists of a collection of computer programs for executing experiments and analyses, and for carrying out data management. By using programs that share data formats, experimentation and analysis can be controlled in a highly efficient manner, and many sources of human error eliminated. Existing tools use differential equations to define dynamic clamp experiments (model reference current injection; [53]) or simulations (X-Windows phase plane; [54]). Here, we show that a general (polymorphic) definition of signals and events, embedded in the lambda calculus, can define a much larger range of experiments and the evidence that they produce. Our experiment definitions have the further advantage that they compose; that is, more complex experiments can be formulated by joining together elementary building blocks.

Our full approach is particularly relevant to the execution of very complex and multi-modal experiments, which may need to be dynamically reconfigured based on previous observations, or to disambiguate difficult judgements about evidence [55]. Even if used separately, however, individual aspects of CoPE can make distinct contributions to scientific methodology. For instance, our ontology for physiological evidence can be used within more conventional programming languages or web applications that facilitate data sharing. In a similar way, the capabilities of CoPE for executing and analysing experiments could provide a robust core for innovative graphical user interfaces. We expect the formalism presented here to be applicable outside neurophysiology. Purely temporal information from other fields could be represented using signals, events and durations, or other kinds of temporal context formalized in type theory. In addition, the concepts of signals, events and durations could be generalized to allow not just temporal but also spatial or spatio-temporal contexts to be associated with specific values. Such a generalization is necessary for CoPE to accommodate data from the wider neuroscience community, including functional neuroanatomy and microscopy, and other scientific disciplines that observe and manipulate spatio-temporal data.

We have argued that observational data, experimental protocols and analyses formulated in CoPE (or similar frameworks) are less ambiguous and more transparent than those described using many current formulations. The structure of CoPE presents additional opportunities for mechanically excluding some types of procedural errors in drawing inferences from experiments. For instance, CoPE could incorporate an extension to simple type theory [56] that adds not only a consistency check for dimensional units, but also powerful aspects of dimensional analysis, such as the Buckingham's π-theorem. Furthermore, in our formulation, experimentally observed values exist as mathematical objects within a computational framework that can be used to define probability distributions. Hierarchical probabilistic notation [57] permits the construction of flexible statistical models for the directly observed data. This means that CoPE could in principle be used to turn such probabilistic models into powerful data analysis tools accessible from within the CoPE formalism. Such data analysis procedures can largely be automated, for instance by calculating parameter estimates using statistical packages such as WinBUGS [58] or Mlwin [59]. Alternatively, it would be possible to compile descriptions of probabilistic models in CoPE to the specification format used by the AutoBayes [60] system, which would then be used to generate efficient code for statistical inference. When run, this code would return data to CoPE. This analysis workflow could be integrated seamlessly into an implementation of CoPE such that both the hierarchical model structure and the returned parameters would be defined by, and tagged with, the appropriate temporal contexts. This methodology could be used to quantify different aspects of uncertainty in the measurements, taking into account all available information, while largely avoiding ad hoc transformations of data. Integrating well-developed statistical tools with data acquisition and manipulation within CoPE would create a powerful platform for validating inferences drawn from physiological experiments.

4. Experimental Procedures

4.1. Language implementation

We have used two different implementation strategies for reasons of rapid development and execution efficiency. For experimentation and simulation, we have implemented a prototype compiler that can execute some programs that contain signals and events defined by mutual recursion, as is necessary for the experiments in this study. The program is transformed by the compiler into a normal form that is translated to an imperative program that iteratively updates variables corresponding to signal values, with a time step that is set explicitly. The program is divided into a series of stages, where each stage consists of the signals and events defined by mutual recursion, subject to the constraints of input–output sources and sinks. This ensures that signal expressions can reference values of other signals at arbitrary time points (possibly in the future) as long as referenced signals are computed in an earlier stage.

In order to calculate a new value from existing observations after data acquisition, we have implemented the calculus of physiological evidence as a domain-specific language embedded in the purely functional programming language Haskell.

For hard real-time dynamic clamp experiments, we have built a compiler back-end targeting the LXRT (user-space) interface to the real-time application interface (RTAI; http://rtai.org) extensions of the Linux kernel, and the Comedi (http://comedi.org) interface to data acquisition hardware. Geometric shapes were rendered using OpenGL (http://opengl.org).

All the codes used for experiments, data analysis and generating figures are available at http://github.com/glutamate/bugpan under the GNU general public licence (http://www.gnu.org/licenses/gpl.html).

4.2. Locust experiments

Locusts were maintained at 1600 m−3 in 50 × 50 × 50 cm cages under a standard light and temperature regime of 12 h light at 36°C : 12 h dark at 25°C. They were fed ad libitum with fresh wheat seedlings and bran flakes. Recordings from locust DCMD neurons were performed as described previously [61]. Briefly, locusts were fixed in plasticine, with their ventral side facing upwards. The head was fixed with wax and the connectives were exposed through an incision in the soft tissue of the neck. A pair of silver wire hook electrodes were placed underneath a connective, and the electrodes and connective enclosed in petroleum jelly. The electrode signal was amplified 1000× and bandpass filtered between 50 and 5000 Hz, before analogue-to-digital conversion at 18 bits and 20 kHz with a National Instruments PCI-6281 board. The locust was placed in front of a 22″ CRT monitor running with a vertical refresh rate of 160 Hz. All aspects of the visual stimulus displayed on this monitor and of the analogue-to-digital conversion performed by the National Instruments board were controlled by programs written in CoPE running on a single computer.

The code for running the trials described in example 1, including relevant metadata, is given in the electronic supplementary material, listing 1.

4.3. Zebrafish experiments

Zebrafish were maintained according to established procedures [62] in approved tank facilities, in compliance with the Animals (Scientific Procedures) Act 1986 and according to University of Leicester guidelines. Intracellular patch-clamp recordings from motor neurons in the spinal cord of a 2-days old zebrafish embryo were performed as described previously [63]. We used a National Instruments PCI-6281 board to record the output from a BioLogic patch-clamp amplifier in current-clamp mode, filtered at 3 kHz and digitized at 10 kHz, with the output current calculated at the same rate by programs written in CoPE targeted to the RTAI back-end (see earlier text). The measured jitter for updating the output voltage was 6s and was similar to that measured with the RTAI latency test tool for the experiment control computer.

The code for the Zebrafish experiment trials, including relevant metadata, is given in the electronic supplementary material, listing 2.

Acknowledgements We thank Jonathan McDearmid for help with the Zebrafish recordings and Angus Silver, Guy Billings, Antonia Hamilton, Nick Hartell and Rodrigo Quian Quiroga for critical comments on the manuscript. This work was funded by a Human Frontier Science Project fellowship to T.N., a Biotechnology and Biological Sciences Research Council grant to T.M. and T.N., a BBSRC Research Development Fellowship to T.M., and Engineering and Physical Sciences Research Council grants to H.N. The author contributions were as follows: T.N. designed and implemented CoPE, carried out the experiments and data analyses, and wrote the draft of the paper. H.N. contributed to the language design, helped clarify the semantics and wrote several sections of the manuscript. T.M. contributed to the design of the experiments and the data analysis, and made extensive comments on drafts of the manuscript. All authors obtained grant funding to support this project.

Footnotes