Two dendrites

A comparison between the two types of learning processes is first examined using a prototypical feedforward network, a perceptron12,13,14 consisting of three input nodes, one output node and three weights with given delays and initial strengths (Fig. 2a). For synaptic learning, the adjustable parameters are the three weight strengths (color coded in Fig. 2a). For dendritic learning the weight strengths (W) are unchanged and the adjustable parameters are the two dendritic strengths (W D in Fig. 2b), connected to the first and to the last two input units, respectively (Fig. 2b). The synaptic and dendritic adaptations are identical and are based on the currently accepted modified Hebbian learning rule, known as spike-time-dependent-plasticity3,4,11. Specifically, the relative change in the strength of a weight, δW, during a learning step is a function of the time-lag, Δ, between an above-threshold stimulation, resulting in an evoked spike, and a stimulation that does not result in an evoke spike, e.g. sub-threshold stimulation. A positive/negative Δ strengthens/weakens a weight following a typical profile (Fig. 2c).

Figure 2 Different stationary firing patterns and weights for learning by links and nodes in a feedforward network. (a) A schema of a perceptron with three input units, connected to one output unit with weights and delays (w, τ, color coded). The relative change in the strength of a weight during a learning step is δW (defined in c). The dynamics of each unit is governed by leaky integrated-and-fire neuron (Methods). (b) The same perceptron and delays as in a but the first input is connected to the output node via the left-dendrite, while the two other inputs are connected via the right-dendrite. The dendritic weights and their relative changes during a learning step are denoted by W D and δW, respectively. The initial weights for both dendrites are W D = 1. (c) Left: A typical profile for δW = 0.05*exp(−|Δ|/15)*sign(Δ) during a learning step, where Δ stands for the time-lag between a spike and a sub-threshold stimulation, measured in ms. Right: Scenarios for positive/negative Δ, spikes colored in orange and sub-threshold stimulations are denoted by (green and red) hills. (d) An example of the initial three weights (color coded), where the input units are simultaneously stimulated at 10 Hz. Left: The dynamical evolution of the three weights in a (bottom) and the firing timings of the output unit (dots at upper part), colored following the origin of the above-threshold stimulation. Right: Results for b where initially W D = 1. (e) Similar to d but with different initial weights. The stationary firing patterns in d and e are the same for synaptic learning in a, but differ for dendritic learning in b. Full size image

The input units are stimulated above threshold simultaneously at 10 Hz and the following standard leaky integrate-and-fire model15,16 is used to evaluate the dynamics of the output neuron for both scenarios (Fig. 2a,b). Specifically, we simulated a perceptron consisting of N = 3 excitatory leaky integrate and fire input neurons and one output neuron. The voltage V(t) of the output neuron is given by the equation:

$$\frac{dV}{dt}=-\frac{V-{V}_{st}}{\tau }+\sum _{i=1}^{N}({W}_{i}\ast {W}_{Di})\sum _{n}\delta (t-{t}_{i}(n)-{d}_{i})$$ (1)

where W i and W Di are the connection’s and dendrite’s strength from neuron i to the output neuron, respectively. d i is the delay from neuron i to the output neuron, and N stands for the number of input neurons. τ = 20 ms is the membrane time constant and V st = −70 mV stands for the stable membrane (resting) potential. The summation over t i (n) sums all the firing times of neuron i. A neuronal threshold is defined as V th = −54 mV and a threshold crossing results in an evoked spike. In the event of synaptic learning, for every pair of a sub-threshold stimulation and an evoked spike, the weights, W i , were modulated according to the learning curve (Fig. 2c). Similarly, in the event of dendritic learning, for every pair of a sub-threshold stimulation and an evoked spike, originated from two different dendrites, the dendritic weights, W Di , were modulated following the learning curve (Fig. 2c, see Methods for more details).

For synaptic learning (Fig. 2d, left), evoked spikes are initially generated by the orange-weight, and the preceding/later sub-threshold stimulation weakens/strengthens the green/red weight, respectively (Fig. 2c). Asymptotically, the green-weight vanishes and the red-weight is at threshold and the perceptron repeatedly generates pairs of evoked spikes (Fig. 2d, left).

For dendritic learning (Fig. 2d, right), an evoked spike is generated by the left dendrite after 12 ms (orange) and two sub-threshold stimulations arrive via the right dendrite after 7 ms (green) and 15 ms (red), 5 ms before and 3 ms after an evoked spike, respectively. Consequently, the right dendrite is strengthening on the average (Fig. 2c). Asymptotically, all three effective weights, W*W D , are above-threshold and generate triplets of evoked spikes (Fig. 2d, right).

The same perceptron but with different initial weights results for synaptic learning in the same firing pattern and weight strengths (Fig. 2d,e, left). For dendritic learning (Fig. 2e, right), the right-dendrite strengthens such that the red-weight is effectively above-threshold, while the green-weight is still sub-threshold (Fig. 2e, right at ~15 s). The firing patterns consist now of orange-red pairs of spikes (Fig. 2e, right), however the learning process proceeds. The green sub-threshold stimulation arrives before the orange-spike, resulting in the weakening of the right-dendrite and the termination of red-spikes. Now the right-dendrite is again strengthening as in Fig. 2d and so forth, resulting in a complex firing pattern with a longer periodicity.

Examples presented in Fig. 2d,e hint on two major differences between the two learning scenarios. For the same architecture but different initial weights, synaptic learning tends to stabilize on the same firing pattern, whereas dendritic leaning may result in a variety of firing patterns. In addition, synaptic learning drives weights to extreme limits17,18, vanishing or threshold, whereas dendritic learning enables stabilization around intermediate values.

Three dendrites

An extension to a perceptron with seven inputs (N = 7 in the abovementioned equation) and with three dendrites enriches the fundamental differences between the two adaptive dynamics (Fig. 3a,b). The seven delays (Fig. 3a,b, bottom) and initial weights (Fig. 3c 1 and d 1 , top) are identical for both scenarios and the input units are simultaneously stimulated above-threshold at 10 Hz. Weights in synaptic learning are driven again toward vanishing or threshold limits (Figs 3c 1 and 2d,e), however, dendritic learning reveals a new phenomenon, oscillatory behavior of the weights. These trends are explained using several snapshots of the effective weights, color coded and ordered following their delays, representing different stages of the dynamics (Fig. 3c 2 and d 2 ). Since the neuronal voltage has a decay time to the resting potential after an input arrival (Methods), a necessary condition to generate an evoked spike is an effective weight which reaches \(\widetilde{{\rm{Th}}}\), the difference between the threshold and the current neuronal voltage. For synaptic learning, initially only the dark-orange weight is at threshold (panel A in Fig. 3c 2 ). Following the learning rule (Fig. 2c), the strengths of all longer/shorter delays increase/decrease (panel B in Fig. 3c 2 ), until only vanishing or weights at threshold remain (panel C in Fig. 3c 2 ).

Figure 3 Dendritic learning as a self-controlled mechanism for oscillating weights, governed by the weak links. (a) A schema of a perceptron with seven inputs with weights and delays (w, τ, color coded). Changes in weights during the learning are defined in Fig. 2c, and the dynamics of the output is governed by leaky integrated-and-fire neuron (Methods). (b) A similar perceptron and delays as in a, but the output unit has three dendrites (color coded), red/green/orange connecting 3/2/2 input units, respectively, and with given initial dendritic weights, W D . (c 1 ) The initial seven weights in a are denoted (top, color coded). The input units are simultaneously stimulated at 10 Hz and the resulting dynamical evolution of the seven adaptive weights is presented. (c 2 ) Schematic presentation of the seven weights, ordered following their delays, with respect to \(\widetilde{{\rm{Th}}}\), the difference between the threshold and the current neuronal voltage, at three denoted timings (A–C) in c 1 . (d 1 ) Similar to c 1 where the same initial seven weights are now time-independent and the three dendritic weights in b are dynamically updated. (d 2 ), Similar to c 2 where the seven effective weights W*W D (color coded following their dendrites and ordered following their delays) are presented at five denoted timings (A–E) in d 1 . (e) and (f) present similar results as in c 1 and d 1 respectively, for a different set of seven initial weights. Full size image

For dendritic learning, initially only the effective orange weight (W*W D ) is at threshold (panel A in Fig. 3d 2 ), generating a spike 40 ms after each input stimulation. Consequently, the red-dendrite and effectively its three incoming weights are strengthening (panel B in Fig. 3d 2 ), since its nearby sub-threshold input, via the 50 ms pink-weight, arrives 10 ms later (Fig. 3b). Similarly, the strength of the green-dendrite decreases as it generates sub-threshold stimulations prior to the evoked spikes. Spikes are now generated after 5 ms and also after 50 ms (panel B in Fig. 3d 2 ) and the strength of the orange-dendrite rapidly decreases, since its sub-threshold input arrives just before, after 46 ms (The origin of the orange decay-slope shape is demonstrated in Fig. S1). The red-evoked spike at 5 ms is now rapidly strengthening the green-dendrite (with 20 ms and 25 ms delays) until generating evoked spikes (panels B and C in Fig. 3d 2 ). The 10 ms red-weight sub-threshold stimulations, arriving before the green-spikes, weaken the red dendrite and the red-spikes terminate (panel C in Fig. 3d 2 ). In addition, the orange-dendrite is strengthening and finally generates evoked spikes, as its sub-threshold stimulations arrive after the green-spikes (panel D in Fig. 3d 2 ). Now green-spikes terminate, as a result of green-sub-threshold stimulation at 25 ms, prior to the orange-spikes (panels D,E in Fig. 3d 2 ). A loop of the weight strengths emerges (panels E and A in Fig. 3d 2 ) generating an oscillatory behavior (Fig. 3d 1 ). Identical architectures (Fig. 3a,b) but with different initial weights (Fig. 3e,f) result again in extreme limit weights for synaptic learning, but with a different oscillatory behavior for dendritic learning.

Synaptic learning terminates in vanishing or threshold weights, independent of the initial conditions (Figs 2 and 3) and represents an unrealistic biological reality. In addition, the large fraction of very weak weights has practically no impact at all on the dynamics. In contrast, dendritic learning can stabilize weights with intermediate strengths (Fig. 2e, right) and oscillatory behaviors (Fig. 3d 1 ,f) which are significantly and instantaneously governed by the sub-threshold stimulations originated from the weak effective couplings.

Experimental results

The reality of the theoretical concept of dendritic learning receives a support from the following new type of in-vitro experiments, where synaptic blockers are added to neuronal cultures such that sparse synaptic connectivity is excluded (Methods). A multi-electrode array (Fig. S2) is used to stimulate extracellularly a patched neuron19 via its dendrites (Fig. 4a). An online method is used to identify a subset of extracellular electrodes which reliably generate intracellularly evoked spikes (Fig. 4b). Low stimulation rates (e.g. 1 Hz) ensure stable neuronal response latencies, NRL, measuring the time-lag between the extracellular stimulation and the intracellularly recorded evoked spike, which is crucial for controlling the relative timings between pairs of intra- and extra- stimulations20.

Figure 4 Experimental results indicating enhanced dendritic learning following the relative timings of the neuronal anisotropic inputs, similar to the mechanism currently attributed to links. (a 1 ) A zoom-in of a micro-electrode array (MEA) consisting of 60 extracellular electrodes separated by 200 µm, indicating a patched neuron by an intracellular electrode (orange) and a nearby extracellular electrode (green). (a 2 ) A patched neuron and its dendrites (red), growing to different directions and in proximity to extracellular electrodes are presented using reconstruction of a fluorescence image (Methods). (b 1 ) A procedure for finding reliable evoked spikes by a subset of extracellular electrodes. The 60 electrodes are stimulated serially several times at 2 Hz (presented here once) and the recorded intracellular voltage (blue in case of evoked spike) is presented (Methods). (b 2 ) A zoom-in of the pink area in b 1 , presenting evoked spikes originated from different extracellular stimulating electrodes and the NRL. (b 3 ) A zoom-in of 4 electrodes from b 1 , showing electrodes with/without recorded evoked spikes. (c) A typical stimulation scheduling of the learning process of an above-threshold intracellular stimulation (orange) and a sub-threshold extracellular stimulation (green), separated by a fixed time delay, 2–5 ms (Methods), and such pairs are given 50 times at 1 Hz. (d) A comparison between the spike waveform resulted from an intracellular stimulation (red) where the same intracellular stimulation is followed by an adjacent (scheduling at the bottom) extracellular sub-threshold stimulations (light-blue). (e) An intracellular voltage recording of a patched neuron stimulated extracellularly 5 times at 1 Hz at each noted stimulation amplitude (bottom). Left: Initial response before training. Right: Several minutes after training (Methods). δ measures the height of the voltage peak (local depolarization), averaged over 5 stimulations with a given amplitude, in comparison to the resting potential, indicting an enhancement of δ by 200–300% by learning. (f) Similar to e where stimulations were given at 0.5 Hz. The effect of the learning is expressed by the appearance of spikes after training instead of small depolarization before (at 500 mV). Full size image

The learning process is based on a training set of typically 50 pairs of stimulations, an above-threshold intracellular stimulation followed by an extracellular stimulation which does not result in evoke spikes, e.g. sub-threshold (Fig. 4c), arriving after a predefined delay, typically 2–5 ms to enhance possible adaptation (Fig. 2c, Methods). We take into account only experimental realizations where a local depolarization was visible by a consecutive sub-threshold stimulation to the above-threshold one (Fig. 4d). The demonstrated results were quantitatively repeated tens of times on many cultures (see statistical analysis in Methods).

The intracellular voltage recordings of a patched neuron stimulated extracellularly before and a few minutes after training (Fig. 4c) presents a significant effect of the learning in the form of 200–300% increase in the local depolarization (Fig. 4e). This learning effect emerges only a few minutes after the termination of the training procedure and was found to be stable and persistent over longer periods (by repeated measurements of solely extracellular stimulations over tens of minutes). Another evidence for such learning is the enhancement of the effect of extracellular stimulations from small local depolarization to evoked spikes recorded intracellularly (Fig. 4f). Note that before training, the responsiveness of neurons was found to be time-independent and over tens of minutes.

A reverse learning procedure, presenting the sub-threshold stimulation prior to the above-threshold one, was also examined in tens of experiments, indicating no effect or weakening of the local depolarization, but no strengthening (Fig. 5). It suggests, as indicated by some preliminary results, the possibility to first strengthen and then weaken the local depolarization, using sequential learning and reverse learning.