The most effective forms of learning in large networks of neurons rely on mechanisms that adjust synaptic weights according to errors that are detected further downstream14,39. In re-examining the conditions under which neural networks can exhibit such forms of deep learning, we have identified a new algorithm that we call feedback alignment. We show that in its simplest form, feedback alignment is able to make use of fixed, random connectivity patterns to update synaptic weights throughout a network. To our surprise, even with such minimal constraints on connectivity patterns, feedback alignment can achieve learning performances that are comparable to the backpropagation of error algorithm. Critically, this demonstrates that the kind of precise symmetric connectivity between layers of neurons that is required by backprop, is not essential to achieve effective transmission of errors across layers. In characterizing the performance of feedback alignment, we first demonstrated that the algorithm is effective in using error signals to update synaptic weights in simple linear and nonlinear networks. We then showed that feedback alignment is also effective in larger networks that incorporate multiple hidden layers and in networks that exhibit sparse connectivity or impose more realistic constraints on how activity is represented. Finally, our investigations into how feedback alignment works suggest that the algorithm’s power relies on the fact that the weight matrices of the forward going synapses evolve to align approximately with those in the feedback pathway. Taken together, our study reveals much lower architectural constraints on what is required for error propagation across layers of neurons and thus provides insights into how neural circuits might support fast learning in large deep networks.

Feedback alignment offers a surprising and simple solution to the problem of synaptic ‘weight transport’. As with many forms of learning that have been proposed to occur in the brain, it makes use of the idea that teaching signals could be carried by reciprocal connections7,21,24,25,29,40. However, in the case of feedback alignment we have shown that this does not depend on detailed symmetric reciprocal connectivity, and yet it is still able to train large networks quickly. There are, of course, many outstanding questions regarding how the brain could utilize learning processes that rely on error propagation to adapt upstream synaptic connections in a network21,22. This includes how exactly the brain computes and represents errors, and how the feedforward and feedback pathways might interact with one another. These issues are relevant for understanding any form of supervised learning and are not unique to the algorithm we describe. Nevertheless, these questions are important when considering the potential biological context for feedback alignment or future, related algorithms. In terms of error, an important question has been where the brain obtains labelled data for training a supervised system. A key insight has been that rather than requiring an external teacher, errors can result from mismatches between expected and actual perceptions, or between intended and realized motor consequences4,30,40. For example, it is possible to derive teaching signals from sensory input by trying to predict one modality from another, by trying to predict the next term in a temporal sequence40,41 or by trying to encode and reconstruct sensory information42,43. These processes can be thought of as supervised tasks, with the sensory activity itself playing the role of the teacher7,40,43,44. Indeed, experimental data from a range of systems have shown that neuronal populations represent prediction mismatch and motor errors in their activity1,2,3,5,6,7,8,45,46,47,48.

As with other forms of hierarchical learning, an important question is how feedforward and feedback pathways interact with one another in the brain. It is well established that there are extensive feedback pathways that carry information from ‘higher’ areas to ‘lower’ sensory areas and these connections have been shown to modulate the tuning properties and therefore the activity of neurons in lower areas49,50. It therefore seems likely (perhaps inevitable) that this top-down modulation of neuronal activity will impact the learning that goes on in the lower area neuron’s synapses. Indeed, recent work has demonstrated that learning in sensorimotor tasks alters representations in earlier cortical areas51. For higher layers to deliver error information that could enable lower layers to make useful changes to their synaptic weights, neurons in the lower layers should, at least in part, be able to differentiate a top-down error signal from activity originating in the forward pathway. Thus, a prediction is that one of the functions of the backward pathway is to ultimately modulate plasticity processes at the synapses of a neuron in the forward pathway. In this regard, it is interesting that experimental evidence has shown that various ‘third-factors’ can modulate the magnitude and sign of synaptic plasticity mechanisms. Depolarizing inputs arriving at specific times and/or subcellular compartments33,34,35, neuromodulators52,53 and different types of synapse37,38 can all regulate plasticity resulting from the pairing of pre- and post-synaptic activity. For example, Sjöström et al.33 demonstrated that Hebbian learning protocols that result in long-term potentiation at neocortical synapses can be altered to result in long-term depression if they occur simultaneously with local, subthreshold depolarizing inputs into the post-synaptic dendrite33,35. And more recently, an empirically grounded learning mechanism has been proposed in which forward and teaching signals are delivered concurrently into dendritic and somatic compartments, respectively36.

These observations suggest that there are a variety of plasticity mechanisms that would enable feedforward and feedback pathways to interact during learning. Indeed, any task-driven learning will require mechanisms that serve to modulate ongoing plasticity. Reinforcement learning, for example, requires the delivery of a global signal that can be thought of as a widespread third factor for regulating ongoing synaptic plasticity10,11. At the other end of the spectrum, a learning algorithm such as backprop would require a much more highly orchestrated computation and delivery of third factors to individual neurons in the hidden layer. By contrast, feedback alignment represents a surprising middle ground, in that it has many of the performance advantages of backprop, but it markedly reduces the complexity of the machinery for computing and delivering third factors: the modulatory signals in feedback alignment can be delivered via random connections by one or many neurons in the backward pathway, to one or many neurons in hidden layers, and the modulatory signals are themselves computed on the basis of random connections in the backward pathway.

Nevertheless, there remains many questions about how the brain uses error signals that are passed across multiple layers of neurons. For example, in our simplified models (for example, Fig. 2), error signals modulate the synaptic strengths of feedforward connections without affecting the post-synaptic activities. Whilst various third factors could play a role in delivering error signals without significantly altering activity in the forward path (see above), it seems more likely that feedback in real neuronal circuits will influence the post-synaptic activity in lower layers. Feedback alignment uses top-down connections to make small and gradual adjustments to the synaptic weights so that future data are processed in a more optimal manner. On a much faster timescale however, the same top-down connections are likely to be important for improving inference on the current inputs (that is, hidden variable estimation). A challenge for future work therefore, is to understand how top-down connections can be used to simultaneously support inference and learning15,40. A related question for the field is how error signals from higher layers can be integrated with bottom-up, unsupervised learning rules43,54.

A key insight from machine learning work is that the most powerful learning algorithms use some form of error propagation for gradient estimation, and that without gradient-based algorithms such as backprop, learning remains intractably slow for difficult problems. Recent advances in supervised learning have achieved state-of-the-art and even human-level performance by training deep networks on large data sets by applying variants of the backprop algorithm55,56. The most effective forms of reinforcement and unsupervised learning also rely on the ability to transmit detailed error information across multiple layers of neurons15,16. Recently, reinforcement learning has been used to achieve impressive results using simple temporal-difference error signals, but these results hinge crucially on backpropagation. These reinforcement signals are not delivered as a global modulatory signal, but are carefully backpropagated through the deep network that supports behaviour16. Unsupervised learning algorithms that obtain state-of-the-art results also rely on backprop, such as variational auto-encoders and networks that predict perceptual information in a sequence15,17.

Whilst theoretical, these advances in machine learning provide a context in which to examine different learning processes in the brain. In particular, they strengthen the case for looking beyond naive learning rules that broadcast the same global scalar summary of error to every neuron. Such rules are, on their own, likely too slow for training very large deep networks to perform difficult tasks. In this context it is again useful to think of feedback alignment as one of many algorithms that lie on a ‘spectrum’ between naive global updates and precise gradient-based updates. An interesting point on this spectrum is work showing that reinforcement learning on binary decision tasks can be sped up if, in addition to a global scalar reward, each neuron also receives information about the population decision11. An earlier study examined using asymmetric weights in the context of simple classification tasks solved via attention-gated reinforcement learning32, although this approach still made use of a global scalar reward. Moving a little closer to backprop, feedback alignment is extremely simple and makes few demands on connectivity, and yet it quickly learns to deliver useful estimates of the gradient tailored to individual neurons. Indeed, it is reasonable to suppose that there is a large family of algorithms that the brain might exploit to speed up learning by passing expectation or error information between layers of neurons. Although we have found that random feedback connections are remarkably good at conveying detailed error signals, we anticipate future algorithms that will better capture the details of neural circuits and incorporate different mechanisms for delivering effective teaching signals. Indeed, recent work presents further evidence that weight symmetry is not crucial for effective error propagation57 (Supplementary Notes 2–9). These experiments highlight the importance of the signs of the delivered gradients and that error propagation via asymmetric connections can be improved by techniques such as batch normalization58.

Our results also hint that more complex algorithms could benefit from the implicit dynamics inherent in feedback alignment, which naturally drive forward synapses into alignment with the backward matrices. For example, these dynamics may work well with architectures or circuits in which B is adjusted as well as W, to further encourage functional alignment between W and B (perhaps by training the backward weights to reproduce the activity of the layer below, as in layer-wise autoencoder training with untied weights). Finally, deep learning innovations may provide insight into other questions that have surrounded the implementation of learning in the brain. For example, while backprop is usually applied in artificial networks that transmit information using continuous rather than discrete stochastic values, recent developments in machine learning suggest roles for ‘spiking’ activities. Not only can backprop-like mechanisms work well in the context of discrete stochastic variables59, random transmission of activities also forms the basis of powerful regularization schemes like ‘dropout’56. These recent insights into learning in large, multi-layer networks provide a rich context for further exploring the potential for feedback alignment and related algorithms, which may help explain fast and powerful learning mechanisms in the brain.

The issue of how the brain might propagate detailed error signals from one region to another is a fundamental question in neuroscience. Recent theories of brain function have suggested that cortex uses hierarchical message passing wherein both predictions and prediction errors are communicated between layers, or areas, of cortex40,44. And recent experimental work has shown that high-level visual response properties in cortex are significantly better explained by models that are optimized by transmitting errors back across many layers of neurons60, than by models that are trained via layer-wise unsupervised learning. In the 1980s, new learning algorithms promised to provide insight into brain function14,21. But the most powerful of these (that is, learning by backpropagation of error) has seemed difficult to imagine implementing in the brain21,22. There are a number of questions about how neural circuits might implement error propagation, but one of the most central and enduring issues concerns the constraints on connectivity patterns between layers—that is, because backprop requires weight transport to tailor error signals for each neuron in a network21,22,24. Our observations and experimental results dispel the central assumptions implicit in the statement of the weight transport problem. Instead, we demonstrate that the constraints on the connectivity required to support effective error transport are much less demanding than previously supposed. Starting with random feedback, standard update rules quickly push the forward pathway into a soft alignment with the fixed feedback pathway, allowing relevant error information to flow. Taken together with recent theoretical and empirical advances, our work supports revisiting the idea of backprop-like learning in the brain and may provide insights into how neural circuits could implement fast learning in large deep networks.