The advent of inexpensive, high-quality recording equipment and the maturation of deep learning present an incredible opportunity for neuroscience; but the field needs to develop theoretical frameworks and computational tools to interpret the impending onslaught of data

We are all familiar with the standard paradigm in systems neuroscience: Study a simplified behavior in a model organism while measuring neural activity in brain structures thought to subserve the behavior. Analyze responses from a single neuron, or more recently, a population of neurons in one area, and correlate those responses back to behavior. But have you ever wondered what systems neuroscience will look like when every single laboratory is recording from many thousands of neurons simultaneously? The potential of these technologies is exciting — we finally have the opportunity to determine if we can understand how populations of neurons give rise to the computations that lead to behavior. But if we are to have a hope of answering that question, we need to start thinking through how we will manage the flood of data such technologies will produce. Without theoretical frameworks and computational tools, neuroscience will drown in the deluge. Two advances are making the impending data surge an important issue to address now. The first is the advent of widely available, low-cost recording technologies that are easy to use, such as the Neuropixels probes. These probes promise high neuron count, high temporal resolution measurements that experimentalists can quickly and easily deploy in a wide number of animal models. They can be combined with optical imaging to study neural activity across the brain and over time. The second is the maturation of deep learning, a catchphrase for a collection of very powerful artificial neural network concepts, along with the software and hardware that power them. Driven by algorithmic innovation and computing advances over the last decade, deep learning has emerged as an important framework for modeling huge quantities of high-dimensional, time-series data. David Sussillo

Making progress without a theory Technologies like Neuropixels make it possible to address the big question in systems neuroscience: how do large populations of neurons give rise to behavior? But we currently have no useful theory of biological neural computation to organize our thinking about this question or to guide the design of new experiments. Yet, in the absence of such a theory, we still need to think about the most important measurements to make. Here are some possible first steps: Do what we do now, but record from more neurons across more brain regions. Researchers are already using technologies such as Neuropixels to measure brain-wide responses to simple, well-studied behaviors. These types of studies will give insight into how neurons in different brain regions work together to perform computations. Do what we do now, but make the tasks we study more complex. To capture the brain’s full computational repertoire, we may need to use much more complex behavioral tasks. Of course, more complex tasks will also produce more complex data, which will in turn be more challenging to interpret. Systematize. We could also focus on previously understudied brain structures. So much of the brain is essentially uncharted territory. Recording methodologies that increase our understanding of population responses in critical structures such as the thalamus, basal ganglia or cerebellum will be of immense importance. Deep learning as a powerful tool for neuroscience These types of experiments will produce huge volumes of data, but the field generally lacks sophisticated ways to analyze simultaneous recordings from thousands of neurons. As a computational neuroscientist, I have focused on developing deep-learning tools to help neuroscientists make sense of this data. In analogy to biological neural networks, artificial neural networks (ANNs) are composed of simple nonlinear elements, they are potentially recurrently connected, and the synapses that connect the units determine the computation. The gross structural similarities between artificial and biological neural networks provide some justification that ANNs may provide insight into the functioning of biological networks. However, further theoretical work is needed here to understand what architectural details of an ANN better align its population activity with that of a biological neural network. One successful deep-learning approach involves modeling behavioral tasks. Researchers build an ANN and optimize it to solve a task analogous to the one studied in animals. They then compare the internals of the trained ANN with the biological neural recordings, typically smoothed spike trains of a population of neurons. If there are quantitative similarities, researchers can then attempt to reverse-engineer the ANN in order to develop a mechanistic explanation for how the ANN solves the task. The insights found in the ANN can lead to testable hypotheses for how the biological network implements the behavior. For example, we first used this approach in 2013 to understand how the brain might implement a context-dependent sensory integration task. Others are using neural networks to study visual processing.

In the tradition modeling framework (top), a scientist trains an animal ('system', orange, right) to do a task (yellow) and collects neural data (green, right). The scientist then guesses the mechanism that generates the data (cyan) and builds a model (orange, left) by hand that incorporates that mechanism, from which model data is generated (green, left). If the model data and the system data are quantitatively similar, then the scientist argues that the model provides a hypothesis of how the system works. In the optimized modeling framework (bottom), a scientist trains both an animal ('system', orange, right) and an artificial neural network model (ANN) on the same or an analogous task (yellow). The scientist then generates data from both the system and the optimized model (green, right and left, respectively). If the model data and the system data are quantitatively similar, then the scientist attempts to discover the mechanism (cyan) in the optimized model. That mechanism provides a hypothesis of how the optimized model works. Credit: David Sussillo