1. Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).

2. Steinmetz, N. A., Koch, C., Harris, K. D. & Carandini, M. Challenges and opportunities for large-scale electrophysiology with Neuropixels probes. Curr. Opin. Neurobiol. 50, 92–100 (2018).

3. Marder, E. & Bucher, D. Central pattern generators and the control of rhythmic movements. Curr. Biol. 11, R986–R996 (2001).

4. Cullen, K. E. The vestibular system: multimodal integration and encoding of self-motion for motor control. Trends Neurosci. 35, 185–196 (2012).

5. Kim, J. S. et al. Space-time wiring specificity supports direction selectivity in the retina. Nature 509, 331–336 (2014).

6. Olshausen, B.A. & Field, D.J. What is the other 85 percent of V1 doing? in 23 Problems in Systems Neuroscience (eds van Hemmen, J. L. & Sejnowski, T. J.) 182–211 (Oxford Univ. Press, 2006).

7. Thompson, L. T. & Best, P. J. Place cells and silent cells in the hippocampus of freely-behaving rats. J. Neurosci. 9, 2382–2390 (1989).

8. Yamins, D. L. K. & DiCarlo, J. J. Using goal-driven deep learning models to understand sensory cortex. Nat. Neurosci. 19, 356–365 (2016).

9. Botvinick, M. et al. Reinforcement learning, fast and slow. Trends Cogn. Sci. 23, 408–422 (2019).

10. Kriegeskorte, N. & Douglas, P. K. Cognitive computational neuroscience. Nat. Neurosci. 21, 1148–1160 (2018).

11. Rumelhart, D. E., McClelland, J. L. & PDP Research Group. Parallel Distributed Processing (MIT Press, 1988).

12. Sacramento, J., Costa, R. P., Bengio, Y. & Senn, W. Dendritic cortical microcircuits approximate the backpropagation algorithm. Adv. Neural Inf. Proc. Sys. 31, 8735–8746 (2018).

13. Poirazi, P., Brannon, T. & Mel, B. W. Pyramidal neuron as two-layer neural network. Neuron 37, 989–999 (2003).

14. Guerguiev, J., Lillicrap, T. P. & Richards, B. A. Towards deep learning with segregated dendrites. eLife 6, e22901 (2017).

15. Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).

16. Cichy, R. M., Khosla, A., Pantazis, D., Torralba, A. & Oliva, A. Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Sci. Rep. 6, 27755 (2016).

17. Kell, A. J. E., Yamins, D. L. K., Shook, E. N., Norman-Haignere, S. V. & McDermott, J. H. A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron 98, 630–644.e16 (2018).

18. Richards, B. A. & Lillicrap, T. P. Dendritic solutions to the credit assignment problem. Curr. Opin. Neurobiol. 54, 28–36 (2019).

19. Roelfsema, P. R. & Holtmaat, A. Control of synaptic plasticity in deep cortical networks. Nat. Rev. Neurosci. 19, 166–180 (2018).

20. Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Proc. Sys. 25, 1097–1105 (2012).

21. Hannun, A. et al. Deep speech: scaling up end-to-end speech recognition. Preprint at arXiv https://arxiv.org/abs/1412.5567 (2014).

22. Radford, A. et al. Better language models and their implications. OpenAI Blog https://openai.com/blog/better-language-models/ (2019).

23. Gao, Y., Hendricks, L.A., Kuchenbecker, K.J. & Darrell, T. Deep learning for tactile understanding from visual and haptic data. in IEEE International Conference on Robotics and Automation (ICRA) 536–543 (2016).

24. Banino, A. et al. Vector-based navigation using grid-like representations in artificial agents. Nature 557, 429–433 (2018).

25. Finn, C., Goodfellow, I. & Levine, S. Unsupervised learning for physical interaction through video prediction. Adv. Neural Inf. Proc. Sys. 29, 64–72 (2016).

26. Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550, 354–359 (2017).

27. Santoro, A. et al. A simple neural network module for relational reasoning. Adv. Neural Inf. Proc. Sys. 30, 4967–4976 (2017).

28. Khaligh-Razavi, S.-M. & Kriegeskorte, N. Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Comput. Biol. 10, e1003915 (2014).

29. Bashivan, P., Kar, K. & DiCarlo, J. J. Neural population control via deep image synthesis. Science 364, eaav9436 (2019).

30. Pospisil, D. A., Pasupathy, A. & Bair, W. ‘Artiphysiology’ reveals V4-like shape tuning in a deep network trained for image classification. eLife 7, e38242 (2018).

31. Singer, Y. et al. Sensory cortex is optimized for prediction of future input. eLife 7, e31557 (2018).

32. Watanabe, E., Kitaoka, A., Sakamoto, K., Yasugi, M. & Tanaka, K. Illusory motion reproduced by deep neural networks trained for prediction. Front. Psychol. 9, 345 (2018).

33. Wang, J. X. et al. Prefrontal cortex as a meta-reinforcement learning system. Nat. Neurosci. 21, 860–868 (2018).

34. Scellier, B. & Bengio, Y. Equilibrium propagation: bridging the gap between energy-based models and backpropagation. Front. Comput. Neurosci. 11, 24 (2017).

35. Whittington, J. C. R. & Bogacz, R. An approximation of the error backpropagation algorithm in a predictive coding network with local hebbian synaptic plasticity. Neural Comput. 29, 1229–1262 (2017).

36. Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 7, 13276 (2016).

37. Roelfsema, P. R. & van Ooyen, A. Attention-gated reinforcement learning of internal representations for classification. Neural Comput. 17, 2176–2214 (2005).

38. Pozzi, I., Bohté, S. & Roelfsema, P. A biologically plausible learning rule for deep learning in the brain. Preprint at arXiv https://arxiv.org/abs/1811.01768 (2018).

39. Körding, K. P. & König, P. Supervised and unsupervised learning with two sites of synaptic integration. J. Comput. Neurosci. 11, 207–215 (2001).

40. Marblestone, A. H., Wayne, G. & Kording, K. P. Toward an integration of deep learning and neuroscience. Front. Comput. Neurosci. 10, 94 (2016).

41. Raman, D. V., Rotondo, A. P. & O’Leary, T. Fundamental bounds on learning performance in neural circuits. Proc. Natl Acad. Sci. USA 116, 10537–10546 (2019).

42. Neyshabur, B., Li, Z., Bhojanapalli, S., LeCun, Y. & Srebro, N. The role of over-parametrization in generalization of neural networks. in International Conference on Learning Representations (ICLR) 2019 https://openreview.net/forum?id=BygfghAcYX (2019).

43. Wolpert, D. H. & Macready, W. G. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1, 67–82 (1997).

44. Bengio, Y. & LeCun, Y. Scaling learning algorithms towards AI. in Large-Scale Kernel Machines (eds Bottou, L., Chapelle, O., DeCoste, D. & Weston, J.) chapter 14 (MIT Press, 2007).

45. Neyshabur, B., Tomioka, R. & Srebro, N. In search of the real inductive bias: on the role of implicit regularization in deep learning. Preprint at arXiv https://arxiv.org/abs/1412.6614 (2014).

46. Snell, J., Swersky, K. & Zemel, R. Prototypical networks for few-shot learning. Adv. Neural Inf. Proc. Sys. 30, 4077–4087 (2017).

47. Ravi, S. & Larochelle, H. Optimization as a model for few-shot learning.in International Conference on Learning Representations (ICLR) 2017 https://openreview.net/forum?id=rJY0-Kcll (2017).

48. Zador, A. M. A critique of pure learning and what artificial neural networks can learn from animal brains. Nat. Commun. 10, 3770 (2019).

49. Bellec, G., Salaj, D., Subramoney, A., Legenstein, R. & Maass, W. Long short-term memory and learning-to-learn in networks of spiking neurons. Adv. Neural Inf. Proc. Sys. 31, 787–797 (2018).

50. Huang, Y. & Rao, R. P. N. Predictive coding. Wiley Interdiscip. Rev. Cogn. Sci. 2, 580–593 (2011).

51. Williams, R. J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992).

52. Klyubin, A.S., Polani, D. & Nehaniv, C.L. Empowerment: A universal agent-centric measure of control. in 2005 IEEE Congress on Evolutionary Computation 128–135 (IEEE, 2005).

53. Salge, C., Glackin, C. & Polani, D. Empowerment–an introduction. in Guided Self-Organization: Inception (ed. Prokopenko, M.) 67–114 (Springer, 2014).

54. Newell, A. & Simon, H.A. GPS, a Program that Simulates Human Thought. https://apps.dtic.mil/docs/citations/AD0294731 (RAND Corp., 1961).

55. Nguyen, A., Yosinski, J. & Clune, J. Understanding neural networks via feature visualization: a survey. Preprint at arXiv https://arxiv.org/abs/1904.08939 (2019).

56. Kebschull, J. M. et al. High-throughput mapping of single-neuron projections by sequencing of barcoded RNA. Neuron 91, 975–987 (2016).

57. Kornfeld, J. & Denk, W. Progress and remaining challenges in high-throughput volume electron microscopy. Curr. Opin. Neurobiol. 50, 261–267 (2018).

58. Lillicrap, T. P. & Kording, K. P. What does it mean to understand a neural network? Preprint at arXiv https://arxiv.org/abs/1907.06374 (2019).

59. Olshausen, B. A. & Field, D. J. Natural image statistics and efficient coding. Network 7, 333–339 (1996).

60. Hyvärinen, A. & Oja, E. Simple neuron models for independent component analysis. Int. J. Neural Syst. 7, 671–687 (1996).

61. Oja, E. A simplified neuron model as a principal component analyzer. J. Math. Biol. 15, 267–273 (1982).

62. Intrator, N. & Cooper, L. N. Objective function formulation of the BCM theory of visual cortical plasticity: Statistical connections, stability conditions. Neural Netw. 5, 3–17 (1992).

63. Fiser, A. et al. Experience-dependent spatial expectations in mouse visual cortex. Nat. Neurosci. 19, 1658–1664 (2016).

64. Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).

65. Momennejad, I. et al. The successor representation in human reinforcement learning. Nat. Hum. Behav. 1, 680–692 (2017).

66. Nayebi, A. et al. Task-driven convolutional recurrent models of the visual system. Adv. Neural Inf. Proc. Sys. 31, 5290–5301 (2018).

67. Schrimpf, M. et al. Brain-Score: which artificial neural network for object recognition is most brain-like? Preprint at bioRxiv https://doi.org/10.1101/407007 (2018).

68. Kepecs, A. & Fishell, G. Interneuron cell types are fit to function. Nature 505, 318–326 (2014).

69. Van Essen, D.C. & Anderson, C.H. Information processing strategies and pathways in the primate visual system. in Neural Networks: Foundations to Applications. An Introduction to Neural and Electronic Networks (eds Zornetzer, S. F., Davis, J. L., Lau, C. & McKenna, T.) 45–76 (Academic Press, 1995).

70. Lindsey, J., Ocko, S. A., Ganguli, S. & Deny, S. A unified theory of early visual representations from retina to cortex through anatomically constrained deep CNNs. in International Conference on Learning Representations (ICLR) Blind Submissions https://openreview.net/forum?id=S1xq3oR5tQ (2019).

71. Güçlü, U. & van Gerven, M. A. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35, 10005–10014 (2015).

72. Kwag, J. & Paulsen, O. The timing of external input controls the sign of plasticity at local synapses. Nat. Neurosci. 12, 1219–1221 (2009).

73. Bittner, K. C., Milstein, A. D., Grienberger, C., Romani, S. & Magee, J. C. Behavioral time scale synaptic plasticity underlies CA1 place fields. Science 357, 1033–1036 (2017).

74. Lacefield, C. O., Pnevmatikakis, E. A., Paninski, L. & Bruno, R. M. Reinforcement learning recruits somata and apical dendrites across layers of primary sensory cortex. Cell Rep. 26, 2000–2008.e2 (2019).

75. Williams, L. E. & Holtmaat, A. Higher-order thalamocortical inputs gate synaptic long-term potentiation via disinhibition. Neuron 101, 91–102.e4 (2019).

76. Yagishita, S. et al. A critical time window for dopamine actions on the structural plasticity of dendritic spines. Science 345, 1616–1620 (2014).

77. Lim, S. et al. Inferring learning rules from distributions of firing rates in cortical neurons. Nat. Neurosci. 18, 1804–1810 (2015).

78. Costa, R. P. et al. Synaptic transmission optimization predicts expression loci of long-term plasticity. Neuron 96, 177–189.e7 (2017).

79. Zolnik, T. A. et al. All-optical functional synaptic connectivity mapping in acute brain slices using the calcium integrator CaMPARI. J. Physiol. (Lond.) 595, 1465–1477 (2017).

80. Scott, S. H. Optimal feedback control and the neural basis of volitional motor control. Nat. Rev. Neurosci. 5, 532–546 (2004).

81. Krakauer, J. W., Ghazanfar, A. A., Gomez-Marin, A., MacIver, M. A. & Poeppel, D. Neuroscience needs behavior: correcting a reductionist bias. Neuron 93, 480–490 (2017).

82. Zylberberg, J., Murphy, J. T. & DeWeese, M. R. A sparse coding model with synaptically local plasticity and spiking neurons can account for the diverse shapes of V1 simple cell receptive fields. PLoS Comput. Biol. 7, e1002250 (2011).

83. Chalk, M., Tkačik, G. & Marre, O. Inferring the function performed by a recurrent neural network. Preprint at bioRxiv https://doi.org/10.1101/598086 (2019).

84. Cadieu, C. F. et al. Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput. Biol. 10, e1003963 (2014).

85. Golub, M. D. et al. Learning by neural reassociation. Nat. Neurosci. 21, 607–616 (2018).

86. Fukushima, K. Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193–202 (1980).

87. Vogels, T. P., Rajan, K. & Abbott, L. F. Neural network dynamics. Annu. Rev. Neurosci. 28, 357–376 (2005).

88. Koren, V. & Denève, S. Computational account of spontaneous activity as a signature of predictive coding. PLoS Comput. Biol. 13, e1005355 (2017).

89. Advani, M. S. & Saxe, A. M. High-dimensional dynamics of generalization error in neural networks. Preprint at arXiv https://arxiv.org/abs/1710.03667 (2017).

90. Amit, Y. Deep learning with asymmetric connections and Hebbian updates. Front. Comput. Neurosci. 13, 18 (2019).

91. Lansdell, B. & Kording, K. Spiking allows neurons to estimate their causal effect. Preprint at bioRxiv https://doi.org/10.1101/253351 (2018).

92. Werfel, J., Xie, X. & Seung, H. S. Learning curves for stochastic gradient descent in linear feedforward networks. Adv. Neural Inf. Proc. Sys. 16, 1197–1204 (2004).

93. Samadi, A., Lillicrap, T. P. & Tweed, D. B. Deep learning with dynamic spiking neurons and fixed feedback weights. Neural Comput. 29, 578–602 (2017).

94. Akrout, M., Wilson, C., Humphreys, P. C., Lillicrap, T. & Tweed, D. Using weight mirrors to improve feedback alignment. Preprint at arXiv https://arxiv.org/abs/1904.05391 (2019).

95. Bartunov, S. et al. Assessing the scalability of biologically-motivated deep learning algorithms and architectures. Adv. Neural Inf. Proc. Sys. 31, 9368–9378 (2018).

96. MacKay, D.J. Information Theory, Inference and Learning Algorithms (Cambridge University Press, 2003).

97. Goel, V., Weng, J. & Poupart, P. Unsupervised video object segmentation for deep reinforcement learning. Adv. Neural Inf. Proc. Sys. 31, 5683–5694 (2018).

98. LeCun, Y. & Bengio, Y. Convolutional networks for images, speech, and time series. in The Handbook of Brain Theory and Neural Networks (ed. Arbib, M. A.) 276–279 (MIT Press, 1995).

99. Chorowski, J. K., Bahdanau, D., Serdyuk, D., Cho, K. & Bengio, Y. Attention-based models for speech recognition. Adv. Neural Inf. Proc. Sys. 28, 577–585 (2015).