CHAPTER 1. INTR ODUCTION

so has the complexity of the tasks that they can solve. Go o dfellow et al. ( 2014d )

sho w ed that neural netw orks could learn to output an entire sequence of characters

transcrib ed from an image, rather than just identifying a single ob ject. Previously ,

it w as widely b elieved that this kind of learning required lab eling of the individual

elemen ts of the sequence ( Gülçehre and Bengio , 2013 ). Recurrent neural netw orks,

suc h as the LSTM sequence mo del mentioned ab ov e, are now used to mo del

relationships b et w een se quenc es and other se quenc es rather than just ﬁxed inputs.

This sequence-to-sequence learning seems to be on the cusp of rev olutionizing

another application: machine translation ( Sutskev er et al. , 2014 ; Bahdanau et al. ,

2015 ).

This trend of increasing complexit y has b een pushed to its logical conclusion

with the in tro duction of neural T uring machines ( Grav es et al. , 2014 ) that learn

to read from memory cells and write arbitrary conten t to memory cells. Suc h

neural net w orks can learn simple programs from examples of desired b ehavior. F or

example, they can learn to sort lists of num b ers giv en examples of scrambled and

sorted sequences. This self-programming technology is in its infancy , but in the

future it could in principle b e applied to nearly any task.

Another cro wning ac hiev emen t of deep learning is its extension to the domain of

reinforcemen t learning

. In the context of reinforcement learning, an autonomous

agen t must learn to p erform a task b y trial and error, without any guidance from

the h uman op erator. DeepMind demonstrated that a reinforcemen t learning system

based on deep learning is capable of learning to pla y Atari video games, reac hing

h uman-lev el p erformance on many tasks ( Mnih et al. , 2015 ). Deep learning has

also signiﬁcantly improv ed the p erformance of reinforcemen t learning for rob otics

( Finn et al. , 2015 ).

Man y of these applications of deep learning are highly proﬁtable. Deep learning

is now used by man y top technology companies, including Go ogle, Microsoft,

F aceb o ok, IBM, Baidu, Apple, Adobe, Netﬂix, NVIDIA, and NEC.

A dv ances in deep learning ha v e also dep ended hea vily on adv ances in softw are

infrastructure. Softw are libraries such as Theano ( Bergstra et al. , 2010 ; Bastien

et al. , 2012 ), PyLearn2 ( Go o dfellow et al. , 2013c ), T orc h ( Collob ert et al. , 2011b ),

DistBelief ( Dean et al. , 2012 ), Caﬀe ( Jia , 2013 ), MXNet ( Chen et al. , 2015 ), and

T ensorFlow ( Abadi et al. , 2015 ) hav e all supp orted imp ortan t research pro jects or

commercial pro ducts.

Deep learning has also made con tributions to other sciences. Mo dern conv olu-

tional net w orks for ob ject recognition provide a mo del of visual pro cessing that

neuroscien tists can study ( DiCarlo , 2013 ). Deep learning also pro vides useful to ols

for pro cessing massive amoun ts of data and making useful predictions in scien tiﬁc

25