Understanding, analyzing, and generating text with Python

Learn both the theory and practical skills needed to go beyond merely understanding the inner workings of NLP, and start creating your own algorithms or models. From the Foreword by Dr. Arwen Griffioen, Zendesk Natural Language Processing in Action is your guide to creating machines that understand human language using the power of Python with its ecosystem of packages dedicated to NLP and AI. Listen to this book in liveAudio! liveAudio integrates a professional voice recording with the book’s text, graphics, code, and exercises in Manning’s exclusive liveBook online reader. Use the text to search and navigate the audio, or download the audio-only recording for portable offline listening. You can purchase or upgrade to liveAudio here or in liveBook.

About the Technology Recent advances in deep learning empower applications to understand text and speech with extreme accuracy. The result? Chatbots that can imitate real people, meaningful resume-to-job matches, superb predictive search, and automatically generated document summaries—all at a low cost. New techniques, along with accessible tools like Keras and TensorFlow, make professional-quality NLP easier than ever before.

About the book Natural Language Processing in Action is your guide to building machines that can read and interpret human language. In it, you’ll use readily available Python packages to capture the meaning in text and react accordingly. The book expands traditional NLP approaches to include neural networks, modern deep learning algorithms, and generative techniques as you tackle real-world problems like extracting dates and names, composing text, and answering free-form questions.

takes you straight to the book detailed table of contents Table of Contents Acknowledgments Part 1: Wordy machines 1 Packets of thought (NLP overview) 1.1 Natural language vs. programming language 1.2 The magic 1.2.1 Machines that converse 1.2.2 The math 1.3 Practical applications 1.4 Language through a computer’s "eyes" 1.4.1 The language of locks 1.4.2 Regular expressions 1.4.3 A simple chatbot 1.4.4 Another way 1.5 A brief overflight of hyperspace 1.6 Word order and grammar 1.7 A chatbot natural language pipeline 1.8 Processing in depth 1.9 Natural language IQ 1.10 Summary 2 Build your vocabulary (word tokenization) 2.1 Challenges (a preview of stemming) 2.2 Building your vocabulary with a tokenizer 2.2.1 Dot product 2.2.2 Measuring bag-of-words overlap 2.2.3 A token improvement 2.2.4 Extending your vocabulary with n-grams 2.2.5 Normalizing your vocabulary 2.3 Sentiment 2.3.1 VADER — A rule-based sentiment analyzer 2.3.2 Naive Bayes 2.4 Summary 3 Math with words (TF-IDF vectors) 3.1 Bag of words 3.2 Vectorizing 3.2.1 Vector spaces 3.3 Zipf’s Law 3.4 Topic modeling 3.4.1 Return of Zipf 3.4.2 Relevance ranking 3.4.4 Alternatives 3.4.5 Okapi BM25 3.4.6 What’s next 3.5 Summary 4 Finding meaning in word counts (semantic analysis) 4.1 From word counts to topic scores 4.1.1 TF-IDF vectors and lemmatization 4.1.2 Topic vectors 4.1.3 Thought experiment 4.1.4 An algorithm for scoring topics 4.1.5 An LDA classifier 4.2 Latent semantic analysis 4.2.1 Your thought experiment made real 4.3 Singular value decomposition 4.3.1 U — left singular vectors 4.3.2 S — singular values 4.3.3 VT — right singular vectors 4.3.4 SVD matrix orientation 4.3.5 Truncating the topics 4.4 Principal component analysis 4.4.1 PCA on 3D vectors 4.4.2 Stop horsing around and get back to NLP 4.4.3 Using PCA for SMS message semantic analysis 4.4.4 Using truncated SVD for SMS message semantic analysis 4.4.5 How well does LSA work for spam classification? 4.5 Latent Dirichlet allocation (LDiA) 4.5.1 The LDiA idea 4.5.2 LDiA topic model for SMS messages 4.5.3 LDiA + LDA = spam classifier 4.5.4 A fairer comparison: 32 LDiA topics 4.6 Distance and similarity 4.7 Steering with feedback 4.7.1 Linear discriminant analysis 4.8 Topic vector power 4.8.1 Semantic search 4.8.2 Improvements 4.9 Summary Part 2: Deeper learning (neural networks) 5 Baby steps with neural networks (perceptrons and backpropagation) 5.1 Neural networks, the ingredient list 5.1.1 Perceptron 5.1.2 A numerical perceptron 5.1.3 Detour through bias 5.1.4 Let’s go skiing — the error surface 5.1.5 Off the chair lift, onto the slope 5.1.6 Let’s shake things up a bit 5.1.7 Keras: Neural networks in Python 5.1.8 Onward and deepward 5.1.9 Normalization: Input with style 5.2 Summary 6 Reasoning with word vectors (Word2vec) 6.1 Semantic queries and analogies 6.1.1 Analogy questions 6.2 Word vectors 6.2.1 Vector-oriented reasoning 6.2.2 How to compute Word2Vec representations 6.2.3 How to use the gensim.word2vec module 6.2.4 How to generate your own Word vector representations 6.2.5 Word2vec vs GloVe (Global Vectors) 6.2.6 fastText 6.2.7 Word2vec vs LSA 6.2.8 Visualizing word relationships 6.2.9 Unnatural words 6.2.10 Document similarity with Doc2vec 6.3 Summary 7 Getting words in order with convolutional neural networks (CNNs) 7.1 Learning meaning 7.2 Toolkit 7.3 Convolutional neural nets 7.3.1 Building blocks 7.3.2 Step size 7.3.3 Filter composition 7.3.4 Padding 7.3.5 Learning 7.4 Narrow windows indeed 7.4.1 Implementation in Keras: Prepping the data 7.4.2 Convolutional neural network architecture 7.4.3 Pooling 7.4.4 Dropout 7.4.5 The cherry on the sundae 7.4.6 Let’s get to learning (training) 7.4.7 Using the model in a pipeline 7.4.8 Where do you go from here? 7.5 Summary 8 Loopy (recurrent) neural networks (RNNs) 8.1 Remembering with recurrent networks 8.1.1 Backpropagation through time 8.1.3 Recap 8.1.4 There’s always a catch 8.1.5 Recurrent neural net with Keras 8.2 Putting things together 8.3 Let’s get to learning our past selves 8.4 Hyperparameters 8.5 Predicting 8.5.1 Statefulness 8.5.2 Two-way street 8.5.3 What is this thing? 8.6 Summary 9 Improving retention with long short-term memory networks 9.1 LSTM 9.1.1 Backpropagation through time 9.1.2 Where does the rubber hit the road? 9.1.3 Dirty data 9.1.4 Back to the dirty data 9.1.5 Words are hard. Letters are easier. 9.1.6 My turn to chat 9.1.7 My turn to speak more clearly 9.1.8 Learned how to say, but not yet what 9.1.9 Other kinds of memory 9.1.10 Going deeper 9.2 Summary 10 Sequence-to-sequence models and attention 10.1 Encoder-decoder architecture 10.1.1 Decoding thought 10.1.2 Look familiar? 10.1.3 Sequence-to-sequence conversation 10.1.4 LSTM review 10.2 Assembling a sequence-to-sequence pipeline 10.2.1 Preparing your dataset for the sequence-to-sequence training 10.2.2 Sequence-to-sequence model in Keras 10.2.3 Sequence encoder 10.2.4 Thought decoder 10.2.5 Assembling the sequence-to-sequence network 10.3 Training the sequence-to-sequence network 10.3.1 Generate output sequences 10.4 Building a chatbot using sequence-to-sequence networks 10.4.1 Preparing the corpus for your training 10.4.2 Building your character dictionary 10.4.3 Generate one-hot encoded training sets 10.4.4 Train your sequence-to-sequence chatbot 10.4.5 Assemble the model for sequence generation 10.4.6 Predicting a sequence 10.4.7 Generating a response 10.4.8 Converse with your chatbot 10.5 Enhancements 10.5.1 Reduce training complexity with bucketing 10.5.2 Paying attention 10.6 In the real world 10.7 Summary Part 3: Getting real (real-world NLP challenges) 11 Information extraction (named entity extraction and question answering) 11.1 Named entities and relations 11.1.1 A knowledge base 11.1.2 Information extraction 11.2 Regular patterns 11.2.1 Regular expressions 11.2.2 Information extraction as ML feature extraction 11.3 Information worth extracting 11.3.1 Extracting GPS locations 11.4 Extracting relationships (relations) 11.4.1 POS tagging 11.4.2 Entity name normalization 11.4.3 Relation normalization and extraction 11.4.4 Word patterns 11.4.5 Segmentation 11.4.6 Why won’t split('.!?') work? 11.4.7 Sentence segmentation with regular expressions 11.5 In the real world 11.6 Summary 12 Getting chatty (dialog engines) 12.1 Language skill 12.1.1 Modern approaches 12.1.2 A hybrid approach 12.2 Pattern-matching approach 12.2.1 A pattern-matching chatbot with AIML 12.2.2 A network view of pattern matching 12.3 Grounding 12.4 Retrieval (search) 12.4.1 The context challenge 12.4.2 Example retrieval-based chatbot 12.4.3 A search-based chatbot 12.5 Generative models 12.5.1 Chat about NLPIA 12.5.2 Pros and cons of each approach 12.6 Four-wheel drive 12.6.1 The Will to succeed 12.7 Design process 12.8 Trickery 12.8.1 Ask questions with predictable answers 12.8.2 Be entertaining 12.8.3 When all else fails, search 12.8.4 Being popular 12.8.5 Be a connector 12.8.6 Getting emotional 12.9 In the real world 12.10 Summary 13 Scaling up (optimization, parallelization, and batch processing) 13.1 Too much of a good thing (data) 13.2 Optimizing NLP algorithms 13.2.1 Indexing 13.2.2 Advanced indexing 13.2.3 Advanced indexing with Annoy 13.2.4 Why use approximate indexes at all? 13.2.5 An indexing workaround: Discretizing 13.3 Constant RAM algorithms 13.3.1 Gensim 13.3.2 Graph computing 13.4 Parallelizing your NLP computations 13.4.1 Training NLP models on GPUs 13.4.2 Renting vs. buying 13.4.3 GPU rental options 13.4.4 Tensor processing units 13.6 Gaining model insights with TensorBoard 13.6.1 How to visualize word embeddings 13.7 Summary Appendixes A.1 Anaconda3 A.2 Install NLPIA A.3 IDE A.4 Ubuntu package manager A.5 Mac A.5.1 A Mac package manager A.5.2 Some packages A.5.3 Tuneups A.6 Windows A.6.1 Get Virtual A.7 NLPIA automagic Appendix B: Playful Python and regular expressions B.1 Working with strings B.1.1 String types (str and bytes) B.1.2 Templates in Python (.format()) B.2 Mapping in Python (dict and OrderedDict) B.3 Regular expressions B.3.1 | — "OR" B.3.2 () — Groups B.3.3 [] — Character classes B.4 Style B.5 Mastery Appendix C: Vectors and matrices (linear algebra fundamentals) C.1 Vectors C.1.1 Distances D.1 Data selection and avoiding bias D.2 How fit is fit? D.3 Knowing is half the battle D.4 Cross-fit training D.5 Holding your model back D.5.1 Regularization D.5.2 Dropout D.5.3 Batch normalization D.6 Imbalanced training sets D.6.1 Oversampling D.6.2 Undersampling D.6.3 Augmenting your data D.7 Performance metrics D.7.1 Measuring classifier performance D.7.2 Measuring regressor performance D.8 Pro tips Appendix E: Resources E.1 Applications and project ideas E.2 Courses and tutorials E.4 Research papers and talks E.4.1 Vector space models and semantic search E.4.2 Finance E.4.3 Question answering systems E.4.4 Deep learning E.4.5 LSTMs and RNNs E.5 Competitions and awards E.6 Datasets E.7 Search engines E.7.1 Search algorithms E.7.2 Open source search engines E.7.3 Open source full-text indexers E.7.4 Manipulative search engines E.7.5 Less manipulative search engines E.7.6 Distributed search engines Appendix F: Glossary Acronyms Terms Appendix G: Setting up your AWS GPU G.1 Steps to create your AWS GPU instance G.1.1 Cost control Appendix H: Locality sensitive hashing H.1 High-dimensional vectors are different H.1.1 Vector space indexes and hashes H.1.2 High-dimensional thinking H.2 High-dimensional indexing H.2.1 Locality sensitive hashing H.2.2 Approximate nearest neighbors H.3 "Like" prediction

What's inside Some sentences in this book were written by NLP! Can you guess which ones?

Working with Keras, TensorFlow, gensim, and scikit-learn

Rule-based and data-based NLP

Scalable pipelines

About the reader This book requires a basic understanding of deep learning and intermediate Python skills.