Related skills
- Natural Language Processing
Associated roles
- AI engineer
- Data/ML engineer
- ML/AI architect
Contents
- prefaceacknowledgmentsabout this bookWho should read this bookHow this book is organized: A road mapAbout the codeliveBook discussion forumOther online resourcesabout the authorabout the cover illustration
- 1.1 A brief history of NLP1.2 Typical tasks1.2.1 Information search1.2.2 Advanced information search: Asking the machine precise questions1.2.3 Conversational agents and intelligent virtual assistants1.2.4 Text prediction and language generation1.2.5 Spam filtering1.2.6 Machine translation1.2.7 Spell- and grammar checkingSummarySolution to exercise 1.1
- 2.1 Introducing NLP in practice: Spam filtering2.2 Understanding the task2.2.1 Step 1: Define the data and classes2.2.2 Step 2: Split the text into words2.2.3 Step 3: Extract and normalize the features2.2.4 Step 4: Train a classifier2.2.5 Step 5: Evaluate the classifier2.3 Implementing your own spam filter2.3.1 Step 1: Define the data and classes2.3.2 Step 2: Split the text into words2.3.3 Step 3: Extract and normalize the features2.3.4 Step 4: Train the classifier2.3.5 Step 5: Evaluate your classifier2.4 Deploying your spam filter in practiceSummarySolutions to miscellaneous exercises
- 3.1 Understanding the task3.1.1 Data and data structures3.1.2 Boolean search algorithm3.2 Processing the data further3.2.1 Preselecting the words that matter: Stopwords removal3.2.2 Matching forms of the same word: Morphological processing3.3 Information weighing3.3.1 Weighing words with term frequency3.3.2 Weighing words with inverse document frequency3.4 Practical use of the search algorithm3.4.1 Retrieval of the most similar documents3.4.2 Evaluation of the results3.4.3 Deploying search algorithm in practiceSummarySolutions to miscellaneous exercises
- 4.1 Use cases4.1.1 Case 14.1.2 Case 24.1.3 Case 34.2 Understanding the task4.3 Detecting word types with part-of-speech tagging4.3.1 Understanding word types4.3.2 Part-of-speech tagging with spaCy4.4 Understanding sentence structure with syntactic parsing4.4.1 Why sentence structure is important4.4.2 Dependency parsing with spaCy4.5 Building your own information extraction algorithmSummarySolutions to miscellaneous exercises
- 5.1 Understanding the task5.1.1 Case 1: Authorship attribution5.1.2 Case 2: User profiling5.2 Machine-learning pipeline at first glance5.2.1 Original data5.2.2 Testing generalization behavior5.2.3 Setting up the benchmark5.3 A closer look at the machine-learning pipeline5.3.1 Decision Trees classifier basics5.3.2 Evaluating which tree is better using node impurity5.3.3 Selection of the best split in Decision Trees5.3.4 Decision Trees on language dataSummarySolutions to miscellaneous exercises
- 6.1 Another close look at the machine-learning pipeline6.1.1 Evaluating the performance of your classifier6.1.2 Further evaluation measures6.2 Feature engineering for authorship attribution6.2.1 Word and sentence length statistics as features6.2.2 Counts of stopwords and proportion of stopwords as features6.2.3 Distributions of parts of speech as features6.2.4 Distribution of word suffixes as features6.2.5 Unique words as features6.3 Practical use of authorship attribution and user profilingSummary
- 7.1 Use cases7.2 Understanding your task7.2.1 Aggregating sentiment score with the help of a lexicon7.2.2 Learning to detect sentiment in a data-driven way7.3 Setting up the pipeline: Data loading and analysis7.3.1 Data loading and preprocessing7.3.2 A closer look into the data7.4 Aggregating sentiment scores with a sentiment lexicon7.4.1 Collecting sentiment scores from a lexicon7.4.2 Applying sentiment scores to detect review polaritySummarySolutions to exercises
- 8.1 Addressing multiple senses of a word with SentiWordNet8.2 Addressing dependence on context with machine learning8.2.1 Data preparation8.2.2 Extracting features from text8.2.3 Scikit-learn’s machine-learning pipeline8.2.4 Full-scale evaluation with cross-validation8.3 Varying the length of the sentiment-bearing features8.4 Negation handling for sentiment analysis8.5 Further practiceSummary
- 9.1 Topic classification as a supervised machine-learning task9.1.1 Data9.1.2 Topic classification with Naïve Bayes9.1.3 Evaluation of the results9.2 Topic discovery as an unsupervised machine-learning task9.2.1 Unsupervised ML approaches9.2.2 Clustering for topic discovery9.2.3 Evaluation of the topic clustering algorithmSummarySolutions to miscellaneous exercises
- 10.1 Topic modeling with latent Dirichlet allocation10.1.1 Exercise 10.1: Question 1 solution10.1.2 Exercise 10.1: Question 2 solution10.1.3 Estimating parameters for the LDA10.1.4 LDA as a generative model10.2 Implementation of the topic modeling algorithm10.2.1 Loading the data10.2.2 Preprocessing the data10.2.3 Applying the LDA model10.2.4 Exploring the resultsSummarySolutions to miscellaneous exercises
- 11.1 Named entity recognition: Definitions and challenges11.1.1 Named entity types11.1.2 Challenges in named entity recognition11.2 Named-entity recognition as a sequence labeling task11.2.1 The basics: BIO scheme11.2.2 What does it mean for a task to be sequential?11.2.3 Sequential solution for NER11.3 Practical applications of NER11.3.1 Data loading and exploration11.3.2 Named entity types exploration with spaCy11.3.3 Information extraction revisited11.3.4 Named entities visualizationSummaryConclusionSolutions to miscellaneous exercises
Overview
Hit the ground running with this in-depth introduction to the NLP skills and techniques that allow your computers to speak human.
InGetting Started with Natural Language Processing you’ll learn about:
Getting Started with Natural Language Processing is an enjoyable and understandable guide that helps you engineer your first NLP algorithms. Your tutor is Dr. Ekaterina Kochmar, lecturer at the University of Bath, who has helped thousands of students take their first steps with NLP. Full of Python code and hands-on projects, each chapter provides a concrete example with practical techniques that you can put into practice right away. If you’re a beginner to NLP and want to upgrade your applications with functions and features like information extraction, user profiling, and automatic topic labeling, this is the book for you.
About the Technology
From smart speakers to customer service chatbots, apps that understand text and speech are everywhere. Natural language processing, or NLP, is the key to this powerful form of human/computer interaction. And a new generation of tools and techniques make it easier than ever to get started with NLP!
About the Book
Getting Started with Natural Language Processing teaches you how to upgrade user-facing applications with text and speech-based features. From the accessible explanations and hands-on examples in this book you’ll learn how to apply NLP to sentiment analysis, user profiling, and much more. As you go, each new project builds on what you’ve previously learned, introducing new concepts and skills. Handy diagrams and intuitive Python code samples make it easy to get started—even if you have no background in machine learning!
What's Inside
About the Reader
You’ll need basic Python skills. No experience with NLP required.
About the Author
Ekaterina Kochmar is a lecturer at the Department of Computer Science of the University of Bath, where she is part of the AI research group.
Quotes
An accessible entry point. Learn key NLP concepts by building real-world projects.
- Samantha Berk, AdaptX
A well-written, pragmatic book.
- James Richard Woodruff, SAIC
The best NLP resource.
- Najeeb Arif, ThoughtWorks
Get started with NLP and understand its fundamentals.
- Walter Alexander Mata López, University of Colima
Makes a difficult subject easy to understand.
- Tanya Wilke, .NET Engineer
InGetting Started with Natural Language Processing you’ll learn about:
- Fundamental concepts and algorithms of NLP
- Useful Python libraries for NLP
- Building a search algorithm
- Extracting information from raw text
- Predicting sentiment of an input text
- Author profiling
- Topic labeling
- Named entity recognition
Getting Started with Natural Language Processing is an enjoyable and understandable guide that helps you engineer your first NLP algorithms. Your tutor is Dr. Ekaterina Kochmar, lecturer at the University of Bath, who has helped thousands of students take their first steps with NLP. Full of Python code and hands-on projects, each chapter provides a concrete example with practical techniques that you can put into practice right away. If you’re a beginner to NLP and want to upgrade your applications with functions and features like information extraction, user profiling, and automatic topic labeling, this is the book for you.
About the Technology
From smart speakers to customer service chatbots, apps that understand text and speech are everywhere. Natural language processing, or NLP, is the key to this powerful form of human/computer interaction. And a new generation of tools and techniques make it easier than ever to get started with NLP!
About the Book
Getting Started with Natural Language Processing teaches you how to upgrade user-facing applications with text and speech-based features. From the accessible explanations and hands-on examples in this book you’ll learn how to apply NLP to sentiment analysis, user profiling, and much more. As you go, each new project builds on what you’ve previously learned, introducing new concepts and skills. Handy diagrams and intuitive Python code samples make it easy to get started—even if you have no background in machine learning!
What's Inside
- Fundamental concepts and algorithms of NLP
- Extracting information from raw text
- Useful Python libraries
- Topic labeling
- Building a search algorithm
About the Reader
You’ll need basic Python skills. No experience with NLP required.
About the Author
Ekaterina Kochmar is a lecturer at the Department of Computer Science of the University of Bath, where she is part of the AI research group.
Quotes
An accessible entry point. Learn key NLP concepts by building real-world projects.
- Samantha Berk, AdaptX
A well-written, pragmatic book.
- James Richard Woodruff, SAIC
The best NLP resource.
- Najeeb Arif, ThoughtWorks
Get started with NLP and understand its fundamentals.
- Walter Alexander Mata López, University of Colima
Makes a difficult subject easy to understand.
- Tanya Wilke, .NET Engineer