iOS’s Natural Language framework allows us to analyze language and to perform language-specific tasks like script identification, tokenization, lemmatization, part-of-speech tagging, and named entity recognition.

In this introduction tutorial, we will discover this framework’s capabilities by looking at 4 common and essential techniques:

🔹 Tokenization

🔹 Language Identification

🔹 Part-of-speech Tagging

🔹 Identifying People, Places, and Organizations in Text

1) Tokenization

Before we can actually perform natural language processing on a text, we need to apply some pre-processing to make the data more understandable for computers. Usually, we need to split the words to process the text and remove any punctuation marks.

Apple provides NLTokenizer to enumerate the words, so there’s no need to manually parse spaces between words. Also, some languages like Chinese and Japanese don’t use spaces to delimiter words—luckily, NLTokenizer handles these edge cases for you. For the all supported languages, NLTokenizer can find the semantic units in a given text.