Natural language processing (NLP) portrays a vital role in the research of emerging technologies. It includes sentiment analysis, speech recognition, text classification, machine translation, question answering, among others. If you have watched any webinar or online talks of computer science pioneer Andrew NG, you will notice that he always asks AI and ML enthusiasts to read research papers on emerging technologies. Research papers are a good way to learn about these subjects.

In this article, we list down – in no particular order – ten technical papers on natural language processing (NLP) one must read in 2020.

1| Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

About: In this paper, researchers from Carnegie Mellon University and Google Brain proposed a novel neural architecture known as Transformer-XL that enables learning dependency beyond a fixed-length without disrupting temporal coherence. According to the researchers, TransformerXL learns dependency that is 80% longer than RNNs and 450% longer than vanilla Transformers, achieves better performance on both short and long sequences, and is up to 1,800+ times faster than vanilla Transformers during evaluation.

Read the paper here.

2| Bridging The Gap Between Training & Inference For Neural Machine Translation

About: This paper is one of the top NLP papers from the premier conference, Association for Computational Linguistics (ACL). This paper talks about the error accumulation during Neural Machine Translation. The researchers addressed such problems by sampling context words, not only from the ground truth sequence but also from the predicted sequence by the model during training, where the predicted sequence is selected with a sentence-level optimum. According to the researchers, this approach can achieve significant improvements in multiple datasets.

Read the paper here.

3| BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding

About: BERT by Google AI is one of the most popular language representation models. Several organisations, including Facebook as well as academia, have been researching NLP using this transformer model. BERT stands for Bidirectional Encoder Representations from Transformers and is designed to pre-train deep bidirectional representations from the unlabeled text by jointly conditioning on both left and right context in all layers. The model obtained new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5%, MultiNLI accuracy to 86.7%, and much more.

Read the paper here.

4| A Neural Conversational Model

About: Tech giant Google released this paper. Conversational modeling is an essential task in natural language understanding and machine intelligence. In this paper, the researchers presented a simple approach for conversational modeling, which uses the sequence to sequence framework and converses by predicting the next sentence given the previous sentence or sentences in a conversation. On a domain-specific IT helpdesk dataset, the model can find a solution to a technical problem via conversations. On a noisy open-domain movie transcript dataset, the model can perform simple forms of common sense reasoning.

Read the paper here.

5| Emotion-Cause Pair Extraction: A New Task To Emotion Analysis In Texts

About: Emotion cause extraction (ECE) is a task that is aimed at extracting the potential causes behind certain emotions in text. In this paper, researchers from China proposed a new task known as emotion-cause pair extraction (ECPE), which aims to extract the potential pairs of emotions and corresponding causes in a document. The experimental results on a benchmark emotion cause corpus that prove the feasibility of the ECPE task as well as the effectiveness of this approach.

Read the paper here.

6| Improving Language Understanding By Generative Pre-Training

About: This paper is published by OpenAI, where the researchers talked about natural language understanding and how it can be challenging for discriminatively trained models to perform adequately. The researchers demonstrated the effectiveness of the approach on a wide range of benchmarks for natural language understanding. They proposed a general task-agnostic model, which outperformed discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon state-of-the art in 9 out of the 12 tasks studied.

Read the paper here.

7| Neural Approaches To Conversational AI

About: This research paper by Microsoft Research surveys neural approaches to conversational AI that have been developed in the last few years. In this paper, the researchers grouped conversational systems into three categories, which are question answering agents, task-oriented dialogue agents, and chatbots. For each category, a review of state-of-the-art neural approaches is presented, drawing the connection between them and traditional approaches, as well as discussing the progress that has been made and challenges still being faced, using specific systems and models as case studies.

Read the paper here.

8| Language Models Are Unsupervised Multitask Learners

About: In this research paper, the authors demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of web pages called WebText. According to the researchers, the core of this research is language modeling, and a transformer-based architecture is being used for the language models.

Read the paper here.

9| ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations

About: In this paper, the researchers at Google designed A Lite BERT (ALBERT), which is a modified version of the traditional BERT model. This model incorporates two-parameter reduction techniques, which are factorized embedding parameterization and cross-layer parameter sharing for lifting the major obstacles in scaling pre-trained models in NLP. According to the researchers, this model outperformed the benchmark tests for natural language understanding (NLU), which are GLUE, RACE, and SQuAD 2.0.

Read the paper here.

10| XLNet: Generalized Autoregressive Pre-training For Language Understanding

About: In this paper, researchers from Carnegie Mellon University and Google AI Brain Team proposed e XLNet, a generalised autoregressive pre-training method that enables learning bidirectional contexts by maximising the expected likelihood over all permutations of the factorization order and overcomes the limitations of BERT due to its autoregressive formulation. This model outperformed BERT on 20 tasks, including question answering, natural language inference, sentiment analysis, and document ranking.

Read the paper here.

If you loved this story, do join our Telegram Community.



Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.