Chinese AI company iFLYTEK has bested the SQuAD2.0 challenge once again. The model “BERT + DAE + AoA” submitted by the joint iFLYTEK Research and HIT (Harbin Institute of Technology) laboratory HFL outperformed humans on both EM (exact match) and F1-score (fuzzy match) indexes to top the SQuAD2.0 leaderboard.

SQuAD2.0 (Stanford Question Answering Dataset) is a widely recognized, top-level machine reading comprehension challenge in the field of cognitive intelligence. Leveraging massive amounts of data from Wikipedia, the reading comprehension dataset contains more than 100,000 questions. SQuAD2.0 has two evaluation indexes, EM (exact match) and F1-score (fuzzy match), which compare models’ reading comprehension ability to human performance. Unlike with other machine reading comprehension tasks, models participating in SQuAD2.0 must not only predict answers to a question in the content of the dataset, but also identify questions that are not supported by text content and not attempt to respond to such unanswerable questions.

The HFL team model “BERT + DAE + AoA” combines Google’s industry-leading natural language semantic representation model BERT and the neural network Attention-over-Attention (AoA) for reading comprehension. Another innovation of the iFLYTEK system is DAE (DA Enhanced), in which DA represents both Data Augmentation and Domain Adaptation.

HFL Senior Researcher Yiming Cui says although the BERT + DAE + AoA model has achieved SOTA results on current SQuAD2.0, it still has room for improvement on reading comprehension tasks. As its accuracy rate in predicting unanswerable questions is lower than its performance on the task’s answerable questions, iFLYTEK researchers are exploring ways to better predict whether a question can be answered through article content analysis.

The HFL joint laboratory was founded in 2014 by Harbin Institute of Technology and iFLYTEK. It mainly focuses on reading comprehension, automatic scoring, human-machine dialogue, speech recognition, and other forward-looking research topics on cognitive intelligence such as deep semantic understanding and autonomous learning evolution.