Wikipedia is becoming increasingly critical in helping people obtain information and knowledge. Its leading advantage is that users can not only access information but also modify it. However, this presents a challenging issue: how can we measure the quality of a Wikipedia article? The existing approaches assess Wikipedia quality by statistical models or traditional machine learning algorithms. However, their performance is not satisfactory. Moreover, most existing models fail to extract complete information from articles, which degrades the model’s performance. In this article, we first survey related works and summarise a comprehensive feature framework. Then, state-of-the-art deep learning models are introduced and applied to assess Wikipedia quality. Finally, a comparison among deep learning models and traditional machine learning models is conducted to validate the effectiveness of the proposed model. The models are compared extensively in terms of their training and classification performance. Moreover, the importance of each feature and the importance of different feature sets are analysed separately.

References

[1] Anderka, M . Analyzing and predicting quality flaws in user-generated content: the case of Wikipedia. PhD Thesis, Bauhaus-Universität , Weimar , 2013 .

Google Scholar

[2] Anderka, M, Stein, B. A breakdown of quality flaws in Wikipedia . In: Proceedings of the 2nd Joint WICOW/AIRWeb workshop on web quality , Lyon , , pp. 11 – 18 . New York : ACM .

Google Scholar

[3] de la Robertie, B, Pitarch, Y, Teste, O. Measuring article quality in Wikipedia using the collaboration network . In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining , Paris , , pp. 464 – 471 . New York : ACM .

Google Scholar

[4] Hardik, V, Anirudh, V, Balaji, P. Link analysis of Wikipedia documents using mapreduce . In: Proceedings of the IEEE international conference on information reuse and integration , San Francisco, CA , , pp. 582 – 588 . New York : IEEE .

Google Scholar

[5] Dalip, DH, Gonçalves, MA, Cristo, M et al . A general multiview framework for assessing the quality of collaboratively created content on web 2.0 . J Assoc Inf Sci Technol 2017 ; 68: 286 – 308 .

Google Scholar Crossref

[6] Dalip, DH, Lima, H, Gonçalves, MA et al . Quality assessment of collaborative content with minimal information . In: Proceedings of the IEEE/ACM joint conference on digital libraries , London , , pp. 201 – 210 . New York : IEEE .

Google Scholar

[7] Sinanc, D, Yavanoglu, U. A new approach to detecting content anomalies in Wikipedia . In: Proceedings of the 12th international conference on machine learning and applications , Miami, FL , , pp. 288 – 293 . Piscataway, NJ : IEEE .

Google Scholar

[8] Dang, Q, Ignat, C. Measuring quality of collaboratively edited documents: the case of Wikipedia . In: Proceedings of the IEEE 2nd international conference on collaboration and internet computing (CIC) , Pittsburgh, PA , , pp. 266 – 275 . New York : IEEE .

Google Scholar

[9] Kapugama, KDCG, Lorensuhewa, SAS, Kalyani, MAL. Enhancing Wikipedia search results using text mining . In: Proceedings of the 16th international conference on advances in ICT for emerging regions (ICTer) , Negombo, Sri Lanka , , pp. 168 – 175 . New York : IEEE .

Google Scholar

[10] Bykau, S, Korn, F, Srivastava, D et al . Fine-grained controversy detection in Wikipedia . In: Proceedings of the IEEE 31st international conference on data engineering , Seoul, South Korea , , pp. 1573 – 1584 . New York : IEEE .

Google Scholar

[11] Ganjisaffar, Y, Javanmardi, S, Lopes, C. Based ranking of Wikipedia articles . In: Proceedings of the international conference on computational aspects of social networks , Fontainebleau , , pp. 98 – 104 . New York : IEEE .

Google Scholar

[12] Hu, M, Lim, EP, Sun, A et al . Measuring article quality in Wikipedia: models and evaluation . In: Proceedings of the 16th ACM conference on information and knowledge management , Lisbon, Portugal , , pp. 243 – 252 . New York : IEEE .

Google Scholar

[13] Adler, BT, Chatterjee, K, de Alfaro, L et al . Assigning trust to Wikipedia content . In: Proceedings of the 4th international symposium on wikis , Porto, Portugal , , pp. 1 – 12 . New York : ACM .

Google Scholar

[14] Javanmardi, S, Lopes, C, Baldi, P. Modeling user reputation in wikis . Stat Anal Data Min 2010 ; 3: 126 – 139 .

Google Scholar

[15] Javanmardi, S, Lopes, C. Statistical measure of quality in Wikipedia . In: Proceedings of the first workshop on social media analytics , Washington, DC , , pp. 132 – 138 . New York : ACM .

Google Scholar

[16] Wöhner, T, Peters, R. Assessing the quality of Wikipedia articles with lifecycle based metrics . In: Proceedings of the 5th international symposium on wikis and open collaboration , Orlando, FL , , pp. 1 – 10 . New York : ACM .

Google Scholar

[17] de la Calzada, G, Dekhtyar, A. On measuring the quality of Wikipedia articles . In: Proceedings of the 4th workshop on information credibility , Raleigh, NC , , pp. 11 – 18 . New York : ACM .

Google Scholar

[18] Biancani, S . Measuring the quality of edits to Wikipedia . In: Proceedings of the international symposium on open collaboration , Berlin , , pp. 1 – 33 . New York : ACM .

Google Scholar

[19] Halfaker, A, Kittur, A, Kraut, R et al . A jury of your peers: quality, experience and ownership in Wikipedia . In: Proceedings of the 5th international symposium on wikis and open collaboration , Orlando, FL , , pp. 1 – 10 . New York : ACM .

Google Scholar

[20] Priedhorsky, R, Chen, J, Lam, SK et al . Creating, destroying, and restoring value in Wikipedia . In: Proceedings of the 2007 international ACM conference on supporting group work , Sanibel Island, FL , , pp. 259 – 268 . New York : ACM .

Google Scholar

[21] Suzuki, Y, Yoshikawa, M. Mutual evaluation of editors and texts for assessing quality of Wikipedia articles . In: Proceedings of the 8th annual international symposium on wikis and open collaboration , Linz, Austria , , pp. 1 – 10 . New York : ACM .

Google Scholar

[22] Nemoto, K, Gloor, P, Laubacher, R. Social capital increases efficiency of collaboration among Wikipedia editors . In: Proceedings of the 22nd ACM conference on hypertext and hypermedia , Eindhoven , , pp. 231 – 240 . New York : ACM .

Google Scholar

[23] Dalip, D, Gonçalves, MA, Cristo, M et al . Automatic quality assessment of content created collaboratively by web communities: a case study of Wikipedia . In: Proceedings of the 9th ACM/IEEE-CS joint conference on digital libraries , Austin, TX , , pp. 295 – 304 . New York : ACM .

Google Scholar

[24] Anderka, M, Stein, B, Lipka, N. Predicting quality flaws in user-generated content: the case of Wikipedia . In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval , Portland, OR , , pp. 981 – 990 . New York : ACM .

Google Scholar

[25] Blumenstock, JE . Size matters: word count as a measure of quality on Wikipedia . In: Proceedings of the 17th international conference on World Wide Web , Beijing, China , , pp. 1095 – 1096 . New York : ACM .

Google Scholar

[26] Kincaid, JP, Fishburne, RP, Rogers, RL et al . Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical Report Research Branch Report 8-75. Report no. 56, February 1975 . Millington, TN : Institute for Simulation and Training .

Google Scholar Crossref

[27] Mc Laughlin, GH . SMOG grading-a new readability formula . J Read 1969 ; 12: 639 – 646 .

Google Scholar

[28] Coleman, M, Liau, TL. A computer readability formula designed for machine scoring . J Appl Psychol 1975 ; 60: 283 – 284 .

Google Scholar Crossref | ISI

[29] Chen, H-H . How to use readability formulas to access and select English reading materials . J Educ Media Libr Sci 2012 ; 50: 229 – 254 .

Google Scholar

[30] Lipka, N, Stein, B Identifying featured articles in Wikipedia: writing style matters . In: Proceedings of the 19th international conference on World Wide Web , Raleigh, NC , , pp. 1147 – 1148 . New York : ACM .

Google Scholar

[31] Xu, Y, Luo, T. Measuring article quality in Wikipedia: lexical clue model . In: Proceedings of the 3rd symposium on web society , Port Elizabeth, South Africa , , pp. 141 – 146 . New York : IEEE .

Google Scholar

[32] Kamps, J, Koolen, M. Is Wikipedia link structure different? In: Proceedings of the second ACM international conference on web search and data mining , Barcelona , , pp. 232 – 241 . New York : ACM .

Google Scholar

[33] Pateman, BM, Johnson, C. Using the Wikipedia link structure to correct the Wikipedia link structure . In: Proceedings of the 2nd workshop on the people’s web meets NLP: collaboratively constructed semantic resources , Beijing, China , , pp. 10 – 18 . Stroudsburg, PA : ACL .

Google Scholar

[34] de Ruvo, G, Santone, A. Analysing wiki quality using probabilistic model checking . In: Proceedings of the IEEE 24th international conference on enabling technologies: infrastructure for collaborative enterprises , Larnaca, Cyprus , , pp. 224 – 229 . New York : IEEE .

Google Scholar

[35] Wilkinson, DM, Huberman, BA. Cooperation and quality in Wikipedia . In: Proceedings of the 2007 international symposium on Wikis , Montreal, QC, Canada , , pp. 157 – 164 . New York : ACM .

Google Scholar

[36] Liu, J, Ram, S. Who does what: collaboration patterns in the Wikipedia and their impact on article quality . ACM Trans Manag Inf Syst 2011 ; 2: 1 – 23 .

Google Scholar Crossref

[37] Li, X, Tang, J, Wang, T et al . Automatically assessing Wikipedia article quality by exploiting article–editor networks . In: Advances in information retrieval (eds Hanbury, A, Kazai, G, Rauber, A et al .), Vienna, Austria , , pp. 574 – 580 . Cham : Springer International Publishing .

Google Scholar

[38] Warncke-Wang, M, Cosley, D, Riedl, J. Tell me more: an actionable quality model for Wikipedia . In: Proceedings of the 9th international symposium on open collaboration , Hong Kong, China , , pp. 1 – 8 . New York : ACM .

Google Scholar

[39] Halfaker, A . Interpolating quality dynamics in Wikipedia and demonstrating the keilana effect . In: Proceedings of the 13th international symposium on open collaboration , Galway , , pp. 1 – 9 . New York : ACM .

Google Scholar

[40] Suzuki, Y . Quality assessment of Wikipedia articles using h-index . J Inf Process 2015 ; 23: 22 – 30 .

Google Scholar

[41] Ofek, N, Rokach, L. A classifier to determine which Wikipedia biographies will be accepted . J Assoc Inf Sci Technol 2014 ; 66: 213 – 218 .

Google Scholar Crossref

[42] Dalip, DH, Gonçalves, MA, Cristo, M et al . Automatic assessment of document quality in web collaborative digital libraries . J Data Inf Qual 2011 ; 2: 14 .

Google Scholar

[43] Agrawal, R, DeAlfaro, L. Predicting the quality of user contributions via LSTMs . In: Proceedings of the 12th international symposium on open collaboration , Berlin , , pp. 1 – 10 . New York : ACM .

Google Scholar

[44] Dang, Q, Ignat, C. Quality assessment of Wikipedia articles without feature engineering . In: Proceedings of the 2016 IEEE/ACM joint conference on digital libraries (JCDL) , Newark, NJ , , pp. 27 – 30 . New York : IEEE .

Google Scholar

[45] Dang, Q-V, Ignat, C-L. An end-to-end learning solution for assessing the quality of Wikipedia articles . In: Proceedings of the 13th international symposium on open collaboration , Galway , , pp. 1 – 10 . New York : ACM .

Google Scholar

[46] Shen, A, Qi, J, Baldwin, T. A hybrid model for quality assessment of Wikipedia articles . In: Proceedings of the Australasian language technology association workshop , Brisbane, QLD, Australia , , pp. 43 – 52 , https://pdfs.semanticscholar.org/8946/03d927860010ed3554a9922a992838188d81.pdf?_ga=2.152178489.1047445110.1568451902-1540706140.1559042995

Google Scholar

[47] Lee, YW, Strong, DM, Kahn, BK et al . AIMQ: a methodology for information quality assessment . Inf Manage 2002 ; 40(2): 133 – 146 .

Google Scholar Crossref

[48] Blumenstock, JE . Automatically assessing the quality of Wikipedia articles . School of Information , UC Berkeley , https://escholarship.org/uc/item/18s3z11b ( 2008 , accessed 6 June 2018).

Google Scholar

[49] Stvilia, B, Michael, BT, Linda, CS et al . Assessing information quality of a community-based encyclopedia . In: Proceedings of the international conference on information quality (ICIQ) , Cambridge, MA , , pp. 442 – 454 . DBLP, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.78.6243

Google Scholar

[50] Flesch, R . A new readability yardstick . J Appl Psychol 1948 ; 32(3): 221 – 233 .

Google Scholar Crossref | Medline | ISI

[51] Gunning, R . The fog index after twenty years . Int J Bus Commun 1969 ; 6(2): 3 – 13 .

Google Scholar SAGE Journals

[52] Bjo^msson, CH . Lasbarhet [ Readability ]. Stockholm : Liber , 1968 .

Google Scholar

[53] Lih, A . Wikipedia as participatory journalism: reliable sources? Metrics for evaluating collaborative media as a news resource . Nature 2004 ; 3(1): 1 – 31 .

Google Scholar

[54] Wang, P, Li, X. Assessing the quality of information on Wikipedia: a deep-learning approach . J Assoc Inf Sci Tech. Epub ahead of print 8 April 2019. DOI: 10.1002/asi.24210.

Google Scholar Crossref

[55] Abdul-Mageed, M, Ungar, L. Emonet: fine-grained emotion detection with gated recurrent neural networks . In: Proceedings of the 55th annual meeting of the association for computational linguistics , Vancouver, BC, Canada , , pp. 718 – 728 . Stroudsburg, PA : ACL .

Google Scholar

[56] Ming, H, Huang, D, Xie, L et al . Deep bidirectional LSTM modeling of timbre and prosody for emotional voice conversion . In: Proceedings of the 17th annual conference of the international speech communication association, San Francisco, CA , September 2016 , pp. 2453 – 2457 . Interspeech, https://www.isca-speech.org/archive/Interspeech_2016/abstracts/1053.html

Google Scholar

[57] Olah, C . Understanding LSTM networks , http://colah.github.io/posts/2015-08-Understanding-LSTMs/ ( 2015 , accessed 5 June 2018).

Google Scholar

[58] Yang, B, Mitchell, T. Leveraging knowledge bases in LSTMs for improving machine reading . In: Proceedings of the 55th annual meeting of the association for computational linguistics , Vancouver, BC, Canada , , (Volume 1: Long Papers) Jul 2017 , pp. 1436 – 1446 . Stroudsburg, PA : ACL .

Google Scholar Crossref

[59] Fan, B, Xie, L, Yang, S et al . A deep bidirectional LSTM approach for video-realistic talking head . Multimed Tools Appl 2016 ; 75: 5287 – 5309 .

Google Scholar Crossref

[60] Krizhevsky, A, Sutskever, I, Hinton, GE. ImageNet classification with deep convolutional neural networks . In: Proceedings of the 25th international conference on neural information processing systems , Lake Tahoe, NV , , pp. 1097 – 1105 . Red Hook : Curran Associates, Inc .

Google Scholar

[61] Oquab, M, Bottou, L, Laptev, I et al . Learning and transferring mid-level image representations using convolutional neural networks . In: Proceedings of the IEEE conference on computer vision and pattern recognition , Columbus, OH , , pp. 1717 – 1724 . New York : IEEE .

Google Scholar

[62] Trusov, R . Text classifier algorithms in machine learning , https://blog.statsbot.co/text-classifier-algorithms-in-machine-learning-acc115293278 ( 2017 , accessed 5 June 2018)

Google Scholar

[63] Feng, W, Wu, S, Li, X et al . A deep belief network based machine learning system for risky host detection , 2017 , https://arxiv.org/abs/1801.00025

Google Scholar

[64] Davis, J, Goadrich, M. The relationship between Precision-Recall and ROC curves . In: Proceedings of the 23rd international conference on machine learning , Pittsburgh, PA , , pp. 233 – 240 . New York : ACM .

Google Scholar