Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.

This dataset includes a list of over 37 million Wikipedia articles in 55 languages with quality scores by WikiRank (https://wikirank.net). Quality scores of articles are based on Wikipedia dumps from July, 2018





License



All files included in this datasets are released under CC0: https://creativecommons.org/publicdomain/zero/1.0/



Format

• page_id -- The identifier of the Wikipedia article (int), e.g. 4519301

• revision_id -- The Wikipedia revision of the article (int), e.g. 24284811 • page_id -- The identifier of the Wikipedia article (int), e.g. 4519301• revision_id -- The Wikipedia revision of the article (int), e.g. 24284811

• page_name -- The title of the Wikipedia article (utf-8), e.g. General relativity

• wikirank_quality -- quality score 0-100

