The Punjabi-language Wikisource is the fastest-growing Wikimedia project in the world. Rupika Sharma, a volunteer Wikimedia editor and community member, writes about one of the initiatives that has helped made this a reality.

Imagine a world where you grew up in a world where the greatest literary works in history never existed.

For many of the world’s language speakers, this can be their functional reality. Titles like these have either never been translated, or were translated decades ago and now hide in ancient paper-bound texts on dusty library shelves.

• • •

As an example of this problem, let’s take a look at the Punjabi language. Separated as part of the 1947 partition of British India, the language is today spoken by 120 million people in regions of Pakistan and India. I’m one of them. I grew up in northwest India and can still remember hearing about Chambe Diyan Kaliyan, a short story collection by Leo Tolstoy that was adapted into the Punjabi by Abhai Singh. That particular book is frequently cited in the history of Punjabi literature as one of the first collections of short stories to be published in the language.

You’ll note, though, that I didn’t say I can remember reading it—I’ve never been able to track down one of the published books to read it for myself, nor have I been able to find anything but a bunch of pop-culture songs with similar titles when I search for it online in Punjabi. All of which is to say that when I was growing up, reading and learning from Tolstoy’s story was functionally impossible for Punjabi speakers.

Thankfully, times are changing. While there are still many barriers to surmount, the advent of the internet has made the fundamental problem of publishing and distributing of translations far easier. The Wikimedia community has an entire project devoted to this sort of thing: Wikisource.

Wikimedia community gathering in Patiala , Punjab, India, in 2018.

Bringing the lost literature of long-forgotten times into the modern era for interested users, Wikisource is a free e-library that provides freely licensed or public domain books free of cost, in different formats, and able to be used for any purpose. It is one of thirteen collaborative knowledge projects operated by the Wikimedia Foundation, the largest of which is Wikipedia, and Wikisource is available in nearly seventy languages.

The Punjabi-language Wikisource was and is small compared to other language Wikisources, and to grow this resource, I formed a partnership with a government library in the Indian city of Patiala to digitize public domain books. By making rare literature books accessible in languages that have little to no presence online, Wikisource serves the common people, allowing them to freely browse these resources.

As a titled Wikimedian-in-residence at the library, I helped their staff scan a selection of important books. The collaboration brought forty-two public domain Punjabi-language works online—including a reprint of Chambe Diyan Kaliyan, the Tolstoy short story collection. But just making the scanned images available online isn’t enough; they are not easy to read and often rank low in search engines. Wikisource plays a crucial middleman role: they host the images and pair them with searchable text versions, created and vetted by volunteers. They’re helped in this process by Jay Prakash’s IndicOCR, a new tool that helps to easily transcribe any Indic language to Wikisource. (It replaced an older Linux-based tool that could not be used on many devices.) In addition, Wikisource makes everything available in different file formats so that readers can download whatever works best on their device, whether it’s a computer, tablet, phone, or otherwise.

Finally, Wikisource also allows anyone to contribute, and so I helped organize an online contest, held from December 2018 to January 2019. Prize offerings and in-person trainings brought around three dozen new volunteers to the project, including twenty-four who made more than fifty edits. Kuljit Singh Khuddi, a new volunteer who joined Punjabi Wikisource during the contest, says that “I am proud to be able to contribute to my mother tongue on Wikisource. Such contests help make my language known worldwide.”

The results were stark—the contest made the Punjabi Wikisource the fastest-growing Wikimedia project in the entire world in both content and editors. As of October of last year, the Punjabi Wikisource contained a bit over 1,200 pages. By January of this year, it had over 6,770 belonging to 200 different books. Moreover, over 6,000 of these pages had been proofread by volunteers.

• • •

The growth of the Punjabi Wikisource through the contest and other volunteer work is just a beginning. There are a number of opportunities for supporting the project with technical contributions and GLAM partnerships with different government organizations and institutions.

Moreover, they’re just one of several expanding Wikisources in the region. The Wikisources for the Indic languages of Marathi, Kannada, and Assamese each more than doubled in size in the last year, and with every edit, they’re bringing the sum of all knowledge into their own mother tongues.

Rupika Sharma, Wikimedia community member