You probably know about Mark Twain even if you haven’t read a book by the name of The Adventures of Tom Sawyer. But that ain’t no matter. The digitization of 19th century newspapers means that Samuel Clemens (Twain’s real name) scholars have discovered a whole series of newspaper articles—where he told the truth, mainly—that had been thought lost.

Like the British Library, which is going through its collection of scanned newspapers from the Victorian era looking for what passed for humor then, the new Clemens writings were discovered through the digitization of newspapers.

Digitization was like “opening up a big box of candy,” Bob Hirst, editor of the University of California, Berkeley’s Mark Twain Project Online (MTPO), told the Guardian. It meant that Twain’s articles could be tracked down in a way that was not possible when archives were all on microfilm, he said. About 110 new pieces have been discovered.

“Twain wrote some of the letters and stories at the San Francisco Chronicle when it was called the San Francisco Dramatic Chronicle, where his job included writing a 2,000-word dispatch every day and sending it off by stagecoach for publication in the Territorial Enterprise newspaper in Virginia City, Nevada,” writes the Guardian. At the time, 1865, he was 29 and debating his future career.

It was known that Clemens had written for the Territorial Enterprise—in fact, it was the first place he used the Twain pseudonym. But much of what he wrote for the paper was lost in a series of fires that burned back issues, Hirst told Steve Rubenstein, who writes for the present-day San Francisco Chronicle.

“Scholars rediscovered his San Francisco stories by combing back issues of other Western U.S. newspapers, many of which reprinted the Territorial Enterprise letters,” Rubenstein writes. “Some of the recently found letters, assembled bit by bit from newspaper archives and historical societies, were unsigned. Hirst says that a close examination of their style proves that Twain was the author.”

MTPO is digitizing its complete collection of Twain material, including books, letters, photographs, documents, and notes. Its ultimate purpose, according to the organization’s website, is to “produce a digital critical edition, fully annotated, of everything Mark Twain wrote.”

MTPO started the process of digitizing Clemens’ work as long ago as 2001, though at first each of the different types of material had a different repository. The organization then partnered with the California Digital Library to obtain the technical expertise it needed to make the materials available in a single place online, where they are both more accessible and more easily updated than in printed books.

“Dozens of previously unknown letters by Clemens come to light every year, so that print volumes that are meant to be comprehensive may be out of date almost as soon as they hit the bookstores,” MTPO writes. “Web publication, on the other hand, can make new documents accessible to the public as soon as they emerge.”

Online publication offers other advantages, the organization writes. “A digital critical edition allows for interactivity and an integrative reading experience that is unimaginable in a print edition,” it describes.

“Side-by-side digital presentation makes more visible the transcription process of a particular passage from the original manuscript or typescript. Critical editions often contain long lists of emendations and historical collations in the back of the book, referencing the text using page-and-line cues. Digital publication allows each emendation or collation to be hyperlinked to a specific location in the text or texts that can instantly display all the variant readings from one witness in place, providing superior access to the information.”

That said, over the next 18 months MTPO plans to collect all the newly found material into a book, according to the Washington Post.

UC Berkeley isn’t the only repository of scanned Clemens work. Other sources include:

Northern Illinois University, which provides a searchable and indexed digital library of some of his work, particularly his Mississippi novels and reminiscences;

The New York Public Library, which is digitizing manuscripts of A Connecticut Yankee in King Arthur’s Courtand Following the Equator, as well as his letters with people such as Andrew Carnegie, William Dean Howells, and Theodore Roosevelt; and

The University of Missouri Press, which is raising money in an effort to digitize authors associated with Missouri, including Twain.

And what happened to Clemens after his sojourn in San Francisco? It turns out that he moved to Hawaii and took up the career for which he is best known, writes Rubenstein. “A life of writing novels, in which it was OK to make things up, suited him better than that of writing newspaper stories, in which doing so is not generally recommended,” he writes.

Don’t you love a happy ending?