Game creator Jordan Mechner wanted to teach the next generation. So the man behind the groundbreaking 1989 Apple II game Prince of Persia recently posted his original 6052 assembly source code to Github. But getting the code from decades-old floppy disks "covered with dust" was no simple task. Mechner employed the services of vintage computer expert Tony Diaz and digital archivist Jason Scott to extract the bits from the floppies and assemble it into a readable code file.

Without Diaz and Scott, Mechner's code could have been lost forever. The exact methods he used to create this landmark game would have become as obsolete as the 1976 technology it was played on.

But old source code isn't the only cultural artifact that requires specialized knowledge to preserve. As paper and dyes deteriorate, acetate degenerates, and the minute magnetic flux recorded in analog tape fades with the ages, how do we preserve cultural artifacts like photographs, music, and film? And what of more modern digitally created media? Images and video are shot directly in digital formats and stored on flash media. Music is recorded in 24-bit, 192kHz digital resolution onto massive hard drives. All these files exist in various codecs, formats, and file systems; on spinning magnetic platters or in solid state NAND flash. How do we preserve these files for future reference, study, and appreciation?

These are the questions asked and answered by "digital archivists" every day. Ars spoke to Robin Pike, a certified archivist currently serving as a Digital Collections Librarian at the University of Maryland, to give us insight into the process, and the motivation, that fuels her work.

Collections speak in a loud voice

Anthony Bannon, a director at the George Eastman House, wrote in the forward to 500 Cameras that "collections manage time." Collecting, documenting, and managing a collection of objects—in Bannon's case, cameras, photographs, and other photographic ephemera—gives us insight into our history, and can lead us in new directions. Such collections bring value when they are organized, interpreted, and shared.

"But we must care for them, say their names, and notice what they consign," Bannon writes. "So we take responsibility for our collections with gratitude. The collector, whether individual or institutional, engages with the object to recognize the light of its value and hold the spark, to take on the responsibility of its meaning and make sense of it."

Pike brings such a personal sense of responsibility to her work as a digital archivist specializing in audiovisual materials. Growing up in a family of musicians and music lovers, Pike was exposed to "legacy" music formats including vinyl LPs, and even vintage glass 78RPM records, from an early age. Honing in on classic Hollywood musicals, Pike performed as both an instrumentalist and vocalist in high school stage productions, and performed in bands and orchestras.

This love of music took Pike to Indiana University of Pennsylvania, where she majored in music education. To supplement her studies, Pike took a job as a student assistant in the university's music library, first shelving books and scores, and later doing preservation repairs. Pike helped process numerous collections that were donated to the library, including an obscure collection of books and other documents from a former marching band director that had a zeal for military band history.

"It wasn't so much the content that interested me, but that some of these books were nearly a century old," Pike told Ars.

Pike also took, as part of her music education curriculum, a class focused on capturing and disseminating digital music. Among its subjects: MIDI files, compressed audio such as MP3, and digital score formats.

"By the time I finished my undergrad degree, I knew I didn’t want to teach anymore; I wanted to go into librarianship," Pike said.

So Pike headed to the University of Pittsburgh to earn a masters degree in library and information science. Pitt is regarded as having one of the top three archivist programs in the country, but it also features a course called "Sound and Moving Image Archives."

"While I work with all formats, my passion is still audiovisual media," Pike explained.

Analog to digital converter

Aside from her lifelong love of music, Pike is attracted to the complexity of working with analog materials. "Analog materials are formed of layers of 'stuff,'" she told Ars. "Film exists of a base layer and an emulsion in which the dyes are suspended. Magnetic media, such as open reels, cassette tapes, and VHS tapes, are made of the base layer with the emulsion which includes magnetic particles that encode the analog (or sometimes digital) signal and a lubricant to make it glide over the tape player heads easily. LPs have a core material and a coating, in which the grooves are stamped.

"In every case," Pike continued, "these materials do not want to stick together forever, and it's only a matter of time before they flake apart, even under the best storage conditions. When magnetic and digital media starts to deteriorate, the encoding signal can become distorted, so you may not be able to retrieve what was originally recorded. Because the physical materials have an 'expiration date,' when their expected lifespan will end, the only way to preserve the materials is digitally."

Pike employs an impressive array of modern and vintage technology in her work. She prefers to use Macs, since OS X gives her the ability to run top GUI software for handling audio, video, and still images, while still giving her access to open source management tools like Fedora and Dspace. She also works with UMD's IT staff to arrange massive storage arrays to store archival master files on site. These files "can be around 100MB for images, 5GB for audio, and over 50GB for video files," Pike said. And each file gets two tape backups—one on-campus, and one off-campus.

Pike also has a collection of "legacy audiovisual players," including turntables, reel-to-reel tape decks, cassette players, VHS players, and CD and DVD players. But she also has at her disposal Laserdisc players, open-reel videotape players, 2” Quadraplex players, Sony Betamax players, film projectors for various film sizes, DAT players, and specialized systems to play wax and wire recordings.

Besides working on physical transfers—moving analog media into digital formats—Pike and her colleagues also deal with transferring massive amounts of digital data from legacy formats into newer ones in hopes they will remain accessible far into the future.

"One of the biggest challenges in the field of digital librarianship is simply trying to evolve as fast as technology," Pike said, "because we need to also keep up with emerging file formats and software systems to read those formats. We need to think of ways to preserve them and make them accessible either through emulation, or migration to a different format or system."

After transferring or reformatting media into digital archival masters, however, comes the rather inglorious but essential process of creating, storing, and indexing the critical metadata for each audiovisual artifact.

Modern Melvil Dewey

"Metadata is essentially information about the digital object which is frequently embedded in the header of the file code, akin to someone writing a description of a photograph on the back of the photograph," Pike said. "This metadata can help identify characteristics about the file without having to open the file, and is especially important for identifying the files before they are ingested into a repository or content management system."

Once all the metadata is associated with a file, and loaded into repository—essentially a database—that file can now be browsed or searched by students, faculty, researchers, and others.

Sometimes, though, copyright issues rear their ugly head, and can limit accessibility.

"Copyright is a huge barrier for making any digitized materials accessible," Pike said. Some institutions focus solely on public domain material. Some institution may own some of their copyrights, such as a university-run press. And there's always plenty of that material to work on, according to Pike.

Regardless, many institutions lack the resources to track down copyright holders, which are often difficult to locate and negotiate licensing agreements to make digital archives and make them available to others.

"The libraries [at UMD] do approach copyright holders to create licensing agreements for materials that professors want to put on reserve or make available for classes when publishers are known," Pike said. "But, there are levels of accessibility that are created based on the copyright (or questionable copyright) and license agreements—whether materials are available on the digital collections website, available for download, available for streaming, or available for streaming to restricted IP addresses on campus or just within the libraries."

Digital stewards of our past, present, and future

If there is one thing we learned in talking with Pike, it's that digital archivists rarely suffer from a lack of gravitas. Much like the family member that collects, organizes, and identifies old family photos to preserve one's heritage, digital archivists seek to do the same for all mankind.

"We are the custodians of what has been created and are enabling access—ideally free and unlimited—for the future," Pike said. "No matter what is created and where it is created, if it is important, some librarian, archivist, or records manager is capturing it and saving it for the future. In addition to saving the digital objects, we need to make them accessible so people can use and reuse the materials."

"We are the custodians of human history."

Further Reading