Found  /  Comment

Archivists Aren’t Ready for the ‘Very Online’ Era

The challenge: how to catalog and derive meaning from so much digital clutter

The work of history starts with a negotiation. A public figure or their descendant—or, say, an activist group or a college club—works with an institution, such as a university library, to decide which of the figure’s papers, correspondence, photos, and other materials to donate. Archivists then organize these records for researchers, who, over subsequent years, physically flip through them. These tidbits are deeply valuable. They reveal crucial details about our most famous figures and important historical events. They’re the gas feeding the engine of our history books.

Over the past two decades, the volume of these donations has increased dramatically. When Donald Mennerich, a digital archivist at NYU, first started working in the field, 15 years ago, writers or activists or public figures would hand over boxes of letters, notes, photos, meeting minutes, and maybe a floppy disk or a “small computer that had a gigabyte hard drive,” he told me. Now, Mennerich said, “everyone has a terabyte of data on their laptop and a 4-terabyte hard drive”—about 4,000 times as much content—plus an email inbox with 10,000 messages or more.

Processing this digital bulk is a headache. At the British Library, when a laptop arrives, Callum McKean, the library’s lead digital curator, makes a master copy of the hard drive. Then archivists create a curated version that filters out sensitive information, just as they do for paper records. Various software promises to ease the work, for example by scanning an email inbox for potentially sensitive messages—bank-account details, doctor’s notes, unintended sexual disclosures—but the technology isn’t foolproof. Once, Mennerich was surprised to find that the tool had not redacted the phone number of a celebrity. So archivists must still review files by hand, which has “created a huge bottleneck,” McKean told me.

Now many libraries possess emails that they don’t have the bandwidth to make accessible to researchers. The writer Ian McEwan’s emails, although technically part of his collection at the Harry Ransom Center, in Texas, have not been processed, because of “challenges in capacity,” a spokesperson told me. The archive of the poet Wendy Cope reportedly contains a trove of emails, but they are also not yet ready for the public and still need to undergo sensitivity review, McKean said. Recently, I visited NYU to examine the activist, artist, and onetime Andy Warhol acolyte Jeremy Ayers’s files, which include a collection of his emails and an archive of his Facebook account. The public description of the Ayers collection hinted at a labyrinth of insights into the late stage of his career, when he photographed scenes from Occupy Wall Street—the kind of deep look into an artist’s process and social calendar that would have been unthinkable a few decades ago. But my requests to view both his emails and his Facebook account were denied; an archivist had not yet reviewed the records for sensitivity. For now, until Ayers’s digital files are fully processed, which could take a while, the archive promises more access than it can deliver.