What Will We Archive in the Future?

For a Glimpse of Library Collections to Come, Check Out the Fascinating Project to Document Born-Digital Material on Global Health Events


Colorful AIDS education posters from the 1980s. Black-and-white photos of mid-20th-century anatomy lessons for midwives. Eighteenth-century instructions for the administration of patent medicines. While a paper archival collection in the U.S. National Library of Medicine might contain items like these—handwritten or typed journals, correspondence, educational materials, and official reports, some digitized many years after their creation—the next generation of health information lives online.

That’s why the NLM’s two-year-old Global Health Events collection archives born-digital material—webpages, blog posts, social media streams—published during outbreaks and other health crises. The archive includes items like blog posts from doctors in the field, tweets from the Centers for Disease Control and Prevention, and situation reports from international organizations. In the Global Health Events collection, you can read a cached post on the Doctors Without Borders blog, written by Liberian clinic staffer Amie Subah in February 2015, in which she describes the social stigma she faced after surviving Ebola; tweets posted to the hashtag #NepalEarthquake in May 2015; and the March of Dimes’ advice to parents concerned about Zika in February 2016.

The library’s paper collections are typically centered on people or organizations, and offer incidental perspective on any epidemics or outbreaks those people or organizations might have experienced in their lifetimes. (Christie Moffatt, the archivist in the library’s Digital Manuscripts Program who serves as the point person for the Global Health Events collecting project, cites the papers of former Surgeon General C. Everett Koop and the AIDS crisis of the 1980s as an example.) In making new collections of born-digital material, Moffatt and her fellow archivists can make the decision to center their collecting around an event or a theme instead. So the Global Health Events collection contains material on Ebola and Zika, as well as the 2015 earthquake in Nepal.

The web is huge, and there’s a lot of information out there. How do archivists decide when to start collecting links, and which links to save? The NLM team started collecting records for Ebola months after the initial epidemic began. The group learned from that experience and decided to start accumulating digital material whenever the World Health Organization declares a Public Health Emergency of International Concern. The library’s Disaster Information Management Research Center, which maintains pages with links to official information from international organizations and major authorized social media feeds, provides a starting point for the archivists to find official information. The archivists use these links to push outward and begin collecting items like blog posts written by practitioners working in the field—the digital equivalent of the paper journal that a 19th-century doctor might have kept during an outbreak. Using the Internet Archive’s service Archive-It, the team captures a link, and decides how often the software should return to re-crawl the page and save new versions. The result is a collection that’s been fairly selectively curated by humans, with a vision for what’s worth saving.

The web is huge, and there’s a lot of information out there. How do archivists decide when to start collecting links, and which links to save?

Captured official material, like pages of the WHO’s website, gives one perspective on the way information about an outbreak spreads, letting us see how organizations have chosen to word their warnings and advice to people visiting their websites or social media feeds looking for information. But what about rumors, innuendo, false information—the fog that social media is so good at spreading during a breaking news event? I told Moffatt that, to me, that stuff was almost more interesting than the official record. She agreed, saying, “This is all just part of the story.” While being careful to make sure that browsers know that the information in a given saved link is not necessarily correct, the team makes a point of saving links that show how muddled up facts can get when traveling online.

Moffatt pointed me to a couple of links the NLM team has saved that offer a peek at that kind of shaky information. In 2014, the Food and Drug Administration published a warning it had issued to an entrepreneur peddling a cure for Ebola; the NLM captured that warning, which contained details about Natural Solutions Foundation’s pitch to consumers. (“Nano Silver is the world’s only hope against Ebola and the other antibiotics/antiviral resistant pathogens.”) To illustrate public perception of the way Ebola news was spreading, the team grabbed an Oct. 7, 2014, New Yorker Borowitz Report column (“Man Infected With Ebola Misinformation Through Casual Contact With Cable News”). And the team saved a February 2016 Huffington Post debunking of a claim that Monsanto is responsible for Zika.

By setting Archive-It to periodically crawl hashtags like #Ebola and #EbolaResponse, the archivists hope to capture some of the picture of the way information spreads over time. In the future, researchers might use those saved tweets to carry out projects like the one described in this 2014 paper. A team of researchers at Virginia Tech used tweets sent during the Ebola crisis to map the spread of disinformation, tracking rumors like “Ebola vaccine only works on white people” and “The new iPhone 6 is infecting people with Ebola.” In the decades to come, people interested in the way Zika news worked its way through social networks might tap the NLM’s archived tweets to do so.

One of the most interesting tasks the team faces is imagining what uses future researchers might make of this information, and adjusting what they collect accordingly. To some degree, this mission is impossible. “We see so many examples now of collections of digital materials that have uses you never imagined,” Moffatt told me, referring to a project that looks at old ship’s logs to study changing weather patterns. “We try to collect as broadly as possible and as many perspectives as possible.” Who knows what future historians of medicine and public health may need?