deborah: the Library of Congress cataloging numbers for children's literature, technology, and library science (Default)
deborah ([personal profile] deborah) wrote2005-01-13 07:07 pm

Permanent archival of online news sites

When I did my first master's degree, I learned a basic principle: all postbaccalaureate work should be publication quality. I've tried to stick to that, at least in papers with some creativity involved (as opposed to just writing to a set assignment; the student can't be expected to do graduate-level work with an undergraduate-level assignment).

When I began this independent study project, I was extremely excited about the idea of publication. My original goal was to make a publishable academic paper as well as a public product: a website which informed news consumers of the archival policies of giving news agencies. Unfortunately, I don't think I made a publishable paper. I made a number of mistakes:
  • To start with, this is my first social science research paper. I have neither the format or the style down pat, and I think I alternate between too casual and too aggressively academic, without ever finding a comfortable medium.
  • I failed to get all of the release forms I would need to get this information published.
  • I presented my survey badly, both in the e-mail and newsliblog requests, and in the lack of a click-through explanation and release form. As a result, I got scarcely any replies.
  • I'm sure I screwed up something else. Or many somethings else.
Nonetheless, I think my findings were important, and I'd like to share them. In a nutshell, I found that online news agencies don't make permanent archives or even limited snapshots of their layout and design, and in extreme cases, don't even make permanent archives of online-only content. The first part of this is a difficult problem. It still unsolved how best to make snapshots of something as deep, multi-layered, and dynamic as a news website. But the second problem is merely a matter of culture (print newspapers have librarians while webmasters live in the present), and needs to be fixed before an important part of the historical record is lost for good. An important part of the historical record is already lost for good; can we reverse the trend?

Abstract:

In the digital age, nearly all print newspapers of a substantial size have a corresponding website which is used to publish news online, often a more frequent update cycle than the print newspaper. In recent years news consumers who formerly depended on print newspapers increasingly turn to the print newspapers' corresponding websites for up-to-the-minute news which is updated constantly. Some news websites are built to parallel print news resources, near-static pages with long-term information preservation policies similar to that of any newspaper. Other news websites have evolved from this print-like model toward a more dynamic or portal-like model model, in which information is updated as necessary without permanent storage of the various incarnations of a site. This study examines the different archival methods used by print newspapers and their online sister publications.

This study investigated several news organizations with both print and online newspapers. The study was designed first to determine differences between the print and online newspapers: how much did the content and layout differ between the print and online versions? Secondly, the study was designed to determine the archival policies used by the print and online newspapers. Although the online medium differs substantially from the print medium such that identical archival procedures will not work for both, the study investigated whether the online medium was archived with parallel procedures which would serve similar purposes as the print archival. The study found substantial differences between print and online newspapers in both content and layout, and found that print newspapers are substantially more completely archived than online newspapers.

Online newspapers vary greatly in their differences from their sister print publications. Online newspapers have one thing in common, however: they do not have archival processes that parallel those used by their sister print publications. The print newspapers surveyed all have libraries which keep complete and permanent microfilm and digital archives of at least one edition of each day's newspaper. None of the online newspapers surveyed keeps snapshots of a substantial portion of the site layout throughout the day, and many of the online newspapers surveyed do not send online-only content to the print publication's library. Additionally, the permanent archival processes the online newspapers use rely on technology designed for business infrastructural needs such as legal requirements and backup storage, and not for library-oriented policies; there is no process or policy guarantee that these archives will remain prominently or unmodified. None of the surveyed newspapers surveyed employ a digital librarian either at the print newspaper's library or at the online newspaper.


Survey of the Archival Methods for Print and Web Newspapers [RTF]