real preservation
I've been getting increasingly concerned about what I see as a too-shallow view of sustainability in digital preservation. There's been a lot of lip service paid over the last few years to preservation, and I have certainly heard talks by grant-funding agencies in which they explained that they are now only funding grants which have sustainability written into the grant structure. Yet time and time again, I see soft money being awarded to projects for which the project administrators clearly have only the vaguest idea of what sustainability really means in a software environment.
I don't see this as anyone's fault, mind you. Software developers and IT folks aren't used to thinking of software projects in terms of Permanence. In the traditional software world, the only way something is going to be around forever is if it's going to be used all that time -- for example, a financial application which is in constant use needs to be constantly up. But archival digital preservation has a very different sense of permanence. For us, permanence might mean that you build a digital archival collection once, don't touch its content again for 10 years, but can still discover all of its preserved content at the end of those 10 years.
Meanwhile, in Internet time, a project which has been around for two years is clearly well past its prime and ready to be retired.
Repository managers are putting all of this great work into the repository layer* of preservation: handles and DOIs, PRESERV and PRONOM, JHOVE and audit trails and the RLG checklist. But meanwhile, all of these collections of digital objects -- many of them funded by limited-duration soft money -- are running on operating systems which will need to be upgraded and patched as time passes, on hardware which will need to be upgraded and repaired as time passes, on networks which require maintenance. Software requires sustenance and maintenance, and no project which doesn't take into account that such maintenance requires skilled technical people in perpetuity can succeed as perpetual preservation. Real sustainability means commitment from and communication with the programmers and sysadmins. It requires the techies understand an archivist's notion of "permanence", and the librarians and archivists (and grant agencies) understand how that a computer needs more than electricity to keep running -- it needs regular care and feeding.
(This, by the way, is one of the reasons I'm so excited by the OTW Archive of One's Own and the Transformative Works and Cultures journal. The individuals responsible for the archive and the journal *do* have a real understanding of and commitment to permanence down to the hardware and network provider level. Admittedly, it's a volunteer-run, donation supported organization, so its sustainability is an open question. But it's a question the OTW Board is wholeheartedly investigating, because they understand its importance.)
*I'm somewhat tempted to make an archival model of preservation that follows the layered structue of the OSI model of network communication. Collection policy layer, Accession layer, Content layer, Descriptive Metadata layer, Preservation Metadata layer, Application Layer, Operating System layer, Hardware layer. Then you could make sure any new preservation project has all of those checkboxes ticked. Sort of an uber-simplification of the RLG Checklist, in a nice, nerd-friendly format.
I don't see this as anyone's fault, mind you. Software developers and IT folks aren't used to thinking of software projects in terms of Permanence. In the traditional software world, the only way something is going to be around forever is if it's going to be used all that time -- for example, a financial application which is in constant use needs to be constantly up. But archival digital preservation has a very different sense of permanence. For us, permanence might mean that you build a digital archival collection once, don't touch its content again for 10 years, but can still discover all of its preserved content at the end of those 10 years.
Meanwhile, in Internet time, a project which has been around for two years is clearly well past its prime and ready to be retired.
Repository managers are putting all of this great work into the repository layer* of preservation: handles and DOIs, PRESERV and PRONOM, JHOVE and audit trails and the RLG checklist. But meanwhile, all of these collections of digital objects -- many of them funded by limited-duration soft money -- are running on operating systems which will need to be upgraded and patched as time passes, on hardware which will need to be upgraded and repaired as time passes, on networks which require maintenance. Software requires sustenance and maintenance, and no project which doesn't take into account that such maintenance requires skilled technical people in perpetuity can succeed as perpetual preservation. Real sustainability means commitment from and communication with the programmers and sysadmins. It requires the techies understand an archivist's notion of "permanence", and the librarians and archivists (and grant agencies) understand how that a computer needs more than electricity to keep running -- it needs regular care and feeding.
(This, by the way, is one of the reasons I'm so excited by the OTW Archive of One's Own and the Transformative Works and Cultures journal. The individuals responsible for the archive and the journal *do* have a real understanding of and commitment to permanence down to the hardware and network provider level. Admittedly, it's a volunteer-run, donation supported organization, so its sustainability is an open question. But it's a question the OTW Board is wholeheartedly investigating, because they understand its importance.)
*I'm somewhat tempted to make an archival model of preservation that follows the layered structue of the OSI model of network communication. Collection policy layer, Accession layer, Content layer, Descriptive Metadata layer, Preservation Metadata layer, Application Layer, Operating System layer, Hardware layer. Then you could make sure any new preservation project has all of those checkboxes ticked. Sort of an uber-simplification of the RLG Checklist, in a nice, nerd-friendly format.
no subject
... deep breath...
THEY NEED TO HONOR THAT COMMITMENT.
In other words, yes, a lot of people Just Don't Get It, and I am sad to say that that phenomenon isn't necessarily limited to techies, either.
no subject
no subject
Word. I am just now learning the pain of trying to ask software vendors if their system has provisions for exporting files for back-up purposes. "It only saves in proprietary formats" = bad answer. (I won't even start on the unsuccessful attempts to get the people whose files I'm supposed to be organizing onboard with records management software as well as document management software. "But this application does all sorts of things we don't need." Me: "Most applications that provide for version control have RM features. Scheduling is good! Scheduling would mean that four-year-old files of news clippings aren't cluttering up every corner of your shared drive!").
no subject
"But we backed up six months ago; we couldn't possibly replicate that."
Them: "No."
"So you wrote a tool that backs up but doesn't restore."
Them: "...Yes."
no subject
no subject
And it's all safe in there! All nice and recorded, and ready to hand over to opposing counsel in case of a lawsuit! And that's the real purpose of backup software, right? CYA for legal reasons, not for actual access of the files.
You have a whole filing cabinet for that. And besides, didn't you put it all on discs before you backed it up?
no subject
Nightmares. Twitchy screaming nightmares with slimy poisonous tentacles of DOOM.
Surrounded by lawyers who ask us for "searchable tifs." And who seem to think we keep databases of every project we've ever done, complete with the name of the case, the custodians involved, and cross-referenced with relevant legal precedents. (We, umm, printed the stuff on the disc. And then we gave the disc back. We kept a digital copy. It's called "_Disc [clientname] 09 15 07.")
Sometime soon, there's gonna be some huge industry-spanning lawsuit across a dozen cities or law firms or megacorps (at some level, those are interchangeable concepts) who are supposed to be keeping "digital archives" but what actually happened is "umm, our server backed up the hard drives, didn't it?"
no subject
I mean, yes, technically you could write an OCR tool that is tied into the tiff, and some cool technology relies on that, but gah.
And yes, totally on the lawsuit front. We're all so *disorganized* in our organization. And we're the professionals!
no subject
Nope, she insisted, the last place she'd sent this work to had returned "searchable tifs."
We handed her off to the sales rep and said "you speak business hype; find out what she actually wants before we start telling her what we think of her request."
no subject
(I work preservation and archiving at a university library. We're in the middle of building a digital archive of stuff related to the university, and eeeeek some of the ways stuff is being archived...)
So this post was a total moment of fandom + RL collision.
Thanks for the grin, and the wise words!
no subject
And yeah, these issues are hitting everyone now.
Thanks for stopping by!
no subject
Back in the'90s, as a commemoration of the 900th anniversary of the Doomsday Book, the government had a bright idea. They would get schools to contribute to a modern doomsday book which would be sent out to every school. And so it was done.
Every school in the UK received a copy of the modern doomsday book on laserdisc, whether they had a laserdisc player or not. Our school has no idea what they did with their copy.
At least for the millennium they issued real books on paper.
no subject
I think we're doing a better job these days with understanding hardware and software format migration, but we still aren't understanding platform persistence. That is, the archivists understand microfilm and CDs and magtape et al. But the idea that *software* needs to change? Operating Systems? Never.