Along with Anne Sauer and Eliot Wilczek, I've just had a new paper published: "Archival Description in OAI-ORE", in the Journal of Digital Information, a free, green open access journal. This is a version of a paper which we presented last year at Open Repositories 2010, and mercifully, has been greatly improved since the draft of the paper I wrote while running a temperature of 102°.

This paper, by the way, is our attempt to COMPLETELY REVOLUTIONIZE ARCHIVES AND CHANGE THE LAWS OF PHYSICS. Sort of. Revolutionize archival description using new technology, anyway. Changing the laws of physics will have to wait until we get grant funding.
You know it's a good conference when you've sent your coworkers countless caffeine-fueled e-mail messages that read I AM SO BRILLIANT LOOK AT MY BRILLIANT IDEA or WE ARE SO STUPID WHY DIDN'T WE THINK OF THIS BRILLIANT THING THAT EVERYBODY ELSE IS DOING. Or when you've made a blog post illustrating academic repositories as toddlers playing with a toy truck. I strongly suspect my coworkers wish I would lay off the café con Leche already.

I have this crazy WordPad document open full of notes and links, and I can't figure out which of them are e-mails to specific departments, which of them are notes for myself for further investigation, in which of them are totally awesome blogable exciting links to share with YOU, my loyal readers.

trying not to ramble too much outside of cut: scientific workflows, datasets, faculty information systems )

I see two overarching themes of the conference: the first is Interoperability Is the One True Religion. No silo-like repository can solve everybody's problems. We are interdisciplinary and inter-institution, and we won't solve any problems and less our resources and data can be used by other tools, other resources, other datasets, etc. The second theme I see is Duraspace Helps Those Who Help Themselves. This is open-source software, and we all need to pitch in, and everything is going to be perfect in a modular happy world where everyone writes the tools they want and shares them in an open source community.
Sustainability and Revenue Models for Online Academic Resources is a new report by Ithaka "sets forth a systematic understanding of the mechanisms for pursuing sustainability in not-for-profit projects". They say some very smart things, including "Assuming that grant funding will always be available is not likely to lead to a successful sustainability plan." and "Project leaders need to adopt a more comprehensive definition of ‘sustainability’.... It is incorrect to assume that, once the initial digitisation effort is finished and content is up on the web, the costs of maintaining a resource will drop to zero or nearly zero." (Emphasis mine.) They say some other things which I don't exactly disagree with but I think need to be carefully defined, such as "The value of a project is quantified by the benefits it creates for users", which needs to be carefully defined in an archives world where the value it creates for users might be "long term preservation of rarely accessed materials to benefit the global scholarly community". (At Open Repositories 2008, I heard a lot of conversation and presentations where people assumed that digital resources which weren't being heavily used had no value. As an archivist, I say them nay -- much of what we are preserving we are preserving for the future.)

But in any case, I read the report thinking "that's just what I've been saying". I'm thrilled that major reports are coming out discussing these issues.
[Tagged as, among other things, otw, because even though I am dealing with these issues as a professional I think that The Organization for Transformative Works is very well-placed to be one of the few organizations prepared to confront operational preservation from the outset. After all, the OTW has to deal with one even more frightening aspect of operational preservation: it is an entirely volunteer-run organization which promises perpetual preservation. It takes a lot of planning and commitment to be prepared to follow through on a commitment like that. Luckily, the OTW has both.]

Introductory thoughts on Operational Preservation )

I would love to get comments from the community on this, because I truly believe that this could be a very useful model for organizations designing digitization projects. I know I'm going to prompt my institution to follow this matrix for all new digitization efforts.

Problem Statement: When an archivist deposits material in a digital archive, he or she often has assumptions that object is preserved in perpetuity, just as it would be worried a physical object. Depositors of digital material often have the same assumptions, as do institutional administrators. However, the assumptions of the software development and maintenance community do not assume permanence on the same scale in which archivists are accustomed to providing permanence. Moreover, administrators (and archivists) often have unrealistic assumptions about the labor and costs involved in daily operational maintenance to provide digital preservation, which are -- if not higher -- certainly different from the operational maintenance costs for providing physical preservation. Even worse, many digital preservation projects are funded by limited-duration soft money instead of out of an operational budget.

Or, in a nutshell, we need to remember that Digital preservation has an ongoing operational cost which cannot be provided within the archive.

Operational Preservation: To that end, I am proposing this matrix for new preservation and archival projects to see if they have thought of the requirements necessary for permanent preservation.

Anything calling itself a digital preservation project has to be prepared, in perpetuity, to provide all items down the left-hand column for all of the items in the top row. Funding is really a redundant item -- by "Labor", I mean funding for staff to provide all of the work involved, and "Physical facility" is really something which can be provided by funding -- but the fact that digital preservation requires ongoing operational money is too important to ignore. By "Bureaucratic support" I mean policies and procedures in place which support the operational business of preservation at an organizational level.

Operational Preservation Matrix
Labor Physical facility Bureaucratic support Funding
Existence of the datastream
in a file system or database
. . . .
Object access via handle/doi/uri . . . .
Maintenance, repair, and upgrade
of hardware (server, disk, etc.)
. . . .
Maintenance, patching, and upgrade
operating system
. . . .
(The following tasks are not as
essential, but still very important)
. . . .
Rolling forward file formats . . . .
Transferring data to more modern
repository and software tools when appropriate
. . . .
Modernizing user interface as appropriate . . . .

(Of course, traditional preservation of physical objects is also an ongoing operational cost. Physical objects require extensive physical facilities with narrow environmental limitations, they require re-housing and repair, they require maintenance and supervision. But these ongoing operational tasks can be performed by archivists with traditional skills. The technological operational tasks of a digital archive often can't be performed even by technologically-trained archivists, because the institution will have specific requirements about who is able to, say, maintain the network.)
All these papers will eventually be available in the Open Repositories 2008 conference repository. I'm linking to all of the placeholders; papers should be up soon.

This will be very limited liveblogging, because I'm typing in the conference and dictating betwen sessions, so I can't say much. Hopefully I'll get some good fodder for my upcoming sustainability post.


Repositories for Scientific Data, Peter Murray-Rust )

Session 1 – Web 2.0

Adding Discovery to Scholarly Search: Enhancing Institutional Repositories with OpenID and Connotea, Ian Mulvany, David Kane )

The margins of scholarship: repositories, Web 2.0 and scholarly practice, Richard Davis )

Rich Tags: Cross-Repository Browsing, Daniel Smith, Joe Lambert, mc schraefel )

Ow. I'm not doing this for the next session. I can blog at the breaks.
