|deborah (deborah) wrote,|
@ 2006-06-13 01:56 pm UTC
|Entry tags:||conferences, conferences: jcdl, digitization, interoperability, metadata, preservation, privacy|
Opening Plenary on getting books online
Daniel Clancy from Google Print made an interesting point on paternalistic, nationalistic colonization of information. He warned against believing we know how to fix things for other countries. (Ironically, he also referred to languages written in non-Latin scripts, for the purposes of OCR, as "obscure languages". A slip of the tongue, I'm sure, but a tacky one.) I believe he was also the one who said "How can we determine whether or not these books are of preservation quality? There's no standards or agreement about what preservation means!" to which I can only respond, and I cannot believe these characters are passing through my laptop, O RLY?
David Ferriero from the NYPL pointed out that of the Google Print partners, only 49% of their holdings are in English.
Daniel Greenstein from the OAI said most of what I found interesting:
- It's too costly to determine the status of orphaned works and other post-1923 potentially public domain texts, and it's easier just to assume that everything post-1923 is not public.
- We shouldn't be thinking of digitizing as preservation, we should be thinking of it as collection management.
- Most of the funders of these projects are interested in works written in Latin scripts, so that's where the effort is spent.
This panel was quite interesting. It presented the results of the working group that's developing Pathways Core.
- Herbert Van de Sompel said we need to start thinking of repositories as active nodes contributing to a global network rather than local nodes passively waiting for external discovery.
- He pointed out that Dublic Core has been shown to be inadequate for most users (though, given that all the panelists stressed that they're developing full datastream/object interoperability standards rather than new metadata standards, and that metadata fields required are subject-specific, he didn't say if they're addressing this issue.)
- I see what they mean about the weakness of a URL as a means to transmit a (potentially-complex) object.
- The model should, in theory, allow selective data harvesting to build data mashups. Could be very useful in cross-field studies such as bioinformatics.
- Good questions included asking what this standard will do that differs from SRB, and asking why this effort will succeed now when similar efforts have always failed.
Jonathan Zittrain on privacy
Jonathan Zittrain is a fantastic presenter. He's such a good presenter that I can barely pay attention to what he's saying because I'm so caught up in his style, which is an intereting observation on pedagogy, now that I think of it. Notes I made:
- There *are* technological solutions to privacy and dissemination; see Omniva. Libraries are generally law-abiding, so the content industry should see us as best friends because we're good setups for DRM. I agree with major caveats, namely that DRM usually leads to licensing rather than purchasing of content which I think is financial pit for libraries (cf. the crisis in scholarly communication). I find that content-providers tend to do this in ways that violate LOCKSS, but he thinks it doesn't need to be.
- "Talk about protest -- Somebody destroys the Constitution." Well, if I needed a reminder of my priorities, my surge of nausea when he said that provided one.
- If we provided information via links then we're allowing the content holder to retract or change the information.
- Libraries can hack DRM legally to see if they want to buy somethingin the DMCA (though you can't traffic in DMCA-violating tools which will let you do it).
- Collection development: If there's no real limit on what a diital library can store, then do you still want to discriminate? And libraries do -- "reader advisory". But I disagree with the hypothesis; there is a real limit on storage. Disk space is cheap, but building a digital collection that's backed up, searchable, good metadata, checksummed, etc, is not. Also, he misquoted Obi-Wan and attributed the line to Leia, feh. But he points out that librarians claim a lot of value in readers advisory, and part of it requires context.
- The net has eliminated privacy: Flickr, Riya, Google, Facebook, etc, combined with photo cells makes it trivial to know everything about anyone, anywhere, and to tell the world who you are and where.
- Collective judgement: wikipedia voting, cyworld, the "we suck" hack of the H/Y game. Wikipedia editing ("Jimbo and his crew"). In the real world, personal transponders coming? Disctributed but totally centralised.
- Jylland-Postens controversy is one of wp's bast moments, as far as preventing the memory hole from winning.
- Are we worried about the memory hole as librarians?
- There will be times when diffeent things are correct. Keep it; let it die; save but encrypt. Case by case.
- Just a thought: that it's illegal to announce game scores unless you're licensed to do so.
- On Brin, for good or ill: Privacy as we know it as dying and partly because of generational divide. (You might lose your job, but the hiring manager will be from the new generation)
- The default we have is not to trust people to use ought-to-be-private information well.
- We trust information systems a whole lot. Paul Vixie's spammer blackhole, Google redacting negative comments. And Wikipedia, of course. (All information systems have weaknesses, but the 'net puts them on steroids.)
Joanne Kaczmarek on the RLG Audit checklist
I was quite disgruntled at the response to this paper. Admittedly she might not have been the best person to present to a highly-technical audience (since she couldn't answer their questions), but the audience response to being asked to consider end-users in design was downright hostile. People asked questions such as "how could untrained librarians be able to give reasonable assessments about what is a usable interface without working with trained computer scientists or psychologists?" Now leaving aside the ludicrousness of assuming that computer scientists know how to make usable interfaces (to quote Chasing Amy, "Bitch, you almost made me laugh."), and the assumption that librarians are untrained in usability, the presenter gave a sad hand-waving answer to the question, on the order of "oh, you're right, but we do what we can."
I wish she'd instead discussed that the RLG checklist was written by people who paid close attention to what usability actually means. Here are some sample excerpts from the checklist:
C1.1 Repository has a definition of its Designated Community/ies—who it is, what its
knowledge base is, what levels of service it expects, etc.
Examples of Designated Community definitions include:
- General English-reading public educated to high school and above, with access to a Web Browser (HTML 4.0 capable).
- For GIS data: GIS researchers—undergraduates and above—having an understanding of the concepts of Geographic data and having access to current (2005, USA) GIS tools/computer software, e.g., ArcInfo (2005).
- Astronomer (undergraduate and above) with access to FITS software such as FITSIO, familiar with astronomical spectrographic instruments....
Repository policies should clearly define access and delivery mechanisms available to its
Designated Communities. Repositories do not have to support any particular type of request;
they just need to state which types of request they can handle (online, batch, on-site, incidental,
programmed or repeated requests—either to be notified when new material of a given type
appears, or automatically receive copies of certain types of material).
In other words, the draft provides a pretty good starting point for assessing usability issues.
This isn't every presentation that I liked, but most of the others I enjoyed were displays of clever software products, hardware display, or metadata tools (though I am fasincated by the project of "Exploring Erotics in Emily Dickinson’s Correspondence with Text Mining and Visual Interfaces") and I'm not sure how much there is to blog on them.
Oh, also, to my fellow presenters. If you are going to do a demo, get some capture software and make a video of yourself doing the demo. You're all either computer or library professionals, and should know better than to trust internet connections, computers, and A/V systems to work on demand. The demos that were pre-recorded went smoothly, and for many of the live demos we lost any real understanding of the software because you gut hung up on the failing demo.