deborah | Entries tagged with metadata

Brynt Johnson is a records manager in the clerk's office of the District of Columbia's federal court by day, and a personal trainer to Ruth Bader Ginsburg and Elena Kagan by night.

"Hot Mess", children's author Zetta Elliot's post about trauma (particularly African-American trauma) in children's books, estrangement, privilege,and race in the industry.

I read a lot of science blogs, and I'm always excited to see archives show up in science blogs. Here's one we probably wouldn't have expected: "Wormholes in old folks preserve the history of insects". An evolutionary biologist from Penn State has used insect holes in prints made from old wood blocks to study the spread of particular wood-boring beetles. The prints, rather than the blocks themselves, show an accurate timestamp of when the beetles emerged and where, because the texts usually contain the information about when they were printed and where they were printed. The power of old metadata, people!

In "Descartes Letter Found, Therefore It Is", I learned that a long-lost stolen letter of Descartes' has turned up in my alma mater's archives:

If old-fashioned larceny was responsible for the document’s loss, advanced digital technology can be credited for its rediscovery. Erik-Jan Bos, a philosophy scholar at Utrecht University in the Netherlands who is helping to edit a new edition of Descartes’s correspondence, said that during a late-night session browsing the Internet he noticed a reference to Descartes in a description of the manuscript collection at Haverford College in Pennsylvania. He contacted John Anderies, the head of special collections at Haverford, who sent him a scan of the letter.
...
Scholars have known of the letter’s existence for more than 300 years, but not its contents. Apparently the only person who had really studied it was a Haverford undergraduate who spent a semester writing a paper about the letter in 1979. (Mr. Bos called the paper “a truly fine piece of work.”)

Guys, this is awesome. This is why I do what I do! Putting collection guides online is a royal pain (ASK ME HOW I FEEL ABOUT THE EAD STANDARD), but this is the kind of story that makes it all worthwhile. Archival collections are full of hidden treasures the archivists themselves don't know about. It takes a dedicated scholar to find these lost and hidden (and rarely digitized) gems, and digital collection guides, followed up by e-reference, followed up by spot digitization, solved the puzzle.

Viva la Ford!

On a more somber note, from "Why diversity matters (the meritocracy business)":

Now, whenever I screen resumes, I ask the recruiter to black out any demographic information from the resume itself: name, age, gender, country of origin. The first time I did this experiment, I felt a strange feeling of vertigo while reading the resume. “Who is this guy?” I had a hard time forming a visual image, which made it harder to try and compare each candidate to the successful people I’d worked with in the past. It was an uncomfortable feeling, which instantly revealed just how much I’d been relying on surface qualities when screening resumes before – even when I thought I was being 100% meritocratic. And, much to my surprise (and embarrassment), the kinds of people I started phone-screening changed immediately.

All these papers will eventually be available in the Open Repositories 2008 conference repository. I'm linking to all of the placeholders; papers should be up soon.

This will be very limited liveblogging, because I'm typing in the conference and dictating betwen sessions, so I can't say much. Hopefully I'll get some good fodder for my upcoming sustainability post.

Keynote:

( Repositories for Scientific Data, Peter Murray-Rust )

Session 1 – Web 2.0

( Adding Discovery to Scholarly Search: Enhancing Institutional Repositories with OpenID and Connotea, Ian Mulvany, David Kane )

( The margins of scholarship: repositories, Web 2.0 and scholarly practice, Richard Davis )

( Rich Tags: Cross-Repository Browsing, Daniel Smith, Joe Lambert, mc schraefel )

Ow. I'm not doing this for the next session. I can blog at the breaks.

Just like the other times I've seen these gentlemen speak, I found this panel -- and Carl Lagoze in particular -- to be less provocative than frustrating. I never get the sense that Lagoze is listening to anyone in the room, even though he pays more lip service than any other panelist I've ever heard to saying 'talk to me, tell me what I'm doing wrong. But he takes all questions about his process as attacks and perforce responds angrily and defensively. Lynch, on the other hand, is friendly, but only wants to communicate with people he knows. It's hard for me to get past the aggressive, defensive mode of communication to see potential value in the ORE effort, but I'm trying.

( Overview of OAI-ORE )

( JISC )

( OAI-ORE in chemistry and further discussion )

Ultimately I think they're solving a real but tiny problem, and I wish the effort were being spent on solving realler, bigger problems.

Okay, folks, I need your help. I am currently getting soaked in a brainstorm, and I'd like to get this idea down before I lose the details. But since this is a brainstorm, it might make no sense at all. Tell me if what I'm talking about is an incredibly stupid idea that will never work. Alternately, tell me if what I'm suggesting is ridiculously common, and everybody does it this way already, and how could I not have noticed?

The two-part problem:

1. As we investigate products for digital asset management in the library, it's extremely likely that no one product will solve all of our needs. We will perforce find ourselves with digital resources in a number of different products, and will need to design either a single front end, or we'll have to accept a certain amount of user confusion at not knowing which tool holds the resources they need.

2. It's entirely possible that a single asset might be simultaneously part of our institutional repository and yet necessary for our learning management software, or similarly dual-purposed. How do these assets get filed? In what product?

My idea: carefully design an institution-specific set of metadata fields for each purpose. One indicating institutional repository, for example, and another indicating learning management. Assign as many of these metadata fields as necessary to each asset, no matter what product the asset is stored in. Store the asset in a product which is best suited for that asset-type. Then, using some kind of harvesting (e.g. Z39.50, OAI), harvest the contents of the various products and repositories. Write an institution-specific search mechanism that knows how to search the harvested data for all, say, institutional repository items. Or for all items in the special collections.

This idea of course ellides several major problems: designing the metadata; building what is effectively a small-scale federated search tool; deciding the appropriate product for the appropriate kind of asset; submitting assets into a multitude of products, possibly by non-librarian users such as faculty members and students. But is there any meat to this idea?ed

In my day job, in the local "metadata expert" -- or so they keep calling me, although I will continue to point out that they have a cataloging department, and just because it's got a fancy new computer-based word doesn't mean the catalogers are there we'll metadata experts. But my job entails constantly thinking how users find information. What metadata fields will end-users want, or be able to use? What metadata fields are important only for technical services? What metadata is used technologically to control rights or object manipulation? Under what circumstances is it appropriate to ignore metadata altogether and just do fulltext keyword searches?

Now I'm volunteering at the Second Life Library. (Or I will be, once I get back in; I've been locked out since the security incident this weekend and I can't get anyone from tech support to call me back. Not a great sign, but I suppose they were hacked, and so they're probably overloaded.)

At the Second Life Library, the virtual space is arranged something like a real library. As avatars move around the space, they may see a shelf of science fiction books in the science-fiction room, of reference books in the reference section, or of Gutenberg Project classics arranged in no particular order. Some of these books are portals which will open up a page on your web browser, outside of Second Life. Others will hand you a set of note cards you can read in-world which contain the text of the book, and still others (more clever, but extremely clunky and difficult to use) appear as enormous larger-than-avatar books an avatar can actually read in-world. And how do the users find these books? Well, they wander around and browse, or ask a librarian.

In other words, a collection of electronic texts is made available through one portal (the library building), and in order to find them, the patron wanders around a virtual space, browsing. (In the long run, I think it would be a good idea for the library to provide a list at the front door of all of the electronic texts made available at the library, with either hyperlinks or teleports directly on the list. And now that I'm thinking about it, it would be truly awesome if that list in-world appeared to be an old-fashioned card catalog -- with direct keyword searching, of course, but still looking like a card catalog.)

Do you see what I'm getting at? The idea is that the traditional experience of walking around the library building -- even for those users who were so much into computer worlds that they spend their days in a virtual environment and would rather go to the Second Life Library than to their local library -- is in some cases preferable to be much simpler and faster direct access search. In some ways, the look of the virtual space is the metadata: science-fiction books are behind that display of planets; reference materials are on the shelf by the reference desk.

All of us involved with the Second Life Library really hope it works out. But I will be really curious to see whether this model is currently only appealing because of its novelty. Maybe the experience of browsing through a physical space, looking for displays and book covers that catch the eye, is one that people really genuinely want.

Welcome to the William Gibson world.

( Carl Lagoze, on metadata aggregation and the NSDL experience )

Just a quick note: I have a big girly crush on Brewster Kahle, and he's not even here.

( Opening Plenary on getting books online )

( Interoperability panel )

( Jonathan Zittrain on privacy )

( Joanne Kaczmarek on the RLG Audit checklist )

This isn't every presentation that I liked, but most of the others I enjoyed were displays of clever software products, hardware display, or metadata tools (though I am fasincated by the project of "Exploring Erotics in Emily Dickinson’s Correspondence with Text Mining and Visual Interfaces") and I'm not sure how much there is to blog on them.

Oh, also, to my fellow presenters. If you are going to do a demo, get some capture software and make a video of yourself doing the demo. You're all either computer or library professionals, and should know better than to trust internet connections, computers, and A/V systems to work on demand. The demos that were pre-recorded went smoothly, and for many of the live demos we lost any real understanding of the software because you gut hung up on the failing demo.

Dorothea wrote a scathing piece on some of the problems in electronic cataloguing that I was going to respond to, but I realised my response was more of a spinoff than a reply, so it'll be here instead.

Caveat: I've been absorbed in schoolwork. I have not been following the myriad projects combining technology and cataloging. It's entirely possible that the rant I'm about to make is Old News, years old. I know there are other DTDs out there I haven't investigated.

Note for the non-librarians: MARC -- Machine Readable Cataloguing -- is a 30-year-old format which encodes bibliographic information in a way that's can be read by computer. The step inevolution before MARC was the card catalogue, so at the time it was a massive advance. But it can get a little fuggly

When I took my Cataloging class, I decided to do my term project on MARCXML. MARC, in my techie's opinion, was a cute work-around from less technological days, but clearly outdated in this day and age. I was jazzed by the notion of using the powers of XML to do an intelligent and flexible encoding of cataloging data. I imagined something like this:

( MARCXML of my dreams )
That would break up each element into a completely machine parseable entity, ready for display in MARC format. Perfect! Handy, useful, easy to convert existing MARC records into XML and XML records back into MARC or any other format. Instead, here's what the Library of Congress schema actually calls for:

( MARCXML of sad reality )
What's wrong with this encoding? Let me count the ways. Firstly, the fact that MARC, while machine-readable, is not particularly human readable, is a side effect of technological limitations which are no longer in place. Once upon a time it made sense to name your machine-readable fields "100" with single-letter subfield codes. Now is not that time. For goodness' sake, it's a positive abuse of XML to have a number of data fields (all named"datafield") tagged with the MARC number. And you kept the meaningful spacing and punctuation. By the ghost of S. R. Ranganathan, people, this is not Fortran.

Give each field type (authority, title, etc) its own field, to start. It's readable, it's portable, and there's no reason not to.
Name the fields. Please. Please. Don't name your field "245". Name it "Title". Readable code is everything.
Under no circumstances should the meaningful spacing and punctuation exist. What on earth is the point of converting to XML if you're not going to take advantage of its power? A field for title and one for subtitle; a field for birth date and for death date. Use the tool you're in. You like the way MARC looks? Fine, write an XML to MARC converter and you can view the MARC to your heart's delight. But store your data in the extensible, human-readable, portable database. Please.

Ach, this project made me cry.

As I approach the end of library school, I am overwhelmed with the projects I don't have time to even investigate.

Boatloads of open-source cataloguing projects
Metadata initiatives out the wazoo
Open Access initiatives

All important, exciting, an extremely interesting to me. Not to mention that I don't have time to look into all the personal projects that led to work on: bar-code scanning and cataloging my own book collection; writing the database to catalog our music collection so we can easily write tools to generate playlists and archive mixes we've made (even if we still only own analog copies of the songs); creating a comprehensive database of reference sources with annotations which can be used to generate bookmark files or pathfinders.

With luck, having this space to talk about these issues will help me find focus.

Current Mood: sore hands. why am i typing?

Profile

deborah

Suberic Networks

Syndicate

Custom Text

Gnomic Utterances. These are traditional, and are set at the head of each section of the Guidebook. The reason for them is lost in the mists of History. They are culled by the Management from a mighty collection of wise sayings probably compiled by a SAGE—probably called Ka’a Orto’o—some centuries before the Tour begins. The Rule is that no Utterance has anything whatsoever to do with the section it precedes. Nor, of course, has it anything to do with Gnomes.

Expand Cut Tags

No cut tags

Page generated Jun. 16th, 2026 05:27 am

Ramblings on Librarianship, Technology, and Academia

The Australasian Journal of Me

Entries tagged with metadata

3 links: records management & exercise; children's lit & race; damaged archival documents & science

(1) digital discovery (2) meritocratic myth

Open Repositories 2008, part 1.

jcdl post 3: The OAI-ORE Effort: Progress, Challenges, Synergies

k-federated gets divorced!

No wetware in the stacks! (Cataloguing and the virtual environment)

what's a DL for?

JCDL detailed panel notes

xmlmarc ranting

Welcome to my semi-public life

Profile

Syndicate

Most Popular Tags

Custom Text

Expand Cut Tags