deborah: the Library of Congress cataloging numbers for children's literature, technology, and library science (Default)
deborah ([personal profile] deborah) wrote2010-07-27 03:48 pm
Entry tags:

Sometimes the intersection of Libraries and Technology has a 10-car pileup

I'm currently trying to normalize and shift into comma separated values files the disambiguated name lists created by four different students who don't work for my department, with whom I'm not allowed to communicate, and for whom I'm not allowed to create standard documentation. (Don't ask.) After title casing everything, my current (incomplete) Vim regular expression is: (screenreader users be warned you should skip!)

:%s#\(<\([^>]*\)>\( \)\)*\(\(\((\)*\([^)]*\)\()\)*\) \([^{]*\)\)#\2,\7,\9,,,,,,,,,\2\3\7 \9;,MS165.001.010.00001


Yes, this is what happens when the people dealing with metadata that need to be normalized are not being managed by professionals.

(I'm doing this in Vim instead of in Perl because each file is a little bit different, so every time I open one I'm doing some hand manipulation of the data and massaging the regular expression slightly to accommodate the fact that each of the students copes with variant names, titles, and unknown personal or surnames differently.)

This is why we can't have nice things.
sanguinity: woodcut by M.C. Escher, "Snakes" (Default)

[personal profile] sanguinity 2010-07-27 08:43 pm (UTC)(link)
:: with whom I'm not allowed to communicate, and for whom I'm not allowed to create standard documentation. (Don't ask.) ::

Hahahahahahaha!

And I know the explanation would be total crack for a systems nut like me. But see? I'm being good, and not asking. :-)
rantingnerd: Earth-Moon (Default)

[personal profile] rantingnerd 2010-07-27 10:51 pm (UTC)(link)
Owww. That hurt my brains. Owwww.
rantingnerd: Earth-Moon (Default)

[personal profile] rantingnerd 2010-07-28 07:46 pm (UTC)(link)
D'oh!

That really sounds dreadful.
rantingnerd: Titus Yawning (Titusyawn)

[personal profile] rantingnerd 2010-07-27 10:52 pm (UTC)(link)
If I were a LISP-head I'd ask if you had a 10-cdr pileup.
rantingnerd: Max (Max)

[personal profile] rantingnerd 2010-07-28 07:46 pm (UTC)(link)
I am humbly honored, and very glad. :-)
libskrat: (bookspecial)

[personal profile] libskrat 2010-07-28 02:12 am (UTC)(link)
*laughing ruefully in recognition*

Did I tell you about the image metadata set with no image titles?

I didn't?

Yeah.

SOME PEOPLE ARE GOING TO THE SPECIAL HELL.
libskrat: (Default)

[personal profile] libskrat 2010-07-28 10:20 pm (UTC)(link)
They gave me a "keywords" field which had (variously) something title-ish that wasn't in any way demarcated from a description of varying length (from zero to...), as well as EVERY FREAKING THING THAT WAS IN THE OTHER FIELDS.

SHOOT. ME. NOW.
jeshyr: Pile of thick books labelled "Geek" (Geek)

[personal profile] jeshyr 2010-08-04 09:10 am (UTC)(link)
Let me get this right, you people chose this profession?

Do they hide the details until you graduate and are hired, or were you both on the Really Good Drugs?

Ricky
(who might, under duress, admit that the vim regular expression ... or the fact that other people do stuff like that ... was really hot, in a geeky way)