<a href="http://www.oldbaileyonline.org/">The Proceedings of the Old Bailey</a>

The Proceedings of the Old Bailey (POB) is an ongoing digitization effort undertaken collectively by the Open University and the Universities of Hertfordshire and Sheffield. It has, in the past, acquired funding from the Arts and Humanities Research Council, the Big Lottery Fund, and the Economic and Social Research Council. Though there is no clear creation date of POB provided, its first mentioned funding source arrived in 2001 in the form of a grant from the Arts and Humanities Research Council. This would suggest, at the very least, POB has existed or been in developmental stages for eight years. This author mentions this seemingly unimportant detail only as a means of setting up this point: in eight years POB has accomplished something truly significant. According to POB, over 190,000 pages from the Old Bailey Proceedings and the Ordinary of Newgate's Accounts have been digitized in this time span. These digitized images represent all the extant editions of the Old Bailey Proceedings - from 1674 to 1913 - and the Ordinary of Newgate's Accounts - from 1679 to 1772. This is, in short, an extensive - even, perhaps, exhaustive - digitization effort whose main collection development policy is clearly to create as complete a collection of these documents as possible and to make them, according to POB, "fully searchable" and "free of charge for non-commercial use." From what this author has seen, POB appears to have unquestionably succeeded on the second of these fronts.

POB does not explicitly state a metadata schema to which it adheres. One does have the option to view a record in XML markup and this option provides a bit more clarity with respect to the way each document and its accompanying record are formulated. Yet, even here, it remains unclear if POB proscribes to a certain metadata schema. Moreover, outside of each record's unique identifier, the metadata that does appear is mainly subject or description driven - i.e. Offense, Punishment, Verdict, etc. - and tends to resemble a MARC record's subject field in this respect. Unfortunately, where a MARC record's subject entry, if clicked, takes one to other items with the same entry, clicking on a subject entry in POB takes one, instead, to another page that explains all offenses, punishments, verdicts, etc. This author found this to be frustrating and, ultimately, unsatisfying in light of the fact that one can, should one wish to, view all records with a punishment of imprisonment in an asylum provided one does a search for it. One cannot accomplish the same feat from the record-level and this is where, in this author's mind, POB fails to meet its aim of a "fully searchable" database as one has the ability to go top-down, but not the ability to travel in the opposite direction.

Despite the above shortcoming, POB is impressively explicit with respect to their digital objects. Microfiche were scanned at 400dpi and saved as TIFF files which are being "preserved for archival purposes, and should eventually be accessible over the web once data transmission speeds improve." Initially, GIF files were then derived from these TIFFs for web purposes whereas the standard practice now appears to be creating JPEG derivatives from the TIFFs. Afterward, POB digitized the text so as to make it meet their goal of having a fully searchable digital collection and this is where this project becomes utterly fascinating. This process of digitizing text involved "double rekeying" certain parts of the collection, meaning that "the text is typed in twice, by two different typists, and then the two transcriptions are compared by computer." Other parts of the collection had their text digitized through a process in which the text was "manually keyed once and a second transcription was created using optical character recognition (OCR) software." In each case, computer driven comparisons were used to locate any differences which would then be addressed manually. POB goes on to claim a 99% level of accuracy with its digitized text and this author found no evidence to suggest otherwise. It is both the aforementioned derivative images and this digitized text that are accessible in every record for a user. The only real complaint this author can lodge here concerns the quality of the images. They appear to be no better than a photocopy in many cases. This may not, in truth, be POB's fault as they make it clear early on that the images are derived from microfilm. Thus, they are merely making do with what they have and cannot be held too accountable for not having performed miracles. However, their images are a case study in the aesthetic losses caused by digitizing surrogates.

POB does not make any declarations on their site regarding an intended audience. They do, however, have what they call a "User Wiki". In this wiki, POB encourages user participation in the form of providing biographies, historical backgrounds, corrections, and teaching resources among others. From here, one comes away with the impression that POB wishes their digitization effort to be appreciated and used by an audience that includes but is not isolated to scholarly niches. This author sees no reason why this could not be possible with the right exposure as this is a fabulous resource that provides an expansive and fascinating look back into the minds and stories of the rightfully and wrongfully accused.

Survey of Digitization - Spring 2009

Saturday, April 11, 2009

The Proceedings of the Old Bailey

No comments:

Post a Comment

Followers

Blog Archive

Contributors