Tuesday, March 31, 2009
It's a bit of a stretch to say that this is a kind of intellectual repository or an online collection of the Classics put on the web so that access can be more widespread, but nonetheless Brian Collier's website Very Small Objects is entertaining, aesthetically pleasing to look and also contains a very creative "classification system.
The objects featured on this site are close-up photos of very small items found anywhere and everywhere. Collier accepts all objects submitted to him and names them based on a classification system that is largely influenced by satirical version of the scientific nomenclature, comprising of fragments of words from the English language. He then photographs each object complete with measurements posted within the picture so the user can see how small the dimensions are. All the photos are stored in a "database." The criteria for photographing each submission is that they be of a certain size, with a new name based on Collier's classification system, non-living (Collier opposes the collection of live specimens), and somewhat distinguishable of what it once was.
As for metadata or ways to search the collection, this is an area that is sorely sorely sorely needed. Each item in the database is classified by the first part of its name and progresses through the names in a list format with a little thumbnail picture for reference. For example, all the things that were never alive (ie rocks, plastic) start off with the name Neli, and if it's a part of a whole= Nelipart and so on; all the Neliparts are grouped together. There is no search feature to easily access a particular specimen. As for metadata about each object, the only metadata made available on the site is the "name" and its dimensions.
The collection itself is pretty neat to look. There are all kinds of itty-bitty things. During my examination of the site, I saw a fragment of a green M&M, electrical components, beachcombing loot, dried up curls of a leaf, a watch gear, the list goes on. It's an eccentric collection based on a classification system. Collier exhibits his collection around the nation and in Europe.
As for the audience, I think it's geared towards people who are knowledgeable about classification systems, naming items, and has a kick out of emulating the scientific nomenclature system: namely information professionals and scientists or just about anyone willing to take the time to peruse this very small object collection.
The website is very easy to use. It only has two main menus: TV and Movie. For example, when users click the “TV” button, it will show submenus: Channels, popular Episodes, Popular Clips, Popular Shows and Browse. For Channels, it presents TV shows including all types such as Action and Adventure, Animation and Cartoons, Comedy, Drama, and Family etc. There are updated lists of the most popular and the highest rated videos, so uses can keep up with what other people are watching. When looking for a specific show, users can search by name or can also look at an alphabetical list of all the shows in Hulu’s library. In addition, one of the biggest highlights of using the site is the video player. Users don’t need to download and install special video player, because the website operates directly in the web browser using flash player technology. When users are watching movies, mouse over the player and the usual controls will appear. The “embed” function allows users to set in and out points, so users can embed just a selected chunk of a video clip on their blog. The “share” function allows users to send the link to their friends. The “timeline” function allows users to watch a movie at any time point. “Lower lights” dims the whitespace around the player to dark gray. And “Pop out” turns the video player into a pop-up. On the full-screen mode, the site allow users to watch at full screen. Even users can leave comments under the player. For metadata, the website provides a thumbnail picture, name, time and channel for each item.
All in all, Hulu.com is a video presentation site with a good variety of TV shows and movies to start. Hulu’s user experience is better, and everything loads and plays very quickly. The audiences for this collection include all TV and movies lovers.
Books from the Past (http://www.booksfromthepast.org/) is a digital library based in Wales that provides “an on-line collection of books of national cultural interest which have long been out of print, and are unlikely to be reprinted by traditional means.” It is the result of collaboration between Culturenet Cymru and the Welsh Books Council. At first glance, one may think that the digital library will not offer anything exceptional, but after spending some time with it, it is quite apparent that Books from the Past is a very sophisticated and well-designed digital library.
Selection Decisions: This digital library has some of the most thorough and transparent collection documentation that I have encountered. Its “About” page reveals the step-by-step thought processes that Culturenet Cymru and the Welsh Books Council had in designing the collection, including the respective goals that each organization had from the project’s inception. The organizations were very thorough in considering standards that would make the collection broadly available, freely accessible, and sustainable over time. The organizations understand the value of collaboration and hope to invite similar institutions to join them in their partnership. They also understand the importance of content selection as the project increases in size, so the Welsh Books Council has identified at least 200 books of educational and cultural value to be eventually included in the collection.
The website’s “Help” page thoroughly describes almost everything a user would need to know about navigating the website and utilizing its features. It discusses how to select English or Welsh for the language of the website, how to choose a book, how to navigate through a book, how to search within a book, how to change the default search settings, how to search the entire collection, and how to download materials from the site.
Metadata: One of the most impressive features of the website is its discussion of metadata standards and other standards on the “About” page. The designers decided to use XML for metadata and encoded text, the Text Encoding Initiative (TEI) for text encoding, the Metadata Encoding and Transmission Standard (METS) for the structural metadata, and the Greenstone Archive Format (GAF) for xml-html transformations. The reasoning for each of these decisions is clearly stated on the website and provides a solid basis for the long-term preservation of the collection’s digital objects.
The collection’s metadata also allows for searchability within each book and across the entire digital collection. Users can search by book title, author, subject, publisher, date of publication, period, or language. Users can also browse books by image or by text. The designers have even considered unique challenges that arise when searching in Welsh, including the ability to include diacritics (“mark[s] added to a letter or symbol indicating a change in its usual pronunciation”) or mutations (“instances where the initial letter of a word changes depending on its grammatical context.”)
Object Characteristics: I was impressed with the discussion on the “About” page under the heading “The case for PDF.” The designers of the digital library did not simply decide to include PDF images because it was easy and popular. They really thought through the pros and cons of using PDFs for the files and determined that PDFs would not fully meet their needs. Accordingly, they provided functionality that would allow users to download images in PDF, ASCII, or RTF files. They scanned the images at 300dpi in 24 bit color, and they use TIFF for master file images and JPEG (SPIFF)/GIF for web display files. Their selection of the METS standard allowed them to tie the digital objects together for each book in a coherent manner. At this time, there are books in Welsh and English, but there are not translations for either type. This could be a potential area for enhancing the digital objects at some point in time.
Intended Audience: The intended audiences are primarily educators, scholars, and students since the books were selected for their educational and cultural value. However, it could be of potential interest for anyone interested in rare books or Welsh culture.
The Mildred Wirt Benson Collection at the University of Iowa
Mildred Wirt Benson was the original Carolyn Keene, pseudonym for all authors of the Nancy Drew mysteries. She attended the University of Iowa in the 1920's, determined her whole life to be an incredible writer. UI collected what remained of her personal collection in the 1990's, soon before she died in 2002. Sadly, Benson sold, threw away or burned most of her documents and correspondence, but the library has the entire collection of her remaining writings. They are housed in an archive of UI's special Iowa Authors collection.
This collection is complete in its mission to digitize all of her published works, including articles published as a student journalist, her thesis, almost 100 short stories, and 135 books. Each is legible and searchable through OCR technology. The standard search function is very useful for browsing, but the advanced search could be used for scholarly research and even allows you to search across collections.
The metadata for each entry includes standard information such as title, creator, and publication information. It also includes archival collection information and rights management of the particular object (which is useful because some of this collection is under copyright and some isn't). They are careful to include information about the scanning process, including the name of the machine, the date of scan, and who to contact with questions or a request for the tiff image.
There are useful links throughout the metadata so it is easy to browse through the different layers of the administrative details. On the UI Digital Library main page, they include links to tutorials for using the digital services provided, including a variety of 60-second tutorials.
The most interesting part of the collection to me is the Memory Book. Each page of the book is shown as a whole and in zoomed parts, which is useful since almost the entire scrapbook is made up of newspaper clippings. It also includes the loose items - everything from tickets to a piece of string. This scrapbook is the most personally revealing part of the collection as most of her notes and correspondence are gone; it shows how much pride Wirt took in her journalism career, what events were memorable, and awards she won for swimming.
Although the photographs of her late flying career were a close second for favorite...
Monday, March 30, 2009
The William Blake Archive is sponsored by the Library of Congress and supported by the Carolina Digital Library and Archives at UNC-Chapel Hill. Both clearly have very high digitization standards and access to amazing materials. This is certainly not the only Blake Archive but it seems to be much more extensive than most, including the New York Public Library's project which features three of Blake's works.
The digitization initiative includes Blake's illuminated books, commercial book illustrations, separate prints and prints in a series, drawings and paintings, and manuscripts and typographic works. The about page is the most complete information page I've seen on a digitization initiative. Viewers are given thorough information on the people actually involved in the program and details of the long-term plan. They began with the illuminated manuscripts in the early '90s and have been adding to the collection since. One could spend a couple hours just reading over the supplemental and background information provided for the project. You can even take a virtual tour of the collection to learn about all the features included.
The images themselves can be accessed two main ways. The only real downside of this project is that there is no way to browse thumbnails. You can select a medium and then a work through either the index or navigator pages (the navigator is actually a pop-out window). There is also a search feature but users aren't able to enter their own search terms when searching for images. (There is the option of searching the text using your own search terms however). For an image search, the site provides a list of search terms available and lets you select several of them to find images.
Once an image is accessed, there are several ways to view it. One drop down menu offers an image enlargement, illustration description, or textual transcription. The image enlargement opens another window and lets users zoom in once more to see more detail. The textual transcription will give the original text plus an English translation if necessary. The illustration description essentially provides the same information as the inote button. When a user selects inote, another window opens with the image split into several sections and a menu to select each of the sections. Once selected, the user is given a description of the action and content of that frame. The info button below the image provides detailed metadata for the featured image. They include the three levels of resolution available, information about the works creation and the source of the image. They even include information about the scanner used for the images.
Although I first saw this project five years ago, it is one of the most exciting digital projects I've encountered. They provide more than sufficient data on the project and its implementation as well as giving detailed information on the images themselves. The only thing I think would improve the site would be thumbnail previews rather than textual lists of the items in a work.
The Freshwater and Marine Image Bank was established by the Fisheries-Oceanography library at the University of Washington. Their goal was to create a comprehensive digital collection of historic fish pictures for the world to enjoy, and to maintain it well into the future (at least as long as people have the desire to look at fish, which knowing people, will be quite a while). The collection consists of approximately 19,000 images captured from a wide variety of publications printed between 1735 and 1924. The idea being, of course, to focus on images that would be in the public domain. On to presentation and metadata!
Searching by keyword or browsing by predetermined classifications both bring you to the same thumbnail view screen of relevant images. Each image is listed along with it's title, which unfortunately isn't standardized at all. Some are scientific, Latin names of flora and fauna depicted, some are descriptions of the image, and some are even written in the language of the image's country of origin! It can make things confusing at times. Clicking on the thumbnail brings up a larger version of the image with all of the metadata, which is extensive (date, author, publisher's information, copyright information, etc.) Much better than the thumbnail view, at any rate.
All in all, the site is easy to navigate despite the large size of the collection, and there's plenty to look at for the aspiring marine researcher. The public domain focus also makes it a great source for students working on presentations or projects. The images are all of high quality, and relevant metadata is provided. Overall, one of the least silly and best put together sites I've looked at so far.
Sunday, March 29, 2009
Users may access to the collection either by searching by keyword, by title keyword and by composer or browsing the collection by title, by composer, by series and only by entries with images.
Users may browse the collection in series volumes. Because each volume embodies its owner’s taste and philosophical learning therefore description would be given in volume. All the sheet images are represented in thumbnail. There are enrich descriptive information go along with the images of music sheet. The information includes composer, short title, full title, publisher, publisher number, pages, language, cover inscription, series. Always there are hyperlinks on first 2 or 3 entries, and by clicking the links there are new relative searching results come out.
The sheet images are printable for educational use.
The index page displays a number of thumbnail covers that provide a one-click view to available metadata that may include author, publisher, publication date, typeface, genre, designer, and art director. (Not every cover's page includes the same metadata.) Each of these categories is also an active link to cross-search for other covers sharing the same metadata. This page also provides a comment option that is moderated, an e-mail link for suggested edits to the page, and a link to Amazon. (Comments do include a date.) There is a drop-down search option that includes designers, titles, authors, art directors, photographers, illustrators, genres, publication date, publisher, typeface, and a "smart keyword search" box. The images are PNGs and JPEGs.
Unfortunately, covers can't be enlarged, nor is there any data on when the covers were added, the metada edited, scanning/digitizing, paper information (i.e. texture, gloss), where the images/books came from (i.e. bought, loaned - assuming they were digitized specifically for this site). There are no images of the back covers or information as to when the site was last updated. Apparently Mr. Pierrat is a cover designer, as a search using his name produced five covers (four of the five are for theological topics, which isn't a choice in the genre drop-down).
The sole mention of copyright is "All covers are the copyright of their respective publishers."
There is value-added with additional information about and links to "Portfolio sites of book cover designers", "Great Sites on Book Cover Design", "Font Identification", "BCA Blog" (which was interesting), " Submit a Cover" and "Support the Archive" (which is a little misleading in that it is actually a request to support "Children's Relief International" - a a health and education charity). I can appreciate the "Future Enhancements to the Archive" action list that includes items to do (such as tags and multiple images) and items done (such as RSS feed and Comments - each done item has a line through it). Additionally, the small black BCA icon on the far left of every page that redirects to the index and the "Steady Beta" seal are nice touches.
This is an extremely easy site to navigate with good use of white space, text size and image lay-out. It would appeal to the LibraryThing and Shelfari crowd, as well as those interested in publishing, art, advertising, and interesting fonts.
Saturday, March 28, 2009
Portraits of Scientists is a digitization project undertaken by the Wisconsin Historical Society (WHS) that pulls together 95 cartes-de-visite and 3 documents from an album of correspondence between famed Wisconsin scientist, Increase Lapham, and his colleagues (1811-1875). The cartes-de-visites themselves, according to WHS, represent a veritable "who’s who of [the] 19th century scientific world" including such notable figures as Charles Lyell, Richard Owen, and Thomas H. Huxley. It is no surprise, then, that WHS chose to create a digital collection highlighting these cartes-de-visite. The collection principle, thus, seems rather straightforward: to digitize Lapham's album of cartes-de-visite sent to him by other famed scientists of his day. On this front, WHS has succeeded as it appears they have, in fact, digitized all such cartes-de-visite.
The metadata fields provided by WHS for each image are pleasingly consistent from what this author observed. They include Title, Description, Image ID, Creation Date, Creator Name, Collection Name, Genre, Additional Information, and Subjects. The most interesting of the above fields are the Description and Additional Information fields. The Description field is used as a mini-biography for the photographed individual of the cartes-de-visite, providing a list of the individual's accomplishments that prove quite helpful in contextualizing the individual's impact of the development of scientific pursuits. The Additional Information field is used to describe the information found on the backside of the cartes-de-visite while also providing explanatory notes of certain photographers and photographic processes. As stated previously, the uniformity with which these fields are employed is pleasing and refreshing given some of the other digital collections this author has observed. However, it should be said that there are certain fields missing here that would prove quite helpful. There are no fields for the creation date and creator of the digital object nor are there any fields for characteristics of the digital object itself outside of the Image ID field. Inclusion of any of these fields might go a long way towards better informing an audience of the process undertaken by WHS to digitize these carte-de-visite.
The object characteristics is another area where one could argue WHS falls somewhat short in this otherwise well-executed endeavor. This author, as hinted at just above, discovered no resource providing information on (1) how these physical objects were reformatted into digital objects and (2) how the digital objects were then processed, versioned, and stored. There is what appears to be a persistent unique identifier contained in the Image ID field. Beyond this, the most this author can say is that the enlarged images appear to have 600 ppi. Yet, it is the enlarged images that caused this author the most frustration. WHI offers no zoom capability. Consequently, there is a good deal of information a user in unable to access - signatures, captions, handwritten and personalized messages - but by referring to the Description and Additional Information fields. For this author, being able to access this information on the digital surrogate of the original object would be much more preferable than being forced to refer to a transcription with no absolutely certain possibility of verifying the transcription's accuracy.
WHI never unequivocally states its intended audience. However, it does state at one point that "these photographs may be used as a primary resource in their own right to better understand Lapham's 19th Century world" with the obvious implication being that the intended audience for this collection is one of a scholarly persuasion. Thus, the intended audience could range from those seeking more context of Increase Lapham himself to those seeking to explore the arenas into which the cartes-de-visite market made for itself a niche. It is difficult for this author to judge WHI's effectiveness on this front. This author can, however, declare that he found this collection - despite its flaws - a fairly enjoyable and educational one without being himself engaged in a scholarly project related to anything found in this digital collection.
I am a bit of a sucker for projects like this. Even though eyewitness accounts are not always factually accurate, I find personal perspectives like those present in this collection absolutely fascinating. One stated goal of this project was to present a wide range of insights into questions and issues related to U.S. Cold War nuclear weapons programs. It is noted in the information about the project that the available interviews are not fully representative of all individuals or groups involved with or affected by activities at the test site; however, it appears that many important groups are represented. These groups include: national laboratory scientists and engineers; labor trades and support personnel; cabinet-level officials, military personnel and corporate executives; Native American tribal and spiritual leaders; peace activists and protesters; Nevada ranchers, families and communities downwind of the test site. It is also noted that the scope of the project was limited by available resources (i.e. time, money, student research interest, available participants), but there is hope that material will be added to the collection by future researchers. (Time will tell on that account.) A few interviews are not currently available for online viewing at the interviewee's request. Presumably, there are arrangements for release in the future.
The metadata available for the collection as a whole is brief but informative. Topics include student involvement, community involvement, using the metadata, using the transcripts, and acknowledgements. A timeline provides a nice overview of the chronology of significant events from the creation of the Manhattan Project in 1942 through last U.S. test and moritorium on testing in 1992. Metadata for individual objects was created using qualified Dublin Core and controlled vocabulary from various sources including Library of Congress Thesaurus for Graphic Materials, Library of Congress Name Authority, the Getty Thesaurus for Geographic Names, and U.S. Nuclear Tests. Locating related content is easily accomplished with a single mouse click.
The intended audience for the collection appears to be scholars and the general public. Not only is it a resource for scholars interested in the topic of U.S. Cold War nuclear weapons programs to use, it is also one their research can contribute to. Because it is freely available on the Web, any person interested in the topic has access to the collection. The downside for everyone, though, is that everything in the collection is not digitized. One other omission is that, apparently, student papers were presented at conferences and the collection is being used for research projects, but there is no information about those papers or any published research that reference the collection available from the site. Overall, this is not a bad example of using local and regional assets as a foundation for interesting and useful digital (and special) collections.
Friday, March 27, 2009
The Bathtub Art Museum, Portland, Oregon, began in 1993 and operates with a collection of over 300 postcards.
It appears to have been updated January 2008 according to one update, but there is still listed the announcement in 2006 looking for artist for 2007 and the home page has 2003-2007 listed. That’s a little confusing.
The postcard collection is interesting. They range from 1900 to recent and includes a description of each card with the artist, location, date and other information when available. The most interesting is when the post card was actually sent and the message from the sender is shared.
Included also is a list of artist. Jonathan Liu is featured with his Etch-a-Sketch bathtub etching.
There are four galleries featured: Art of a bathtub Cake (features 9 cakes from the cake contest that celebrated the Museum’s 3rd birthday),More Peeping Toms (featuring 8 postcards from the 1920’s to the 1060’s), Full Tub(featuring 7 post cards) and Beauty and the Bath, (features 8 post cards from 1900-1950’s).
Each postcard can be enlarged, but not significantly enough to take in the smaller details.
Another portion of the web site centers on Bathtub News and Bathtub Blurbs with descriptions and photos. This section is pretty cool, and the news and blurbs are quite an interesting collection. There is also a spot for you to submit information if you have any contributions. You can also join the mailing list. There are also additional informational links.
MPTV was founded by photographer Sid Avery who was best known for his work taking candid photos of celebrities during Hollywood’s Golden Age. It now exists as an ever growing archive of still images focused on "preserving the memory of some of the greatest celebrity legends of our time.” The site includes over 1 million photos from over 60 photographers and focuses on images of the red carpet, film legends, and retrospective prints. The site encourages users to register with them so that they can license images.
There is a drop down menu on the about MPTV page which allows you to browse photographers alphabetically. Another way to browse the image collection is to select legends, red carpet, or retrospective images and then type in a keyword. A menu will pop up asking you to select which of the three areas you’d like to search or if you’d like to search in all of them at once. After doing this you are able to browse the retrieved images. Each thumbnail has a unique number and description beneath it as well as the date it was taken and the name of the photographer. There is also a link underneath each thumbnail that if clicked on will add the image to your “lightbox.” This is how you would save a list of images you’d like to license if you are logged in to MPTV. No watermarks appear on the thumbnails of the images.
Selecting an image opens a new window that contains a larger, watermarked version of the photograph and its metadata. The caption information describes the film the image may be from, the person within it, the date, and the studio name. There’s also a keyword list, the image number, the max file size, dimensions, dpi, and specific release information.
The audience for this site would primarily be those looking for good quality, professional images of Hollywood legends and stars walking the red carpet. Since this site is focused on commercial interests, it’s likely the casual browser would visit to view the images but quickly become annoyed by the large watermarks. The legal information assures the browser that it is not wise to reproduce the images in any way, ever, if they haven’t purchased the licensing.
Thursday, March 26, 2009
Early American Bookbinding
The Redwood Library and Athenaeum, located in Newport, RI, displayed "Cover as Clue to Content" June 6 - August 20, 1999. This exhibit highlighted the changes in book binding during the 19th century, as well as publisher's attempts to use the book covers as a means of advertising the book's content.
The exhibit is small, with only a handful of book covers serving as examples of the trends of the six decades highlighted.
The exhibit's home page credits the persons who created the online exhibit. The exhibit itself is entered by clicking on the six different decades listed on a picture of the Redwood Library.
Each decade has 3-4 book covers at the top of the page, as well as a bulleted list describing the characteristics of the cover designs. Following this list are the metadata for each book cover (author, title, year published, publisher, and the original ownner, if known), as well as a brief description of the physical condition and characteristics of each cover.
The first decade, 1830-1840, also has a brief introduction to the exhibit, where it outlines the context of book binding starting in the 18th-century, and the focus of the exhibit. The introduction is following by a link for further information.
This link directs you to "Introduction to 19th-Century Bookbinding" by Russell J. De Simone, a short narrative that further describes bookbinding in the decades covered in the exhibit.
This exhibit has a very clean and easily navigated layout. The 1860 page, however, has a deadlink for one of the bookcovers. Where the bookcover should be, there is only a blank space, which, ironically enough, is still a hyperlink. A link that opens a blank page.
The University ALbUM is a digital collection of images taken at or related to the University of Maryland, dating from the late 1800s to the early 2000s. It came about as a result of a flood of requests for pictures for the University's 150th anniversary, and is only a small sampling of the materials within the University of Maryland Photo Archives.
Wednesday, March 25, 2009
The collection has clear collection policy principles. The Law Library has invested significant time and resources in creating this collection. It negotiated for over fifteen years to make the resources in the collection available for free to a general scholarly audience, and it has sought to secure the stability of the site by creating a $90,000 endowment to be "used only (a) to purchase a replacement server or (b) to transition the digitized material into a different format should JPEG images become obsolete." It has a clear vision to continue growing the site at a rate of 400,000 frames each year until it reaches about 8,000,000 images. It seeks to "serve as a model for low-cost, high-volume acquisition of digitized research material made accessible without charge to the public." Its states that a broad range of historians is its target audience as well as "those who find that personal, financial, familial, or physical limitations prevent them from accessing the material at the National Archives in Kew."
In one sense, the collection is heavily curated in that the website contains numerous articles relating to the materials by Robert C. Palmer, Cullen Professor of History and Law at the University of Houston and a helpful link to a tutorial on the U.K. National Archives site on "Palaeography: reading old handwriting 1500-1800." However, in most other senses, the collection is not very well curated, and the curators have made policy decisions that fall far short of the NISO standards. The default standard format for the objects is JPEG, which is a lossy standard. The cameras utilized are off-the-shelf cameras -- Canon Powershot S70 and Powershot S80 -- and the curators are very pleased with them. They state, "Even the acquisitions made with the S70 are sufficiently clear that it is not necessary to re-photograph them at a better resolution." Their rationale for using such cameras is purely based on cost. The website states, "The argument from the beginning has been that such projects should aim to produce images that are of sufficient quality to enable research, not perfect images: resources should be arranged to increase the overall possibilities for research." This reasoning is somewhat understandable, but it is unfortunate and largely self-defeating in the long run, particularly as the images deteriorate. Many images have extraneous materials in them, such as fingers or scanning tags, which detract from the attractiveness of the images.
Besides the articles by Professor Palmer, there is no metadata of any kind relating to the images. Apparently some of the image numbers correspond to published finding aids, but for the uninitiated, the numbers are largely meaningless, and the site does not have any other search engine. The browsing function provides row after row of JPEG thumbnail images. Users can download whole volumes to their computer, which is helpful, but there is no indication of the objects' origins, structure, authenticity, or developmental history actually tied to the digital objects.
The rationale and ideals of the Anglo-American Legal Tradition website are praiseworthy, but it is unfortunate that so much time and energy is being expended on a project that technologically seems to fall far short of its potential. By not providing metadata and better access to the materials, the collection will likely cease to be of interest to the very audiences it seeks to reach.
The site specifies that all the images are 640x480 JPEG formats, and that prints may be purchased. It is possible to copy the images, but there is not zoom function. The images are presented as thumbnails with eight metadata fields. These include the date, space program, mission name, film size used, a unique title, description, and a field named “subject terms” which acts like a keyword bank. There is also a unique photo identification number that is used if a visitor decides to order a print.
While I was searching the collection I found that some of the more interesting photos to me were inaccessible and one link led me to a page that claimed no file of that name existed even though I could see the thumbnail on my list of potential matches. I found that these missing images were random and could not find any common thread between them such as mission, date, or subject of the photograph. Even the metadata was absent. The frequently asked questions page did not address this problem either which reinforced my early feeling that this site is maintained only minimally. The only irritating detail I discovered with the search by date option was the lack of results when using the earliest year provided, 1958. It seems that although this year is included the actual earliest year possible to use for a successful image search is 1958.
There are two separate ways to search the collection. A browse function and a search function. On the search page a visitor can look by date, including a month, day, and year field that must be filled in for any results; a keyword search; and the unique photo ID number search. One of the interesting details about these rather common search abilities is the inclusion of directions for best recall and precision. Here is the exact language from the identification number search:
“NOTE: NASA Photo IDs have several possible formats, but will always include both letters and numbers. Examples: AS06-02-1436, S60-02552, sts026-38-056, sts028(s)014. Fuzzy matching may be used if an exact match fails.”
There is a hyper link for the term “fuzzy matching” that explains how to break the ID number down if the user does not have the complete number. I assume the program tries to find like numbers through some algorithm which is designed to have high recall rather than precision. The key word search includes the following helpful hints:
“NOTE: All listed keywords must occur in a photo description for a photo to be returned as a match. Searches are case-insensitive. Wild card (pattern matching) operators are not supported. Symbols (*, &, @), single-character words and abbreviations (a, i.e.), and single-digit numbers (1, 2, 3) will be ignored.”
These searching notes and hints are something I have not come across in a digital collection before and are the main reason I decided to look at this collection. Although the curatorial presence is weak, (lacking a ‘contact us’ option unless the visitor is ordering prints) the site has an extensive frequently asked questions page that includes everything from how to order prints to the very specific “I saw in a magazine a composite picture of the entire Earth at night with city lights. Where can I get that picture?” the site seems very concerned with helping visitors find exactly what they are looking for and it seems that their target audience are space exploration enthusiasts and journalists or other people affiliated with media publications. This guess comes from the inclusion of the unique photo ID numbers and the film type information in conjunction with the directions for successfully searching the collection.
The photographs are not very interesting as a digitization effort because they are so static, but the kinds of metadata developed make this an interesting collection to compare with other museum or fine art types of collections. It would be interesting to see if other scientific digital collections including photographs use similar metadata and searching hints or if this is unique to NASA and the JSC.
This online collection consists of “documents, images, maps, finding aids, interpretive and educational materials, and other media,” brought together with the goal of documenting the historical and cultural heritage of the state. In the “about” section specific details about the collection are revealed. For example, the selected materials were chosen based on their organizational roots: schools, colleges, historical societies, museums, archives and others. These organizations were chosen from around the state, with the aim of having a more centralized digital location for these cultural resources. For specific digital objects, no selection criteria has been made available. However, the digital collection is organized into four subject areas, which appear to be driving the collection’s thematic objectives: maps, aviation, football, and “highlights.” The individual contributing organizations can be searched.
The collection-level metadata can also be found on the “About” page. The digital project committee has made available its goals for the project, which include “to create an open, publicly accessible virtual collection relating to Iowa history and culture,” and “to establish basic criteria and standards based on best practices.” The management software is ContentDM. What’s wonderful about this collection is the additional information provided. In addition to the goals, the team also provides site visitors with access to collection’s initial planning documents, including meeting minutes. There are also links to funding sources. Because of the collaborative nature of the project, multiple servers are used in sustaining the digital content.
Since the images come from throughout the state and from many different repositories, the audience for the collection may be very broad. It has the potential to serve research and colloquial uses.
1. Collection Principles
This project, from the University of Washington, brings together an impressive array of materials related to the Seattle Chapter of the Black Panther Party for Self-Defense (BPP). The project includes photographs, digital videos of member oral histories, 140 newspaper articles published in Seattle during the group's most active period, a PowerPoint on Seattle history and how the Seattle BPP chapter fits into it, and lots of scanned documents and photographs relating the 1970 Congressional investigation of the BPP, focused primarily on the investigation covering the Seattle Panthers and the ensuing Congressional hearings on the findings. The site reminds me more of a project like In the Shadow of the Valley, and less of a traditional digital library in a Content DM kind of format. The collection principle is not explicitly stated, but as the project in an introductory essay claims to be the most "extensive online collection of materials for any chapter of the Black Panther Party, including the Oakland chapter," the principle seems to have been to digitize everything the institution had or could get its hands on that relates to the topic. There are many sources of material represented on the site: the University of Washington's Special Collection Department (for the 140 newspaper articles, the majority of which can be opened as PDF files of the scanned original texts, which was great to browse), private collections of various chapter members (especially for photographs), and some other institutions, such as the Washington State Archives for photographs and the federal government for materials related to the 1970 investigation. There is update information at the bottom of each page of the project, which states that it has not been updated since December 1, 2007. So, one detraction might be that there does not seem to be ongoing active curatorial work on the collection.
2. Object Characteristics:
There are a lot of high-quality digital objects available in this collection. Many of the photographs blow up really large and none have watermarks. The videos stream smoothly and are really interesting (several feature a member of the party who is of Japanese ancestry, which was unique to the Seattle BPP chapter). The PowerPoint is not particularly impressive looking, but it is informative. Some of the scans of newspaper articles are not great, but it appears that this may have been related to the quality of the originals, and all that I opened were easily readable. One complaint would be that the some of the scans for the government documents are not very clear and the text came out somewhat muddy. But, when I used the zoom-in function on my PDF reader, I could still mostly make out what was written on all of the documents that I viewed.
If the collection really fails on any count, it is in metadata. There is a lot of background information available on the site about the Party, and Seattle, and the government and police responses to the Party's activities, but not a lot on individual objects. For instance, the "photos" page tells you what institution each picture comes from by sorting them under institution name titles, but no further information is given on where they are held in that institution, how one would get to look at the original, or even what size or print type they are, or names of people in them and the date that the photo was taken. There is no metadata other than background biographical information on the subject speaking for any of the videos, such as when they were shot, although there is one little line on the introductory page of the site saying that the project editor shot them. I would not know how to begin locating the government documents they offer on the site, which is a shame because they are interesting, and I have known a few people who specialize in researching government suppression of Black radical activism who would find that information really useful. But, there is only a line of text describing the document that one clicks on to open the document, and that description is all of the metadata that is offered. There are no stable, unique identifiers for any objects on the site.
This site, besides the metadata issues (which would usually bother me enough to dislike the whole project, but this one is so cool that it almost balances out) could be useful to a range of people. The writing used to describe the history of the Party and the Seattle Chapter is accessible enough that a high school teacher could use it for a lesson. The fact that so many of the pictures demonstrate how young chapter members were might also help make the topic accessible to high school students learning about the history of social justice activism or African American history topics. The site was created in conjunction the Panther Legacy Committee, so it obviously also has a use there, for past Panthers or Panther supporters. Members of the general public who are simply interested in 1960s-1970s politics or African American activism would not find the site difficult to use or understand. Finally, scholars researching the Black Panther Party would find this site useful. I did a big research project on the Black Panther Party, so I know how limited primary source materials are for this topic, and this project offers a lot. The primary limitation that I can identify for the site's audience, beside the impediments to scholarly use that the lack of metadata could create, would be that it is geared, though not dogmatically, toward an audience sympathetic to the BPP story. The Panthers were controversial and are still demonized a lot in mainstream media (when they come up) as gun-crazy "nationalists." This site does a really good job, especially through the video interviews and the newspaper article database, of presenting an alternate view of the Party by showing how rooted the Seattle chapter was in community help programs such as providing free breakfast for school children, free sickle-cell anemia testing, and free clothing and shoe closets, as well as the more militant anti-police abuse activism that the BPP is better known for.
The complete database contains 798,810 images as of March 1, 2009. The Earth from Space collection is a small subset of those images - 2,613 images - with the most recent image, Mawson Peak, Heard Island, dated February 28, 2009. The Gateway portal links to the outdated Earth from Space collection homepage, but that is most likely for navigational purposes. It is much easier to discover resources within the collection from the collection homepage than it is utilizing the search feature for the complete database. The collection can be searched using the search pages or a clickable map.
Selection Decisions/Collection Principles
- No information available about collection development policy other than the statement that the pictures selected are deemed 'best of'
- The collection's FAQ page does not provide current or completely accurate information about collection characteristics, but current information is available from various links on the Gateway homepage
- New images are being added to the collection, but that is only discoverable by experimentally searching by dates from the collection's Technical Search page or by refining a search of the database from Gateway Technical search page
- Server statistics that record the number of downloads from the database as a whole are available, but not from the collection's homepage
- Images in the collection are publically and freely available, even to those with low-bandwidth connections and tips for viewing, printing, saving, and transforming the images are available from a link on the search results page
- Sustainability over time and interoperability have been considered as noted on the Database Content page
- Objects available on the Web exist in JPG formats of varying sizes and can be viewed or downloaded
- Most images acquired with digital cameras are transmitted from the shuttle or the space station via a ku-band communication system and archived in a proprietary Kodak format (KDK); however, if the ku-band was unavailable images were downlinked in JPG or TIF format
- Images originally captured on film have been digitized using various methods including the use of video technology, scanning by hand, or scanning in batch mode (no particular details available)
- Images in the Earth to Space collecton have been color corrected by hand, but that is not typical for most images in the complete database
- Object names are the same as the file name but those names conform to the following naming scheme: mission-roll-frame_version.type where _version only exists if the version is not NULL
- Metadata specific to the digital image includes file name, file size in bytes, height and width in ppi, image editing (e.g. cropping or annotation), image availability (i.e. web-viewing or special request), purpose, and comments
- Camera information is available via a pop-up screen
- 25 other fields of metadata are also available - see the Database Fields page
- As noted, in the Object Characteristics, extensive metadata is available for each object
- Metadata about the collection itself is limited (see Collection Principle section)
- The Database Content page notes that metadata conforms to community standards and is created with current and future users and interoperability in mind
- Object metadata exists as unique records within the database
To access the items, users can use a keyword search, or browse several categories in 5 main groups of subjects. Users can also browse items by selecting decades from drop down lists. Results of searches or browsing present the items in a list with thumbnails and basic descriptive metadata. As a nice asthetic touch, the thumbnails are displayed in decorative, old-fashioned-looking, frames. The list can be sorted by title or date and users can jump to other pages of results with drop down menus. From the list, one can click on the thumbnail for larger images, where multi-page items can are displayed one page at a time. No further zoom controls are available for the items, but the jpeg can easily be downloaded. Also from the list, one finds a link to the expanded record for each item. The descriptive data at the expanded record page includes a short abstract, as well as more standard title, creator, subject, and rights information. A nice addition to the expanded records are links to text transcriptions of the items and the Dublin Core record for the item.
Developers of this project put a lot of work into their collection for such a relatively narrow, regional focus. Perhaps the librarians involved have benefited from their proximity to Dublin. Other regional libraries and societies should be so lucky, but one factor missing from the information about the site is how the project was funded.
The University of Washington's library has several online collections, including one of historic photographs of Seattle, called the Seattle Collection. While they did not address how they chose a collection or even how they chose what parts of a collection they digitized, the University of Washington did a relatively good job of explaining the characteristics of the collection. They explained that the collection was digitized in 2000 by the cataloging staff. They provided the type of scanner (Microtek Scanmaker 9600L) and the file format they used (jpeg). They also said they manipulated some images in order to provide the clearest digital representation possible. The fact that all the photos in a collection was included with the collection, but the reasons for choosing these particular photographs over others in the collection was unclear.
Clicking on an image brings it up a new window with a larger version of the picture along with extensive metadata. Fields include: title, photographer, date, notes, subjects (including Library of Congress Subject Headings), location depicted, digital collection, order number, ordering information, negative number, repository, repository collection, object type, physical description, and digital reproduction information. The metadata field are consistent and full of information that would help a researcher. Many fields, such as the subject field, are also links to other items with that metadata entry. Searching the online collections is relatively easy, and users can choose to search through all the collections or to limit their search to one particular collection.
“The British Museum is the authoritative source of images depicting would culture and history including ceramics, sculpture, prints, drawings and paintings.”
British Museum Images is the on-line digital image website of the British Museum offering images of objects held in the British Museum’s collection. All of the images are rights-managed; users can search, buy, license and download 300 dpi JEPG digital images for educational use. 72 dpi JEPG un-watermarked digital images are accessible at no extra cost. The British Museum does not supply transparent or print format images for commercial use.
Currently, there are 9 collections on-line. Users could search the collection by keyword or image number or randomly browse the collection. All the images are represented through thumbnails with image numbers. By clicking the thumbnail, a pop out window with detailed metadata descriptive information including name, date, original country, brief history will show up.
British Museum Images is just a simple image database. I don’t think it is of equivalent reputation as British Museum. However, for research or educational purposes, it seems that there is enough information on it. What I concern is how could a database be enriched but not limited in simple searching as well as representing the images.
Texas Treasures is a site run by the Texas State Libraries and Archives Commission and purports to be "an online exhibit of historical artifacts and documents." This is more or less an apt description as the collection does contain historical pictures and documents in digital form. Furthermore, these objects are themed around a number of different exhibits such as "Indian Relations in Texas" or "Rangers and Outlaws." However, the site is far from comprehensive and seems to be merely offering highlights of Texas history instead of a whole lot of real historical value.
I will say that the site had pretty clear collection principles. They are clearly going for historical highlights of general interest. Also, they chose objects to correspond to coherent topical exhibits also of general interest. The site makes it easy to get from one topic to another and related topics and subtopics (e.g. The Comanche War under Texas Indian Relations) are grouped together in a logical way that corresponds to a natural work flow. The collection is also obviously curated with a separate section for new exhibits. Also, the site provides a lot of background historical information and puts its objects into that context. My only concern with the collection is its relatively small number of actual objects. (Some pages are full of background info text without a single object!)
The objects themselves seem appropriately titled and described for public use. Some objects also have their location and id# from their physical home location listed. However, not all objects have that information, and that and the title appear to be the extent of metadata provided. The sore lack of metadata might also explain why the site does not have a search function.
Between the lack of metadata and the history-by-example approach, I gather that the Texas Treasures sites intended audience consists of members of the general public with amateurish interests in history and/or small children. (Note, however, that the site does not include lesson plans or other supplementary material that would indicate school children as a primary audience.)
If the Texas State Libraries and Archives Commission's goal was to publish a novelty site for the general public at which even quasi-serious historians would laugh, then they have succeeded admirably with Texas Treasures Online.
This collection by Yale University’s Beinecke Rare Book and Manuscript Library focuses on alchemical literature from Europe. Despite the title, I found objects from as early as 1400 and none from later than 1790. I couldn’t find specific information about the collection development policy, although this may be explained by the front-page announcement that: “Many of the items available here are drawn from the exhibition Book of Secrets: Alchemy and the European Imagination, 1500-2000, January 20 - April 18, 2009, at the Beinecke Library.” This may also answer the question of who the intended audience is (i.e. exhibit attendees).
The objects in the collection are really interesting and beautiful to me, with a variety of printing types represented. All scans I checked were 400ppi .jpg files with no apparent watermarks or copy protection, and the layout is simple and functional. The metadata is comprehensive, with notable fields such as “Physical Description” giving objective attributes of objects (though dimensions were missing for some objects) and wonderfully unexpected (if sometimes non-sequitur) information in the “Note” and “Summary/Description” fields of many objects. There is also info providing a clear path for finding the source objects in the collection.
The representation of objects in the collection is inconsistent. Some are scanned as full manuscripts, with covers and all pages represented, while others are scanned selectively, seemingly with a focus on pages with images. It would be nice to know more about their decision making processes. Although this collection currently exists as a supplement to an exhibit, it is not far from standing on its own as a digital library. With the Book of Secrets exhibition currently in progress, there is hope that future updates could add the information to help accomplish this.
Tuesday, March 24, 2009
Vivarium is a digital library that is a project created by the Hill Museum and Manuscript Library that has a partnership with St. John's University and the College of Saint Benedict. Vivarium specializes in Christian literature from all over the world, from medieval Europe to ancient Ethiopian manuscripts. There are several collections contained within this digital library.
The Hill Museum and Manuscript Library has a strong tradition of collecting images of monastic manuscripts and books. They originally focused on western monastic manuscripts and then gradually expanded to Eastern European and African and Indian Christian documents. Their collection guidelines specifically state that they find it a real benefit when there are photographic "back ups" of original manuscripts which has caused them to be a "library of libraries," or in other words, letting museums around the world retain guardianship of the manuscripts while Hill collects images to store in their repository to allow greater access to monastic manuscripts. With this in mind, Vivarium is pretty large and does not really have a particular selection criteria to determine eligibility of digitization.
As for metadata, each item is described through the 15 Dublin Core elements; however the metadata is not very consistent from collection to collection. For instance, the Illumination Collection's metadata is very detailed with lots of attributes, whereas the Syrian Collection just has the image and a title associated with each piece. I couldn't tell if this was because this particular collection is still a work in progress and later on, there will be more descriptive factors to aid search. When searching amongst the collections, the user has an option to search by date, author (if applicable), type, collection, letter. The list goes on; the search features are pretty detailed.
The collections within Vivarium are pretty amazing. There are a lot of different types of monastic literature featured and the digitization is done in such a way that the entirety of a piece is represented. For instance, I browsed through an Ethiopian codex. The pictures were vibrant and pretty and you could "leaf" through the codex as if you were going through the book in the physical state. This is just one of the many many many objects featured in Vivarium. Another one of my favorites was the Illumination Collection, as you could search all the manuscripts by a particular letter in the alphabet.
As for the audience, I think that Vivarium is aimed at scholars. The Ethiopian collection was started, as Vivarium states, "to stimulate scholarship on Ethiopian literature." I also think that Vivarium was created as a way for Hill Musuem and Manuscript Library to continue their objective to collect as many monastic manuscript images as possible and have a public space to showcase their collection.
Regardless, the lack of information and metadata within some of these collections irks me and I wish they would be more consistent with the metadata.
The Samuel J. May Anti-Slavery Collection is a Cornell digitization initiative which Cornell undertook in 1999 upon receiving a grant from NEH's Save America's Treasures program. Cornell utilized this grant to "catalog, conserve and digitize the published pamphlets [in] its Samuel J. May Anti-Slavery Collection, one of the nation's founding collections on the abolitionist movement in America" (Grant Project Description). The collection itself comprises some 10,000 pamphlets and leaflets collected by the Revered Samuel J. May during his lifetime. This author was unable to locate a precise statistic relating to how many such pamphlets and leaflets have been digitized as of today. However, all signs point towards this initiative having been completed by Cornell.
The lack of metadata with this collection came as a surprise to this author. One could perhaps argue that this is a byproduct of the collection focusing on pamphlets and leaflets. Yet, even the most basic administrative and structural metadata remains absent. The only fields one regularly finds are seemingly only the most basic of fields - title, author, collection to which the item belongs, and a date field that is viewable when browsing record lists yet nowhere to be seen when looking at an individual record but on the digitized title page of the pamphlet/leaflet itself. Thus, this author discovered no information specified by Cornell with respect to a metadata schema and its implementation. That said, Cornell does state that it employed the University of Michigan's Digital Library eXtenstion Service (DLXS) in digitizing and making available this collection (Grant Project Description). DLXS does contain a piece of middleware termed 'Collection Manager' which "maintains all collection and group information in tables" (DLXS Metadata Databases). Perhaps Cornell used this piece of middleware when employing DLXS. Then again, perhaps they did not. They do not state such information anywhere that this author accessed.
With regards to the characteristics of the digital objects, this author again had difficulty discovering a wide range of specifics. Cornell does make it known that they used a Xerox DocuImage 620S flatbed scanner for scanning and that, after scanning, they then entered a post-scanning process which included quality control and tagging amongst, presumably, other actions (Grant Project Workflow). The digital objects themselves are viewable either as digitized images of the original object or as OCR-produced text. This is a nice option to have, yet the OCR-produced text is somewhat frustrating and inaccurate. One such example is the following comparison between a random sentence fragment read from a digitized image of the original object and the same sentence fragment read from the OCR-produced text.
Digitized image of original: "But at present I shall, so far as I can, ascertain from pamphlet the specific complaints you make as to the 'emancipation proclamation' . . ."
OCR-produced text: "But at present I shall, so fhr as I can, ascertain from your pain- phiet the specific complaints you make as to the 'emancipa- tion proclamation,'. . . "
The images themselves allow for users to zoom in once while the OCR-produced text allows a user to search a specific pamphlet or the collection at large. Beyond this, this author can only say that images are downloadable as low-quality .tifs images but also printable at what appears to be a medium level of quality.
One of the pieces of this collection that this author was pleased to uncover is the following quote that Cornell used in describing the history of the Samuel J. May collection and of Cornell's Civil War collection:
In 1874 the abolitionists William Lloyd Garrison, Wendell Phillips, and Gerrit Smith, wrote, signed, and circulated an appeal to their friends and supporters in America and Great Britain, urging that it was of "great importance that the literature of the Anti-Slavery movement...be preserved and handed down, that the purposes and the spirit, the methods and the aims of the Abolitionists should be clearly known and understood by future generations. (Collection Description)
Not only does this author find this quote enjoyable from an historical perspective, this author also believes the underlying principle of Cornell's digitization attempts with respect to this collection is highlighted in this quote. That principle being to ensure that the lightnesses and the darknesses of the past never fade or disappear.