Tag Archives: digital collection

Crowdsourced digital history: Project South at Stanford — via NPR.org

Linton Weeks, in a post for NPR, describes a unique digital audio archive at Stanford on the civil rights movement: Project South. Here is part of Weeks’ description of the archive:

The Background: Exactly 50 years ago this year — in the summer of 1965 — a group of eight students filtered out into the Southern United States. Under the aegis of Stanford’s Institute of American History — and with help from campus radio station, KZSU — the young people gathered more than 300 hours of amazing audio recordings. They interviewed a lot of people — young and old, black and white — including members of the Mississippi Freedom Democratic Party, the National Association for the Advancement of Colored People, the Student Nonviolent Coordinating Committee and the Southern Christian Leadership Conference.

In the mountain of material: Audio appearances by Ralph Abernathy, Charles Evers, Fannie Lou Hamer and Hosea Williams. Andrew Young leads a singalong. The enterprising students captured the sounds of a Ku Klux Klan meeting and an address by Robert Shelton, a KKK imperial wizard.

Weeks invites readers to help him crowdsource the archive….

Historically speaking, I need your help.

Davis Houck, a communications professor at Florida State University, recently pointed me toward a little-explored archive at Stanford University called Project South.

It’s an intriguing trove — full of original source material. In fact, it’s so rich with historical moments, I need your help to sort it all out.

So I am asking anyone who is interested — historians professional and amateur — to do some research sleuthing. Let’s commit historical crowdsourcery.


Davis Houck, who wrote about the Project South archive for the Clarion-Ledger in Jackson, Miss., in 2014 tells NPR History Dept. that the relatively obscure archives “are just remarkable: from the highs of Dr. King’s oratory to Fannie Lou Hamer’s amazing testimony, to lots of singing, to a Klan rally! And I would underscore, and keep in mind this is somebody who writes about civil rights for a living, there’s simply nothing else even remotely like it.”

Read the rest of the NPR article here: http://goo.gl/QuRKzG

Dr. Martin Luther King Jr. in 1965. AP

Leave a comment

Filed under Digital Humanities

SLATE’s The Vault: Five of 2014’s Most Compelling Digital History Exhibits and Archives

From Slate‘s history blog, The Vault, Rebecca Onion features five digital collections and/or historical websites:

2014 brought us a wealth of new digital archives and document-rich historical websites to peruse. Here, in no particular order, are five of the best such sites I saw this year.”

Follow the link to enjoy. She promises a link to five more sites tomorrow.

Historical documents online: Five best digital archives from 2014.

Leave a comment

Filed under Digital Collections, Digital Humanities

Darwin Manuscripts Project – The American Museum of Natural History


Leave a comment

Filed under Digital Humanities

The Shelley-Godwin Archive

The Shelley-Godwin Archive.

The Frankenstein manuscript comes alive in this digital archive!!

THIS is what the digital humanities is about, in my opinion. Big data is important, yes; but what really jazzes me is when materials that have been hidden away in museums and libraries make their public appearance in ways that are beautiful, useful, and open access!

Leave a comment

Filed under Digital Collections, Digital Humanities, digital surrogate

Main Street Public Library Database – Ball State University

Main Street Public Library Database – Ball State University.

This is a database related to the  What Middletown Read digital collection.


Leave a comment

Filed under Digital Collections, Digital Humanities, Library science

The King James Bible Virtual Exhibit : The King James Bible

The King James Bible Virtual Exhibit : The King James Bible.

Here is an interesting DH project from Ohio State libraries about the King James Bible. It was developed as part of a pilot project, as the developer describes below:

The exhibits pilot innovation grant project was a partnership of three departments, Digital Content Services (formerly SRI), Rare Books and Manuscripts, and the Web Implementation Team (nowApplications, Development and Support). The Preservation and Reformatting Department (Amy McCrory) and the Copyright Resources Center (Sandra Enimil) were also heavily involved. The grant was “to develop a new model for creating and delivering digital exhibits at the Libraries.” The project was developmental in scope, and the specific goals were to create a polished digital version of a physical exhibit, and to gather information about what would be required to develop an exhibits program in the Libraries.

The King James Bible exhibit, curated by Eric Johnson, is indeed a polished exhibit.  We learned a great deal from working on it, such as the need to create a glossary of terms as reference for all people on the project.  We also identified the strengths and weaknesses of the Omeka software for our environment. The research into what it would take to build a sustainable program took many forms.  We looked at existing digital exhibits at OSUL, as well as curator expectations for exhibit functionality, and the use of Omeka at other institutions.  We tracked information on the time it took to create the exhibit.

What’s next?  The report is done and has been given to the Executive Committee.  The suggestions in the report are just that – suggestions.  We were not charged to develop a program.  We applied for funding to explore the possibilities; the report is what we discovered.  It is also worth noting that the environment has changed since the report was written.  Most important, is that the Libraries have hired an Exhibits Coordinator.  However, many of you have expressed interest in our results.

Read Report Here (docx).



Leave a comment

Filed under Content Management, Digital Collections, Digital Humanities, digital repository, Library science, Omeka

Big data meets the Bard – FT.com

Big data meets the Bard – FT.com.

Yet another article about the perhaps “diabolical” use of “Big Data” in the humanities. The article describes the author’s reactions to a Skype seminar from the Stanford Literary Lab. While I don’t think that “Big Data” will replace actually reading novels, I did cringe at this quote:

Ryan Heuser, 27-year-old associate director for research at the Literary Lab, tells me he can’t remember the last time he read a novel. “It was probably a few years ago and it was probably a sci-fi. But I don’t think I’ve read any fiction since I’ve been involved with the lab.”

But reading books, and analyzing Big Data, as I’ve said before on this blog, are different –and complementary–tasks.

Leave a comment

Filed under Digital Collections, Digital Humanities, liberal arts colleges

Unit 5 – Using Drupal for my digital collection

Discuss either a) which module you decided to try to try from assignment 2 and how it enhances your collection; include if you like any problems or tips related to installation; or
b) now that you have some experience, how you feel overall about the suitability of Drupal for your collection.

It is clear that Drupal, in the hands of a trained Drupal programmer, would be a powerful and customized tool that could be used to manage my digital collection; although it seems that it is not really designed for the type of content I would like to include: many large searchable text files (in pdf or other formats, especially including files with specialized markup). When I say that it is not really designed for it, I mean that the native content types don’t lend themselves to it (although I have not experimented with the “book” type).  Of course there are many modules that add that type of functionality; I saw several that seemed designed to make RDF-type relations between nodes; but I was too intimidated by all the dependencies to try to install such modules, and the help material was too highly technical for a casual Drupal user to understand.

I did find an apparently simple module that added some necessary functionality to my site, i.e., the ability to search attached text files. The module is called, appropriately, search-files.Here is a screenshot of the kind of output the module produces:

sample of the results returned by the Drupal serach module

Because this is a crucial function for my collection, I decided to install it, even though it requires several “helper applications” in Linux.

Helper Applications

In order to extract text, this module calls ‘helper apps’ such as cat and pdftotext. Drupal administrators can configure any helpers they like. Helper apps need to be installed on the server and need to be setup to print to stdout.

Most Linux distributions have the following helper apps available:

  • cat – generic text (txt) files
  • pdftotext – Adobe Acrobat (pdf) Documents
  • catdoc – Microsoft Word (doc) Documents
  • xls2csv – Microsoft Excel (xls) files
  • catppt – Microsoft Power Point (ppt) files
  • unrtf – Rich Text Format (rtf) files

For more information about helpers and how to configure them, see hints for Linux and Windows. It is also possible to configure helpers in a shared hosting environment.

I assumed that my Linux installation might already have these applications available, although I could enable them separately if need be. So I downloaded and installed search_files-6.x-1.6.

I had no difficulty installing it or configuring it in Drupal. But it can’t search the pdf files I have attached, so I’m assuming I also need to install the helper applications in Linux.

UPDATE: as it turns out, this module worked in Drupal 5 but is broken in Drupal 6. Evidently it works in Drupal 7, so hopefully when I update my system I can get this working. Else I will need to find a different CMS, because this search functionality is crucial.

Leave a comment

Filed under Content Management, Digital Collections, Digital Humanities, Drupal, Operating systems, SIRLS 675

Unit 4 – Drupal as a content management system – initial thoughts

This  week,  you  might  choose  to  comment  on  how  suitable  Drupal  might  be  for  your  collection.  Begin  to   develop  some  criteria  you  would  use  to  judge  how  well  an  application  such  as  Drupal  meets  the  needs   of  your  collection  and  its  users.  We  will  expand  on  this  problem  over  the  semester.

We have been reading about the need for humanities scholars to be able to use a digital collection with a degree of confidence about the nature and authority of the relations between objects, yet having the structure of those relations clear so that the information added is objective rather than subjective. What I would really like to make is a database or collection or semantic web of all the texts (with attached full-text) that George Eliot read or interacted with, with some degree of confidence added in about how influential those texts were. One could argue that there is a sort of taxonomy to how much she interacted with a text, in ascending order from hearing it read aloud, to reading it in translation, to reading it herself in the original language, to reviewing it, to editing it, to translating it from another language into English. These are all types of relations with a text. One can also argue that reading it more than once, or attesting to its influence in letters or in research notebooks, is also a measure of influence. I was reading about RDF, and that seems exactly the sort of inferential structure I want to be able to capture, starting with the simplest: What she read, with some sort of statement about her relation to the text, and a documentary page showing the authority for that relation. One can infer the direction of influence between texts according to who read what and when.

Because eventually I would want this to be part of a larger database of “Literary intelocutors,” I’m having trouble figuring out if the key entity in this collection is texts or a person. The way I envision the normalized tables in a database would be a table of persons, a table of texts, and a table of links between the two, in the form of “GE  read  Rousseau’s Les Confessions, in French, in 1834, according to these authorities, and here is a link to that edition of Les Confessions in French (or perhaps a digital image), plus a searchable English translation.”  I have been thinking that I needed to include all the standard metadata for each text in each entry, but that seems a waste of space. The new and useful information to be collected is the table of links, so all I really need to capture is what I have underlined; Each underlined phrase is a field in my collection.

Any content management system I use for my collection will need to be able to search and manage large attached text files in a variety of formats, to query the collection of these files with a full-text search, and have a faceted search that narrows the query results by type of relation, by subject, by language, by year, or type of text file. I also want to be able to widen the search if necessary, though, across subjects, dates, etc. The idea is to be able to use this collection to specify a group of texts to search, and to be able to document the relationships and direction of influence between them. I would love to be able to actually graph the connected nodes in some sort of network display and to assess the degree of influence.

Leave a comment

Filed under Content Management, Drupal, SIRLS 675

UVa Library: Digital Initiatives – R&D – American Studies Grant

UVa Library: Digital Initiatives – R&D – American Studies Grant.

Here is the link to the proposed American Studies Information Community at the University of Virgina that sounds very much like the kind of portal I would like to develop for George Eliot studies.

An Information Community is a group of scholars, students, researchers, librarians, information specialists and citizens from similar or dissimilar fields, whose common link is a shared information need. This information need can be oriented around a subject, a field, a methodology, or a data type. The information can include text, data, digitized media, images, and formal and informal scholarly exchanges of ideas. Information Communities exist as a medium for bringing people together and making them aware of opportunities and resources. Community is fostered by personal communication, shared interests, shared research materials, shared tools, and shared standards. Information Communities add value to information, and offer opportunities for using information in new and different ways. Activities of the community can include creation of web-based materials, development of portable tools for enhancing access to the materials, and managing of conferences and publications. Information Communities foster innovation and spark new areas of research, and usually result in a tangible body of knowledge for consumers.

Leave a comment

Filed under Digital Collections, Digital Humanities, George Eliot, Library science, SIRLS 675