Helping scholars explore the construction and binding of hundreds of manuscripts
Building an interface to the Bibliotheca Philadelphiensis Project that was easy, deep and crossed multiple collections.
University of Pennsylvania Libraries has a manuscript collection and houses data of multiple manuscript collections, including the Philadelphia Museum of Art, Temple University, Bryn Mawr and the Free Library of Philadelphia. They wanted an interface that covered the various collections based on the work we did at The Walters. This project is an interface for the Bibliotheca Philadelphiensis Project, so the site is called BiblioPhilly.
UPENN Libraries also wanted a new kind of interface that would allow visitors to understand more details on how the manuscripts were built. Understanding the manuscripts' collation helps scholars and curators understand the construction, provenance and changes made to a manuscript by booksellers and owners. Dot Porter has much more thorough explanation on her blog, and another post about transformative works covers some of the functionality in her words.
Any project we build will have a content and data strategy, and the first thing we need to do is get all the data in a place we can explore via data interfaces or temporary data explorers, a mini proof-of-concept interface we build just for us and the project's stakeholders. Because of the size of the data with multiple institutions, we needed to work with larger amounts of data.
Any set of data will have issues, inconsistencies, and uncommon and edge attributes. When we merge multiple data sets to work as one, these issues become more challenging. We had a system to import TEI-based data already, but we needed to re-write it entirely to make finding the inconsistencies easier, and to do partial imports because of the sheer size of the data.
The data comes from a series of XML documents all listed in an HTML page we screen scrape. The XML data needs to be indexed and normalized, building common point into separate tables, like tags for regions, centuries, subjects, etc.
Quires and Binding
To make a system to explore a manuscript's collation, first we had to learn collation, and how to create and bind a manuscript. We cut and bound multiple pages into quires and turned them into makeshift manuscripts. Manuscripts are usually built by having multiple pages stitched together and folded at the seam, often time 4 or 6 folios, and these are called quires. The quires are stacked and sewn into a common binding and then a cover is added that includes a cover of the binding.
Because page numbering is confusing in manuscripts, and because this a second numbering system, we manually numbed the pages and used these models to create early designs and prototypes of our collation explorer. As we were modeling the interface, we turned pages left and right and flipped and folded, all while testing the interface with non-scholars to see how close we were to something people would understand.
The collation explorer needed to help people understand the issues that a manuscript might have, like pages ripped out, pages added, pages rearranged, etc. In a few cases people had dropped manuscripts and tried to put them together centuries earlier, and in some cases there were edits made centuries later. Having a system that helped to start to explain that was important to visualize, with missing and added pages with their own visual tells in the explorer.
A Beautiful Interface
We helped UPENN Libraries create a beautiful, easy-to-use interface to 3500 unique manuscripts ranging from the 9th to the 20th century. Visitors can engage with manuscripts through a search and discovery system that covers manuscript type, culture, descriptive terms, geography and a lot more. A simple page-turning interface lets people interact naturally with a manuscript, with panels that cover the chapters, illuminations and the aforementioned collation explorer.