May 4, 2024
A great deal of work has, as often, gone on behind the scenes at Perseus during the semester. The most important task is the slow but steady transition to Perseus 6. There are several components to this work.
(1) Transitioning to Knowledge Futures’ PubPub publication platform and the creation of a Perseus Data Journal. We are working with Knowledge Futures to replace the HTML Perseus homepage and WordPress-based Perseus Updates blog. The PubPub publication platform allows us to generate citable publications (including DOIs). PubPub not only supports standard features of academic writing (such as footnotes) but also easily allows the addition not only of static images but also of videos and interactive visualizations – crucial components for digital publication. This transition will in turn allow us to create a Perseus Data Journal and thus to begin documenting systematically the materials that are already within Perseus as well as new content. The Journal of Open Humanities Data is an inspiration for this effort but we will need to publish far more than we can expect this one venue to digest. We have also already begun to use PubPub as a medium for publishing coursework and theses and expect that more content will shift to PubPub. For those interested in how this might look, the Harvard Data Science Review provides one example.
(2) Integrating the Scaife Viewer and Beyond Translation: In January 2024, James Tauber took over the primary development of Perseus 6. When he began, we had two systems that had similar default interfaces but that had completely different backend architectures. The Scaife Viewer (https://scaife.perseus.org/) has used the CapiTainS suite of tools for the serving and processing of texts. Scaife uses TEI XML with citation data structured to be compatible with the Canonical Text Services data model. Scaife allowed us with a sustainable framework by which to publish new content and contains the most up-to-date version of Open Greek and Latin, a collaborative effort to expand the amount of openly licensed Greek and Latin textual content available under an open license and in a consistent format. The Scaife Viewer also provided the framework for Brill’s Scholarly Editions. Almost all of Brill’s scholarly editions lie behind paywalls – the Literary History of Medicine, funded by the Wellcome Trust, is the one Brill edition currently available under a CC license. Readers can explore this to see Brill’s version of Scaife (and to observe how Scaife supports Classical Arabic). The additions to the Scaife code-base are, however, open source and Brill supported the development of the tools needed to provide full support for critical editions (such as Ammianus Marcellinus Online), including editions of fragmentary authors (such as Jacoby Online). Readers can see these services applied to Christopher Marlowe and to the Alexandrian War (an account of events in Egypt during the Civil war between Caesar and Pompey).
Our collaborators Jacob Wegner and James Tauber, however, decided that they needed a more generic backend. First, we wanted to be able to publish a wide range of openly licensed content and we decided that we did not want to require that everything be CapiTanS compliant. Second and more importantly, our goal in Beyond Translation was to integrate many different categories of annotation and many of these classes of annotation were published in formats such as CSV and JSON. While we continued to use the Canonical Text Services data model as a way to integrate multiple streams of standoff annotation, Jacob Wegner and James Tauber developed the Aligned Text and Linguistic Annotation Servier (ATLAS) to accommodate an open ended set of annotation classes.
Readers can get a sense of the kinds of data that we currently support and the formats that we have been using by
In the first half of 2024, James Tauber has unified the Scaife and Beyond Translation backends. This involved rewriting and refactoring earlier versions of code developed for Scaife and Beyond Translation. It also involved progress towards publishing guidelines for the formats that we would import into Perseus 6. In our earlier work, we had made a point of importing data from multiple sources that was structured in various ways. Our goal was to discover how others had encoded their data and to determine how well we could use available content. Our goal now is to provide documentation for the formats we will immediately support.
James Tauber and his colleagues have actually run the Scaife Viewer and Beyond Translation sites. The transfer to Tufts has been deferred for a variety of reasons as Tufts updated and modified its own computer services. We are now finally in the process of setting up Perseus 6 (the combined Scaife/Beyond Translation) system for the first time under Tufts’ IT infrastructure.
Once the combined Scaife/Beyond Translation is operational at Tufts, the next step will be to begin providing initial documentation for how to structure data to be included in the new Perseus and how others could use the open source tools that Eldarion and now Signum University have created in their their own digital libraries.
(3) Reintegrating Art and Archaeological Content into Perseus: Clifford Wulfman of Princeton University Library is working with us to update the Perseus Art and Archaeology collections. When we first developed Perseus in the 1980s, one of our major goals was to be able to integrate the material and the textual record. To do so, we needed not only to build a system that could manage textual and material data but we needed also to create the digital images and metadata that we would use. In the late 1990s, however, the Web had emerged and many efforts (in particular, the German Arachne project) had begun to publish digitized images and metadata. We felt that we could focus upon the textual issues and so, under Perseus 4, we created a separate and relatively simple, Art and Archaeology Artifact Browser. The metadata that we have collected and the images are quite general. We need to accomplish several goals.
First, we need to replace the artifact browsers with a more modern and sustainable system. We will be basing our work on the International Image Interoperability Framework (IIIF), which has enjoyed increasingly broad adoption among museums and libraries around the world. In this first stage, we essentially replicate what we have now but the transition to IIIF makes another development possible.
Second, once we have a IIIF based system, we can import metadata from other collections and create a virtual Perseus Art and Archaeology collection that uses metadata and displays images from IIIF-based systems in the US and abroad.
Third, we need to revitalize images produced on 35mm film in the late 1980s and 1990s. For almost a decade, Maria Daniels worked as a full time photographer for Perseus, creating tens of thousands of images for thousands of objects in dozens of museums in the US and Europe. When we began work in 1986, we were publishing our images on videodiscs and could be much more cheaply collected images as high quality video. We invested in 35mm film, however, because we wanted to be able to digitize these images at a higher resolution in the future. We began work before the World Wide Web and we were publishing on videodisc and CD ROM. When the Web appeared, we went back to museums to update our permissions in the 1990s. Many were then reluctant to allow us to distribute any but low resolution images of the pictures that we had produced. We contacted museums again in summer 2023 to request more open rights but received almost no response. Harvard did, however, give us the rights that we need and so we will use images from the Harvard collections to dramatize what can be done with higher resolution images. The hope is then to have a compelling use case for museums to let their images be made available and to get the funding to do the retro-digitization of the photography.
Fourth, we need to begin reintegrating the visual record into Perseus. This effort must exploit multiple opportunities, including applications of AI such as text-to-image, the use of photogrammetry to convert 2D images into 3D representations of space and objects, and advances in Geographic Information Systems and visualizations.
(4) Updating Perseus collections beyond Greco-Roman culture. At Perseus, we have expended considerable effort developing projects that go beyond the Greco-Roman world. We wanted to develop methods that are as general as possible. We developed collections on the History and Topography of London that included texts, images, maps and geographic data. We built collections for Old Norse and Arabic. We created a substantial curated collection on 19th century US culture.
The new Perseus can certainly include the content that we have produced. The problem is that we may need to update the format for many of our older collections. In some cases, importing is remarkably smooth. To test our ability to publish a critical edition with textual notes, we drew upon a digital edition of Christopher Marlowe that Hilary Binda had produced in the late 1990s. Twenty-five years later, we were able to use the original TEI XML. Working with James Tauber, Cliff Wulfman will also begin work on updating the Perseus Renaissance Collections that he himself developed when he worked at Perseus in the early 2000s. The goal is to smooth the way towards updating and importing other collections, including older Perseus materials and the growing body of newer, openly licensed content.