On Saturday, November 9, Washington University in St. Louis hosted a lively (and successful?) THATCamp: THATCamp St. Louis, 2013. One of the first sessions combined two suggested topics: “Beginners in the Digital Humanities” and “Subject Librarian DH boot camp,” which I volunteered to co-facilitate with Twyla Gibson. I hope that discussing some of the broad outlines of digital humanities was helpful, but there was a lot to follow-up on. The thin list here leads to pages and resources with pointers to many, many more resources for exploring DH.
Overview of DH
In the “Beginners” session, we discussed some of the broad outlines of digital humanities: the dispute over “hack vs. yack”—the practical creation of tools and resources (especially scholarly digital editions) versus broad theoretical considerations. Within the “hack” camp of DH is another division—not less contentious, but with slightly differing aims and perspectives: scholars involved with scholarly editing (and “digital projects” more broadly) on the one hand, and scholars interested in text mining, on the other. The following are good resources for anyone getting started in DH:
TEI, XML, encoding
Not all XML that might be relevant for faculty members, archivists, librarians and others in cultural heritage organizations is necessarily DH, but certainly deserves mentioning, such as EAD (Encoded Archival Description) for encoding of finding aids, PREMIS for describing objects in a digital preservation environments, and many others! On the other hand, as prevalent as TEI (Text Encoding Initiative) is, it is is also not the only XML standard relevant to DH, as others have grown up out of it, such as the Charters Encoding Initiative (CEI) and EpiDoc, for Epigraphic Document Encoding.
- Ted Underwood’s The Stone and the Shell blog on Where to start with text mining
- Steve Ramsay, Reading Machines: Toward an Algorithmic Imperative (theory, but also dealing with text mining)
- Drew Conway, Machine Learning for Hackers
- The Metadata Offer New Knowledge (MONK) project
Thanks to all who attended and who organized, especially the thoughtful and thorough Olin Library staff.
There’s always some kind of magic at a THATCamp, including the mystery that there will never be enough information to know in advance about all the good sessions, everyone will always miss some good ones, yet that uncertainty is part of what makes it possible for the great sessions to be great. All the possibilities were not figured out in advance.
What I especially like about THATCamp is that the very form is a discovery process. We don’t know what we are collectively best suited to talk about or do until we all show up. So much tacit knowledge can be discovered and shared when people from different institutions, job descriptions, experiences, disciplines, and side interests try to have conversations with each other that are low stakes, speculative, practical, and motivated in the best way.
May there be another THATCamp in the St. Louis region again before too long!
We want to thank everyone who participated in and lead sessions at THATCampSTL. You all made it a great day!
In the last few years, Saint Louis University has generated some great tools and projects for research. At the least, I am willing to share T-PEN our tool for transcription of digital images and discuss/demo the Tradamus tool that is forthcoming (April 2014) that seeks to be a modular, but end-to-end solution for creating a digital edition.
Probably, I will also talk about vHMML, which is an online resource for learning coming out in Spring, and a few other projects I can’t commit to text, but cannot keep secret.
In a perfect conference, these demos will only serve to show what chasms are left in tool development as I seek to find the next great need and expand these tools into other fields.
Every project I have worked on has been in collaboration with at least two other institutions and began with a sturdy “This would be cool, if it were not impossible” conversation. I want to finish another one of those conversations, but I would be happy to start one.
I am pretty sure there are no technological hurdles left to crowdsourcing everything.
As digital editions and big data projects begin to allow deeper access to their processes, citing the contributions made by those from whom you have lifted already assembled datasets or important cataloguing conventions becomes very difficult, but can be glossed over without consequence in most applications. When these important micro-contributions come from hundreds of people across several disciplines and a range of credentials, the task becomes near impossible and much more important.
Massive sites like Wikipedia have developed conventions (in addition to their official flags, stubs, and citation formats) that discern between contributors who share knowledge on a topic, those who flit about correcting spelling and grammar, and those who seek out citations to flesh out incomplete articles. Is this folksy approach the future? Can and should the value attached to someone who applies professional polish to a scholarly article be different from the workhorse who dropped a mangle of data and conclusions into a public area? Is the artist who created the visualization that makes it all accessible simply an illustrator?
When I first began work in Digital Humanities, I was too ignorant to anticipate how often I would be told this as a developer. So often, it turns out, that I’d like to start an argument.
It seems that a digital humanist who is capable of programming prefers the command line, where she can break into anything she wants. If she is smart enough to research and program, went the reasoning, she is clever enough to decide what customizations make the perfect tool for her research. On the other side, at an institution with the resources for great minds and strong technical support, the most erudite researcher can configure a task so precise that even an uninitiated programmer can run an appropriate analysis and return wonderful data or at least a helpful visualization to the (often digitally hypo-literate) taskmaster.
The pyramids and evolution have shown that if you throw enough bodies at something, it will get done, but a tool is something special. Every craft has a rich history of interplay between those who pushed the limits of possibility and the new designs that made sure that limit was ever-expanding.
Digital Humanists represent a very different audience from most web design or software projects – an opportunity that is often missed. Time taken to restrict bad data input causes conflict with data models that may be based on now incomplete or incompatible scholarly conventions. The interface is adjusted; the tool is improved. This loop creates a tool that generates better data and more completely describes the scholarly work while simultaneously creating (de facto) or reinforcing data standards. The result is well-composed knowledge that is completely portable, dissectable, criticizable, citable, and reusable.
This success is powered by the scholar, sharpened by the focus of the designer, and accelerated by the tool born from the interaction between them.
I am happy to share my experiences with people beginning projects. I am very interested in hearing from others who have completed reusable digital humanities tools and discussing what sorts of possibilities exist in disparate solutions applied to emerging problems.
Markup, I insist, was the necessary jolt to encourage machine readable encoding. XML is a convenient vehicle to bridge relational tools and linked open data(LOD) (or any triple), but the weight limit of RDF-XML has been exceeded. The standards for annotation are necessary for interoperability and the exposure/discovery of LOD, but is also a very useful way to work with offline, local, or private/siloed data.
I am able to share experience with OAC and the manuscript-focused children of OAC, SharedCanvas and IIIF. These standards were emerging as the transcription tool T-PEN was being completed – it has allowed us to include features that were previously unplanned and filled me with healthy discontent at its completeness. Our current project, a tool for the complete creation of digital editions (focusing on manuscripts) makes heavy use of these standards and is dangerously near spawning a few of its own.
I would like to learn about other efforts in annotation, especially in fields outside of manuscripts. What already exists, what is in flux, and has this shift impacted the way you organize data?
At the very least, I would like to debate whether annotation is a fad or there is a real possibility that markup will get out of the way and we may be left with a single pristine artifact that takes the universe as its metadata.
It would be really cool to have a program that would semi-automate the process of geocoding textual data within a document in preparation for GIS analysis. Given the complexities of place names, such a program would require some user validation for each data item. As far as I know, there is no free software that does this. A really ambitious program would also be able to map the data and expose the strengths of relationships among places for any set of text documents. My particular interest in such a tool would involve the use of oral history transcripts. I realize this is a pie-in-the-sky proposal.
The concept of collaborative scholarship has generated tremendous excitement among digital humanists. Much of the discussion has been restricted to the transformative impact of social media on the process and output of academic research. This session will extend the dialogue by exploring possibilities for collaboration among academic scholars, students, and communities. What models exist for constructive collaboration of this type? What might such scholarship look like? What are the technological, logistical, and institutional obstacles? How might they be overcome? What are the implications for the traditional compartmentalization of academic life into the categories of research, teaching, and service?
The application of GIS mapping to humanities research has been recognized as a primary catalyst for what has been termed the “spatial turn” in humanities scholarship. Despite tremendous excitement about the power of visualization to expose heretofore hidden relationships and patterns, it remains unclear just what these methodological techniques have contributed in terms of either raising new questions or answering longstanding ones. This discussion will attempt to inventory some of the most interesting and ambitious efforts to employ spatial tools such as GIS mapping and 3D visualizations in humanities research and consider whether these spatial tools have the capacity to add significant value to humanities scholarship.
My particular interest is the field of urban history but I would be interested to learn how visualizations are employed for spatial analysis in other fields and disciplines within the humanities.