My long wait for Martin Paul Eve’s book on Open Access in the Humanities is apparently going to be a little longer – it seems that the OA copy isn’t up on Cambridge UP’s website yet, and physical copies are still in presale.  I’ve been waiting for months to see this come out, so the added agony is killing me.

But! In the meantime, I plan to check out another open access book of interest: The History Manifesto.  This looks like it’ll definitely provide an interesting perspective, even if there isn’t a chapter on digital economics.



As part of my research at the Huygens Institute in The Hague, I’ve been reading A LOT lately and on the advice of a visiting faculty member, I’ve recently started tracking all of my articles and books in my Zotero account.  You can view my library here:  I haven’t done much organization of the collection at this point, but I plan to go through and tag contents and break it up into thematic collections, so check back if you’re interested in digital preservation of digital humanities publications, copyright, piracy, or scholarly communication.


“A book will always have its role… But the opportunity is to use a technology built for discourse to create an unprecedented good for scholarship.”

- Cameron Neylon (PLoS), in discussing monograph publication and the move to OA as part of his closing keynote at Open Access monographs in the humanities and social sciences conference


This morning I posted a general meditation on the DH BeNeLux conference to the DiXiT blog, outlining some of the major themes and takeaways of the conference.  This leaves the choicer, more granular bits to be lolled about here on my personal blog.  So then, this post is meant to provide a deeper understanding of the conference, but is also really an avenue for me to work through a few concepts that have been sitting at the edge of my mind for the past week.

So first, to contextualize – the DH BeNeLux conference, in its first year, is a collaboration between various cultural heritage organizations and research centers in Belgium (Be), the Netherlands (Ne), and Luxembourg (Lux) and aims to foster a sense of local community within the larger context of the digital humanities.  While there seem to be a lot of meetup groups in various parts of the world, this multi-country collaboration, to me, felt fresh.  Because there were still researchers from very different cultures in attendance, the more international issues like conference language were broached; however, unlike a more international conference like ADHO’s annual DH, the use of English seemed a lot less justifiable as none of the organizing institutions utilize the language officially (to my knowledge).

The conference raised a lot of interesting points overall, and even the final panel discussion felt more like the beginning of a conversation rather than the end.  So with this fact in mind, here are a few discussion starters that I grasped from the various DH BeNeLux sessions I attended.

1. Digital Humanities is not digital humanities is not the humanities AS digital.


Whitacre’s Virtual Choir 3

This is a really important point  that was apparent in a lot of the breakout sessions, but truly laid bare by Cissie Fu.  A faculty member at Leiden University College in The Hague, Fu’s presentation exploring crowdsourcing in choral arrangements (specifically highlighting the Virtual Choir work of Eric Whitacre, which should be watched and rewatched in surround sound) was a resounding hit as part of the Day One session on Crowdsourcing; however, it was actually her comments in a few other presentations that really put forth the need for a deeper discussion of semantics.

During a presentation by Niels-Oliver Walkowski on the Preliminaries to a Digitally Carried Out Philosophy (as part of the generically-titled About DH session), Fu engaged Walkowski in an interesting discussion of his use of ‘digital’, in which she asserted that he was using the word predicatively, rather than attributively.  She again brought up a similar point during the final panel discussions, though apropos to what I don’t remember.  Her assertion, though a bit difficult to engage with outright, was something I wrote in my notebook as a point I wanted to return to.  Now that I’ve had time to roll it around in my brain (and refamiliarize myself with the grammatical terms – thanks, Google), I think it’s an important thing to talk about.  How do we as digital humanists (capital D capital H) differentiate from humanists working with the digital… or do we?

Further, what constitutes a digital humanities project?  Looking over some of the projects coming out of the top digital humanities institutes around the world even, I can’t help but often wonder “isn’t that just a digital archive?”  Speaking to a traditional humanist about this issue, he asserted that a lot of digital humanities projects seem like “humanities with maps”, often with a questionable methodology.  While that’s just one scholar’s understanding of the field, it really points to the need to pin down a semantic understanding of ‘digital’ in this context.  Or not.  But at the very least we need to illucidate what ‘digital humanities’ means in the context of our own projects, so that we’re not all being lumped into the same, often wrong, box.

2. External factors need to be better addressed in terms of their impact on the field.

Albert Meroño-Peñuela presenting on the Short Title Catalogue, Netherlands during the Linked Open Data breakout.

One of the most interesting presentations I saw during the conference was that of Alastair Dunning, who discussed the Great 20th-Century Hole in digital humanities projects, asserting that copyright has, to a large extent, excluded more recent works from being studied in the field.  As a copyright nerd, this was one of those ah-ha moments, where something so obvious was laid out so succinctly that I had to wonder why I had never considered it before.  The data from Dunning’s study of some of the top DH research centers (which can be viewed in his presentation slides here) is very straightforward, and I’ll be interested to see what comes of it.

Another issue worth mentioning here is something that Max Kemman talks about in his post on the conference, and that is the fact that humanities data is often siloed in collections.  The heavily-attended Linked Data breakout on Friday provided some insight into how this is changing, but still there are issues (e.g., audience members asking how a project would map to other existing LOD projects like Pleaides, and presenters not having a plan in action).  Saskia Scheltjens, librarian at Universiteit Gent, did address this issue in the final panel by suggesting that libraries were the place where desiloization occurs, but this raises yet another discussion point…

3. Digital humanities ≠ digital preservation, and we need to figure that out.

A common question in most of the breakouts that I attended was “What are you planning to do with the data once [insert techie project name here] is complete?”, and a common answer was that other researchers will figure that out once the data is published.  While it’s one thing to leave interpretation up to the scholarly masses, it’s a whole different issue that many (though not all) digital humanists leave long-term preservation of digital projects up to other entities like libraries to solve at the end of the project cycle.  As a digital preservationist myself (and now as someone studying the long-term sustainability of digital scholarly editions, har har) I can feel my blood start to boil just now.  Preservation strategies need to be implemented at the beginning of a project and adapted as it goes, otherwise the library is simply going to become a place where DH projects go to die.

This is probably a good stopping point, before the entire post becomes a heated discussion of best practices in digital preservation.  Suffice it to say that the conference was amazing for generating great ideas and discussions, allowed researchers to share final and mid-cycle projects and receive feedback, and also the conference swag was highly notable.



DH BeNeLux, or Digital Humanities Belgium-Netherlands-Luxembourg in longform, is happening today at KB Library in the Hague (the research institute where I work is on the fifth floor in the same building).  Follow along with the hashtag (#DHBenelux) and official Twitter account (@DHBenelux) if you can.  The program is amazingly full of some of the top digital literary and history scholars working on cool projects, so I can’t wait to hear what they have to say.

As a heads up, if you follow my own Twitter account (@theglobal_lib) I’ll be livetweeting this schedule (on UTC time):

Thursday, June 12

13:00-14:00 Keynote: Melissa Terras

14:00-15:00 About DH

Beyond patterns: Using digital methods to find and think about particularities - Hieke Huistra, Bram Mellink

Preliminaries to a Digitally Carried Out Philosophy - Niels-Oliver Walkowski

15:30-16:50 Crowdsourcing

How to best chain humans and machines together: From image identification to crowdsourcing to the social graph of European integration - Lars Wieneke and Marten Düring

Migration stories in a digital era. Participative and biographic collecting at the Red Star Line Museum Antwerp - Marie-Charlotte Le Bailly

Telematic Resonance in Digital Performance: Choral Crowdsourcing Considered - Cissie Fu crowdsourced citizen science in the e-humanties - René Voorburg

Friday, June 13

09:15-10:15 Copyright

The great 20th-century hole: What the Digital Humanities miss - Alastair Dunning

Authorship and authenticity in the visions of Elisabeth of Schönau - Renske van Nie

Digital Humanities research and intellectual property rights: the case of The riddle of literary quality - Tjeerd Schiphof and Karina van Dalen-Oskam

10:45-11:45 Linked Data

Linking the STCN and Performing Big Data Queries in the Humanities - Wouter Beek, Rinke Hoekstra, Fernie Maas, Albert Meroño-Peñuela and Inger Leemans

Persistent identification: supporting digital humanities - Alina Saenko

Talk of Europe – Linking European Parliament Proceedings - Max Kemman and Astrid van Aggelen

12:00-12:45 Panel Session

12:45-13:00 Closing remarks

This entry was originally posted on January 13, 2014.  It outlines the institution-wide file format inventory that I undertook at Dumbarton Oaks as part of my work as a Library of Congress National Digital Stewardship Residency fellow.

In order to develop a better understanding of the holdings at Dumbarton Oaks as part of my NDSR project, I have been working on a file-level inventory that can hopefully be embedded in a digital preservation workflow process at DO in the future.

The benefits of an inventory are manifold, but these are a few that I highlighted in a recent presentation (all originally adapted from this DROID user guide):


The inventory basically tells us what we have, how much we have, where we have it, and most importantly, what user behaviors surround the creation and management of digital assets.  Keep these goals in mind as you are working, because undertaking a file-level inventory won’t be easy.  There really aren’t a lot of tools out there, and the ones that are there require a pretty solid base of technical knowledge.

The two tools I decided to try out were JHOVE2 and DROID.

The first was my main focus, as it integrates DROID, and therefore calls to the very robust PRONOM file registry.  On top of this, JHOVE2 includes validation of files, which is an added bonus when compared to DROID.

Drawbacks of JHOVE2, however, were pretty insurmountable in my project implementation.  They included the need to run now-outdated Java 6, and the lack of a GUI.


Command line, anyone?

The main problem that I ran up against with JHOVE2, however, wasn’t the actual implementation (all of the basic commands needed are outlined in the handbook, so even a relative novice can run it), but rather the reporting.  After going through all of the steps, the tool was spitting out a massive jumble of text that I was unable to make out.  After consulting the forums and trying our in-house IT specialist at Dumbarton Oaks, I had committed too much time to JHOVE2 and still couldn’t process the inventory reports and so I decided, for the sake of moving the project forward, that I would go with DROID instead.

The most recent version, downloadable here, is a lot more accurate than older versions.  The install is incredibly easy (for Windows: download ZIP file, unzip, run BAT file, done).

The interface is also a whole lot prettier than JHOVE2:


But of course, there were still (mysterious) problems.


Beyond the occasional crash, the tool’s output is fairly readable, especially if you pre-read the user guide referenced above.

Here’s a small example of what the final reports look like:


DROID is helpful for identifying preservation issues, like the M4V format above.  The report also provides information like MIME type (to get a top-level idea of general types of media), date last modified (I found this really helpful for determining whether a drive was full of archival assets or everyday files), and file format and size.

While I tried out these two tools, there are other possibilities to check out.  See this comprehensive list of digital preservation tools.  Some all-in-one preservation tools like Archivematica also integrate file inventorying, sometimes referred to as preservation planning.

After a few weeks of being down due to a social media shuffle, my blog is back up and running, powered by WordPress this time.

While I’m planning to migrate some of my best professional posts onto my new platform, you can access them in the meantime at  The shifting of personal SMs was initiated by the deactivation of my Facebook and subsequent need for something to fill the void.  Hence the proliferation of cat pictures in my most recent Tumbles, and the need for a new professional blog.  As I’ve been thinking of migrating to an SM that supports more long-form blogging anyways, the timing was perfect.

And so here we are.