Australian Society of Archivists 2016 conference #asalinks

Last week I participated in the 2016 conference of the Australian Society of Archivists, in Parramatta.

ASA Links poster
#ASALinks poster

I was very impressed by the programme and the discussion. I thought I’d jot down a few notes here about just a few of the presentations that were most closely related to my own work. The presentations were all recorded, and as the ASA’s YouTube channel is updated with newly edited videos, I’ll be editing this post to include those videos.

It was my first time at an ASA conference; I’d been dragged into it by Asa Letourneau, from the Public Record Office Victoria, with whom over the last year I’d been developing a piece of software called “PROVisualizer”:

Asa and I gave a presentation on the PROVisualizer, talking about the history of the project from the early prototypes and models built at PROV, to the series of revisions of the product built in collaboration with me, and including the “Linked Data” infrastructure behind the visualization itself, and its prospects for further development and re-use.

You can access the PROVisualizer presention in PDF.

As always, I enjoyed Tim Sherratt‘s contribution: a keynote on redaction by ASIO (secret police) censors in the National Archives, called Turning the Inside Out.

The black marks are of course inserted by the ASIO censors in order to obscure and hide information, but Tim showed how it’s practicable to deconstruct the redactions’ role in the documents they obscure, and convert these voids, these absences, into positive signs in their own right; and that these signs can be utilized to discover politically sensitive texts, and zoom in precisely on the context that surrounds the censored details in each text. Also the censors made a lot of their redaction marks into cute little pictures of people and sailing ships, which raised a few laughs.

In the morning of the first day of talks, I got a kick out of Chris Hurley’s talk “Access to Archives (& Other Records) in the Digital Age”. His rejection of silos and purely hierarchical data models, and his vision of openness to, and accommodation of, small players in the archives space both really resonated with me, and I was pleased to be able to chat with him over coffee later in the afternoon about the history of this idea and about how things like Linked Data and the ICA’s “Records in Context” data model can help to realize it.

In the afternoon of the first day I was particularly struck by Ross Spencer‘s presentation about extracting metadata from full text resources. He spoke about using automation to identify the names of people, organisations, places, and so on, within the text of documents. For me this was particularly striking because I’d only just begun an almost identical project myself for the Australian Policy Online repository of policy documents. In fact it turned out we were using the same software (Apache Tika and the Stanford Named Entity Recognizer).

On the second day I was particularly struck by a few papers that were very close to my own interests. Nicole Kearney, from Museum Victoria, talked about her work coordinating the Biodiversity Heritage Library Australia.

This presentation was focused on getting value from the documentary heritage of museums; such things as field notes and diaries from scientific expeditions, by using the Atlas of Living Australia’s DigiVol transcription platform to allow volunteers to transcribe the text from digital images, and then publishing the text and images online using the BHL publication platform. In between there was slightly awkward part which involves Nicole converting from the CSV format produced by DigiVol into some more standard format for the BHL. I’ve had an interest in text transcription going back to slightly before the time I joined the New Zealand Electronic Text Centre at Victoria University of Wellington; this would’ve been about 2003, which seems like ancient times now.

After that I saw Val Love and Kirsty Cox talk about their journey in migrating the Alexander Turnbull Library‘s TAPUHI software to KE EMu. Impressive, given the historical complexity of TAPUHI, and the amount of data analysis required to make sense of its unique model, and to translate that into a more standard conceptual model, and to implement that model using EMu. It’s an immense milestone for the Turnbull, and I hope will lead in short order to the opening up of the collection metadata to greater reuse.

Finally I want to mention the talk “Missing Links: museum archives as evidence, context and content” from Mike Jones. This was another talk about breaking down barriers between collection management systems in museums: on the one hand, the museum’s collection of objects, and on the other, the institution’s archives. Of course those archives provide a great deal of context for the collection, but the reality is that the IT infrastructure and social organisation of these two systems is generally very distinct and separate. Mike’s talk was about integrating cultural heritage knowledge from different organisational structures, domains of professional expertise, different data models, and IT systems. I got a shout-out in one slide in the form of a reference to some experimental work I’d done with Museum Victoria’s web API, to convert it into a Linked Data service.

It’s my view that Linked Data technology offers a practical approach to resolving the complex data integration issues in cultural heritage: it is relatively easy to expose legacy systems, whatever they might be, in the form of Linked Data, and having done so, the task of integrating the data so exposed is also rather straight-forward (that’s what Linked Data was invented for, pretty much). To me the problem is how to sell this to an institution, in the sense that you have to offer the institution itself a “win” for undertaking the work. If it’s just that they can award themselves 5 gold stars for public service that’s not a great reason. You need to be able to deliver tangible value to museums themselves. This is where I think there’s a gap; in leveraging Linked Data to enhance exhibitions and also in-house collection management systems. If we can make it so that there’s value to institutions in creating and also consuming Linked Data, then we may be able to establish a virtuous circle to drive uptake of the technology, and see some progress in  the integration of knowledge in the sector.


Linked Open Data Visualisation at #GLAMVR16

On Thursday last week I flew to Perth, in Western Australia, to speak at an event at Curtin University on visualisation of cultural heritage. Erik Champion, Professor of Cultural Visualisation, who organised the event, had asked me to talk about digital heritage collections and Linked Open Data (“LOD”).

The one-day event was entitled “GLAM VR: talks on Digital heritage, scholarly making & experiential media”, and combined presentations and workshops on cultural heritage data (GLAM = Galleries, Libraries, Archives, and Museums) with advanced visualisation technology (VR = Virtual Reality).

The venue was the Curtin HIVE (Hub for Immersive Visualisation and eResearch); a really impressive visualisation facility at Curtin University, with huge screens and panoramic and 3d displays.

There were about 50 people in attendance, and there would have been over a dozen different presenters, covering a lot of different topics, though with common threads linking them together. I really enjoyed the experience, and learned a lot. I won’t go into the detail of the other presentations, here, but quite a few people were live-tweeting, and I’ve collected most of the Twitter stream from the day into a Storify story, which is well worth a read and following up.
Continue reading Linked Open Data Visualisation at #GLAMVR16

Visualizing Government Archives through Linked Data

Tonight I’m knocking back a gin and tonic to celebrate finishing a piece of software development for my client the Public Record Office Victoria; the archives of the government of the Australian state of Victoria.

The work, which will go live in a couple of weeks, was an update to a browser-based visualization tool which we first set up last year. In response to user testing, we made some changes to improve the visualization’s usability. It certainly looks a lot clearer than it did, and the addition of some online help makes it a bit more accessible for first-time users.

The visualization now looks like this (here showing the entire dataset, unfiltered, which is not actually that useful, though it is quite pretty):

Continue reading Visualizing Government Archives through Linked Data

Taking control of an uncontrolled vocabulary

A couple of days ago, Dan McCreary tweeted:

It reminded me of some work I had done a couple of years ago for a project which was at the time based on Linked Data, but which later switched away from that platform, leaving various bits of RDF-based work orphaned.

One particular piece which sprung to mind was a tool for dealing with vocabularies. Whether it’s useful for Dan’s talk I don’t know, but I thought I would dig it out and blog a little about it in case it’s of interest more generally to people working in Linked Open Data in Libraries, Archives and Museums (LODLAM).
Continue reading Taking control of an uncontrolled vocabulary

Bridging the conceptual gap: Museum Victoria’s collections API and the CIDOC Conceptual Reference Model

A Museum Victoria LOD graph about a teacup, shown using the LODLive visualizer.
A Museum Victoria LOD graph about a teacup, shown using the LODLive visualizer.
This is the third in a series of posts about an experimental Linked Open Data (LOD) publication based on the web API of Museum Victoria.

The first post gave an introduction and overview of the architecture of the publication software, and the second dealt quite specifically with how names and identifiers work in the LOD publication software.

In this post I’ll cover how the publication software takes the data published by Museum Victoria’s API and reshapes it to fit a common conceptual model for museum data, the “Conceptual Reference Model” published by the documentation committee of the Internal Council of Museums. I’m not going to exhaustively describe the translation process (you can read the source code if you want the full story), but I’ll include examples to illustrate the typical issues that arise in such a translation.

Continue reading Bridging the conceptual gap: Museum Victoria’s collections API and the CIDOC Conceptual Reference Model

Names in the Museum

My last blog post described an experimental Linked Open Data service I created, underpinned by Museum Victoria’s collection API. Mainly, I described the LOD service’s general framework, and explained how it worked in terms of data flow.

To recap briefly, the LOD service receives a request from a browser and in turn translates that request into one or more requests to the Museum Victoria API, interprets the result in terms of the CIDOC CRM, and returns the result to the browser. The LOD service does not have any data storage of its own; it’s purely an intermediary or proxy, like one of those real-time interpreters at the United Nations. I call this technique a “Linked Data proxy”.

I have a couple more blog posts to write about the experience. In this post, I’m going to write about how the Linked Data proxy deals with the issue of naming the various things which the Museum’s database contains.

Continue reading Names in the Museum

Linked Open Data built from a custom web API

I’ve spent a bit of time just recently poking at the new Web API of Museum Victoria Collections, and making a Linked Open Data service based on their API.

I’m writing this up as an example of one way — a relatively easy way — to publish Linked Data off the back of some existing API. I hope that some other libraries, archives, and museums with their own API will adopt this approach and start publishing their data in a standard Linked Data style, so it can be linked up with the wider web of data.

Continue reading Linked Open Data built from a custom web API

Zotero, Web APIs, and data formats

I’ve been doing some work recently (for a couple of different clients) with Zotero, the popular reference management software. I’ve always been a big fan of the product. It has a number of great features, including the fact that it integrates with users’ browsers, and can read metadata out of web pages, PDF files, linked data, and a whole bunch of APIs.


One especially nice feature of Zotero is that you can use it to collaborate with a group of people on a shared library of data which is stored in the cloud and synchronized to the devices of the group members.
Continue reading Zotero, Web APIs, and data formats

Proxying: a trick to easily add features to existing websites and applications

At the start of last month I attended the LODLAM (Linked Open Data in Libraries, Archives and Museums) Summit in Sydney, in the lovely Mitchell Library of the State Library of New South Wales.

The Summit is organised as an “un-conference”. There is no pre-defined agenda; it’s organised by the participants themselves at the start of the day. It makes it a very participatory event; your brain is in top gear the whole time and everything is so interesting you end up feeling a bit stunned at the end of the day.

One of the features of the Summit was a series of very brief talks (“speedos”) on a variety of topics. At the last minute I decided I’d contribute a quick rant on a particular hobby-horse of mine: the value of using proxies to build web applications, Linked Open Data, and so on. Continue reading Proxying: a trick to easily add features to existing websites and applications

Old News for Twitter

Yesterday I finished a little development project to build a TwitterBot for New Zealand’s online newspaper archive Papers Past.

What’s a “TwitterBot”? It’s a software application that autonomously (robotically, hence “-bot”) sends tweets. There are a lot of TwitterBots tweeting about all kinds of things. Tim Sherratt has produced a few, including one called @TroveNewsBot which tweets links to articles from the Australian online newspaper archive of Trove, and this was a direct inspiration for my TwitterBot. Recently Hugh Rundle produced a TwitterBot called Aus GLAM Blog Bot that tweets links to blog posts by people blogging in the Australian GLAM (Galleries, Libraries, Archives and Museums) sector. People like me. I’m looking forward to seeing Hugh’s bot tweeting about my bot. Continue reading Old News for Twitter