I am really excited to have begun my latest project: a Linked Open Data service for online cultural heritage from New Zealand and Australia, and eventually, I hope, from our other neighbours. I have called the service “oceania.digital”
The big idea of oceania.digital is to pull together threads from a number of different “cultural” data sources and weave them together into a single web of data which people can use to tell a huge number of stories.
There are a number of different aspects to the project, and a corresponding number of stages to go through…
- I need to gather the data together from a variety of sources. Both Trove and Digital NZ are doing this at the national level; I want to build on both of those data sources, and gradually add more and more.
- Having gathered data from my data sources, I need to transform the harvested data into an interoperable form, namely the World Wide Web Consortium’s “Resource Description Framework” (RDF). The metaphor I suggest is that of teasing out threads from the raw data, so that the threads from one dataset can later be interwoven with those from another. This is the vision of the Semantic Web.
- Having converted the data to RDF, I need to weave the threads together so that the data harvested from the different sources is explicitly linked to data from other sources. This means identifying where the same things (people, places, etc.) are described in the different data sources, and explicitly equating or merging those things. This is related to what librarians call “Authority Control“.
- Finally, having produced a web of interconnected data, I need to make it practically useful to a wide range of people, not just Semantic Web nerds like me. I will need to build, curate, and inspire the development of new tools that will help end-users to tell stories using the RDF dataset. Most people won’t be programming with RDF themselves, and they won’t be excited by JSON-LD or SPARQL; they will need user-friendly software tools that allow them to summon up the data they need, with a minimum of technical geekery, and to use it to produce visualisations, links, images, maps, and timelines, which they can embed on their blogs and websites, in Facebook, Twitter, and other social media.
So far, I have set up a website, with some harvesting software and an RDF data store.
The first dataset I intend to process is “People Australia”; a collection of biographical which is aggregated from a variety of Australian sources and published by the National Library of Australia’s “People Australia”. Hopefully soon after I will be be able to add a related dataset from New Zealand.
Once I have some data available in RDF form, I will add some features to allow the data to be reused on other websites, then I’ll go back and add more datasets from elsewhere, and repeat the process.
If you’d like to keep in touch with the project as it progresses, you can follow the @OceaniaDigital account on Twitter, or follow my blog.
If you think you’d like to contribute to the project in any way, please do get in touch, either via Twitter or email!