A tool for Web API harvesting

A medieval man harvesting metadata from a medieval Web API

As 2016 stumbles to an end, I’ve put in a few days’ work on my new project Oceania, which is to be a Linked Data service for cultural heritage in this part of the world. Part of this project involves harvesting data from cultural institutions which make their collections available via so-called “Web APIs”. There are some very standard ways to publish data, such as OAI-PMH, OpenSearch, SRU, RSS, etc, but many cultural heritage institutions instead offer custom-built APIs that work in their own peculiar way, which means that you need to put in a certain amount of effort in learning each API and dealing with its specific requirements. So I’ve turned to the problem of how to deal with these APIs in the most generic way possible, and written a program that can handle a lot of what is common in most Web APIs, and can be easily configured to understand the specifics of particular APIs.
Continue reading A tool for Web API harvesting

Zotero, Web APIs, and data formats

I’ve been doing some work recently (for a couple of different clients) with Zotero, the popular reference management software. I’ve always been a big fan of the product. It has a number of great features, including the fact that it integrates with users’ browsers, and can read metadata out of web pages, PDF files, linked data, and a whole bunch of APIs.

zotero

One especially nice feature of Zotero is that you can use it to collaborate with a group of people on a shared library of data which is stored in the cloud and synchronized to the devices of the group members.
Continue reading Zotero, Web APIs, and data formats

Beta release of XProc-Z web server framework

I have at last released a “final” version of my web server framework, XProc-Z, for testing. The last features I had wanted to include were:

  • The ability for the XProc code in the web server to read information from its environment, so that a generic XProc pipeline can be customized by setting configuration properties.
  • Full support for sending and receiving binary files (i.e. non text files). XProc is really a language for processing XML, but I think it will be handy to be able to deal with binary files as well from time to time.
  • A few sample XProc pipelines, to demonstrate the capability of the platform.

XProc-Z-samples Continue reading Beta release of XProc-Z web server framework