Public OAI-PMH repository for Papers Past

I have deployed a publicly available service to provide access in bulk to newspaper articles from Papers Past — the National Library of New Zealand’s online collection of historical newspapers — via the DigitalNZ API.

The service allows access to newspaper articles in bulk (up to a maximum of 5000 articles), using OAI-PMH harvesting software. To gain access to the collection, point your OAI-PMH harvester to the repository with this URI:

https://papers-past-oai-pmh.herokuapp.com/

If you’re looking for a good harvester, let me recommend jOAI.

Searching

You can harvest records that match a search. Provide your search query as an OAI-PMH set, for example to search for “titokowaru”, specify search:titokowaru as the value of the OAI-PMH set parameter:

https://papers-past-oai-pmh.herokuapp.com/?verb=ListRecords&metadataPrefix=oai_dc&set=search:titokowaru

Formats available

You can harvest records (i.e. articles) in one of three different formats:

  • html — this format returns the full text of the articles, and is likely to be the most useful format. Note that the text available through DigitalNZ has had punctuation and capitalization removed.
  • oai_dc — a simple metadata record.
  • digitalnz — straightforwardly based on DigitalNZ’s own metadata format.

https://papers-past-oai-pmh.herokuapp.com/?verb=ListMetadataFormats

Happy harvesting!

One thought on “Public OAI-PMH repository for Papers Past”

  1. Pingback: Proxy Service

Make a comment