papers past – Conal Tuohy's blog

Old News for Twitter

Yesterday I finished a little development project to build a TwitterBot for New Zealand’s online newspaper archive Papers Past.

What’s a “TwitterBot”? It’s a software application that autonomously (robotically, hence “-bot”) sends tweets. There are a lot of TwitterBots tweeting about all kinds of things. Tim Sherratt has produced a few, including one called @TroveNewsBot which tweets links to articles from the Australian online newspaper archive of Trove, and this was a direct inspiration for my TwitterBot. Recently Hugh Rundle produced a TwitterBot called Aus GLAM Blog Bot that tweets links to blog posts by people blogging in the Australian GLAM (Galleries, Libraries, Archives and Museums) sector. People like me. I’m looking forward to seeing Hugh’s bot tweeting about my bot. Continue reading Old News for Twitter

Public OAI-PMH repository for Papers Past

I have deployed a publicly available service to provide access in bulk to newspaper articles from Papers Past — the National Library of New Zealand’s online collection of historical newspapers — via the DigitalNZ API.

The service allows access to newspaper articles in bulk (up to a maximum of 5000 articles), using OAI-PMH harvesting software. To gain access to the collection, point your OAI-PMH harvester to the repository with this URI:

https://papers-past-oai-pmh.herokuapp.com/ Continue reading Public OAI-PMH repository for Papers Past

How to download bulk newspaper articles from Papers Past

Anyone interested in New Zealand history should already know about the amazing Papers Past website provided by the National Library of New Zealand, where you can read search and browse millions of historical newspaper articles and advertisements from New Zealand.

You may also know about Digital New Zealand, which provides a central point for access to New Zealand’s digital culture.

This post is about using Digital NZ and Papers Past to get access, in bulk, to newspaper articles, in a form which is then open to being “crunched” with sophisticated “data mining” software, able to automatically discover patterns in the articles. In my earlier post How to download bulk newspaper articles from Trove, I wrote:

Some researchers want to go beyond merely using computers to help them find newspaper articles to read (what librarians call “resource discovery”); researchers also want to use computers to help them read those articles.

To use that kind of “data mining” software, you first need to have direct access to your data, and that can mean having to download thousands and thousands of articles. It’s not at all obvious how to do that, but in fact it can be made quite easy, as I will show.

First, though, a brisk stroll down memory lane…

Continue reading How to download bulk newspaper articles from Papers Past