What’s happening with our digitised newspapers?

The KB has about 10 million digitised newspaper pages, ranging from 1650 until 1995. We negotiated rights to make these pages available for research and this has happened more and more over the past years. However, we thought that many of these projects might be interested in knowing what others are doing and we wanted to provide a networking opportunity for them to share their results. This is why we organised a newspapers symposium focusing on the digitised newspapers of the KB, which was a great success!

Prof. dr. Huub Wijfjes (RUG/UvA) showing word clouds used in his research.

Prof. dr. Huub Wijfjes (RUG/UvA) showing word clouds used in his research.

On 24 March, we opened up our auditorium to 10 research projects that use our digitised newspapers. The presenters came from many areas of DH research as we had historians, linguists, computer programmers, a linked data specialist, and media researchers. Next to this, we also asked colleagues to present what they are working on with regards to our newspapers. This resulted in presentations about copyright, digitisation (OCR and future plans), a Wikipedia project, the Research Lab and Europeana Newspapers. All in all we had three blocks of presentations spanning the whole day, and an auditorium filled with interested researchers, journalists, librarians, students and colleagues.

The original idea of the day was to provide a networking opportunity for those researchers who used our newspapers in their projects, but while organising it we noticed that not only those researchers were interested in hearing what others are doing, but that a wider group of people wanted to know what was possible with the data set. When we opened up the registration, we quickly outgrew the room we booked and had to transfer to the auditorium. Ultimately, we had 145 people who visited the symposium and who were responsible for lively discussions after the presentations and during the breaks.

Discussions in between presentations.

Discussions in between presentations.

The varied audience and speakers of the day also meant having different levels of expertise in Digital Humanities and working with data, but this provided very valuable insights into working with the newspapers. The differences in skills between, for example, a historian and a programmer provides challenges when working together in research projects, but also provides a very valuable learning experience for both.

Working with new computational methods might be a bit scary for an inexperienced researcher, but can mean very interesting results, according to Dr. Martijn Kleppe (EUR). But how does a researcher know if these results are trustworthy? If the software actually does its job? Dr. Antske Fokkens (VU) explains that it is important to show researchers that they should be aware of what software can and cannot do and how you should interpret results. Another common technical issue that researchers came across is the lack of computational capacity. Humanities faculties often do not have the setup to process large amounts of data.

Dr. Antske Fokkens on the methods used at the Vrije Universiteit.

Dr. Antske Fokkens on the methods used at the Vrije Universiteit.

Next to these technical challenges, the material itself also provides problems when using it, such as OCR issues, lack of access due to the copyright, changes in the meaning of words and historical spelling and of course the ‘problem’ that we have not yet digitised every newspaper page in our collection. Working with this data and around these challenges is something that we can work together in. The KB works hard to improve the access to and usability of the newspapers and with the KB Research Lab, we can also provide support in the use of the data.

Martijn Kleppe listening to his colleague Laura Hollink

Martijn Kleppe listening to his colleague Laura Hollink

Because of this, many researchers have already used the newspapers in great research projects, such as Polimedia. Here, our newspapers are linked to the parliamentary papers and the radio bulletins, providing a search engine that shows how parliamentary debates are mentioned in the media. Or the Dutch Ships and Sailors-project where maritime data is linked together (and to our newspapers) to provide information about Dutch ships in the 18th and 19th century. Mentioning all success stories might result in a very long blog, but all presentations (in Dutch) are available via our website (scroll down for the URLs).

We were very happy with the whole symposium, the great speakers and the wonderful audience and we hope to be able to organise something similar in the future. If you are interested in our newspapers, other data sets, or our Research Lab, feel free to contact us!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s