Farewell; Work on Discerning Journalistic Styles continues!

At the end of December our current researcher-in-residence dr. Frank Harbers of Groningen University ended his project ‘Discerning Journalistic Styles’. In this blogpost he describes the outcomes and plans for the future.

It is January 2017, meaning my period as researcher-in-residence at the KB has come to an end. It also means that my project Discerning Journalistic Styles (DJS) has come to an end. It was a really nice and valuable experience and a fruitful project in which we (I couldn’t have done it without the expertise of KB programmer Juliette Lonij) have managed to create a classification tool that automatically determines the genre of news articles. You can try the tool yourself at: http://www.kbresearch.nl/genre. Just paste a Dutch news article in the text box, press the button below and the result will appear on the right side; simple as that!

Continue reading

Detecting broken ISO images: introducing Isolyzer

In my previous blog post I addressed the detection of broken audio files in an automated workflow for ripping audio CDs. For (data) CD-ROMs and DVDs that are imaged to an ISO image, a similar problem exists: how can we be reasonably sure that the created image is complete? In this blog post I will discuss some possible ways of doing this using existing tools, along with their limitations. I then introduce Isolyzer, a new tool that might be a useful addition to the existing methods.

Continue reading

Breaking WAVEs (and some FLACs)

At the KB we have a large collection of offline optical media. Most of these are CD-ROMs, but we also have a sizeable proportion of audio CDs. We’re currently in the process of designing a workflow for stabilising the contents of these materials using disk imaging. For audio CDs this involves ‘ripping’ the tracks to audio files. Since the workflow will be automated to a high degree, basic quality checks on the created audio files are needed. In particular, we want to be sure that the created audio files are complete, as it is possible that some hardware failure during the ripping process could result in truncated or otherwise incomplete files.

To get a better idea of what software tool(s) are best suitable for this task, I created a small dataset of audio files which I deliberately damaged. I subsequently ran each of these files through a set of candidate tools, and then looked which tools were able to detect the faulty files. The first half of this blog post focuses on the WAVE format; the second half covers the FLAC format (at the moment we haven’t decided on which format to use yet).

Continue reading

Two Dutch DPC Preservation Awards: what is it all about?

Accompanied by traditional festival tunes of Scottish bagpipes the finalists of the 2016 Digital Preservation Awards and their colleagues “celebrated digital preservation”, as William Kilbride called this event last week in London. And in the audience the proud Dutch group of attendees celebrated even more as we won both the Award for Research and Innovation sponsored by the Software Sustainability Institute and the award for Safeguarding the digital legacy sponsored by The National Archives. The 17 international judges looked at 33 submissions, from 10 different countries.  What was the magical ingredient that helped the Netherlands submitting 3 projects, two of them worthwhile to receive the trophees?

With the help of Rijksmuseum digitization

Continue reading

NCDD Studiedag: Een web van webarchieven

Nederland mag dan een klein land zijn, maar we staan wereldwijd wel op nummer 3 wat betreft het aantal uitgereikte domeinnamen – meer dan 5 miljoen. Ruim 14.000 daarvan worden nu door de KB verzameld en gearchiveerd in onze Web Collectie. Gisteren hield de NCDD een studiedag bij het Instituut voor Beeld en Geluid onder de titel Een web van web archieven om de Nederlandse samenwerking bij het bouwen van web collecties te bevorderen.

Continue reading

DH Clinics – librarians unite!

You might have heard someone from @KBNLResearch mention DH Clinics, or a colleague at the libraries of the Vrije Universiteit or Universiteit Leiden, but what are they, why do we need them and who are they for?

The DH Clinics are our attempt of spreading the DH-word amongst our Dutch colleagues. We wanted to set up a community of librarians who were involved in DH, in order to learn from each other and discuss new methods and initiatives. However, we soon learned that a lot of academic libraries in the Netherlands were still thinking about DH and how to implement it in their organisations. We’re speaking early 2015 now and luckily, a lot has happened since, but we believe a small impulse is needed to speed everything along.

Continue reading

Tackling problems and making progress

Our current Researcher-in-Residence, Frank Harbers, is well under way with his project “Discerning Journalistic Styles. Exploring Automated Analysis of Journalism’s Modes of Expression”. In this blogpost he gives an update on his project and its progress.

Frank Harbers

It has been several months since I wrote the first blog about my work as researcher-in-residence and the research project is in full swing by now. The first phase of the project , connecting the metadata from my own database to the historical newspaper data (and metadata) in Delpher is finished and we are fully enveloped in the main part of the project: training a classifier to automatically determine the genre of historical newspaper articles.

Continue reading

20 Years of Digital Preservation


During the preparations for iPRES 2016 the Programme Committee discussed the fact that exactly 20 years ago Preserving Digital Information. Report of the Task Force on Archiving of Digital Information was published. A landmark report by The Commission on Preservation and Access and The Research Libraries Group, published in May 1996. It describes a broad view on digital preservation and is often looked at as one of the first comprehensive reports on this topic.

It was interesting to read it again and I was wondering what the view on preservation was 20 years ago and how this relates to the topics presented at iPRES 2016?

Continue reading

Abstracts applications for Researcher-in-residence 2017

Below you will find the abstracts that were submitted and unfortunately not accepted for the 2017 run of the Researcher-in-residence programme. The abstracts are in alphabetical order. If your abstract is published here and you would like to have your name posted with it, please contact us and let us know. The accepted projects and their abstracts can be found here.

We want to thank all researchers for their interesting proposals, wish them all the best for 2017 and hope to see them again in a following year!

Continue reading

Our Researchers-in-residence 2017 will be….

Earlier this year we sent out our Call for Proposals for our Researcher-in-Residence Program 2017. This program offers a chance to early career researchers to work in the library with the Digital Humanities team and KB data. In return, we learn how researchers use the data of the KB. Together we will address their research question in a 6 month project using the digital collections of the KB and computational techniques. The output of the project will be incorporated in the KB Research Lab and is ideally beneficial for a larger (scholarly) community.

This year, we received nine proposals that focused on a wide range of datasets and techniques. Last week, a group of seven leading Dutch Digital Humanities professors met at the KB to discuss each proposal thoroughly. Today we are excited to announce the names and projects of our two Researchers-in-Residence 2017!


20160923_122943  Continue reading