At the end of December our current researcher-in-residence dr. Frank Harbers of Groningen University ended his project ‘Discerning Journalistic Styles’. In this blogpost he describes the outcomes and plans for the future.
It is January 2017, meaning my period as researcher-in-residence at the KB has come to an end. It also means that my project Discerning Journalistic Styles (DJS) has come to an end. It was a really nice and valuable experience and a fruitful project in which we (I couldn’t have done it without the expertise of KB programmer Juliette Lonij) have managed to create a classification tool that automatically determines the genre of news articles. You can try the tool yourself at: http://www.kbresearch.nl/genre. Just paste a Dutch news article in the text box, press the button below and the result will appear on the right side; simple as that!
Currently the tool predicts the correct genre in 65% of the cases. This might not seem that high at face value. However, we need to take into account 1) that genres are ideal types that never manifest themselves in their pure form and boundaries between different genres are fluid; 2) that genres are dynamic concepts that change over time, and 3) that genre is a typical example of a ‘latent content’ category, meaning that determining a genre involves a considerable amount of interpretation. It is therefore unsurprising that classifying genres manually is also difficult and that human coders also regularly disagree on what the correct genre of a text is. In fact, it is not unusual that 20 to 30% of the time, human coders disagree on what the right genre of a (historical) news article is. With that in mind, 65% is a solid result – which is not to say that it doesn’t need to be improved.
In that sense the research has only just begun. Not only because in the coming period we will keep concern ourselves with presenting the results on conferences and in academic articles, but also because we are developing research projects that follow-up on DJS. I therefore hope this won’t be the last time I visit The Hague to delve into the historical newspaper collection. If you are curious about the tool, please keep a close eye on the new Lab website of the KB that will be launched soon. On that website we aim to give much more detailed information on the tool and its background.