Earlier this year we sent out our Call for Proposals for our Researcher-in-Residence Program 2017. This program offers a chance to early career researchers to work in the library with the Digital Humanities team and KB data. In return, we learn how researchers use the data of the KB. Together we will address their research question in a 6 month project using the digital collections of the KB and computational techniques. The output of the project will be incorporated in the KB Research Lab and is ideally beneficial for a larger (scholarly) community.
This year, we received nine proposals that focused on a wide range of datasets and techniques. Last week, a group of seven leading Dutch Digital Humanities professors met at the KB to discuss each proposal thoroughly. Today we are excited to announce the names and projects of our two Researchers-in-Residence 2017!
The first researcher is Melvin Wevers of Utrecht University who will be focusing on advertisements in our Digital Newspapers, please find his abstract below.
Combining Textual Content and Non-Textual Features of Digitized Newspaper Advertisements to Study Historical Developments in the Dutch Consumer Society
The KB’s digitized newspaper collection provides an important and exciting set of advertisements. After all, newspapers played a major role in the dissemination of advertisements. This project aims to analyze how advertisements in digitized newspapers can be used to study historical changes in the Dutch consumer society. Roland Marchand argues that advertisements provide an insight into the ideals and aspirations of past realities. Advertisements show the state of technology, the social functions of products, and provide information on the society in which a product was sold (Marchand, 1985). Academic work on advertisements often focuses on a specific symbolic connotation, such as gender or consumerism. This project aims to build on this scholarship.
In this project, I will develop computational methods to identify trends and breakpoints in newspaper advertisements that represent the Dutch consumer society. Schreurs contends that advertisements changed markedly in the Netherlands during the twentieth century (Schreurs, 2001). He claims that advertisements became more visual and gained prominence in media. I intend to use the KB’s researcher-in-residence fellowship to test this hypothesis in a quantifiable manner. In collaboration with the KB, I develop computational methods to analyze three aspects of the advertisements in newspapers between 1850 and 1950. First, the position, size, and frequency of newspaper ads. These metadata are indicators of the prominence and cultural impact of advertisements. For instance, a large advertisement on the front page in a national newspaper has more impact than a small ad on page 8 in a local newspaper. After aggregating these metadata, we can analyze their temporal dynamics. Do changes over time in these aspects reveal characteristics of the Dutch advertising landscape? For instance, did advertisements increase in size and/or move to the front pages?
Secondly, I propose to develop methods to cluster ads by brand or product group using text mining techniques on the textual content of advertisements—available as OCR-ed text. This clustering can help to understand whether the trends found in the metadata are product-specific. The information derived from specific product groups can be used to test existing hypotheses posed in corporate histories or histories of the advertising industry. For instance, did the position and size of cigarette ads changed over time?
Thirdly, I focus on the visual aspect of advertisements. For this aspect, I would like to examine whether computer vision techniques can be applied to advertisements. The precision of computer vision techniques is far from perfect, and therefore this last step would be mostly exploratory (Snoek et al. 2015). A large part of the meaning in advertisements was expressed in images. The extraction of advertisements from the corpus allows for the analysis of visual information in advertisements. Can computer vision be used to identify logos, objects, and people in ads? The dataset’s richness and size offers the KB the possibility to set important future steps in the field of computer vision.
Our second researcher-in-residence is Thomas Smits of Radboud University. Thomas will also focus on newspapers but will address the illustrations. Please find his abstract below.
Illustrations to Photographs: using computer vision to analyse news pictures in Dutch newspapers, 1860-1940
Most digital humanities projects are based on the analysis of text. However, in our increasingly visually orientated world, it has become clear that we should also devise ways to analyse visual material. In the last couple of years, the KB has made important steps in this emergent field: Delpher provides users with the opportunity to search for ‘images with caption’ in its database of digitized newspapers and the KBK-1M database, which holds all the images published in the KB’s digitized newspapers between 1923 and 1995, provides researchers with the opportunity to analyse the visual material of this collection in a viable way.
The proposed research will apply two computer vision techniques to sort the images of the KBK-1M database according to the way in which they were reproduced (engraving/half-tone) and shed a new light on an important transitional phase in the history of the visual culture of the news. Several media historians suggest that around 1900 both illustrations and photographs were considered to be objective visual representations of the news. However, relying on case studies, they have been unable to pinpoint this period. By introducing a digital humanities approach to this question, the proposed project will describe and analyse this period. It consists of two phases, connected to two digital humanities components.
First of all, building on the PhoCon project of Elliott & Kleppe (2016), the KBK-1M database will be expanded to include the period 1860-1923. Using the power of the SURFSara’s Cartesius supercomputer, the first phase will apply the technique of a recent project of Fyfe & Ge (2016) to the images in the expanded database. Fyfe and Ge analysed images in three Victorian illustrated newspapers by measuring their pixel ratio and the entropy level. By juxtaposing these two so-called low-level features, images could be sorted according to the technique used for their reproduction (engraving/half-tone).
The second phase will explore how (a combination of) two open source applications (OpenCV/Caffe) can be used to fine-tune the recognition of engravings and photographs. Both programs can create so-called cascade classifiers that are able to detect faces, objects, or patterns on images in a large dataset by comparing them to a manually created training set. Based on the results of the application of Fyfe & Ge’s method, several cascade classifiers can be created are able to detect specific patterns of different reproduction techniques.
The project will provide researchers with a new way to sort, discover patterns, and make sense of the visual material contained in the KB’s collection of digitized newspapers. Second, by applying computer vision techniques to study an important development in the history of the visual culture of the news, it introduces a digital humanities approach to the relatively theoretical field of nineteenth-century visual culture studies.
Both projects will take place in 2017 and we will keep you updated on their progress on this blog. If you are curious about the work of our current researcher-in-residence Frank Harbers, please see his latest blog post.
If you have any questions about the projects or programme, feel free to contact us via firstname.lastname@example.org. All other submitted abstracts will be posted in a separate blog later this week.