KB Research

Research at the National Library of the Netherlands

Page 3 of 13

Dataset KBK-1M containing 1.6 Million Newspaper Images available for researchers

Each year the KB invites two academics to come and work with us as researchers in residence: early career researchers who work in the library with our Digital Humanities team and KB Data.  Together we address their research questions in a 6 month project using our digital collection and computational techniques. The output of the project will be incorporated in the KB Research Lab. Today we are happy to announce the output of the PhoCon project (‘Photos in and Out of Context’) by dr. Martijn Kleppe and dr. Desmond Elliott: the KBK-1M Dataset containing 1.6 Million Newspapers Images

Continue reading

Valid, but not accessible EPUB: crazy fixed layouts

EpubCheck is an invaluable tool for assessing the quality of EPUB files. Still, it is possible that EPUBs that are valid according to the format specification (and thus EpubCheck) are nevertheless inaccessible to some users. Some weeks ago a colleague sent me an EPUB 2 file that produced some really strange behaviour across a number of viewer applications. For a start, the text wouldn’t reflow properly after re-sizing the viewer window, and increasing the font size resulted in garbled text. Running the file through EpubCheck did return some validation errors, but none of these were related to the behaviour I was getting. Closer inspection revealed some very peculiar stylesheet and HTML use.

Continue reading

The future of EPUB? A first look at the EPUB 3.1 Editor’s draft


About a month ago the International Digital Publishing Forum, the standards body behind the EPUB format, published an Editor’s Draft of EPUB 3.1. This is meant to be the successor of the current 3.0.1 version. IDPC has set up a community review, which allows interested parties to comment on the draft. The proposed changes relative to EPUB 3.0.1 are summarised in this document. A note at the top states (emphasis added by me):

The EPUB working group has opted for a radical change approach to the addition and deletion of features in the 3.1 revision to move the standard aggressively forward with the overarching goals of alignment with the Open Web Platform and simplification of the core specifications.

As Gary McGath pointed out earlier, this is a pretty bold statement for what is essentially a minor version. The authors of the draft also mention that they expect it “will provoke strong reactions both for and against”, and that changes that raise “strong negative reactions” from the community “will be reviewed for future drafts”.

This blog post is an attempt to identify the main implications of the current draft for libraries and archives: to what degree would the proposed changes affect (long-term) accessibility? Since the current draft is particularly notable for its aggressive removal of various existing EPUB features, I will focus on these. These observations are all based on the 30 January 2016 draft of the changes document.

Continue reading

“Visible data, invisible infrastructure” iDCC conferentie 2016

Slechts 12% van data ontstaan bij onderzoek, gefinancierd door National Institutes of Health,  komt in een ‘trusted repository’ terecht, de rest is verloren, aldus Barend Mons (professor Biosemantics, LUMC), de keynote spreker op deze 11de IDCC conferentie. Verbeteren van deze situatie gaat langzamer dan verwacht. Maar hij heeft wel een visie op wat er beter moet. Data moet FAIR zijn (Findable, Accessible, Interoperable, Re-usable) maar vooral ook machine readable.  Waarom? Om sneller betere ontdekkingen in de wetenschap te doen. “ Research as a social machine”: door een continue interactie tussen miljoenen computers en miljoenen onderzoekers. Hergebruik van datasets wordt steeds belangrijker maar om ze aan de FAIR principles te laten voldoen, zijn er goed opgeleide “data stewards” nodig, die de onderzoekers hierbij helpen. Mons voorziet dat er op korte termijn 500.000  data stewards in Europa nodig zijn en maakt zich daar hard voor.

Het wetenschappelijk artikel gaat volgens Mons de huidige centrale plek verliezen ten faveure van de datasets. Niet iedereen was het hiermee eens, maar vanuit een collectieoogpunt zijn deze ontwikkelingen belangrijk. Verzamelen we wel de juiste zaken en sluiten onze activiteiten aan bij wat er in de wereld gebeurt?

Continue reading

Three Library Science Dissertations

Prof. dr. Frank Huysmans is extraordinary professor of library science at the University of Amsterdam. His chair is funded by the National Library of the Netherlands (KB). On his website warekennis.nl he blogs regularly and recently he discussed three Dutch dissertations on Library Science. We are happy to reblog his Dutch post below.

Drie bibliotheekpromoties in acht weken

Soms lijkt er een jaar niets te gebeuren. Of nog langer. Promovendi ploeteren voort en werken in stilzwijgen door aan het Grote Werk. Dan ineens is alles af en krijg je in korte tijd drie van die boekwerken voor je kiezen. Dat klinkt als een opgave, en dat is het, maar het is ook een feest.

Continue reading

From keyword search to concept mining – Keyword Generator as a tool for the study of historical news media

This blog is written by dr. Pim Huijnen and research programmer Juliette Lonij. Pim worked as a Researcher-in-residence at the KB in the first half of 2015. The tool that is discussed below is available at https://github.com/jlonij/keyword_generator


Historical newspapers have traditionally been popular sources to study public mentalities and collective cultures within historical scholarship. At the same time, they have been known as notoriously time-consuming and complex to analyze. The recent digitization of newspapers and the use of computers to gain access to the growing mass of digital corpora of historical news media are altering the historian’s heuristic process in fundamental ways.

Continue reading

Abstracts received for Researcher-in-residence 2016

Below you will find all abstracts that were submitted and unfortunately not accepted for the 2016 run of the Researcher-in-residence programme. The abstracts are in alphabetical order. The accepted projects and their abstracts can be found here.

We want to thank all researchers for their interesting proposals, wish them all the best for 2016 and hope to see them again in a following year!

Continue reading

And now… the results of our Researcher-in-residence CfP!

Shortly before the Christmas break, we had a very interesting afternoon discussing the wonderful projects that were submitted following our Researcher-in-residence call for proposals. We received 11 projects with varying plans and end results, but could unfortunately only accept two. Luckily, we did not have to make this difficult choice alone, but were supported by a group of professors from all over the country who are involved with Digital Humanities research.

Continue reading

Jpylyzer 2015 round-up

Yesterday (7 December) we released version 1.16.0 of the jpylyzer tool, which is this year’s third release of the software (excluding bugfix releases). This blog post gives a brief overview of the main jpylyzer improvements that have been implemented over this year.

Continue reading

iPRES 2015 Chapel Hill


Audit, CD-ROMS, Emulatie, Ingest, OAIS en Web, dat waren in alfabetische volgorde de meest besproken onderwerpen tijdens de jaarlijkse conferentie iPRES 2016, die vorige week plaatsvond in Chapel Hill, North Carolina. Dit is mijn persoonlijke indruk, want natuurlijk kwamen in de lezingen, posters en workshops nog veel meer onderwerpen aan bod. Het is tenslotte een jaarlijkse reünie waarbij iedereen probeert zijn resultaten en toekomstplannen te presenteren.
Continue reading

« Older posts Newer posts »

© 2018 KB Research

Theme by Anders NorenUp ↑