KB Research

Research at the National Library of the Netherlands

Tag: Digitisation days 2014

National Library of the Netherlands participates in Digitisation Days, Madrid, 19-20 May

On 19 and 20 May, the National Library of the Netherlands (KB) visited the Digitisation Days which were held at the Biblioteca Nacional in Madrid. The conference was supported by the European Commission, and organised by the Support Action Centre of Competence in Digitisation (Succeed) project  and the IMPACT Centre of Competence (IMPACT CoC) with the cooperation of Biblioteca Nacional de España.

For the National Library, being a collection holder, the Succeed awards ceremony was one of the highlights of the conference, because it showed the application of technology to actual collections. The Succeed awards aim to recognise successful digitisation programmes in the field of historical texts, especially those using the latest technology.

Two prizes went to the Hill Museum and Manuscript Library and the Centre d’Études Supérieures de la Renaissance, while two Commendations of Merit were awarded to the London Metropolitan Archives/ University College London  and to Tecnilógica.

In her role of member of the IMPACT CoC executive board, the KB’s Head of Research, Hildelies Balk, took part in the ceremony and awarded the Commendation of Merit to the London Metropolitan Archives/ University College London for their Great Parchment Book project. You will find a short video about the project here.[youtube=http://www.youtube.com/watch?v=WDD2cVT7PeU]

Moreover, the KB hosted an interesting and fruitful Round table workshop on the future of research and funding in digitisation and the possible roles of Centres of Competence on 20 May. Some 30 librarians and researchers joined this workshop, and discussed the below topics:

  • What research is needed to further the development of the Digital Library?
  • How can Centres of Competence assist your research or development?
  • In digitisation, are we ready to move the focus from quantity to quality?
  • What enrichments, e.g. in Named Entity Recognition, Linked Data services, or crowdsourcing for OCR correction, would be most beneficial for digitisation?
  • What’s your take on Labs and Virtual Research Environments?
  • What would you like to do in these types of research settings?
  • What do you expect to get out of them?

The preliminary outcomes of the workshop show that the main goal for institutions is to give users unrestricted access to data. During the workshop, the participants discussed the many layered aspects of these three topics, i.e. ‘users’, ‘access’, and ‘data’. Moreover, the participants gave their view on the following questions in relation to these topics:

  • What stops us from making progress?
  • What helps us to make progress?
  • And what role could CoCs play in this?

The outcomes of the workshop have been documented and will be used as a starting point for the roadmap to further development of digitisation and the digital library, which will be produced within the Succeed project. This roadmap will serve to support the European Commission in preparing the 2014–2020 Work Programme for Research and Innovation.

 

‘We learn so much from each other’ – Hildelies Balk about the Digitisation Days (19-20 May)

The Digitisation Days will take place in Madrid on 19-20 May. What can you expect from them and why should you go? In order to get answers to these questions we interviewed Hildelies Balk of the National Library of the Netherlands (KB), who is also a member of the executive board of the organizing insitution, the IMPACT Centre of Competence (IMPACT CoC). – Interview and photo by Inge Angevaare (see below for Dutch version)

Hildelies Balk Reading room National Library

Hildelies Balk in the National Library’s Reading Rooms

The Digitisation Days will be of interest to …?

‘Anyone who is working with digitised historical texts. These are often difficult to use because the software cannot decipher damaged originals or illegible characters. For example:

example OCR historical text

‘The software used to ‘read’ this (Dutch) text produces the following result:

VVt Venetien den 1.Junij, Anno 1618.
DJgn i f paffato te S’ aö’Jifeert mo?üen/bah
.)etgi’uotbciraetail)i.r/JtmelchontDecht
te / sbnbe bele btr felbrr geiufttceert baer bnber
eeniglje jprant o^fen/bie ftcb .met beSpaenfcbeu
enbeeemgljen bifet Cbeiiupcen berbonbru befe

‘The Dutch National Library and many other libraries are striving to make these types of historical text more usable to researchers by enhancing the quality of the OCR (optical character recognition). Since 2008, we have been involved in European projects set up to improve the usability of OCR’d texts – preferably automatically. The IMPACT Centre of Competence as well as the Digitisation Days are quite unique in that they bring together three interest groups:

  • institutions with digitised collections (libraries, archives, museums)
  • researchers working on means to improve access to digitised text (image recognition, pattern recognition, language technology)
  • companies providing products and services in the field of digitisation and OCR.

‘Representatives of all of these groups will be taking part in the Digitisation Days and they offer participants a complete overview of the state of the art in document analysis, language technology and post-correction of OCR.’

What are the most important benefits from the Centre of Competence and the Digitisation Days, in your opinion?

‘The IMPACT Centre of Competence assists heritage institutions in taking important decisions. We evaluate available tools and report about them. Evaluation software of good quality is available as well. We also provide institutions with guidance and advice in digitisation issues by answering questions such as: what would be the best tools and methods for this particular institution? What quality can you expect from a solution? And what will it cost?’

‘The Digitisation Days offer a perfect opportunity for heritage institutions to get together and share experience and knowledge on issues such as: how to embed digitisation in your institution? How to deal with providers? Also: how do we start up new projects? Where do we find funding? On the second day, those who are interested are invited to join a workshop on the topic of the research agenda for digitisation. What should be the focus for the coming years? Should we focus on quantity or quality? How can we help shapeEuropean plans and budgets?’

Now that you mention Europe: IMPACT, IMPACT Centre of Competence, SUCCEED – the announcement of the Digitisation Days is packed with acronyms. Can you give us a bit of help here??

‘IMPACT was the first European research project aimed at improving access to historical texts. It started in 2008, at the initiative of, among others, the Dutch KB. When the project ended, a number of IMPACT partners set up the IMPACT Centre of Competence to ensure that the project results would be supported and developed. The Centre is not a project, but a standing organisation.’

Succeed is another European project, and, by definition, temporary. The objectives are in line with the IMPACT CoC, and the project involves some of the same partners. The aim is raise awareness about the results of European projects related to the digital library and to stimulate implementation. Before the CoC, it was not uncommon for prototypes to be left as they were after completion of a project. Thus the investments did not pay off.’

Will you really turn theory into practice?

‘Yes, most definitely! It is our prime focus for the conference. This is why we instituted the Succeed awards which will be handed out during the Digitisation Days; the Succeed awards recognise the best implementations of innovative technologies. The board has recently announced the winners.’

What do you personally look forward to most during the Digitisation Days?

‘To meeting everybody, to bringing together all these different parties. Colleagues from other institutions, researchers – this is exactly the right kind of meeting for generating exciting ideas and solutions.’

‘We kunnen zoveel van elkaar leren’ – Hildelies Balk over de Digitisation Days (19-20 mei)

Op 19-20 mei worden in Madrid de Digitisation Days gehouden. Wat valt er te beleven en waarom zou je erheen gaan? We vroegen het Hildelies Balk van de Koninklijke Bibliotheek, die voorzitter is van het bestuur van de organisator, het IMPACT Centre of Competence (IMPACT CoC). – interview en foto Inge Angevaare

Hildelies Balk leeszaal KB

Hildelies Balk in de leeszalen van de KB

Voor wie zijn de Digitisation Days interessant?

‘Voor iedereen die te maken heeft met gedigitaliseerde, historische teksten. Die zijn vaak moeilijk bruikbaar omdat de leessoftware veel fouten maakt. Dat komt bij voorbeeld omdat het originele drukwerk zelf al slecht was, of omdat de drukletter slecht leesbaar is:

voorbeeld OCR historische tekst

‘De software die de plaatjes moet omzetten in leesbare tekst maakt daarvan:

VVt Venetien den 1.Junij, Anno 1618.
DJgn i f paffato te S’ aö’Jifeert mo?üen/bah
.)etgi’uotbciraetail)i.r/JtmelchontDecht
te / sbnbe bele btr felbrr geiufttceert baer bnber
eeniglje jprant o^fen/bie ftcb .met beSpaenfcbeu
enbeeemgljen bifet Cbeiiupcen berbonbru befe

‘De KB en andere bibliotheken willen dit soort teksten in bruikbare vorm aanbieden aan wetenschappers. Dus zoeken we al sinds 2008 in Europees verband naar methoden om de teksten te verbeteren, liefst automatisch. Het unieke aan het IMPACT Centre of Competence én van de Digitisation Days is dat daar drie belangengroepen bij elkaar komen die elkaar versterken:

  • instellingen met collecties die gedigitaliseerd zijn (bibliotheken, archieven, musea)
  • onderzoekers die methoden ontwikkelen om gedigitaliseerde tekst te verbeteren (beeldherkenning en – verbetering, patroonherkenning, taaltechnologie)
  • leveranciers van producten en diensten voor digitalisering en OCR (optical character recognition).

‘Door de aanwezigheid van al deze mensen krijgt de bezoeker in twee dagen tijd een compleet overzicht van wat er momenteel allemaal mogelijk is – op het gebied van documentanalyse, taaltechnologie en post-correctie van OCR.’

Wat zie jij als het grootste nut van het Centre of Competence en de Digitisation Days?

‘Het IMPACT Centre of Competence helpt erfgoedinstellingen belangrijke beslissingen te nemen. We evalueren bestaande tools en publiceren daarover. Er is zelfs heel goede evaluatiesoftware. En we leveren begeleiding; als een instelling wil gaan digitaliseren kunnen wij ze van advies dienen. Wat zijn de beste tools en methoden in hun specifieke geval? Wat voor kwaliteit mag je verwachten? Wat gaat het kosten?’

‘De Digitisation Days zijn een perfecte manier voor erfgoedinstellingen om elkaar te ontmoeten, uitgebreid ervaringen en kennis te delen. Bijvoorbeeld: Hoe ga je om met leveranciers? Hoe geef je digitalisering een plek in je organisatie? Maar ook: hoe zetten we nieuwe projecten op? Hoe vinden we geldstromen? Op de tweede dag is er een workshop waarin we met belangstellenden gaan praten over de onderzoeksagenda voor digitalisering. Waar moeten we de nadruk op leggen? Meer kwantiteit of meer kwaliteit? Hoe kunnen we de plannen en budgetten van Europa beïnvloeden?’

Nu je het over Europa hebt: IMPACT, IMPACT Centre of Competence, SUCCEED – de aankondiging van de Digitisation Days staat vol met afkortingen. Kun je een beetje orde scheppen in die chaos?

‘IMPACT was het eerste Europese onderzoeksproject voor verbetering van toegang tot historische teksten dat mede op initiatief van de KB in 2008 is gestart. Toen het project afgelopen was, hebben een aantal IMPACT-partners de handen ineengeslagen om ervoor te zorgen dat de resultaten van het project onderhouden en verder ontwikkeld zouden worden. Dat is het IMPACT Centre of Competence. Geen project, maar een staande organisatie.’

Succeed is weer een Europees project en dus tijdelijk. De doelstellingen liggen helemaal in lijn met het IMPACT CoC, en daarom zijn er deels dezelfde partners bij betrokken. Doel is om te zorgen dat eindresultaten van Europese projecten op het gebied van de digitale bibliotheek goed onder de aandacht worden gebracht zodat ze gebruikt gaan worden in de praktijk. In het verleden bleven prototypes nog wel eens op de plank liggen. Dat is zonde van de investering.’

Wordt de stap van theorie naar praktijk echt gezet?

‘Jazeker! Die willen we juist alle aandacht geven. Daarom reiken we tijdens de Digitisation Days de Succeed awards uit – prijzen voor de beste toepassingen van innovatieve oplossingen. De jury heeft onlangs de kandidaten en de winnaars bekend gemaakt.’

Waar verheug jijzelf je het meest op tijdens de Digitisation Days?

‘Op de ontmoeting, het bij elkaar brengen van al die belanghebbenden. Collega’s van andere instellingen, de onderzoekers – juist uit de ontmoeting komen vaak spannende ideeën en oplossingen voort.’

© 2018 KB Research

Theme by Anders NorenUp ↑