This week, the annual DHBenelux conference will take place in Belval, Luxembourg. It will bring together practically all DH scholars from Belgium (BE), the Netherlands (NE) and Luxembourg (LUX). You can read the full program and all abstracts on the website. Two presentations are by members of our DH team (Steven Claeyssens & Martijn Kleppe) and one presentation is by our current researcher in residence (Puck Wildschut – Radboud University Nijmegen). Please find the first paragraphs of their abstracts below:
Each year the KB invites two academics to come and work with us as researchers in residence: early career researchers who work in the library with our Digital Humanities team and KB Data. Together we address their research questions in a 6 month project using our digital collection and computational techniques. The output of the project will be incorporated in the KB Research Lab. Today we are happy to announce the output of the PhoCon project (‘Photos in and Out of Context’) by dr. Martijn Kleppe and dr. Desmond Elliott: the KBK-1M Dataset containing 1.6 Million Newspapers Images
Prof. dr. Frank Huysmans is extraordinary professor of library science at the University of Amsterdam. His chair is funded by the National Library of the Netherlands (KB). On his website warekennis.nl he blogs regularly and recently he discussed three Dutch dissertations on Library Science. We are happy to reblog his Dutch post below.
Drie bibliotheekpromoties in acht weken
Soms lijkt er een jaar niets te gebeuren. Of nog langer. Promovendi ploeteren voort en werken in stilzwijgen door aan het Grote Werk. Dan ineens is alles af en krijg je in korte tijd drie van die boekwerken voor je kiezen. Dat klinkt als een opgave, en dat is het, maar het is ook een feest.
This blog is written by dr. Pim Huijnen and research programmer Juliette Lonij. Pim worked as a Researcher-in-residence at the KB in the first half of 2015. The tool that is discussed below is available at https://github.com/jlonij/keyword_generator.
Historical newspapers have traditionally been popular sources to study public mentalities and collective cultures within historical scholarship. At the same time, they have been known as notoriously time-consuming and complex to analyze. The recent digitization of newspapers and the use of computers to gain access to the growing mass of digital corpora of historical news media are altering the historian’s heuristic process in fundamental ways.
Deze blogpost is geschreven door dr. Martijn Kleppe en is herblogt van www.martijnkleppe.nl (17 april 2015). Sinds publicatie zijn enkele zaken binnen het onderzoek aangepast. Binnenkort schrijft Martijn hierover een uitgebreidere, Engelstalige blog.
Sinds 1 april ben ik voor een half jaar ‘onderzoeker te gast’ op de onderzoeksafdeling van de Koninklijke Bibliotheek om te werken aan mijn project ‘FoCon – Foto’s in en uit context’. Het is een erg leuke kans omdat ik de ruimte krijg om de digitale kranten– en tijdschriftencollectie alsmede het webarchief van de KB te verkennen waarbij ik me vooral richt op het gepubliceerde beeldmateriaal.
Author: Tineke Koster
As I am writing this, volunteers are rekeying our 17th century newspapers articles. Optical character recognition of the gothic text type in use at the time has yielded poor results, making this part of our digital collection nearly inaccessible for full-text search. The Meertens institute, who have an excellent track record when it comes to crowdsourcing, has developed the editor (Dutch). Together with them we are working towards a full update of all newspaper issues from 1618 to 1700 that are available in our website Delpher.
Great news and, for some researchers, an eagerly awaited development. A bright future beckons in which our digital text corpus is 100% correct, just waiting to be mined for dynamic phenomena and paradigm shifts.
But we have to realize that without the proper precautions, correcting digital texts may also hinder researchers in their work. How so? These texts may have been used (browsed, mined, cited, etc.) by researchers in their earlier form. The improvement or enrichment may have consequences for the reproducibility of their research results.
For all researchers the need to reproduce research results is growing, with new guidelines due to new laws. There is also a specific group of researchers that need sustained access to older versions of digital text. The need is highest for research where the goal is to develop an algorithm and to assess its quality relative to previous versions of the same algorithm or to other algorithms. Without sustained access to older versions, these people cannot do their work.
Is it our role to provide this access? How the National Library of the Netherlands is thinking about this issue, I hope to explain in a later blogpost (soon!). Meanwhile, I would be very interested to hear your experiences. How is this subject discussed in your organization? Does your organization have a policy in place to deal with this?
Great answer by Dot Porter to the OCLC report What if we do, in fact, know best?: A Response to the OCLC Report on DH and Research Libraries ← dh lib.
Dot Porter’s response can be reinforced by a quote from DH ‘silverback’ Andrew Prescott from his influential essay An Electric Current of the Imagination: What the Digital Humanities Are and What They Might Become in http://journalofdigitalhumanities.org/1-2/:
“digital humanities […] has often developed from libraries and information services and it is frequently seen as a support service. One of the things that I am proudest of in my career is the way in which I have moved between being a curator, an academic, and a librarian. Museums, galleries, libraries, and archives are just as important to cultural health as universities. Indeed, I have found my time as a curator and librarian consistently far more intellectually exciting and challenging than being an academic.”
Watch KB director general Bas Savenije on This Week in Libraries talk on how KB, a national library, will integrate its infrastructure with that of public libraries. Also good stuff on why Europe needs to work together in The European Library : ‘by working together on developing tools and services you can all share, you free up efforts in your library for other services you can offer your users!’
A drastic cut was made in the budget for the Connecting Europe Facility (CEF) from 9 billion to 1 billion euros. This will hit Europeana, the infrastructure supporting Europe’s free digital library, museum and archive, very hard. Europeana is now being asked to put the case for funding under the revised guidelines for CEF, which were issued 28 May 2013. Europeana will face severe competition for the available funding from other digital service infrastructure such as e-Justice, e-Health and Safer Internet. All good causes in their own right, but the wonderful digital culture infrastructure that has been built in the last decade will soon get squashed if we do not speak out now! So here goes:
Here is a summary of the three arguments for funding:
1 Europeana supports economic growth.
Some Impact Indicators:
- To date, 770 businesses, entrepreneurs, educational and cultural organisations are exploring ways of including Europeana information in their offerings (websites, apps, games etc.) through our API. See examples such as inventingeurope.eu and http://www.zenlan.com/collage/europeana.
- Digital heritage creates jobs – in Hungary, for example, over 1,000 graduates are now involved in digitising heritage that will feed in to Europeana. Historypin in the UK predicts it will double in size with the availability of more open digital cultural heritage.
2. Europeana connects Europe.
‘People often speak about closing the digital divide and opening up culture to new audiences but very few can claim such a big contribution to those efforts as Europeana’s shift to cultural commons.’ Neelie Kroes, Vice President of the Commission
3. Europeana makes Europe’s culture available for everyone.
In 2012, all 20m Europeana records were released under a Creative Commons Zero public domain dedication making them available for re-use both commercially and non-commercially. Europeana’s CC0 release is a ‘coup d’état’ that ‘will help to establish a precedent for other galleries, libraries, archives and museums to follow – which will in turn help to bring us that bit closer to a joined up digital commons of cultural content that everyone is free to use and enjoy.’ Jonathan Gray, Open Knowledge Foundation.
For those unaware of Europeana – here is what they do:
Europeana has been transformative in opening up data and access to cultural heritage and now leads the world in accessible digital culture that will fuel
Europe’s digital economy. Through Europeana today, anyone can explore
27 million digitised objects including books, paintings, films and audio.
Europeana is a catalyst for change for cultural heritage
– Because they make cultural heritage accessible online.
– Because they have standardised the data of over 2,200 organisations, covering all European countries and 29 European languages.
– Because they provide creative industries and business start-ups with rich, interoperable material, complete with copyright information.
– And because they ensure that every citizen, whether young or old, privileged or deprived, can be a digital citizen.
So please support Europeana by tweeting, blogging, facebooking and whatever other media you like, using the hashtag #AllezCulture!
On 25 March 2013 the BL launched their Labs-project. As KB Research is also setting up a Lab we follow whatever happens at the BL in this area with keen interest.
The main objective for BL in the Labs is to engage with users of the digital collections, says Aly Conteh, who heads the digital research and curator team at BL; ‘Humanities researchers are now able to work with new types of resources, using new technologies, and the BL wants to understand what is required from us’. The scholarly landscape is in transformation and will continue to change. Libraries must change with this, not only in their services but also in the capabilities of their staff. As all curators in the BL are to be digital curators, a training program has been set up to take curators through a new digital scholarship curriculum, from text mining on large datasets to the use of social media. The BL aims to develop new ways of working with scholars– but first they to need to know what it is these scholars want.
The Labs provide the following:
- A wiki space where scholars in the humanities and developers can meet
- Access to available collections
- Developer support for research in the digital collections
- Opportunities for developers to make tools or apps on the digital collections
- Hackathons and workshops
The Labs are kicked off with a competition for projects that explore the BL resources – there’s 3.000 GBP plus a summer residency at BL for the researcher and/or developer with the best project idea. The BL is looking for cross collection search/analysis and the use of novel techniques. ‘The best idea’, says recently appointed Labs Manager Mahendra Mahey, ‘is the one that also helps the BL learn how to support scholars and developers’ .
The launch was a low key affair, mainly testing the water with the digital humanities community. And a very sensible thing to do too – whatever you build without involving this very intelligent and discriminating crowd will not be used. There were thirty to forty people from organisations like the Open Knowledge Foundation; partner institutions like the BBC, and UK digital humanities groups at universities like Kings College, UCL and University of Hertfordshire. We were shown examples of Digital Humanities projects, BL content and tools and techniques for working with datasets.
A few lines of code
All presentations will come online in the next days I expect so I will not bother to repeat them here. I just wish to finish with the Do’s and Don’ts learned from this launch:
- Involve users before , during and always in everything you do. Whatever you think of without them, you might as well not think of- it will not be used. Very wisely, BL has formed an advisory board of partner institutions and leading figures in the digital humanities to help them shape the lab
- There was feedback from the friendly but critical crowd on all details of the plan, and most of it was very relevant. The best one: on top of offering an overview of collections, make available to us a dataset of ten pages per collection, with available metadata, OCR etc – so we can judge the quality of the material before proposing any research on this
- Do not bother to develop too many (or any?) tools or services yourself. Tony Hirst of The Open university gave a dazzling overview of tools and techniques that are already out there – you just need ‘few lines of code’ to connect this to your database with content you have picked up from BL
- To help researchers fit tools to the data , to write these ‘few lines of code’ , make development capacity available in the lab for your users
- Partner up with other content holders to foster cross collection research.
It was an inspiring day in snowy London – cannot wait until we have something to show!