KB Research

Research at the National Library of the Netherlands

Month: March 2013

What Do Scholars Want? British Library Labs launched

On 25 March 2013 the BL launched their Labs-project. As KB Research is also setting up a Lab we follow whatever happens at the BL in this area with keen interest.

Image

The main objective for BL in the Labs is to engage with users of the digital collections, says Aly Conteh, who heads the digital research and curator team at BL; ‘Humanities researchers are now able to work with new types of resources, using new technologies, and the BL wants to understand what is required from us’. The scholarly  landscape is in transformation and will continue to change. Libraries must change with this, not only in their services but also in the capabilities of their staff. As all curators in the BL are to be digital curators, a training program has been set up to take curators through a new digital scholarship curriculum, from text mining on large datasets to the use of social media. The BL aims to develop new ways of working with scholars– but first they to need to know what it is these scholars want.

The Labs provide the following:

  • A wiki space where scholars in the humanities and developers can meet
  • Access to available collections
  • Developer support for research in the digital collections
  • Opportunities for developers to make tools or apps on the digital collections
  • Hackathons and workshops

Competition

The Labs are kicked off with a competition for projects that explore the BL resources – there’s 3.000 GBP plus a summer residency at BL for the researcher and/or developer with the best project idea. The BL is looking for cross collection search/analysis and the use of novel techniques. ‘The best idea’, says recently appointed Labs Manager Mahendra Mahey, ‘is the one that also helps the BL learn how to support scholars and developers’ .

The launch was a low key affair, mainly testing the water with the digital humanities community. And a very sensible thing to do too – whatever you build without involving this very intelligent and discriminating crowd will not be used. There were thirty to forty people from organisations like the Open Knowledge Foundation; partner institutions like the BBC, and UK digital humanities groups at universities like Kings College, UCL and University of Hertfordshire. We were shown examples of Digital Humanities projects, BL content and tools and techniques for working with datasets.

A few lines of code

All presentations will come online in the next days I expect so I will not bother to repeat them here. I just wish to finish with the Do’s and Don’ts  learned from this launch:

  1. Involve users before , during and always in everything you do. Whatever you think of without them, you might as well not think of- it will not be used. Very wisely, BL has formed an advisory board of partner institutions and leading figures in the digital humanities to help them shape the lab
  2. There was feedback from the friendly but critical crowd on all details of the plan, and most of it was very relevant. The best one: on top of offering an overview of collections, make available to us a dataset of ten pages per collection, with available metadata, OCR etc – so we can judge the quality of the material before proposing any research on this
  3. Do not bother to develop too many (or any?) tools or services yourself. Tony Hirst of The Open university  gave a dazzling overview of tools and techniques that are already out there – you just need ‘few lines of code’ to connect this to your database with content you have picked up from BL
  4. To help researchers fit tools to the data , to write these ‘few lines of code’  , make development capacity available in the lab for your users
  5. Partner up with other content holders to foster cross collection research.

It was an inspiring day in snowy London – cannot wait until we have something to show!

The speakers at launch were: @pmgooding @marcgalexander @MappingMetaphor @psychemedia @DigiPalProject @noeL_maS @pj_webster

Meer dan tekst

Barbara Sierman, Marcel Ras

Op 18 en 19 maart was er een interessante Conferentie in Hannover met als thema: Non-textual Information Strategy and Innovation beyond Text. Verschillende sprekers gingen in op het feit dat wetenschappelijke informatie tegenwoordig meer is dan een artikel of boek. Jan Brase van DataCite riep bibliotheken op hun catalogus niet langer een venster op hun holdings te laten zijn, maar een venster met verwijzingen naar trusted providers van content die elders aanwezig is. En hij gaf een mooie definitie van Research Data: Anything that is foundation of further research is research data. Die definitie kunnen we goed gebruiken voor onze enorme collecties gedigitaliseerd materiaal. Todd Carpenter van NISO wees erop dat er een toename van supplemental files bij artikelen in het algemeen is, en noemde als voorbeeld een biomedisch tijdschrift waarbij aan 95% van de artikelen supplementairy materiaal toegevoegd wordt. Een duidelijke toename in een aantal jaren. Overigens is het meeleveren van supplementair materiaal niet een nieuw, digitaal, fenomeen. Ook in de papieren wereld gebeurd dit.

NISO maakt een onderscheid in drie soorten supplemental files: Integral content, Additional Content en Related Content. (let op, waarschuwde hij, de uitgever bepaalt wat een supplemental file is, niet de vorm waarin deze verschijnt). Dit is uitgebreid beschreven in het NISO rapport: NISO RP-15-2013, Recommended Practices for Online Supplemental Journal Article Materials. Belangrijk om in de gaten te houden wanneer we voor het Internationaal e-Depot policies gaan vastleggen over wat we willen bewaren. Interessante quote van Todd Carpenter: “it is expensive to care for metadata, but it is even more expensive not to care”.

Leibniz

Jill Cousins gaf een overzicht van de stand van zaken met betrekking tot Europeana.

Guido Herman van STM/Thieme gaf aan dat van de 23.000 STM tijdschriften die jaarlijks verschijnen er 90% digitaal is. Dat zijn ongeveer 1,4 miljoen artikelen per jaar. Dit aantal groeit jaarlijks met zo’n 3%, het aantal tijdschrifttitels groeit jaarlijks met 3,5%. Kon iemand in 1952 nog een Nobelprijs winnen op basis van een artikel van 2 pagina’s met 1 plaatje, nu is dat onmogelijk en dijen de supplemental files steeds meer uit, wat de vindbaarheid en hergebruik niet altijd bevordert. Als feiten moeten leiden tot information en information to knowledge, dan is het dus de vraag of we de ontwikkeling van knowledge nu ook beter voor elkaar hebben. Naast een pleit voor trusted repositories, vroeg hij zich ook af of de auteurs van de wetenschappelijke artikelen en de data niet meer moesten bijdragen aan de sustainability van de datasets.

Puneet KLishor gaf ons een korte vooruitblik op versie 4.0 van Creative Commons licentie, die er binnen enkele weken aan staat te komen. Voornaamste wijziging ten opzichte van versie 3 is dat niet langer het werk zelf gelicentieerd is, maar dat aangegeven wordt welke rechten bij het werk van toepassing is. Voorbeeld was een amateur filmpje waarop mensen op muziek dansen. De muziek valt niet onder het deel waar rechten voor zijn,  het filmpje zelf wel.

Olivier Koepler gaf een demonstratie van een vernieuwde zoekmethode op research data:  het zoeken op statistische curves in een dataset, waarna er een verfijning aangebracht kan worden op vakgebied.

Brian McMahon van de International Union of Christallography (IUCr) deed ons huiveren over de mogelijkheden die in de hedendaagse CIF files zitten (deze hebben we ook in het e-depot) waarbij je vanuit het artikel een animatie kan oproepen om deze vervolgens in specifieke bijbehorende software af te spelen. Met de data bij het artikel kan er voor verschillende views in de animatie gekozen worden. Daarmee ontstaat er een “verrijkte publicatie” van een hoog niveau. Dit is echt een niveau hoger dan een pdf met een plaatje erbij en gaat zelfs verder dan wat wij tot nu toe met onze Enhanced Publications uitgeprobeerd hebben. Gelukkig heeft hij in artikelen beschreven hoe de christallografen het aanpakken, maar voor toegang zullen we ons deze kennis toch eigen moeten maken. Overigens voegt IUCr ook het volledige peer review proces toe aan de data die ze online zetten (en leveren ze deze gegevens zeer waarschijnlijk ook als supplemental files aan het e-Depot). Daarmee kunnen onderzoekers het gehele proces van het artikelen en kwaliteitscontrole daarvan volgen.

Toegang geven tot grote hoeveelheden dat door middel van data visualisaties werd getoond door Microsoft (Rob Fatland) en van Nederlandse bodem vertelde Remco Veltkamp hoe onderzoek naar patronen in muziek het mogelijk gaat maken te kijken of volksmuziek invloed heeft uitgeoefend op latere muziek van de 20ste eeuw. Helaas vielen er wat lezingen uit, onder meer Thomas Bär over digital preservation van AV materialen. Het nieuwe Europese project DuraArk werd gelanceerd door Jakob Beetz van de Eindhoven University, maar helaas was zijn intro zo lang, dat de laatste slide over preservation erdoor heen gejast werd, maar wel om in de gaten te houden. Al is het maar omdat bleek dat de bouw een enorm inefficiënte bedrijfstak is, die dat wil verbeteren door middel van een preservation project!

Al met al een heel leerzame conferentie en voldoende stof tot nadenken en onderzoek.

Digital preservation – The cost of doing nothing

Author: Barbara Sierman
Originally posted on: http://digitalpreservation.nl/seeds/the-cost-of-doing-nothing

800px-De_geldwisselaar_en_zijn_vrouw

Lately there was much debate on the fact that over the years the digital preservation community mastered to create a collection of more than a dozen of cost models, making the confusion for every one starting in digital preservation even bigger. May be this is part of the way things are going: everyone sees his own situation as something special with special needs. The solution? Tayloring an existing model or developing a new one. We can expect help from the recently started European project 4C ,”The Collaboration to Clarify the Costs of Curation”. In their introduction they state that “4C reminds us that the point of this investment [in digital preservation] is to realise a benefit”. Less emphasis on the complexity of digital preservation, and more on the benefits.

Some people think that talking about digital preservation in terms of complexity and costs sounds more negative than thinking in terms of opportunities (or challenges) and benefits. But in both cases, you will need the same hard-core figures about the costs you make as an organisation and the benefits that raise from it. The latter is not easy to do, but the work of Neil Beagrie and his team shows that it will be possible to measure the benefits.

If we would have better figures of the benefits of preserving digital material, we are in a better position to estimate what it will cost us if digital material is not preserved. Of letting digital objects die, be it intentionally or not.  How much damage is done to society if crucial information is not preserved?  Recently the question was raised that some interesting websites, containing the research results of a project that lasted for several years,  might not be harvested and preserved in a digital archive. Consequence of this would be a tremendous loss for the community in the related research discipline. This is clearly an incentive for preservation!

I remember that when the Planets project was proposed, it was argued that the obsolescence of digital information in Europe,  in case no action to preserve it would be taken, could cost the community an astonishing amount of 3 billion euro a year. I could not find a source for this assumption, only a reference to some articles. One of them described the amount of data that was created worldwide. The other article described the costs for an organization if lacking proper tools to manage data (getting access, searching,  not finding etc). It could be that the Planets assumption derived from this information was used as an illustration to make the case for digital preservation (the amount of stories in the Atlas of Digital Damages does not prove this assumption).

But in essence, it are these kind of figures (and their related evidence) we also need to have at hand. Not only demonstrating the costs of digital preservation, but also demonstrating what it would cost society if we did not preserve things.

MOOCs in the Netherlands by Surf Academy

The SurfAcademy, a program set up to encourage knowledge exchange between higher education institutions in the Netherlands, organised a seminar on MOOCs, Massive Open Online Courses, on 26 February. Several Dutch institutions have started with MOOCs on various platforms and subjects, so the special interest group Open Educational Resources (OER) of Surf thought it was time to share experiences and open up the discussion for institutions that wish to jump on this fast moving train.

The Koninklijke Bibliotheek does not normally provide education as the National Library of the Netherlands, but we do work together with the Dutch universities (of applied sciences) and we are happy to share knowledge with our colleagues and users. Also, as one of the founding members of the impact Centre of Competence in text digitisation, we were asked to think about how we can best share the knowledge that was gathered in the 4 year research project IMPACT. Perhaps a MOOC would be a good idea?

The afternoon has an ambitious program, but is filled with experiences and interesting observations. I thought the most interesting parts of the afternoon were the presentations of the universities that are currently working with MOOCs in the Netherlands. Those were LeidenUniversity, presented by Marja Verstelle, the University of Amsterdam, presented by Frank Benneker and Willem van Valkenburg on the work the Technical University Delft is doing with their MOOC.

[slideshare id=16787746&w=427&h=356&sc=no]

It is interested to see the different choices each institution made for their own implementation of a MOOC. Leiden chose to work with Coursera and TU Delft joined EdX, while Amsterdam built their own platform (forever beta) in only two months and just 20k euro with a private partner. Each have their own reasons for these choices, such as flexibility (Amsterdam), openness (Delft) or ease (Leiden). Amsterdam is the only university that has started its MOOC already with great success (4800 participants in the first week), Leiden plans to start in May 2013 and Delft follows in September.

Another interesting presentation was the one by Timo Kos, both from KahnAcademy and Capgemini Consulting. He shared the results of two projects he did on OER, including MOOCs. As he showed us that MOOCs are not a technical hype, because they use no new technologies, merely combine existing ones for a new purpose. However, MOOCs can be indicated as a disruptive innovation, but as he says in the panel discussion at the end of the day we do not have to fear that real-life universities will be pushed out by MOOCs.

[slideshare id=16790022&w=427&h=356&sc=no]

All in all, I thought it was a very educative day with lots of food for thought. Most presentations are unfortunately in Dutch, but can be found on the website of the Surf Academy, where you will also find the videos made during the seminar. The English presentations have been embedded or linked to in this post.

Some of the questions and insights I took home with me:

  • Leiden and Amsterdam chose to create shorter videos for their MOOCs, while Delft will record regular classes. When do you choose which approach?
  • Do you want to use a platform of your own or will you sign up with one of the existing ones? (Examples: Coursera, EdX, Udacity, canvas.net)
  • Coursera takes 80-90% of the money made in a MOOC and they sell their user’s data to third parties. (Do have to say that I did not did a fact-check on this one!)
  • Do you want to get involved in the world of MOOCs as a non-top-50 university or even as a non-educational institute? The BL will do so, by joining FutureLearn.
  • PR of your MOOC is very important, especially if you use your own platform. However, getting a news item on the Dutch 8 o’clock news will probably mean one server is not enough for the first class.
  • The success of a MOOC also depends on the reputation of your institution.
  • Do students feel they are studying at an institute/university or at i.e. Coursera?
  • Using a MOOC towards your own degree is possible when you take the exam in/with a certified testing centre, such as Pearson or ProctorU.
  • If you plan to go into online education, when do you consider it a MOOC and when is it simply an online course?

Trusted access to scholarly publications

In December 2012 the 3rd Cultural Heritage online conference was held in Florence. Theme of the conference was “Trusted Digital Repositories and Trusted Professionals. At the conference a presentation was given on the KB international e-Depot with the title: The international e-Depot to guarantee permanent access to scholarly publications.

conference room

The international e-Depot of the KB is the long-term archive for international academic literature for Dutch scholars, operating since 2003. This archival role is of importance because it enables us to guarantee permanent access to scholarly literature. National libraries have a depository role for national publications. The KB goes a step further and also preserves publications from international, academic publishers that do not have a clear country of origin. The next step for the KB is to position the international e-Depot as a European service, which guarantees permanent access to international, academic publications for the entire community of European researchers.

The trend towards e-only access for scholarly journals is continuing rapidly, and a growing number of journals are ‘born digital’ and have no printed counterpart. For researchers there is a huge benefit because they have online access to journal articles, anywhere, any time. The downside is an increasing dependency on digital access. Without permanent access to information, scholarly activities are no longer possible. But there is a danger that e-journals become “ephemeral” unless we take active steps to preserve the bits and bytes that increasingly represent our collective knowledge.

We are all familiar with examples of hardware and software becoming obsolete. On top of this threat of technical obsolescence there is the changing role of libraries. In the past libraries have assumed preservation responsibility for material they collect, while publishers have supplied the material libraries need. These well understood divisions of labour do not work in a digital environment and especially so when dealing with e-journals.

Research and developments in digital preservation issues have grown mature. Tools and services are being developed to help perform digital preservation activities. In addition, third-party organizations and archiving solutions are established to help the academic community to preserve publications and to advance research in sustainable ways. As permanent access is to digital information is expensive, co-operation is essential, each organization having its own role and responsibility.

The KB has invested in order to take its place within the research infrastructure at European level and the international e-Depot serves as a trustworthy digital archive for scholarly information for the European research community.

© 2018 KB Research

Theme by Anders NorenUp ↑