The language maintenance of this page is only partially done in German. We apologise for any inconvenience this may cause. Refer to our LANGUAGE POLICY page for further details.
"In Depth" archive
"In Depth" is a collaborative section that provides edited coverage of Project deliverables; these are usually performed by national libraries and partners of The European Library.
The European Library office produces a bimonthly newsletter in English to which you can subscribe by filling in the right column form. This newsletter is intended to inform partners, professionals, and the general readership about recent events and information related to The European Library; it serves as both a source of news and a platform for an exchange of information for all partners.
If you wish to view the entire content of the latest newsletter or older issues, click on the relevant Newsletters on the following page.
RSL OPAC – 3 million records in 200 languages!
Natalia N. Kasparova (Head of the Catalogue System Board) and Mikhail E. Shvartsman (Head of the Computer Systems Research Department) tell us more about the Russian State Library collections and their incredible holdings.
"The OPAC (Online Public Access Catalogue) of the RSL contains bibliographic information on rare and modern printed books, periodicals, maps, music, electronic resources and other documents – in total 3.8 million records" says Natalia. "The bibliographic records give access to roughly 3 million titles in more than 200 languages – Russian, of course, but also 80 languages within the Russian Federation and 50 foreign languages. The material available in our OPAC covers a vast chronological period, from the XVI to the XXI century. The RSL electronic collections are available through the "Electronic Library" option on www.rsl.ru site and through The European Library Collections section."
Alongside the OPAC, the RSL provides access to different e-collections: "The electronic collections of the RSL are mainly constituted by books published in Russian during the USSR and Russia periods. There is also a collection of 30.000 books issued during the XIX century and published in European languages" adds Mikhail.
"This collection is stored on CD’s and we are planning to incorporate it into the Electronic Library. Our main electronic collections were set up via special projects, essentially financed by scientific or humanitarian foundations."
"Here is a short list of those collections:
- Memory of Russia – 19000 pages; pioneering Russian books of XV-XVI cc in Old Slavonic language;
- Meeting at the Frontiers – 20000 pages; RSL/Library of Congress joint project on the digitization of documents dedicated to the history of development of Alaska and Siberia;
- Open Russian Electronic Library (OREL) – 15000 books, maps, printed and sheet music, and open-access theses.
- Electronic Library of dissertations – 180000 theses;
- Electronic library of limited access (for RSL visitors only) – 9000 items."
"The RSL doesn’t have any national programme of digitization; we implement our own institutional programmes. Because of financial restrictions, our OCR (Optical Character Recognition) activity is not widespread We use instead batch recognition. PDF double-layered files are created; the first layer is in "picture" format and the second is the recognized text. This text, while not being seen by the end user, is used in the search tool."
Towards Multilingual "Search & Result"
One of The European Library’s objectives is to provide information in all partners native languages; this means having the portal interface translated in all full members’ languages. It also implies a more ambitious target that is to extend the multilingual capabilities of the portal; in other words, provide a language understanding of queries and return the appropriate search results in the same language. EDLproject Work Package 2 aims at integrating the outcomes of multilingual access research into The European Library portal.
The European Library language policy follows similar guidelines to Europa, the portal site of The European Union: "As far as possible, the aim is to provide the public with the information they are looking for in their own language."This means that the language interface of the HOME and COLLECTIONS pages are translated in all Full Participants languages. The latest update of the portal interface involves the integration of the Russian translation provided by the Russian State Library.
On the other hand, EDLProject Work Package 2 (WP2), Extending the multilingual capacity of the network, is developing the European Library network's localisation and multilingual capabilities by improving access for end-users. This is generated through multi-language interfaces and advanced search mechanisms in a standardised way. EDLProject WP2 is led by the National Library of Slovenia with the input of the Swiss National Library. Since EDLproject is coming to an end, we have asked Genevieve Clavel, in charge of National and International Cooperation for the Swiss National Library, and Maja Žumer, Associate Professor at University of Ljubljana and part-time researcher for the National Library of Slovenia, to help us understand the complexity behind multi-lingual searching.
The European Library: "Could you explain what controlled vocabulary is?"
Genevieve Clavel and Maja Žumer: "A controlled vocabulary, contrary to freely assigned keywords, is a list of terms that are used in cataloguing and searching to describe the subjects of resources. The same list is also used when searching. Controlled vocabularies in general improve precision. Sometimes relationships (e.g. hierarchical, associative) between terms are specified, too. Most library users are familiar with subject heading lists, of which Library of Congress Subject Headings (LCSH) is a well-known example."
The European Library: "What is Multilingual Information Access?"
Genevieve Clavel and Maja Žumer: "Multilingual access is frequently used to cover a variety of elements. At its lowest level, it may be used in reference to the user interface (display screens, user dialogue, help screens), whereas the actual access points themselves are, in fact, monolingual. In terms of work in EDLProject, Multilingual Information Access refers to two main areas: access to controlled vocabulary assigned in one language through the medium of another (e.g. entering terms in French that are linked to LCSH and thus enabling the French-speaking user to search in an English language controlled vocabulary); and translation of free-text search terms into one or more languages (either in bibliographic records or full text). Additionally, investigation is required into the feasibility of translating the results of a search into the user's chosen language."
The European Library: "What is the actual feasibility of The European Library portal being able to translate results in all Full Members’ languages?"
Genevieve Clavel and Maja Žumer: "Given the number of languages (over 30) it is unlikely that, for instance, a search would return translations from 29 languages into another language, for example German. It is more likely that the user will need to choose language pairs that may be implemented progressively in the same way that subject heading languages are linked in clusters."
The European Library: "Can you specify The European Library integration timing for EDLProject WP2 outcomes?"
Genevieve Clavel and Maja Žumer: "In the earlier projects (TEL-ME-MOR and EDLProject) we have explored some of the options and tested some limited prototypes. We now have an overview and planned scenarios. Those will be implemented and tested as part of TELplus and the Europeana* interface."
(Europeana is the test website of the European Digital Library, click here for more info)
APPLICATION PROFILE IMPLEMENTATION IN ALEPH OAI-MODULES
The ALEPH system is commonly used by librarians to manage their collections and bibliographic records. OAI-PMH is an access protocol that allows librarians to submit their collections for web purposes. The Austrian National Library (ANL) is currently developing ALEPH-OAI "translation" software that will considerably ease partners’ compliance to The European Library Application Profile.
The European Library provides query results from national libraries collection records through 3 different access methods; SRU. Z39-50 or, preferably, OAI-PMH. The OAI-PMH environment defines a mechanism for data providers to expose their metadata through Dublin Core (DC), a simple and common initiative to create a "digital library card catalogue" for the Web. OAI collections are directly indexed in The European Library records. Concretely, if all collections were accessible via OAI, the end-user would get the best, most relevant and quickest answers to his query. Currently, only 33% of the collections within The European Library environment are OAI-PMH compatible.
In order to extend the amount of OAI compatible records and ease the partners’ compliance to The European Library Application Profile, the ANL is developing a system that will ease the conversion between ALEPH Version 16 & 18 of OAI modules and The European Library Application Profile. We have asked Walter Zabel, Head of the IT department of the Austrian National Library (ANL), to explain the purpose of such an initiative. "The initial objective is to set up an OAI-interface that is flexible enough to easily provide additional tags to Dublin Core - Qualified schema (DC – Q). The target is of course to meet all of The European Library Application Profile requirements."
The project started during the first quarter of 2006; Walter acted as coordinator between the ANL Library System vendor, the ANL System Librarians, the Application Administrator and The European Library Office. "We decided to hire our system vendor during the development period rather than develop it ourselves. Since it is a complex piece of software, we expect fewer problems with future version changes. During 2006 and 2007 we received several alpha and beta versions for testing. We finished the test phases with the DC-Q version successfully in the course of last month" adds Walter. "We are currently setting up the OAI-provider software on our production system and will deliver data for 2 catalogues with 4 sub-collections by the end of the year."
These initial results allow positive projections: "The best case scenario will allow us to gradually increase the DC-Q schema as planned; consequently, the new schema will exactly meet the needs of The European Library. Data delivery in The European Library schema should be possible in the first quarter 2008".
THE EUROPEAN LIBRARY LOG FILE ANALYSIS
The University of Padua and Max Planck Institute for Informatics have been analysing The European Library log files throughout the year 2007.
Log file analysis allows The European library to see how user’s interact with the portal; for example where visitors enter the website and what actions follow this initial entry. Ultimately, Log file analysis aims at defining a user profile and usage pattern.
Both University of Padua – Italy - and Max Plank Institute for Informatics – Germany -, involved in the "log file analysis" activities, are part of DELOS Network of Excellence on Digital Libraries; DELOS main objectives are research through cooperation agreements with interested parties.
As part of EDLProject Work Package 1, Task 3 (Maximising usability and usage), the University of Padua focused on the identification and deep analysis of individual sessions (separation of human users and crawlers), the users’ geographic provenance and referrer websites. The European Library HTTP and Action Logs analysis was spread through the months of October 2006 to April 2007. "We have decided to follow an innovative approach" underlines Professor Maristella Agosti from the University of Padua. "Instead of following the most common approach of analysing log data "on the fly" having a specific target in mind, we decided to design a rich database application in order to collect and manage the log data. This log data derives from 3 different and non-homogeneous sources of The European Library: Web logs, action logs, and registered users' data. The main designer of the database application is Giorgio Di Nunzio and Tullio Coppotelli contributed to the data analysis and assimilation." The output of this research was significant since one of the main limitations of log analysis is the lack of the information context in which the user operates: "The proposed methodology to analyze The European Library web logs turned out to be very effective and allowed the discovery of user behavior patterns through on demand queries" adds Professor Agosti.
EDLProject Work Package 1, Task 3 also involved Max Plank Institute for Informatics; this organisation dedicated its research to analysing the European Library Action Logs - server log files - from December 2006 to May 2007, in combination with the data from the registered users’ database. Max Plank Institute for Informatics particularly concentrated its research on users’ queries. Julia Luxemburger, a member of the institute's Department for Databases and Information Systems headed by Gerhard Weikum - Research Director -, used statistical methods to derive a user-interaction model from the logs. "This model sheds light into the users' behaviour when refining queries and clicking on results from different collections" says Gerhard Weikum. "The predictive power of such a model could eventually lead to a smarter and automated way of selecting the best-suited collections on behalf of the user. Currently, the logs are not yet sufficiently large to generate statistically significant conclusions and make reasonably accurate predictions. As The European Library's user base continues to grow and more logging data becomes available, we expect to gain better predictions of user interactions and therefore give better support for satisfying the users' information needs."
The collaboration between the University of Padua and Max Plank Institute will be further conducted within TELplus project; the University of Padua will be leading TELplus Work Package 5 (User personalisation services – log file analysis and use of annotations). Max Plank Institute for Informatics will be the TELplus Work Package 5 Tasks 5.2 "Log Analysis" and 5.4 "Personalised Search" leaders.