Polish Digital Libraries as a Philologists' Tool. Based on 666 Adjectives from the Digital Library of Wielkopolska

Jacek Tomaszczyk (University of Silesia, Katowice, Poland)

The Electronic Library

ISSN: 0264-0473

Article publication date: 4 October 2011



Tomaszczyk, J. (2011), "Polish Digital Libraries as a Philologists' Tool. Based on 666 Adjectives from the Digital Library of Wielkopolska", The Electronic Library, Vol. 29 No. 5, pp. 730-731. https://doi.org/10.1108/02640471111177198



Emerald Group Publishing Limited

Copyright © 2011, Emerald Group Publishing Limited

Is the digital library's function solely to facilitate access to literature? Providing access to digital texts may be the main purpose of digital libraries, but they also have great potential in linguistic analyses. The authors prove that a digital collection together with language processing software can be a powerful tool for philologists carrying out various linguistic research. The benefit of using electronically stored information is evident. Researchers have 24/7 access to tens of thousands of documents without cluttering their workplaces with paper or spending hours working in libraries during fixed opening hours. It should also be pointed out that users might be refused access to some documents on preservation grounds. Apart from accessibility, digital libraries offer users search facilities that go far beyond the traditional ways of working with printed materials.

The authors present a few examples of linguistic research using newspapers stored in Polish digital libraries. Newspapers and journals, which make up 30‐50% of all pages accessed in the digital libraries cited by the authors, are very valuable resources as they record varied language usage on a day‐to‐day basis. Because newspapers carry accurate publication dates, and currently most texts are processed by OCR software, it is easy and quick to establish the earliest known date when a word was printed. Complex research is enabled by the availability of attributes such as author's name, publication date, newspaper title (which also defines the territory), and text type (column, advertisement, etc.). These attributes allow researchers to observe different trends, such as changes in the frequency of use of a word or expression. Researchers can also expand search criteria, for example by adding the context in which lexical items occur. The question is how this research can be automated. Although automation is feasible, it requires dedicated software, which may be expensive and not cost‐effective if the scope of the study is narrow.

What can be done to make digital libraries a more efficient tool for researchers? The authors present some proposals. They suggest cooperation between philologists to gain an influence on the materials collected by the digital libraries, linguists and librarians. The authors are of the opinion that it is necessary to write statutory regulations to include digitisation as a mandatory activity for the largest libraries. Endangered resources should be digitised as soon as possible; however, funds ought to be assigned not only for digitisation of the oldest documents, which relatively few readers use, but also to those newer ones that copyright allows.

The main part of the book describes an exploration of the resources of the Digital Library of Wielkopolska. The result is a list of 666 adjectives which are not included in Słownik języka polskiego, the largest general Polish dictionary, which contains about 125,000 entries with words in the Polish language from the mid‐eighteenth to mid‐twentieth century. The study proves that vocabulary chronologisation and theories on language evolution cannot be formed based on the general dictionaries of a language, but on texts produced in the time being studied.

The book shows the possibilities of using digital libraries and customised software to carry out research in different aspects of language use. It can be recommended to philologists, linguists, historians or library and information professionals who are interested in such issues as terminology development.

Related articles