Search results

1 – 10 of over 16000

View access options

Article

Publication date: 19 July 2013

Counting the uncountable: statistics for web archives

Clement Oury and Roswitha Poll

The purpose of this paper is to describe the aims and contents of the ISO Report ISO/TR 14873.

HTML

PDF (78 KB)

Downloads

1151

Abstract

Purpose

The purpose of this paper is to describe the aims and contents of the ISO Report ISO/TR 14873.

Design/methodology/approach

For more than a decade, libraries have started to “collect the web”. National libraries in particular select, collect and store publications and websites from their national domain, seeing this as a task similar to traditional legal deposit. The collection policies and collecting methods vary, so that it is difficult to compare the quantity and quality of the respective web archives.

Findings

In order to harmonize the evaluation of web archives, ISO TC 46 SC 8 has produced a Technical Report that standardizes the terminology and statistics and offers tested indicators for assessing the quality of web archiving.

Originality/value

This paper describes the shortly to be published ISO/TR 14873, a potentially vital guide to harmonize web archive collection internationally.

Details

Performance Measurement and Metrics, vol. 14 no. 2

Type: Research Article

DOI:

ISSN: 1467-8047

Keywords

View access options

Article

Publication date: 2 October 2009

Web archiving in a Web 2.0 world

Edgar Crook

The purpose of this paper is to discuss the current state of web archiving in Australia, and how libraries are adapting their services in recognition of the expanding role that…

HTML

PDF (52 KB)

Downloads

4185

Abstract

Purpose

The purpose of this paper is to discuss the current state of web archiving in Australia, and how libraries are adapting their services in recognition of the expanding role that online material plays in their collections.

Design/methodology/approach

The National Library of Australia is the lead institution for digital archiving and preservation in Australia. Its PANDORA Archive has been the repository for archived web resources in Australia for over ten years and is a mature but continually developing system. The archival management system PANDAS that underpins the Archive, is as of 2007, in its third major revision. Other web archiving activities also now include annual Australian Domain Harvests and the usage of Archive‐It, both of which are conducted in conjunction with the Internet Archive.

Findings

For many years it was considered that archiving could only ever completely capture a small, albeit representative, sample of the internet. Today the gap between what is available and what can be archived is decreasing. But as our archives and our archiving abilities increase, we are still confronted by new technologies and Web 2.0 applications.

Originality/value

Using as an example the 2007 Federal Election in which a large number of interactive sites such as Kevin07, MySpace and YouTube were archived the paper shows how Australian web archivers continue to adapt to and meet new challenges.

Details

The Electronic Library, vol. 27 no. 5

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

View access options

Article

Publication date: 20 January 2022

Database design of the Malaysia public figures web archive repository: a social and cultural heritage web collections

Farrah Diana Saiful Bahry, Noraizan Amran, Tesa Eranti Putri and Muhammad Idzwan Ramli

The growth of web emerging technology and data visual demand from the World Wide Web (WWW) makes the need for information repositories become vital. Proper database development…

HTML

PDF (690 KB)

Downloads

790

Abstract

Purpose

The growth of web emerging technology and data visual demand from the World Wide Web (WWW) makes the need for information repositories become vital. Proper database development will assure the repository managing web content effectively aligns with web archive metadata standards. This paper aims to present the database design process for web archive content repository specifically to maintain social and cultural heritage values upon Malaysians as Mfigures.

Design/methodology/approach

The empirical process start with literature review and validation from expert on the elements and scopes of research. Then, structured database design guideline which part of database life cycle (DBLC) was applied and combined with the step of comparative and mapping the conceptual model with metadata standard that is relevant with web archive content. The paper focuses on the first three stages: Database Initial Study, web archiving and Metadata standard mapping; and conceptual design to focus on data modelling. Another two stages of database design are logical design and physical design will be exposed later.

Findings

The empirical process has produced initial conceptual data model, database structure that can be a basis of web archiving repository. The data model had also been verified with metadata data standards to assure the database structure implementation cater the need of web archiving repository features especially web information discovery.

Research limitations/implications

Nevertheless, database design is the most effective way to develop good information architecture on the Net, but the absence of some important fields on related tables have been identified such as subject, language, coverage, right, publisher and contributor. The MFigures’ database schema will continuously improve for better scope and coverage of web archive content suite with future information demands on the WWW.

Practical implications

The conceptual data model act as a communication tool by the technical team in web application development. It can be revisited to suite with other different database management system or to suite with other similar scope of information repository requirements.

Social implications

Mfigures was uniquely designed for collecting Malaysian social and cultural heritage, which are rarely design before, and it can be beneficial as Malaysia society future references for excellent motivations roles and successful stories.

Originality/value

The Mfigure conceptual data model was empirically design and gone through a proper validation process by the industrial and academic experts.

Details

Collection and Curation, vol. 41 no. 4

Type: Research Article

DOI:

ISSN: 2514-9326

Keywords

View access options

Article

Publication date: 4 July 2016

Collecting and preserving the Ukraine conflict (2014-2015): a web archive at University of California, Berkeley

Liladhar R. Pendse

The purpose of this paper is to highlight the web-archiving as a tool for possible collection development in a research level academic library. The paper highlights the web…

HTML

PDF (2.9 MB)

Downloads

1018

Abstract

Purpose

The purpose of this paper is to highlight the web-archiving as a tool for possible collection development in a research level academic library. The paper highlights the web-archiving project that dealt with the contemporary Ukraine conflict. Currently, as the conflict in Ukraine drags on, the need for collecting and preserving the information from various web-based resources with different ideological orientations acquires a special importance. The demise of the Soviet Union in 1991 and the emergence of independent republics were heralded by some as a peaceful transition to the “free-market” style economies. This transition was nevertheless nuanced and not seamless. Besides the incomplete market liberalization, rent-seeking behaviors of different sort, it was also accompanied by the almost ubiquitous use of and access to the internet and the internet communication technologies. Now 24 years later, the ongoing conflict in Ukraine also appears to be unfolding on the World Wide Web. With the Russian annexation of Crimea and its unification to the Russian Federation, the governmental and non-governmental websites of the Ukrainian Crimea suddenly came to represent a sort of “an endangered archive”.

Design/methodology/approach

The main purpose of this project was to make the information that is contained in Ukrainian and Russia websites available to the wider body of scholars and students over the longer period of time in a web archive. The author does not take any ideological stance on the legal status of Crimea or on the ongoing conflict in Ukraine. There are currently several projects that are devoted to the preservation of these websites. This article also focuses on providing a survey of the landscape of these projects and highlights the ongoing web-archiving project that is entitled, “the Ukraine Crisis: 2014-2015” at the UC Berkeley Library.

Findings

The UC Berkeley’s Ukraine Conflict Archive was made available to public in March of 2015 after enough materials were archived. The initial purpose of the archive was to selectively harvest, and archive those websites that are bound to either disappear or change significantly during the evolution of Crimea’s accession to Russia. However, in the aftermath of the Crimean conflict, the ensuing of military conflict in Ukraine had forced to reevaluate the web-archiving strategy. The project was never envisioned to be a competing project to the Ukraine Conflict project. Instead, it was supposed to capture complimentary data that could have been missed by other similar projects. This web archive has been made public to provide a glimpse of what was happening and what is happening in Ukraine.

Research limitations/implications

Now 24 years later, the ongoing conflict in Ukraine also appears to be unfolding on the World Wide Web. With the Russian annexation of Crimea and its unification to the Russian Federation, the governmental and non-governmental websites of the Ukrainian Crimea suddenly came to represent a sort of “an endangered archive”. The impetus for archiving the selected Ukrainian websites came as a result of the changing geopolitical realities of Crimea. The daily changes to the websites and also loss of information that is contained within them is one of the many problems faced by the users of these websites. In some cases, the likelihood of these websites is relatively high. This in turn was followed by the author’s desire to preserve the information about the daily lives in Ukraine’s east in light of the unfolding violent armed conflict.

Originality/value

Upon close survey of the Library and Information Sciences currently published articles on Ukraine Conflict, no articles that are currently dedicated to archiving the Crimean and Ukrainian situations were found.

Details

Collection Building, vol. 35 no. 3

Type: Research Article

DOI:

ISSN: 0160-4953

Keywords

View access options

Article

Publication date: 2 August 2019

A blended learning-based curriculum on Web archiving in the national Széchényi library

Márton Németh and László Drótos

National Széchényi Library is introducing a new blended learning-based curriculum model on Web archiving for public collection professionals. The purpose of this paper is to…

HTML

PDF (606 KB)

Downloads

735

Abstract

Purpose

National Széchényi Library is introducing a new blended learning-based curriculum model on Web archiving for public collection professionals. The purpose of this paper is to describe this curriculum concept together with its international context.

Design/methodology/approach

A qualitative case study is being offered. The concept of the curriculum applying the results of an international questionnaire of the International Internet Preservation Consortium. A detailed curriculum structure is being presented together with a brief description of the major professional/ methodological concepts. It is based on constructive pedagogical approach. Based on the same general approach, some major methodological differences among the on-site and e-learning elements of curriculum design are also being described.

Findings

There is a high need to offer trainings in Web archiving filed to digital library professionals throughout Europe. A complex curriculum is highly needed to different target groups by various course delivery forms. The course concept offers a solid base; however, the structure of the curriculum has to reflect to the differences of specific methodological requirements in on-site and e–learning environments. A main goal of the study is describing the possibility to build-up that kind of hibrid blended learning-based training structure. Based on the described curriculum trainings are starting on April 2019. Sharing practical experiences about practical training activities based on this course structure can initiate further discussion on web archiving education field in the future.

Research limitations/implications

This paper would like to imitate some further discussions about methodological issues by developing education and training curricula on Web archiving in various European countries. By the framework of the Training Working Group of the International Internet Preservation Consortium, these proposed discussions can be elaborated.

Practical implications

The main practical implications are to encourage other partner libraries by the framework of the Training Working Group of International Internet Preservation Consortium to build-up similar training programmes and to plan various collaborative activities in this field.

Social implications

The proposed curriculum aims to acquire some major skills and competences on web archiving field by librarians from both the research library and public library sectors. The course can be available to museum professionals and archivists […]. The main goal is to learn to build-up small-scale web archiving projects in local, institutional environments in Hungary. It is quite necessary to preserve Web documents and other materials that are reflecting to the life of the local society. The social impact of preserving the local Web history can be overwhelming in the future.

Originality/value

Much untapped potential exists for librarians, archivists and museum professionals to plan and realize Web archiving projects in their own local institutional environments. This paper describes a new type of national model to offer them getting the necessary skills and competences in this field. There is a significant gap of describing education concepts in Web archiving.

Details

Digital Library Perspectives, vol. 35 no. 2

Type: Research Article

DOI:

ISSN: 2059-5816

Keywords

View access options

Article

Publication date: 1 September 2004

Archiving the Web: European experiences

Juha Hakala

Preserving the published cultural heritage of a country is a major concern of any national library, and the challenge of archiving and preserving information published on the Web…

HTML

PDF (78 KB)

Downloads

1453

Abstract

Preserving the published cultural heritage of a country is a major concern of any national library, and the challenge of archiving and preserving information published on the Web is great. A short history of Web archiving in Europe from the Swedish Kulturarw3 project to the Nordic Web Archive initiative is provided, together with a generic discussion on the technical challenges of and the solutions developed for Web harvesting and archiving. Experiences from Helsinki University Library in Finland in the use and co‐operative development of the NEDLIB (Networked European Deposit Library) harvester are given.

Details

Program, vol. 38 no. 3

Type: Research Article

DOI:

ISSN: 0033-0337

Keywords

View access options

Article

Publication date: 26 September 2008

Selective archiving of web resources: a study of processing costs

Mirna Willer, Tanja Buzina, Karolina Holub, Jasenka Zajec, Miroslav Milinović and Nebojša Topolščak

The purpose of this paper is to assess costs in the National and University Library of Croatia for processing Croatian web resources and the maintenance and development of the…

HTML

PDF (354 KB)

Downloads

2619

Abstract

Purpose

The purpose of this paper is to assess costs in the National and University Library of Croatia for processing Croatian web resources and the maintenance and development of the service, and to analyse the present organisation and workflow of their processing, and to propose improvements.

Design/methodology/approach

The assessment period was two months, during which the members of staff involved minutely monitored their tasks. The results were compared to the same exercise reported by the National Library of Australia and processing costs of cataloguing Croatian print publications.

Findings

The bottom‐up analysis of processing web resources shows that a balanced description of tasks and their distribution over staff members was established, and that the present workflow meets the requirements of efficient processing of web resources. As a general finding, approximately the same time was spent on archiving new items, as on the control and maintenance of the already archived ones due to the change of web resource properties, URL instability and the changes of technology. The comparative analysis showed: less time is spent on identification and selection and publishers' contacts on the part of the Croatian National Library compared to the Australian one; almost twice as much time was spent on gathering, quality assurance, and archiving instances in the Australian case than in the Croatian one; practically the same time was spent on cataloguing in both cases; and compared to cataloguing of print publications, significantly less time was spent on the print ones.

Originality/value

The paper is one of the two published articles on the in depth analysis of the workflow and processing costs of managing and selectively archiving legal deposit copies of web resources in a national library. Its potential value is in drawing attention of library managers of those institutions that deal with selective web archiving to assess costs and services in view of the legal obligations of libraries for preserving national cultural web heritage and meeting present and future users' needs.

Details

Program, vol. 42 no. 4

Type: Research Article

DOI:

ISSN: 0033-0337

Keywords

View access options

Article

Publication date: 21 April 2022

Critical care for the early web: ethical digital methods for archived youth data

Katie Mackinnon

This paper aims to provide a brief overview of the ethical challenges facing researchers engaging with web archival materials and demonstrates a framework and method for…

HTML

PDF (175 KB)

Downloads

369

Abstract

Purpose

This paper aims to provide a brief overview of the ethical challenges facing researchers engaging with web archival materials and demonstrates a framework and method for conducting research with historical web data created by young people.

Design/methodology/approach

This paper’s methodology is informed by the conceptual framing of data materials in research on the “right to be forgotten” (Crossen-White, 2015; GDPR, 2018; Tsesis, 2014), data afterlives (Agostinho, 2019; Stevenson and Gehl, 2019; Sutherland, 2017), indigenous data sovereignty and governance (Wemigwans, 2018) and feminist ethics of care (Cifor et al., 2019; Cowan, 2020; Franzke et al., 2020; Luka and Millette, 2018). It demonstrates a new method called an archive promenade, which builds on the walkthrough and scroll-back methods (Light et al., 2018; Robards and Lincoln, 2017).

Findings

The archive promenades demonstrate how individual attachments to digital traces vary and are often unpredictable, which necessitates further steps to ensure that privacy and data sovereignty are maintained through research with web archives.

Originality/value

This paper demonstrates how the archive promenade methodological intervention can lead to better practices of care with sensitive web materials and brings together previous work on ethical fabrications (Markham, 2012), speculation (Luka and Millette, 2018) and thick context (Marzullo et al., 2018), to yield new insights for research on the experiences of growing up online.

Details

Journal of Information, Communication and Ethics in Society, vol. 20 no. 3

Type: Research Article

DOI:

ISSN: 1477-996X

Keywords

View access options

Article

Publication date: 4 October 2021

Climate change and web archives: an Ibero-American study based on the Portuguese and Brazilian contexts

Moisés Rockembach and Anabela Serrano

The purpose of this investigation is to analyze information on the web and its preservation as the digital heritage of events related to climate change and the environment in…

HTML

PDF (1.1 MB)

Downloads

384

Abstract

Purpose

The purpose of this investigation is to analyze information on the web and its preservation as the digital heritage of events related to climate change and the environment in Portugal and Brazil, thus contributing to web preservation in the Ibero-American context.

Design/methodology/approach

A theoretical and applied investigation using mixed methods to collect and analyze qualitative and quantitative data from three sources: the Internet Archive and the public collection of Archive-It, the Portuguese web archive, and a selection from collections compiled by a research group (UFRGS) on web archiving and digital archiving in Brazil.

Findings

Web archive initiatives started in 1996; however, over the years collections have narrowed from nationally relevant themes to specialized thematic niches. The theme “climate change” has had an increasing impact on scientific and mainstream discussion in the 2000s, and by 2010 the over-arching theme became focused on digital preservation of web content, as demonstrated in this study. Failure to preserve data can lead to a rapid loss of climate change information, due to the inherent ephemerality of the web.

Originality/value

The paper demonstrates the relevance of preserving web content on climate change by showing what has been preserved to date and what will need to be preserved in the future.

Details

Records Management Journal, vol. 31 no. 3

Type: Research Article

DOI:

ISSN: 0956-5698

Keywords

Open Access

Article

Publication date: 18 April 2024

Online harvesting of municipality websites into trusted digital repository

Lungile Precious Luthuli and Mpho Ngoepe

Municipalities, as the front lines of service delivery, use websites as one of the tools to communicate information to the public. While it is considered a record, many…

HTML

PDF (1.9 MB)

Downloads

529

Abstract

Purpose

Municipalities, as the front lines of service delivery, use websites as one of the tools to communicate information to the public. While it is considered a record, many organisations, including municipalities, do not manage websites as such. This study aims to explore the archiving of websites as records in the municipalities of KwaZulu-Natal (KZN) Province in South Africa by using the web archiving life cycle model.

Design/methodology/approach

This study used a mixed-methods research with an explanatory design, with quantitative data collected first through content analysis of websites and qualitative data collected through interviews. Researchers used multilevel sampling, first quantitatively analysing all available websites of the municipalities (52) in KZN, and then qualitatively selecting only records managers, information managers, web administrators, communication managers and website managers or designers from municipalities because of their understanding and involvement with websites in some way.

Findings

This study established that some records on municipal websites are often in paper format in record-keeping systems, whereas others are born digital and are not captured in the systems. Municipalities lack a dedicated web online harvesting tool as well as an archiving policy or strategy to guide website archiving. Furthermore, municipalities placed a high reliance on service providers to keep their websites operational.

Research limitations/implications

It became clear during the interviews that most of the participants were unfamiliar with web archiving. As a result, only 12 of the 56 selected participants from the municipalities provided the required information in relation to the current study as others could not provide answers. Data for other participants were not analysed.

Originality/value

Due to a lack of infrastructure for ingesting digital records into archival custody, a framework for harvesting web content of value is proposed both internally in municipalities and externally to an archive repository.

Details

Collection and Curation, vol. 43 no. 3

Type: Research Article

DOI:

ISSN: 2514-9326

Keywords

Access

Year

Content type

1 – 10 of over 16000

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Social implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Social implications

Originality/value

Details

Keywords

Abstract

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Originality/value

Details

Keywords

Access

Year

Content type

All feedback is valuable

Report an issue or find answers to frequently asked questions