Search results
1 – 10 of over 49000Le Vu Ho, Siu Cheung Hui and A.C.M. Fong
The World Wide Web has become an important medium for disseminating scientific publications. To make their research works accessible to other researchers, most research…
Abstract
The World Wide Web has become an important medium for disseminating scientific publications. To make their research works accessible to other researchers, most research institutions list their publications in an index page that sometimes includes links to online versions of the publications. As the index page is usually updated whenever new research papers are published, researchers need to check these index pages frequently in order to know of any new publications published in the targeted Web site or page. This manual publication monitoring process is tedious and time‐consuming. In this paper, a publication monitoring system, known as PubWatcher, is proposed to automatically track Web publications from user‐specified Web sites or pages. A publication extraction technique has been developed to extract publication information listed in the index pages of the monitored Web sites and pages.
Details
Keywords
A.C.M. Fong, S.C. Hui and H.L. Vu
Research organisations and individual researchers increasingly choose to share their research findings by providing lists of their published works on the World Wide Web. To…
Abstract
Research organisations and individual researchers increasingly choose to share their research findings by providing lists of their published works on the World Wide Web. To facilitate the exchange of ideas, the lists often include links to published papers in portable document format (PDF) or Postscript (PS) format. Generally, these publication Web sites are updated regularly to include new works. While manual monitoring of relevant Web sites is tedious, commercial search engines and information monitoring systems are ineffective in finding and tracking scholarly publications. Analyses the characteristics of publication index pages and describes effective automatic extraction techniques that the authors have developed. The authors’ techniques combine lexical and syntactic analyses with heuristics. The proposed techniques have been implemented and tested for more than 14,000 Web pages and achieved consistently high success rates of around 90 percent.
Details
Keywords
Many scientific publications are now available on the World Wide Web for researchers to share research findings. However, they tend to be poorly organised, making the search of…
Abstract
Many scientific publications are now available on the World Wide Web for researchers to share research findings. However, they tend to be poorly organised, making the search of relevant publications difficult and time‐consuming. Most existing search engines are ineffective in searching these publications, as they do not index Web publications that normally appear in PDF (portable document format) or PostScript formats. Proposes a Web citation‐based retrieval system, known as PubSearch, for the retrieval of Web publications. PubSearch indexes Web publications based on citation indices and stores them into a Web Citation Database. The Web Citation Database is then mined to support publication retrieval. Apart from supporting the traditional cited reference search, PubSearch also provides document clustering search and author clustering search. Document clustering groups related publications into clusters, while author clustering categorizes authors into different research areas based on author co‐citation analysis.
Details
Keywords
Deborah R. Hollis and Margaret M. Jobe
With the aid of seed money from a federal grant, librarians at the University of Colorado at Boulder (CU Boulder) developed an online statistical abstract called Colorado by the…
Abstract
With the aid of seed money from a federal grant, librarians at the University of Colorado at Boulder (CU Boulder) developed an online statistical abstract called Colorado by the Numbers (CBN). The last print version of the Colorado Statistical Abstract was published in 1987. CBN provides updated socio‐economic data about the state and its counties on the Web. Librarians have gone beyond the acquisition and maintenance of traditional printed information sources to producing tailor made resources that meet the information needs of their local community. The CBN design and management model is discussed.
Details
Keywords
Nasrine Olson, Jan Michael Nolin and Gustaf Nelhans
The purpose of this paper is to investigate concepts that are used in depicting future visions of society, as afforded by technology, to map the extent of their use, examine the…
Abstract
Purpose
The purpose of this paper is to investigate concepts that are used in depicting future visions of society, as afforded by technology, to map the extent of their use, examine the level of their dominance in different research areas and geographic boundaries, identify potential overlaps, analyse their longitudinal growth, and examine whether any of the identified concepts has assumed an overarching position.
Design/methodology/approach
In total, 14 concepts, each of which is used to depict visions of future information infrastructures, were identified. More than 20,000 scholarly documents related to 11 of these concepts (those with 20 or more documents) are analysed by various qualitative/quantitative methods.
Findings
The concepts most referred to are semantic web and ubiquitous computing (all years), and internet of things (Year 2013). Publications on some newer concepts (e.g. digital living, real world internet) are minimal. There are variations in the extent of use and preferred concepts based on geographic and disciplinary boundaries. The overlap in the use of these terms is minimal and none of these terms has assumed an overarching umbrella position.
Research limitations/implications
This study is limited to scholarly publications; it would be relevant to also study the pattern of usage in governmental communications and policy documents.
Social implications
By mapping multiplicity of concepts and the dispersion of discussions, the authors highlight the need for, and facilitate, a broader discussion of related social and societal implications.
Originality/value
This paper is the first to present a collective of these related concepts and map the pattern of their occurrence and growth.
Details
Keywords
The purpose of this paper is to discuss the current state of web archiving in Australia, and how libraries are adapting their services in recognition of the expanding role that…
Abstract
Purpose
The purpose of this paper is to discuss the current state of web archiving in Australia, and how libraries are adapting their services in recognition of the expanding role that online material plays in their collections.
Design/methodology/approach
The National Library of Australia is the lead institution for digital archiving and preservation in Australia. Its PANDORA Archive has been the repository for archived web resources in Australia for over ten years and is a mature but continually developing system. The archival management system PANDAS that underpins the Archive, is as of 2007, in its third major revision. Other web archiving activities also now include annual Australian Domain Harvests and the usage of Archive‐It, both of which are conducted in conjunction with the Internet Archive.
Findings
For many years it was considered that archiving could only ever completely capture a small, albeit representative, sample of the internet. Today the gap between what is available and what can be archived is decreasing. But as our archives and our archiving abilities increase, we are still confronted by new technologies and Web 2.0 applications.
Originality/value
Using as an example the 2007 Federal Election in which a large number of interactive sites such as Kevin07, MySpace and YouTube were archived the paper shows how Australian web archivers continue to adapt to and meet new challenges.
Details
Keywords
Tomas C. Almind and Peter Ingwersen
This article introduces the application of informetric methods to the World Wide Web (WWW), also called Webometrics. A case study presents a workable method for general…
Abstract
This article introduces the application of informetric methods to the World Wide Web (WWW), also called Webometrics. A case study presents a workable method for general informetric analyses of the WWW. In detail, the paper describes a number of specific informetric analysis parameters. As a case study the Danish proportion of the WWW is compared to those of other Nordic countries. The methodological approach is comparable with common bibliometric analyses of the ISI citation databases. Among other results the analyses demonstrate that Denmark would seem to fall seriously behind the other Nordic countries with respect to visibility on the Net and compared to its position in scientific databases.
Details
Keywords
More knowledge about open access (OA) scholarly publishing on the web would be helpful for citation data mining and the development of web‐based citation indexes. Hence, the main…
Abstract
Purpose
More knowledge about open access (OA) scholarly publishing on the web would be helpful for citation data mining and the development of web‐based citation indexes. Hence, the main purpose of this study is to identify common characteristics of open access publishing, which may therefore enable us to measure different aspects of e‐research on the web.
Design/methodology/approach
In the current study, five characteristics of 545 OA citing sources targeting OA research articles in four science and four social science disciplines were manually identified, including file format, hyperlinking, internet domain, language and publication year.
Findings
About 60 per cent of the OA citing sources targeting research papers were in PDF format, 30 per cent were from academic domains ending in edu and ac and 70 per cent of the citations were not hyperlinked. Moreover, 16 per cent of the OA citing sources targeting studied papers in the eight selected disciplines were in non‐English languages. Additional analyses revealed significant disciplinary differences in some studied characteristics across science and the social sciences.
Originality/value
The OA web citation network was dominated by PDF format files and non‐hyperlinked citations. This knowledge of characteristics shaping the OA citation network gives a better understanding about their potential uses for open access scholarly research.
Details
Keywords
Adrian Cunningham and Margaret Phillips
To review the challenges associated with ensuring the capture and preservation of and long‐term access to government records and publications in the digital age and to describe…
Abstract
Purpose
To review the challenges associated with ensuring the capture and preservation of and long‐term access to government records and publications in the digital age and to describe how libraries and archives in Australia are responding to the challenge.
Design/methodology/approach
Literature‐ and case‐study‐based conceptual analysis of what makes government online information so vulnerable and initiatives at the National Library of Australia and the National Archives of Australia.
Findings
Democracy, governance, consultation and participation all depend on the availability of authentic and reliable information. Government agencies as well as educational and research institutions are producing increasingly large volumes of information in digital formats only. While Australia has done more than most countries to date to address the need to identify, collect, store and preserve government publications and public records in digital formats, large amounts of information are still at risk of loss.
Research limitations/implications
Focuses on circumstances and initiatives in the Australian Government.
Practical implications
Librarians and archivists need to become more proactive in influencing the behaviour of government agencies to ensure that important evidence of democratic governance is created and managed in ways that facilitate their accessibility and long‐term preservation.
Originality/value
Emphasises the vital role that information management agencies such as libraries and archives have to play in supporting transparent and accountable governance in the digital age, and explores innovative strategies for ensuring the long‐term preservation of this important documentary heritage material for the use of future generations.
Details
Keywords
The purpose of this paper is to gain knowledge about the status and characteristics of the current web citations in published articles by Iranian researchers in the Science…
Abstract
Purpose
The purpose of this paper is to gain knowledge about the status and characteristics of the current web citations in published articles by Iranian researchers in the Science Citation Index (SCI). Besides investigating the growth in the presence of web resources in publications, the paper examines the accessibility and decay of web resources. Furthermore, the author will examine the provided information by the URLs to determine whether the cited contents by the authors signify the same information as the URLs.
Design/methodology/approach
The author used the survey research method. Thus, all documents by Iranian chemistry researchers recorded in the SCI database during 2006‐2009 were identified and then transferred to an Excel base. After a one‐by‐one examination, 46,762 web citations were extracted from a total number of 10,333 documents and were then analyzed, with the aid of two research assistants, in two months time (November and December of 2010), as specified in the research objectives. The citations were categorized into nine groups based on the feedback from the URLs' entries in the Internet Explorer browser.
Findings
The results showed that 46,762 citations (20 percent) of the total 187,823 available citations in the articles included web citations. The proportion percentage of web citations increased from 9 percent in 2006 to 39 percent in 2009. The average number of web citations for every article is 4.52. The most widely cited top level domains in URLs include the.org and.edu with, respectively, 31 percent and 23 percent; and when compared to other domains they reveal a greater tendency for stability. The highest percentage of inactive URLs was found to be associated with the .gov top level domain. Ultimately, 40,954 web citations were rendered accessible, of which 79 percent allowed easy and long‐term access to the authors' information intended in URLs. The decay rate for citations reveals an annual 5.2 percent increase. Long‐time inaccessibility to the authors' same intended information was shown to be mostly from URLs that returned the 404 error and also the URLs that had gone through information update. An about eight year half‐life was estimated for Iran's chemistry publications, which is rather promising in comparison with other fields of study.
Originality/value
The paper offers a quantitative analysis of the state of web citations application among chemistry researchers in Iran and voices concerns related to web citations in the publications in this field. The results of this study may be useful for providers of web contents, authors and editors in the field of chemistry publications.
Details