Search results
1 – 10 of over 3000Marta Ortiz-de-Urbina-Criado, Alberto Abella and Diego García-Luna
This paper aims to highlight the importance of open data and the role that knowledge management and open innovation can play in its identification and use. Open data has great…
Abstract
Purpose
This paper aims to highlight the importance of open data and the role that knowledge management and open innovation can play in its identification and use. Open data has great potential to create social and economic value, but its main problem is that it is often not easily reusable. The aim of this paper is to propose a unique identifier for open data-sets that would facilitate search and access to them and help to reduce heterogeneity in the publication of data in open data portals.
Design/methodology/approach
Considering a model of the impact process of open data reuse and based on the digital object identifier system, this paper develops a proposal of a unique identifier for open data-sets called Open Data-set Identifier (OpenDatId).
Findings
This paper presents some examples of the application and advantages of OpenDatId. For example, users can easily consult the available content catalogues, search the data in an automated way and examine the content for reuse. It is also possible to find out where this data comes from, solving the problems caused by the increasingly frequent federation of data in open data portals and enabling the creation of additional services based on open data.
Originality/value
From an integrated perspective of knowledge management and open innovation, this paper presents a new unique identifier for open data-sets (OpenDatId) and a new concept for data-set, the FAIR Open Data-sets.
Details
Keywords
The management of intellectual content in a digital environment (Internet) requires the existence of persistent, reliable unique identifiers for each distinguishable piece of…
Abstract
The management of intellectual content in a digital environment (Internet) requires the existence of persistent, reliable unique identifiers for each distinguishable piece of content, and associated services activated by these identifiers to manage access and other rights. The digital object identifier (DOI) is a major initiative from the content industries which is now being implemented widely. The DOI is a unique identifier of any piece of intellectual content (in any form), together with a system for using that identifier to locate digital services (on the Internet) associated with that content. This paper describes as separate strands the approach of the technology and the content communities, and how these have been brought together in the DOI initial implementation (as a reliable location tool) and future implementations of other services. The DOI has strong support from many quarters, and is funded by a not‐for‐profit independent foundation.
Details
Keywords
John A. Pandiani, Steven M. Banks and Monica M. Simon
The relationship between employment services and employment outcomes has been the subject of research for a number of years (Bond et al., 2001; Drake et al., 1996). More recently…
Abstract
The relationship between employment services and employment outcomes has been the subject of research for a number of years (Bond et al., 2001; Drake et al., 1996). More recently, the competitive employment of service recipients has become an important indicator of community mental health program and service system performance. The National Association of State Mental Health Program Directors’ President’s Task Force on Performance Measures, for instance, recognized the importance of monitoring employment rates for adults with serious mental illness: “For payers, this is the payoff…Monitoring this outcome for populations with mental illness…is critical. This was considered a critical outcome to track.” For similar reasons, the new federal Performance Partnership (Block) Grant program (Federal Register, 2002) requires annual reporting by all states of employment rates for recipients of publicly funded mental health services.
Michael Day, Rachel Heery and Andy Powell
This paper reviews BIBLINK, an EC funded project that is attempting to create links between national bibliographic agencies and the publishers of electronic resources. The project…
Abstract
This paper reviews BIBLINK, an EC funded project that is attempting to create links between national bibliographic agencies and the publishers of electronic resources. The project focuses on the flow of information, primarily in the form of metadata, between publishers and national libraries. The paper argues that in the digital information environment, the role of national bibliographic agencies will become increasingly dependent upon the generation of electronic links between publishers and other agents in the bibliographic chain. Related work carried out by the Library of Congress with regard to its Electronic CIP Program is described. The core of the paper outlines studies produced by the BIBLINK project as background to the production of a demonstrator that will attempt to establish some of these links. This research includes studies of metadata formats in use and an investigation of the potential for format conversion, including an outline of the BIBLINK Core metadata elements and comments on their potential conversion into UNIMARC. BIBLINK studies on digital identifiers and authentication are also outlined.
Details
Keywords
The purpose of this paper is to improve the privacy in healthcare datasets that hold sensitive information. Putting a stop to privacy divulgence and bestowing relevant information…
Abstract
Purpose
The purpose of this paper is to improve the privacy in healthcare datasets that hold sensitive information. Putting a stop to privacy divulgence and bestowing relevant information to legitimate users are at the same time said to be of differing goals. Also, the swift evolution of big data has put forward considerable ease to all chores of life. As far as the big data era is concerned, propagation and information sharing are said to be the two main facets. Despite several research works performed on these aspects, with the incremental nature of data, the likelihood of privacy leakage is also substantially expanded through various benefits availed of big data. Hence, safeguarding data privacy in a complicated environment has become a major setback.
Design/methodology/approach
In this study, a method called deep restricted additive homomorphic ElGamal privacy preservation (DR-AHEPP) to preserve the privacy of data even in case of incremental data is proposed. An entropy-based differential privacy quasi identification and DR-AHEPP algorithms are designed, respectively, for obtaining privacy-preserved minimum falsified quasi-identifier set and computationally efficient privacy-preserved data.
Findings
Analysis results using Diabetes 130-US hospitals illustrate that the proposed DR-AHEPP method is more significant in preserving privacy on incremental data than existing methods. A comparative analysis of state-of-the-art works with the objective to minimize information loss, false positive rate and execution time with higher accuracy is calibrated.
Originality/value
The paper provides better performance using Diabetes 130-US hospitals for achieving high accuracy, low information loss and false positive rate. The result illustrates that the proposed method increases the accuracy by 4% and reduces the false positive rate and information loss by 25 and 35%, respectively, as compared to state-of-the-art works.
Details
Keywords
José L. Navarro‐Galindo and José Samos
Nowadays, the use of WCMS (web content management systems) is widespread. The conversion of this infrastructure into its semantic equivalent (semantic WCMS) is a critical issue…
Abstract
Purpose
Nowadays, the use of WCMS (web content management systems) is widespread. The conversion of this infrastructure into its semantic equivalent (semantic WCMS) is a critical issue, as this enables the benefits of the semantic web to be extended. The purpose of this paper is to present a FLERSA (Flexible Range Semantic Annotation) for flexible range semantic annotation.
Design/methodology/approach
A FLERSA is presented as a user‐centred annotation tool for Web content expressed in natural language. The tool has been built in order to illustrate how a WCMS called Joomla! can be converted into its semantic equivalent.
Findings
The development of the tool shows that it is possible to build a semantic WCMS through a combination of semantic components and other resources such as ontologies and emergence technologies, including XML, RDF, RDFa and OWL.
Practical implications
The paper provides a starting‐point for further research in which the principles and techniques of the FLERSA tool can be applied to any WCMS.
Originality/value
The tool allows both manual and automatic semantic annotations, as well as providing enhanced search capabilities. For manual annotation, a new flexible range markup technique is used, based on the RDFa standard, to support the evolution of annotated Web documents more effectively than XPointer. For automatic annotation, a hybrid approach based on machine learning techniques (Vector‐Space Model + n‐grams) is used to determine the concepts that the content of a Web document deals with (from an ontology which provides a taxonomy), based on previous annotations that are used as a training corpus.
Details
Keywords
Joachim Schopfel, Stéphane Chaudiron, Bernard Jacquemin, Hélène Prost, Marta Severo and Florence Thiault
Print theses and dissertations have regularly been submitted together with complementary material, such as maps, tables, speech samples, photos or videos, in various formats and…
Abstract
Purpose
Print theses and dissertations have regularly been submitted together with complementary material, such as maps, tables, speech samples, photos or videos, in various formats and on different supports. In the digital environment of open repositories and open data, these research results could become a rich source of research results and data sets, for reuse and other exploitation. The paper aims to discuss these issues.
Design/methodology/approach
After introducing electronic theses and dissertations (ETD) into the context of eScience, the paper investigates some aspects that impact the availability and openness of data sets and other supplemental files related to ETD (system architecture, metadata and data retrieval, legal aspects).
Findings
These items are part of the so-called “small data” of eScience, with a wide range of contents and formats. Their heterogeneity and their link to ETD need specific approaches to data curation and management, with specific metadata and identifiers and with specific services, workflows and systems. One size may not fit for all but it seems appropriate to separate text and data files. Regarding copyright and licensing, data sets must be evaluated carefully but should not be processed and disseminated under the same conditions as the related PhD theses. Some examples are presented.
Research limitations/implications
The paper concludes with recommendations for further investigation and development to foster open access to research results produced along with PhD theses.
Originality/value
ETDs are an important part of the content of open repositories. Yet, their potential as a gateway to underlying research results has not really been explored so far.
Details
Keywords
D. Diane Beale and Michael F. Lynch
Ayers’ recent suggestions for a Universal Standard Book Number, logically generated from a catalogue entry, and therefore applicable restrospectively to bibliographic files, have…
Abstract
Ayers’ recent suggestions for a Universal Standard Book Number, logically generated from a catalogue entry, and therefore applicable restrospectively to bibliographic files, have been implemented and tested on two one‐year cumulations of BNB MARC files. The proportion of unique entries provided by the USBN was found to be about 91%. Revisions to the coding tables were made on the basis of a detailed analysis of the results and of determinations of the frequencies of characters in the data elements used. These resulted in improvements to the method, giving an increase in the proportion of unique entries to approximately 96%.
Carly C. Dearborn, Amy J. Barton and Neal A. Harmeyer
The purpose of this case study is to discuss the creation of robust preservation functionality within PURR. The study seeks to discuss the customization of the HUBzero platform…
Abstract
Purpose
The purpose of this case study is to discuss the creation of robust preservation functionality within PURR. The study seeks to discuss the customization of the HUBzero platform, composition of digital preservation policies, and the creation of a novel, machine-actionable metadata model for PURR's unique digital content. Additionally, the study will trace the implementation of the Open Archival Information System (OAIS) model and track PURR's progress towards Trustworthy Digital Repository certification.
Design/methodology/approach
This case study discusses the use of the Center for Research Libraries Trusted Repository Audit Checklist (TRAC) certification process and ISO 16363 as a rubric to build an OAIS institutional repository for the publication, preservation, and description of unique datasets.
Findings
ISO 16363 continues to serve as a rubric, barometer and set of goals for PURR as development continues. To become a trustworthy repository, the PURR project team has consistently worked to build a robust, secure, and long-term home for collaborative research. In order to fulfill its mandate, the project team constructed policies, strategies, and activities designed to guide a systematic digital preservation environment. PURR expects to undertake the full ISO 16363 audit process at a future date in expectation of being certified as a Trustworthy Digital Repository. Through its efforts in digital preservation, the Purdue University Research Repository expects to better serve Purdue researchers, their collaborators, and move scholarly research efforts forward world-wide.
Originality/value
PURR is a customized instance of HUBzero®, an open source software platform that supports scientific discovery, learning, and collaboration. HUBzero was a research project funded by the United States National Science Foundation (NSF) and is a product of the Network for Computation Nanotechnology (NCN), a multi-university initiative of eight member institutions. PURR is only one instance of a HUBzero's customization; versions have been implemented in many disciplines nation-wide. PURR maintains the core functionality of HUBzero, but has been modified to publish datasets and to support their preservation. Long-term access to published data are an essential component of PURR services and Purdue University Libraries' mission. Preservation in PURR is not only vital to the Purdue University research community, but to the larger digital preservation issues surrounding dynamic datasets and their long-term usability.
Details