Search results
1 – 10 of over 10000A number of techniques have been studied for the automatic assignment of controlled subject headings and classifications from free indexing. These techniques involve the automatic…
Abstract
A number of techniques have been studied for the automatic assignment of controlled subject headings and classifications from free indexing. These techniques involve the automatic manipulation and truncation of the free‐index phrases assigned to a document and the use of a manually‐constructed thesaurus and automatically‐generated dictionaries together with statistical ranking and weighting methods. These are based on the use of a statistically‐generated ‘adhesion coefficient’ which reflects the degree of association between the free‐indexing terms, the controlled subject headings, and the classifications. By the analysis of a large sample of manually‐indexed documents the system generates dictionaries of free‐language and controlled‐language terms together with their associated classifications and adhesion coefficients. Having learnt from the manually‐indexed documents the system uses these dictionaries in the subsequent automatic classification procedure. The accuracy and cost‐effectiveness of the automatically‐assigned subject headings and classifications has been compared with that of the manual system. The results were encouraging and the costs comparable to those of a manual system.
Jeong‐Hyen Kim and Kyung‐Ho Lee
This paper reports on the design of a knowledge base for an automatic classification in the library science field, by using the facet classification principles of colon…
Abstract
This paper reports on the design of a knowledge base for an automatic classification in the library science field, by using the facet classification principles of colon classification (CC). To do so, by designing and constructing a knowledge base that is able to be classified automatically, and by inputting titles or key words of volumes into the computer, it aims to create class numbers automatically through automatic subject recognition and processing of key words in titles through the facet combination method of CC. Especially, the knowledge base for classification was designed along with the principle of globe and cylinder, automatic classification which can be possible.
Details
Keywords
This article reviews the state of the art in automatic indexing, that is, automatic techniques for analysing and characterising documents, for manipulating their descriptions in…
Abstract
This article reviews the state of the art in automatic indexing, that is, automatic techniques for analysing and characterising documents, for manipulating their descriptions in searching, and for generating the index language used for these purposes. It concentrates on the literature from 1968 to 1973. Section I defines the topic and its context. Sections II and III consider work in syntax and semantics respectively in detail. Section IV comments on ‘indirect’ indexing. Section V briefly surveys operating mechanized systems. In Section VI major experiments in automatic indexing are reviewed, and Section VII attempts an overall conclusion on the current state of automatic indexing techniques.
C.F. Cheung, W.B. Lee and Y. Wang
Unstructured knowledge management (UKM) becomes indispensable for the support of knowledge work. However, unstructured knowledge is inconvenient and difficult for sharing…
Abstract
Purpose
Unstructured knowledge management (UKM) becomes indispensable for the support of knowledge work. However, unstructured knowledge is inconvenient and difficult for sharing, organizing and acquisition. This paper seeks to present the development and implementation of a multi‐facet taxonomy system (MTS) for effective management of unstructured knowledge.
Design/methodology/approach
Multi‐facet taxonomy is a multi‐dimensional taxonomy which allows the classification of knowledge assets under multiple concepts at any levels of abstraction. The MTS system is based on five components: multi‐dimensional taxonomy structure, thesaurus model, automatic classification mechanism, intelligent searching, and self‐maintenance of taxonomy, respectively. Artificial intelligence (AI) and natural language process (NLP) technologies are used in the development of the MTS.
Findings
With the successful development of the MTS, the accuracy of categorization of unstructured knowledge is significantly improved. It also allows an organization to capture the valuable tacit knowledge embedded in the unstructured knowledge assets. This helps an organization to explore business opportunities for continuous business improvement.
Practical implications
The implementation of the MTS system not only dramatically reduces the human effort, time and cost for UKM but also allows an organization to capture valuable knowledge embedded in unstructured knowledge assets.
Originality/value
As the knowledge work and task become more complex and are dynamically changing with time and involve multiple concepts, the MTS addresses the inadequacy of conventional single dimensional taxonomy for managing unstructured knowledge. The self‐maintenance capability of the MTS ensures that the taxonomy is up‐to‐date and new knowledge is classified automatically for better knowledge sharing and acquisition.
Details
Keywords
This study aims to statistically investigate the place of the eurozone countries in the framework of the international economy and particularly within the most advanced non…
Abstract
Purpose
This study aims to statistically investigate the place of the eurozone countries in the framework of the international economy and particularly within the most advanced non Euro‐currency countries; second it attempts to explain the eventual discrepancies in the performing of the eurozone from the most advanced non‐eurozone countries by the weaknesses of some eurozone members. The discriminant analysis as an investigation tool has been chosen as an as unbiased as possible investigation technique. Of course, every discriminant analysis requires classification criteria. The criteria adopted in this study result to more or less the same conclusions.
Design/methodology/approach
Most econometric studies prefer the popular econometric techniques employing classical regression techniques, while methods of multivariate statistics and non‐linear regressions occupy a minor place as statistical tools. Numerous and multivariate data call for multivariate techniques, which at the cost of losing information and details allow for better a perception of the data structure. Therefore, a great part of the statistical analysis focuses on multivariate techniques and non‐linear regression.
Findings
The study concludes that despite the present budget and debt crisis hitting some major and minor eurozone members the “real” economy of the eurozone posseses a first class place in the World economy – both in relative and absolute terms. During the course of the study effort was paid to balancing the tools of investigation and the fertility of the results, in particular to approach questions such as: what is present condition of the eurozone? How solid are the predictions for a hanging collapse of the euro currency and the eurozone? At the end of the study is given an apercu on the transition of the member countries to the eurozone and their economic status by the end of 2011 tries to soften the fears for the eurozone future.
Originality/value
This study tries to analyze the position of the eurozone countries from an arithmetic/objective perspective, ignoring as much as possible the (geo) political and national interests of the principal countries involved as an effort to check the solidity of the fears. Not all parameters of the economy can enter the study. The author has chosen a few variables, which to their opinion reflect the overall performance of an economy. Parameters relating to financial aspects have nowadays in great degree become autonomous and call for special inquiry. The study seeks to add to econometric studies carried out by national and international institutions and Universities. It mainly concerns the statistical techniques and to treat eurozone as a whole entity vs the rest of the developed non‐eurozone world. Indeed, the study tries to defend the eurozone using objective data against a multitude of gloomy predictions, raised by several world partners, for the performance and the future of the eurozone.
Details
Keywords
Chiara Alzetta, Felice Dell'Orletta, Alessio Miaschi, Elena Prat and Giulia Venturi
The authors’ goal is to investigate variations in the writing style of book reviews published on different social reading platforms and referring to books of different genres…
Abstract
Purpose
The authors’ goal is to investigate variations in the writing style of book reviews published on different social reading platforms and referring to books of different genres, which enables acquiring insights into communication strategies adopted by readers to share their reading experiences.
Design/methodology/approach
The authors propose a corpus-based study focused on the analysis of A Good Review, a novel corpus of online book reviews written in Italian, posted on Amazon and Goodreads, and covering six literary fiction genres. The authors rely on stylometric analysis to explore the linguistic properties and lexicon of reviews and the authors conducted automatic classification experiments using multiple approaches and feature configurations to predict either the review's platform or the literary genre.
Findings
The analysis of user-generated reviews demonstrates that language is a quite variable dimension across reading platforms, but not as much across book genres. The classification experiments revealed that features modelling the syntactic structure of the sentence are reliable proxies for discerning Amazon and Goodreads reviews, whereas lexical information showed a higher predictive role for automatically discriminating the genre.
Originality/value
The high availability of cultural products makes information services necessary to help users navigate these resources and acquire information from unstructured data. This study contributes to a better understanding of the linguistic characteristics of user-generated book reviews, which can support the development of linguistically-informed recommendation services. Additionally, the authors release a novel corpus of online book reviews meant to support the reproducibility and advancements of the research.
Details
Keywords
Metin Sabuncu and Hakan Özdemir
This study aims to identify leather type and authenticity through optical coherence tomography.
Abstract
Purpose
This study aims to identify leather type and authenticity through optical coherence tomography.
Design/methodology/approach
Optical coherence tomography images taken from genuine and faux leather samples were used to create an image dataset, and automated machine learning algorithms were also used to distinguish leather types.
Findings
The optical coherence tomography scan results in a different image based on leather type. This information was used to determine the leather type correctly by optical coherence tomography and automatic machine learning algorithms. Please note that this system also recognized whether the leather was genuine or synthetic. Hence, this demonstrates that optical coherence tomography and automatic machine learning can be used to distinguish leather type and determine whether it is genuine.
Originality/value
For the first time to the best of the authors' knowledge, spectral-domain optical coherence tomography and automated machine learning algorithms were applied to identify leather authenticity in a noncontact and non-invasive manner. Since this model runs online, it can readily be employed in automated quality monitoring systems in the leather industry. With recent technological progress, optical coherence tomography combined with automated machine learning algorithms will be used more frequently in automatic authentication and identification systems.
Details
Keywords
To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning…
Abstract
Purpose
To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning, information retrieval and library science), and point to problems with the approaches and automated classification as such.
Design/methodology/approach
A range of works dealing with automated classification of full‐text web documents are discussed. Explorations of individual approaches are given in the following sections: special features (description, differences, evaluation), application and characteristics of web pages.
Findings
Provides major similarities and differences between the three approaches: document pre‐processing and utilization of web‐specific document characteristics is common to all the approaches; major differences are in applied algorithms, employment or not of the vector space model and of controlled vocabularies. Problems of automated classification are recognized.
Research limitations/implications
The paper does not attempt to provide an exhaustive bibliography of related resources.
Practical implications
As an integrated overview of approaches from different research communities with application examples, it is very useful for students in library and information science and computer science, as well as for practitioners. Researchers from one community have the information on how similar tasks are conducted in different communities.
Originality/value
To the author's knowledge, no review paper on automated text classification attempted to discuss more than one community's approach from an integrated perspective.
Details
Keywords
Erik Bergström, Fredrik Karlsson and Rose-Mharie Åhlfeldt
The purpose of this paper is to develop a method for information classification. The proposed method draws on established standards, such as the ISO/IEC 27002 and information…
Abstract
Purpose
The purpose of this paper is to develop a method for information classification. The proposed method draws on established standards, such as the ISO/IEC 27002 and information classification practices. The long-term goal of the method is to decrease the subjective judgement in the implementation of information classification in organisations, which can lead to information security breaches because the information is under- or over-classified.
Design/methodology/approach
The results are based on a design science research approach, implemented as five iterations spanning the years 2013 to 2019.
Findings
The paper presents a method for information classification and the design principles underpinning the method. The empirical demonstration shows that senior and novice information security managers perceive the method as a useful tool for classifying information assets in an organisation.
Research limitations/implications
Existing research has, to a limited extent, provided extensive advice on how to approach information classification in organisations systematically. The method presented in this paper can act as a starting point for further research in this area, aiming at decreasing subjectivity in the information classification process. Additional research is needed to fully validate the proposed method for information classification and its potential to reduce the subjective judgement.
Practical implications
The research contributes to practice by offering a method for information classification. It provides a hands-on-tool for how to implement an information classification process. Besides, this research proves that it is possible to devise a method to support information classification. This is important, because, even if an organisation chooses not to adopt the proposed method, the very fact that this method has proved useful should encourage any similar endeavour.
Originality/value
The proposed method offers a detailed and well-elaborated tool for information classification. The method is generic and adaptable, depending on organisational needs.
Details
Keywords
Basma Abd El-Rahiem, Ahmed Sedik, Ghada M. El Banby, Hani M. Ibrahem, Mohamed Amin, Oh-Young Song, Ashraf A. M. Khalaf and Fathi E. Abd El-Samie
The objective of this paper is to perform infrared (IR) face recognition efficiently with convolutional neural networks (CNNs). The proposed model in this paper has several…
Abstract
Purpose
The objective of this paper is to perform infrared (IR) face recognition efficiently with convolutional neural networks (CNNs). The proposed model in this paper has several advantages such as the automatic feature extraction using convolutional and pooling layers and the ability to distinguish between faces without visual details.
Design/methodology/approach
A model which comprises five convolutional layers in addition to five max-pooling layers is introduced for the recognition of IR faces.
Findings
The experimental results and analysis reveal high recognition rates of IR faces with the proposed model.
Originality/value
A designed CNN model is presented for IR face recognition. Both the feature extraction and classification tasks are incorporated into this model. The problems of low contrast and absence of details in IR images are overcome with the proposed model. The recognition accuracy reaches 100% in experiments on the Terravic Facial IR Database (TFIRDB).
Details