Search results

1 – 10 of over 75000

View access options

Article

Publication date: 1 May 2006

Automated subject classification of textual web documents

To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning…

HTML

PDF (127 KB)

Downloads

2242

Abstract

Purpose

To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning, information retrieval and library science), and point to problems with the approaches and automated classification as such.

Design/methodology/approach

A range of works dealing with automated classification of full‐text web documents are discussed. Explorations of individual approaches are given in the following sections: special features (description, differences, evaluation), application and characteristics of web pages.

Findings

Provides major similarities and differences between the three approaches: document pre‐processing and utilization of web‐specific document characteristics is common to all the approaches; major differences are in applied algorithms, employment or not of the vector space model and of controlled vocabularies. Problems of automated classification are recognized.

Research limitations/implications

The paper does not attempt to provide an exhaustive bibliography of related resources.

Practical implications

As an integrated overview of approaches from different research communities with application examples, it is very useful for students in library and information science and computer science, as well as for practitioners. Researchers from one community have the information on how similar tasks are conducted in different communities.

Originality/value

To the author's knowledge, no review paper on automated text classification attempted to discuss more than one community's approach from an integrated perspective.

Details

Journal of Documentation, vol. 62 no. 3

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 30 January 2007

Classification of quality attributes

Lars Witell and Martin Löfgren

The purpose of this paper is to investigate whether the different approaches to the classification of quality attributes deliver consistent results.

HTML

PDF (325 KB)

Downloads

5762

Abstract

Purpose

The purpose of this paper is to investigate whether the different approaches to the classification of quality attributes deliver consistent results.

Design/methodology/approach

The investigation includes four approaches and enables comparisons to be made from a methodological perspective and from an output perspective. The different approaches are described, analyzed, and discussed in the context of an empirical study that investigates how 430 respondents perceive the performance of an e‐service. The theory of attractive quality rests on a solid theoretical foundation and a methodological approach to classify quality attributes. Recently, various authors have suggested alternative approaches to the traditional five‐level Kano questionnaire – including a three‐level Kano questionnaire, direct classification, and a dual‐importance grid.

Findings

The classification of quality attributes are found to be dependent on the approach that is utilized. The development of new ways to classify quality attributes should follow rigid procedures to provide reliable and consistent results.

Originality/value

This is the first attempt to compare alternative approaches to classify quality attributes. For managers, our results provide guidance on what approach to choose based on the strengths and weaknesses with the different approaches.

Details

Managing Service Quality: An International Journal, vol. 17 no. 1

Type: Research Article

DOI:

ISSN: 0960-4529

Keywords

View access options

Article

Publication date: 4 December 2017

A high-dimensional classification approach based on class-dependent feature subspace

Fuzan Chen, Harris Wu, Runliang Dou and Minqiang Li

The purpose of this paper is to build a compact and accurate classifier for high-dimensional classification.

HTML

PDF (271 KB)

Downloads

176

Abstract

Purpose

The purpose of this paper is to build a compact and accurate classifier for high-dimensional classification.

Design/methodology/approach

A classification approach based on class-dependent feature subspace (CFS) is proposed. CFS is a class-dependent integration of a support vector machine (SVM) classifier and associated discriminative features. For each class, our genetic algorithm (GA)-based approach evolves the best subset of discriminative features and SVM classifier simultaneously. To guarantee convergence and efficiency, the authors customize the GA in terms of encoding strategy, fitness evaluation, and genetic operators.

Findings

Experimental studies demonstrated that the proposed CFS-based approach is superior to other state-of-the-art classification algorithms on UCI data sets in terms of both concise interpretation and predictive power for high-dimensional data.

Research limitations/implications

UCI data sets rather than real industrial data are used to evaluate the proposed approach. In addition, only single-label classification is addressed in the study.

Practical implications

The proposed method not only constructs an accurate classification model but also obtains a compact combination of discriminative features. It is helpful for business makers to get a concise understanding of the high-dimensional data.

Originality/value

The authors propose a compact and effective classification approach for high-dimensional data. Instead of the same feature subset for all the classes, the proposed CFS-based approach obtains the optimal subset of discriminative feature and SVM classifier for each class. The proposed approach enhances both interpretability and predictive power for high-dimensional data.

Details

Industrial Management & Data Systems, vol. 117 no. 10

Type: Research Article

DOI:

ISSN: 0263-5577

Keywords

View access options

Article

Publication date: 3 December 2020

Developing an information classification method

Erik Bergström, Fredrik Karlsson and Rose-Mharie Åhlfeldt

The purpose of this paper is to develop a method for information classification. The proposed method draws on established standards, such as the ISO/IEC 27002 and information…

HTML

PDF (383 KB)

Downloads

1146

Abstract

Purpose

The purpose of this paper is to develop a method for information classification. The proposed method draws on established standards, such as the ISO/IEC 27002 and information classification practices. The long-term goal of the method is to decrease the subjective judgement in the implementation of information classification in organisations, which can lead to information security breaches because the information is under- or over-classified.

Design/methodology/approach

The results are based on a design science research approach, implemented as five iterations spanning the years 2013 to 2019.

Findings

The paper presents a method for information classification and the design principles underpinning the method. The empirical demonstration shows that senior and novice information security managers perceive the method as a useful tool for classifying information assets in an organisation.

Research limitations/implications

Existing research has, to a limited extent, provided extensive advice on how to approach information classification in organisations systematically. The method presented in this paper can act as a starting point for further research in this area, aiming at decreasing subjectivity in the information classification process. Additional research is needed to fully validate the proposed method for information classification and its potential to reduce the subjective judgement.

Practical implications

The research contributes to practice by offering a method for information classification. It provides a hands-on-tool for how to implement an information classification process. Besides, this research proves that it is possible to devise a method to support information classification. This is important, because, even if an organisation chooses not to adopt the proposed method, the very fact that this method has proved useful should encourage any similar endeavour.

Originality/value

The proposed method offers a detailed and well-elaborated tool for information classification. The method is generic and adaptable, depending on organisational needs.

Details

Information & Computer Security, vol. 29 no. 2

Type: Research Article

DOI:

ISSN: 2056-4961

Keywords

View access options

Article

Publication date: 7 November 2016

A lexicon based approach for classifying Arabic multi-labeled text

Ismail Hmeidi, Mahmoud Al-Ayyoub, Nizar A. Mahyoub and Mohammed A. Shehab

Multi-label Text Classification (MTC) is one of the most recent research trends in data mining and information retrieval domains because of many reasons such as the rapid growth…

HTML

PDF (659 KB)

Downloads

350

Abstract

Purpose

Multi-label Text Classification (MTC) is one of the most recent research trends in data mining and information retrieval domains because of many reasons such as the rapid growth of online data and the increasing tendency of internet users to be more comfortable with assigning multiple labels/tags to describe documents, emails, posts, etc. The dimensionality of labels makes MTC more difficult and challenging compared with traditional single-labeled text classification (TC). Because it is a natural extension of TC, several ways are proposed to benefit from the rich literature of TC through what is called problem transformation (PT) methods. Basically, PT methods transform the multi-label data into a single-label one that is suitable for traditional single-label classification algorithms. Another approach is to design novel classification algorithms customized for MTC. Over the past decade, several works have appeared on both approaches focusing mainly on the English language. This work aims to present an elaborate study of MTC of Arabic articles.

Design/methodology/approach

This paper presents a novel lexicon-based method for MTC, where the keywords that are most associated with each label are extracted from the training data along with a threshold that can later be used to determine whether each test document belongs to a certain label.

Findings

The experiments show that the presented approach outperforms the currently available approaches. Specifically, the results of our experiments show that the best accuracy obtained from existing approaches is only 18 per cent, whereas the accuracy of the presented lexicon-based approach can reach an accuracy level of 31 per cent.

Originality/value

Although there exist some tools that can be customized to address the MTC problem for Arabic text, their accuracies are very low when applied to Arabic articles. This paper presents a novel method for MTC. The experiments show that the presented approach outperforms the currently available approaches.

Details

International Journal of Web Information Systems, vol. 12 no. 4

Type: Research Article

DOI:

ISSN: 1744-0084

Keywords

View access options

Article

Publication date: 18 January 2016

Stakeholder identification and classification: a sustainability marketing perspective

Vinod Kumar, Zillur Rahman and A. A. Kazmi

This paper aims to review the literature on stakeholder identification and classification related to sustainability marketing from 1998 to 2012 and provides a generalized approach…

HTML

PDF (1.2 MB)

Downloads

5252

Abstract

Purpose

This paper aims to review the literature on stakeholder identification and classification related to sustainability marketing from 1998 to 2012 and provides a generalized approach to stakeholder identification and classification in the field of sustainability marketing.

Design/methodology/approach

Beginning with brief introductions of the key concepts, the research discusses landmark studies on the subject in detail. The review process then begins by identifying and selecting relevant research papers from various online databases. Finally, 60 research papers are found suitable for the review and are examined to theoretically analyze the stakeholder identification and classification schemes used in sustainability marketing literature.

Findings

This study identifies trends of growth in stakeholder identification and classification literature. In addition, there are two major findings. First, stakeholder identification can be done with the help of previous studies, with support from managers or via a combination of both. Second, future research can adopt generic stakeholder classification schemes or relative classification schemes based on dimensions of sustainability to classify stakeholders in relation to sustainability marketing. In relative stakeholder classification, regulatory stakeholders may be considered separately.

Research limitations/implications

While the literature review may be incomplete, as it uses only a title-based advanced search, researchers and practitioners can still benefit from this simplified approach to manage stakeholders.

Originality/value

The study introduces a generalized approach to stakeholder identification and classification related to sustainability marketing and provides a bibliography from 1998 to 2012 that can be used by academics and managers.

Details

Management Research Review, vol. 39 no. 1

Type: Research Article

DOI:

ISSN: 2040-8269

Keywords

View access options

Article

Publication date: 14 September 2015

Augmenting Dublin Core digital library metadata with Dewey Decimal Classification

Michael John Khoo, Jae-wook Ahn, Ceri Binding, Hilary Jane Jones, Xia Lin, Diana Massam and Douglas Tudhope

– The purpose of this paper is to describe a new approach to a well-known problem for digital libraries, how to search across multiple unrelated libraries with a single query.

HTML

PDF (459 KB)

Downloads

1808

Abstract

Purpose

The purpose of this paper is to describe a new approach to a well-known problem for digital libraries, how to search across multiple unrelated libraries with a single query.

Design/methodology/approach

The approach involves creating new Dewey Decimal Classification terms and numbers from existing Dublin Core records. In total, 263,550 records were harvested from three digital libraries. Weighted key terms were extracted from the title, description and subject fields of each record. Ranked DDC classes were automatically generated from these key terms by considering DDC hierarchies via a series of filtering and aggregation stages. A mean reciprocal ranking evaluation compared a sample of 49 generated classes against DDC classes created by a trained librarian for the same records.

Findings

The best results combined weighted key terms from the title, description and subject fields. Performance declines with increased specificity of DDC level. The results compare favorably with similar studies.

Research limitations/implications

The metadata harvest required manual intervention and the evaluation was resource intensive. Future research will look at evaluation methodologies that take account of issues of consistency and ecological validity.

Practical implications

The method does not require training data and is easily scalable. The pipeline can be customized for individual use cases, for example, recall or precision enhancing.

Social implications

The approach can provide centralized access to information from multiple domains currently provided by individual digital libraries.

Originality/value

The approach addresses metadata normalization in the context of web resources. The automatic classification approach accounts for matches within hierarchies, aggregating lower level matches to broader parents and thus approximates the practices of a human cataloger.

Details

Journal of Documentation, vol. 71 no. 5

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 24 April 2009

Describing engineering documents with faceted approaches: Observations and reflections

Peter J. Wild, Matt D. Giess and Chris A. McMahon

The purpose of this paper is to highlight the difficulty of applying faceted classification outside of library contexts and also to indicate that faceted approaches are poorly…

HTML

PDF (143 KB)

Downloads

1144

Abstract

Purpose

The purpose of this paper is to highlight the difficulty of applying faceted classification outside of library contexts and also to indicate that faceted approaches are poorly expressed to non‐experts.

Design/methodology/approach

The faceted approach is being applied outside of its “home” community, with mixed results. The approach is based in part on examination of a broad base of literature and in part on results and reflections on a case study applying faceted notions to “real world” engineering documentation.

Findings

The paper comes across a number of pragmatic and theoretical issues namely: differing interpretations of the facet notion; confusion between faceted analysis and faceted classification; lack of methodological guidance; the use of simplistic domains as exemplars; description verses analysis; facet recognition is unproblematic; and is the process purely top‐down or bottom‐up.

Research limitations/implications

That facet analysis is not inherently associated with a particular epistemology; that greater guidance about the derivation is needed, that greater realism is needed when teaching faceted approaches.

Practical implications

Experiences of applying faceted classifications are presented that can be drawn upon to guide future work in the area.

Originality/value

No previous work has reflected on the actual empirical experience used to create a faceted description, especially with reference to engineering documents.

Details

Journal of Documentation, vol. 65 no. 3

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 11 September 2017

Social media sentiment analysis: lexicon versus machine learning

Chedia Dhaoui, Cynthia M. Webster and Lay Peng Tan

With the soaring volumes of brand-related social media conversations, digital marketers have extensive opportunities to track and analyse consumers’ feelings and opinions about…

HTML

PDF (479 KB)

Downloads

8571

Abstract

Purpose

With the soaring volumes of brand-related social media conversations, digital marketers have extensive opportunities to track and analyse consumers’ feelings and opinions about brands, products or services embedded within consumer-generated content (CGC). These “Big Data” opportunities render manual approaches to sentiment analysis impractical and raise the need to develop automated tools to analyse consumer sentiment expressed in text format. This paper aims to evaluate and compare the performance of two prominent approaches to automated sentiment analysis applied to CGC on social media and explores the benefits of combining them.

Design/methodology/approach

A sample of 850 consumer comments from 83 Facebook brand pages are used to test and compare lexicon-based and machine learning approaches to sentiment analysis, as well as their combination, using the LIWC2015 lexicon and RTextTools machine learning package.

Findings

Results show the two approaches are similar in accuracy, both achieving higher accuracy when classifying positive sentiment than negative sentiment. However, they differ substantially in their classification ensembles. The combined approach demonstrates significantly improved performance in classifying positive sentiment.

Research limitations/implications

Further research is required to improve the accuracy of negative sentiment classification. The combined approach needs to be applied to other kinds of CGCs on social media such as tweets.

Practical implications

The findings inform decision-making around which sentiment analysis approaches (or a combination thereof) is best to analyse CGC on social media.

Originality/value

This study combines two sentiment analysis approaches and demonstrates significantly improved performance.

Details

Journal of Consumer Marketing, vol. 34 no. 6

Type: Research Article

DOI:

ISSN: 0736-3761

Keywords

View access options

Article

Publication date: 12 October 2015

Multi-criteria classification of spare parts inventories – a web based approach

S. P. Sarmah and U. C. Moharana

The purpose of this paper is to present a fuzzy-rule-based model to classify spare parts inventories considering multiple criteria for better management of maintenance activities…

HTML

PDF (726 KB)

Downloads

1614

Abstract

Purpose

The purpose of this paper is to present a fuzzy-rule-based model to classify spare parts inventories considering multiple criteria for better management of maintenance activities to overcome production down situation.

Design/methodology/approach

Fuzzy-rule-based approach for multi-criteria decision making is used to classify the spare parts inventories. Total cost is computed for each group considering suitable inventory policies and compared with other existing models.

Findings

Fuzzy-rule-based multi-criteria classification model provides better results as compared to aggregate scoring and traditional ABC classification. This model offers the flexibility for inventory management experts to provide their subjective inputs.

Practical implications

The web-based model developed in this paper can be implemented in various industries such as manufacturing, chemical plants, and mining, etc., which deal with large number of spares. This method classifies the spares into three categories A, B and C considering multiple criteria and relationships among those criteria. The framework is flexible enough to add additional criteria and to modify fuzzy-rule-base at any point of time by the decision makers. This model can be easily integrated to any customized Enterprise Resource Planning applications.

Originality/value

The value of this paper is in applying Fuzzy-rule-based approach for Multi-criteria Inventory Classification of spare parts. This rule-based approach considering multiple criteria is not very common in classification of spare parts inventories. Total cost comparison is made to compare the performance of proposed model with the traditional classifications and the result shows that proposed fuzzy-rule-based classification approach performs better than the traditional ABC and gives almost the same cost as aggregate scoring model. Hence, this method is valid and adds a new value to spare parts classification for better management decisions.

Details

Journal of Quality in Maintenance Engineering, vol. 21 no. 4

Type: Research Article

DOI:

ISSN: 1355-2511

Keywords

Access

Year

Content type

1 – 10 of over 75000