Search results

1 – 10 of 957
Article
Publication date: 24 June 2024

Yanxinwen Li, Ziming Xie, Buqing Cao and Hua Lou

With the introduction of graph structure learning into service classification, more accurate graph structures can significantly improve the precision of service classification…

Abstract

Purpose

With the introduction of graph structure learning into service classification, more accurate graph structures can significantly improve the precision of service classification. However, existing graph structure learning methods tend to rely on a single information source when attempting to eliminate noise in the original graph structure and lack consideration for the graph generation mechanism. To address this problem, this paper aims to propose a graph structure estimation neural network-based service classification (GSESC) model.

Design/methodology/approach

First, this method uses the local smoothing properties of graph convolutional networks (GCN) and combines them with the stochastic block model to serve as the graph generation mechanism. Next, it constructs a series of observation sets reflecting the intrinsic structure of the service from different perspectives to minimize biases introduced by a single information source. Subsequently, it integrates the observation model with the structural model to calculate the posterior distribution of the graph structure. Finally, it jointly optimizes GCN and the graph estimation process to obtain the optimal graph.

Findings

The authors conducted a series of experiments on the API data set and compared it with six baseline methods. The experimental results demonstrate the effectiveness of the GSESC model in service classification.

Originality/value

This paper argues that the data set used for service classification exhibits a strong community structure. In response to this, the paper innovatively applies a graph-based learning model that considers the underlying generation mechanism of the graph to the field of service classification and achieves good results.

Details

International Journal of Web Information Systems, vol. 20 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 3 September 2024

Biplab Bhattacharjee, Kavya Unni and Maheshwar Pratap

Product returns are a major challenge for e-businesses as they involve huge logistical and operational costs. Therefore, it becomes crucial to predict returns in advance. This…

Abstract

Purpose

Product returns are a major challenge for e-businesses as they involve huge logistical and operational costs. Therefore, it becomes crucial to predict returns in advance. This study aims to evaluate different genres of classifiers for product return chance prediction, and further optimizes the best performing model.

Design/methodology/approach

An e-commerce data set having categorical type attributes has been used for this study. Feature selection based on chi-square provides a selective features-set which is used as inputs for model building. Predictive models are attempted using individual classifiers, ensemble models and deep neural networks. For performance evaluation, 75:25 train/test split and 10-fold cross-validation strategies are used. To improve the predictability of the best performing classifier, hyperparameter tuning is performed using different optimization methods such as, random search, grid search, Bayesian approach and evolutionary models (genetic algorithm, differential evolution and particle swarm optimization).

Findings

A comparison of F1-scores revealed that the Bayesian approach outperformed all other optimization approaches in terms of accuracy. The predictability of the Bayesian-optimized model is further compared with that of other classifiers using experimental analysis. The Bayesian-optimized XGBoost model possessed superior performance, with accuracies of 77.80% and 70.35% for holdout and 10-fold cross-validation methods, respectively.

Research limitations/implications

Given the anonymized data, the effects of individual attributes on outcomes could not be investigated in detail. The Bayesian-optimized predictive model may be used in decision support systems, enabling real-time prediction of returns and the implementation of preventive measures.

Originality/value

There are very few reported studies on predicting the chance of order return in e-businesses. To the best of the authors’ knowledge, this study is the first to compare different optimization methods and classifiers, demonstrating the superiority of the Bayesian-optimized XGBoost classification model for returns prediction.

Details

Journal of Systems and Information Technology, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1328-7265

Keywords

Open Access
Article
Publication date: 2 April 2024

Koraljka Golub, Osma Suominen, Ahmed Taiye Mohammed, Harriet Aagaard and Olof Osterman

In order to estimate the value of semi-automated subject indexing in operative library catalogues, the study aimed to investigate five different automated implementations of an…

Abstract

Purpose

In order to estimate the value of semi-automated subject indexing in operative library catalogues, the study aimed to investigate five different automated implementations of an open source software package on a large set of Swedish union catalogue metadata records, with Dewey Decimal Classification (DDC) as the target classification system. It also aimed to contribute to the body of research on aboutness and related challenges in automated subject indexing and evaluation.

Design/methodology/approach

On a sample of over 230,000 records with close to 12,000 distinct DDC classes, an open source tool Annif, developed by the National Library of Finland, was applied in the following implementations: lexical algorithm, support vector classifier, fastText, Omikuji Bonsai and an ensemble approach combing the former four. A qualitative study involving two senior catalogue librarians and three students of library and information studies was also conducted to investigate the value and inter-rater agreement of automatically assigned classes, on a sample of 60 records.

Findings

The best results were achieved using the ensemble approach that achieved 66.82% accuracy on the three-digit DDC classification task. The qualitative study confirmed earlier studies reporting low inter-rater agreement but also pointed to the potential value of automatically assigned classes as additional access points in information retrieval.

Originality/value

The paper presents an extensive study of automated classification in an operative library catalogue, accompanied by a qualitative study of automated classes. It demonstrates the value of applying semi-automated indexing in operative information retrieval systems.

Article
Publication date: 18 August 2023

Gaurav Sarin, Pradeep Kumar and M. Mukund

Text classification is a widely accepted and adopted technique in organizations to mine and analyze unstructured and semi-structured data. With advancement of technological…

Abstract

Purpose

Text classification is a widely accepted and adopted technique in organizations to mine and analyze unstructured and semi-structured data. With advancement of technological computing, deep learning has become more popular among academicians and professionals to perform mining and analytical operations. In this work, the authors study the research carried out in field of text classification using deep learning techniques to identify gaps and opportunities for doing research.

Design/methodology/approach

The authors adopted bibliometric-based approach in conjunction with visualization techniques to uncover new insights and findings. The authors collected data of two decades from Scopus global database to perform this study. The authors discuss business applications of deep learning techniques for text classification.

Findings

The study provides overview of various publication sources in field of text classification and deep learning together. The study also presents list of prominent authors and their countries working in this field. The authors also presented list of most cited articles based on citations and country of research. Various visualization techniques such as word cloud, network diagram and thematic map were used to identify collaboration network.

Originality/value

The study performed in this paper helped to understand research gaps that is original contribution to body of literature. To best of the authors' knowledge, in-depth study in the field of text classification and deep learning has not been performed in detail. The study provides high value to scholars and professionals by providing them opportunities of research in this area.

Details

Benchmarking: An International Journal, vol. 31 no. 8
Type: Research Article
ISSN: 1463-5771

Keywords

Article
Publication date: 23 May 2022

Nedra Ibrahim, Anja Habacha Chaibi and Henda Ben Ghézala

Given the magnitude of the literature, a researcher must be selective of research papers and publications in general. In other words, only papers that meet strict standards of…

Abstract

Purpose

Given the magnitude of the literature, a researcher must be selective of research papers and publications in general. In other words, only papers that meet strict standards of academic integrity and adhere to reliable and credible sources should be referenced. The purpose of this paper is to approach this issue from the prism of scientometrics according to the following research questions: Is it necessary to judge the quality of scientific production? How do we evaluate scientific production? What are the tools to be used in evaluation?

Design/methodology/approach

This paper presents a comparative study of scientometric evaluation practices and tools. A systematic literature review is conducted based on articles published in the field of scientometrics between 1951 and 2022. To analyze data, the authors performed three different aspects of analysis: usage analysis based on classification and comparison between the different scientific evaluation practices, type and level analysis based on classifying different scientometric indicators according to their types and application levels and similarity analysis based on studying the correlation between different quantitative metrics to identify similarity between them.

Findings

This comparative study leads to classify different scientific evaluation practices into externalist and internalist approaches. The authors categorized the different quantitative metrics according to their types (impact, production and composite indicators), their levels of application (micro, meso and macro) and their use (internalist and externalist). Moreover, the similarity analysis has revealed a high correlation between several scientometric indicators such as author h-index, author publications, citations and journal citations.

Originality/value

The interest in this study lies deeply in identifying the strengths and weaknesses of research groups and guides their actions. This evaluation contributes to the advancement of scientific research and to the motivation of researchers. Moreover, this paper can be applied as a complete in-depth guide to help new researchers select appropriate measurements to evaluate scientific production. The selection of evaluation measures is made according to their types, usage and levels of application. Furthermore, our analysis shows the similarity between the different indicators which can limit the overuse of similar measures.

Details

VINE Journal of Information and Knowledge Management Systems, vol. 54 no. 5
Type: Research Article
ISSN: 2059-5891

Keywords

Article
Publication date: 13 August 2024

Samia Nawaz Yousafzai, Hooria Shahbaz, Armughan Ali, Amreen Qamar, Inzamam Mashood Nasir, Sara Tehsin and Robertas Damaševičius

The objective is to develop a more effective model that simplifies and accelerates the news classification process using advanced text mining and deep learning (DL) techniques. A…

Abstract

Purpose

The objective is to develop a more effective model that simplifies and accelerates the news classification process using advanced text mining and deep learning (DL) techniques. A distributed framework utilizing Bidirectional Encoder Representations from Transformers (BERT) was developed to classify news headlines. This approach leverages various text mining and DL techniques on a distributed infrastructure, aiming to offer an alternative to traditional news classification methods.

Design/methodology/approach

This study focuses on the classification of distinct types of news by analyzing tweets from various news channels. It addresses the limitations of using benchmark datasets for news classification, which often result in models that are impractical for real-world applications.

Findings

The framework’s effectiveness was evaluated on a newly proposed dataset and two additional benchmark datasets from the Kaggle repository, assessing the performance of each text mining and classification method across these datasets. The results of this study demonstrate that the proposed strategy significantly outperforms other approaches in terms of accuracy and execution time. This indicates that the distributed framework, coupled with the use of BERT for text analysis, provides a robust solution for analyzing large volumes of data efficiently. The findings also highlight the value of the newly released corpus for further research in news classification and emotion classification, suggesting its potential to facilitate advancements in these areas.

Originality/value

This research introduces an innovative distributed framework for news classification that addresses the shortcomings of models trained on benchmark datasets. By utilizing cutting-edge techniques and a novel dataset, the study offers significant improvements in accuracy and processing speed. The release of the corpus represents a valuable contribution to the field, enabling further exploration into news and emotion classification. This work sets a new standard for the analysis of news data, offering practical implications for the development of more effective and efficient news classification systems.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 30 July 2024

Md. Rifat Mahmud

This paper aims to explore the role of artificial intelligence (AI) in automating library cataloging and classification processes, exploring current applications, challenges and…

319

Abstract

Purpose

This paper aims to explore the role of artificial intelligence (AI) in automating library cataloging and classification processes, exploring current applications, challenges and future possibilities. It aims to provide insights into how AI technologies are reshaping traditional library practices and their implications for the future of information organization and access.

Design/methodology/approach

The paper presents a comprehensive review, analyzing recent research and developments in AI applications for library cataloging and classification. It covers traditional methods, relevant AI technologies, implementation challenges, impacts on library workflows and future directions.

Findings

AI technologies, particularly machine learning and natural language processing, offer significant potential for enhancing efficiency, consistency and depth in metadata creation and classification. However, implementation challenges include data quality issues, integration with legacy systems and the need for new skill sets among library professionals. The impact on library workflows is profound, necessitating a reimagining of traditional librarian responsibilities. Future developments promise more advanced capabilities in personalized discovery, adaptive classification schemes and predictive collection development.

Originality/value

This paper provides a holistic overview of AI’s impact on library cataloging and classification, synthesizing current research and future trends. It highlights the delicate balance required in leveraging AI to enhance library services while upholding core library values. The paper emphasizes the need for ongoing critical engagement with these technologies to shape the future of library services in the AI era.

Details

Library Hi Tech News, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0741-9058

Keywords

Article
Publication date: 26 July 2024

Callum McDonald, Allen Edward Foster and Pauline Rafferty

Genre is a valuable access point for popular music collections; however, the blurring of genre boundaries combined with changing listening habits and new forms of classification…

Abstract

Purpose

Genre is a valuable access point for popular music collections; however, the blurring of genre boundaries combined with changing listening habits and new forms of classification have brought genre’s importance into question. The playlist is now a common means of classification on music streaming platforms. Recent commentary suggests that context is now a preferred access point. This exploratory study offers an examination of genres’ role in playlists.

Design/methodology/approach

A mixed-methods study investigates, using Spotify, whether genre retains relevance amidst the rise in popularity of playlist-based music classification. Sample size is noted as a limitation of the study.

Findings

Qualitative coding of user and editorial playlist names revealed less than 20% of codes applied were genre-based. However, when non-genre themes were differentiated, genre themes ranked as one of the most prevalent. Context-based themes were most common, though genre was readily combined with other descriptive themes, highlighting its utility. Quantitative analysis of genre tags showed playlists with context-based themes demonstrated higher genre homogeneity than those using generic themes, indicating playlists were named on a genre-by-proxy basis.

Originality/value

The study suggests that genre continues to play an integral role in a field where an eclectic variety of descriptive themes has emerged, although its role may have changed. Context-based themes are central to the way users organise music, though such terms can often serve as containers for music collections sharing distinct generic and musicological similarities.

Details

Journal of Documentation, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 17 September 2024

Muddesar Iqbal, Sohail Sarwar, Muhammad Safyan and Moustafa Nasralla

The purpose of this study is to present a systematic and comprehensive review of personalized, adaptive and semantic e-learning systems.

Abstract

Purpose

The purpose of this study is to present a systematic and comprehensive review of personalized, adaptive and semantic e-learning systems.

Design/methodology/approach

Preferred reporting items of systematic reviews and meta-analyses guidelines have been used for a thorough insight into associated aspects of e-learning that complement the e-learning pedagogies and processes. The aspects of e-learning systems have been reviewed comprehensively such as personalization and adaptivity, e-learning and semantics, learner profiling and learner categorization, which are handy in intelligent content recommendations for learners.

Findings

The adoption of semantic Web based technologies would complement the learner’s performance in terms of learning outcomes.

Research limitations/implications

The evaluation of the proposed framework depends upon the yearly batch of learners and recording is a cumbersome/tedious process.

Social implications

E-Learning systems may have diverse and positive impact on society including democratized learning and inclusivity regardless of socio-economic or geographic status.

Originality/value

A preliminary framework of an ontology-based e-learning system has been proposed at a modular level of granularity for implementation, along with evaluation metrics followed by a future roadmap.

Details

International Journal of Web Information Systems, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 3 September 2024

Siqi Liu and Junzhi Jia

Exploring diverse knowledge organization systems and metadata schemes in linked data, aiming to promote vocabulary usability and high-quality linked data creation within the LIS…

Abstract

Purpose

Exploring diverse knowledge organization systems and metadata schemes in linked data, aiming to promote vocabulary usability and high-quality linked data creation within the LIS field.

Design/methodology/approach

We used content analysis to select 77 articles from 13 library and information science journals around our research theme. We identified four dimensions: vocabularies participation, reuse, functions, and naming variations in linked data.

Findings

The vocabulary comprises seven main categories and their corresponding 126 vocabularies, which participate in linked data in single, two, and multiple dimensions. These vocabularies are used in the eight LIS subfields. Reusing vocabularies has become integral to linked data publishing, with six categories and their corresponding 66 vocabularies being reused. Ontologies are the most engaged and widely reused category of vocabulary in linked data practice. The mutual support among the three major categories and seven subfunctions of vocabulary promotes the sustainable development of linked data. Under a combination of factors, the phenomenon of terminology name changes and cross-usage between “vocabulary” and “ontology.”

Research limitations/implications

This study has limitations. Although 77 articles on the topic of vocabularies applied in linked data were analyzed and presented with quantitative statistics and visualizations, the exploration of the topic tends to be a practical activity, with limited presence in scholarly articles. Moreover, this study’s analysis of the practical applications of linked data is relatively limited, and the sample literature focused on articles published in English, which may have affected the diversity and inclusiveness of the research sample.

Practical implications

Practically, this study does not confine the application of content analysis solely to the traditional exploration of knowledge organization topics, development trends, or course content. Instead, it integrates the dual perspectives of linked data and vocabularies, employing content analysis to analyze and objectively reveal the application issues of vocabularies in linked data. The conclusions can provide specific guidelines for future applications of vocabularies in the LIS subfields and contribute to promoting interoperability of vocabularies.

Social implications

This research explores the relationship between linked data and vocabularies, highlighting the diverse manifestations and challenges of vocabularies in linked data. It provides theoretical references for the construction and further development of vocabularies considering technologies such as linked data, drawing attention to the potential and existing issues associated with linked open data vocabularies.

Originality/value

This study extends the application of content analysis to exploring vocabularies, especially Knowledge Organization Systems and metadata schemes in the LIS field linked data, highlighting the mutually beneficial interactions between linked data and vocabularies. It provides guidance for future vocabularies applications in the LIS field and offers insights into vocabularies construction and the healthy development of linked data ecosystems in the era of information technology.

Details

Online Information Review, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1468-4527

Keywords

1 – 10 of 957