Search results
1 – 10 of 78Md. Nurul Islam, Guangwei Hu, Murtaza Ashiq and Shakil Ahmad
This bibliometric study aims to analyze the latest trends and patterns of big data applications in librarianship from 2000 to 2022. By conducting a comprehensive examination of…
Abstract
Purpose
This bibliometric study aims to analyze the latest trends and patterns of big data applications in librarianship from 2000 to 2022. By conducting a comprehensive examination of the existing literature, this study aims to provide valuable insights into the emerging field of big data in librarianship and its potential impact on the future of libraries.
Design/methodology/approach
This study employed a rigorous four-stage process of identification, screening, eligibility and inclusion to filter and select the most relevant documents for analysis. The Scopus database was utilized to retrieve pertinent data related to big data applications in librarianship. The dataset comprised 430 documents, including journal articles, conference papers, book chapters, reviews and books. Through bibliometric analysis, the study examined the effectiveness of different publication types and identified the main topics and themes within the field.
Findings
The study found that the field of big data in librarianship is growing rapidly, with a significant increase in publications and citations over the past few years. China is the leading country in terms of publication output, followed by the United States of America. The most influential journals in the field are Library Hi Tech and the ACM International Conference Proceeding Series. The top authors in the field are Minami T, Wu J, Fox EA and Giles CL. The most common keywords in the literature are big data, librarianship, data mining, information retrieval, machine learning and webometrics.
Originality/value
This bibliometric study contributes to the existing body of literature by comprehensively analyzing the latest trends and patterns in big data applications within librarianship. It offers a systematic approach to understanding the state of the field and highlights the unique contributions made by various types of publications. The study’s findings and insights contribute to the originality of this research, providing a foundation for further exploration and advancement in the field of big data in librarianship.
Details
Keywords
Evagelos Varthis and Marios Poulos
This study aims to present metaGraphos, a crowdsourcing system that aids in the transcription and semantic enhancement of scanned documents by using a pool of volunteers or people…
Abstract
Purpose
This study aims to present metaGraphos, a crowdsourcing system that aids in the transcription and semantic enhancement of scanned documents by using a pool of volunteers or people willing to participate in exchange for a financial reward.
Design/methodology/approach
The metaGraphos can be used in circumstances where optical character recognition fails to produce satisfactory results, semantic tagging or assigning thematic headings to texts is considered necessary or even when ground-truth data has to be collected in raw form.
Findings
The system automatically provides a Web-based interface comprising a static HTML page and JavaScript code that displays the scanned images of the document, coupled with the corresponding incomplete texts side by side, allowing users to correct or complete the texts in parallel.
Social implications
By assisting the parallel transcription and the semantic enhancement of difficult scanned documents, the system further reveals the hidden cultural wealth and aids in knowledge dissemination, a fact that contributes significantly to the academic-scientific dialog and feedback.
Originality/value
Individual researchers, libraries and organizations in general may benefit from the system because it is cost-effective, practical and simple to set up client–server architecture that provides a reliable way to transcribe texts or revise transcriptions on a large scale.
Details
Keywords
Yaolin Zhou, Zhaoyang Zhang, Xiaoyu Wang, Quanzheng Sheng and Rongying Zhao
The digitalization of archival management has rapidly developed with the maturation of digital technology. With data's exponential growth, archival resources have transitioned…
Abstract
Purpose
The digitalization of archival management has rapidly developed with the maturation of digital technology. With data's exponential growth, archival resources have transitioned from single modalities, such as text, images, audio and video, to integrated multimodal forms. This paper identifies key trends, gaps and areas of focus in the field. Furthermore, it proposes a theoretical organizational framework based on deep learning to address the challenges of managing archives in the era of big data.
Design/methodology/approach
Via a comprehensive systematic literature review, the authors investigate the field of multimodal archive resource organization and the application of deep learning techniques in archive organization. A systematic search and filtering process is conducted to identify relevant articles, which are then summarized, discussed and analyzed to provide a comprehensive understanding of existing literature.
Findings
The authors' findings reveal that most research on multimodal archive resources predominantly focuses on aspects related to storage, management and retrieval. Furthermore, the utilization of deep learning techniques in image archive retrieval is increasing, highlighting their potential for enhancing image archive organization practices; however, practical research and implementation remain scarce. The review also underscores gaps in the literature, emphasizing the need for more practical case studies and the application of theoretical concepts in real-world scenarios. In response to these insights, the authors' study proposes an innovative deep learning-based organizational framework. This proposed framework is designed to navigate the complexities inherent in managing multimodal archive resources, representing a significant stride toward more efficient and effective archival practices.
Originality/value
This study comprehensively reviews the existing literature on multimodal archive resources organization. Additionally, a theoretical organizational framework based on deep learning is proposed, offering a novel perspective and solution for further advancements in the field. These insights contribute theoretically and practically, providing valuable knowledge for researchers, practitioners and archivists involved in organizing multimodal archive resources.
Details
Keywords
Debasis Majhi and Bhaskar Mukherjee
The purpose of this study is to identify the research fronts by analysing highly cited core papers adjusted with the age of a paper in library and information science (LIS) where…
Abstract
Purpose
The purpose of this study is to identify the research fronts by analysing highly cited core papers adjusted with the age of a paper in library and information science (LIS) where natural language processing (NLP) is being applied significantly.
Design/methodology/approach
By excavating international databases, 3,087 core papers that received at least 5% of the total citations have been identified. By calculating the average mean years of these core papers, and total citations received, a CPT (citation/publication/time) value was calculated in all 20 fronts to understand how a front is relatively receiving greater attention among peers within a course of time. One theme article has been finally identified from each of these 20 fronts.
Findings
Bidirectional encoder representations from transformers with CPT value 1.608 followed by sentiment analysis with CPT 1.292 received highest attention in NLP research. Columbia University New York, in terms of University, Journal of the American Medical Informatics Association, in terms of journals, USA followed by People Republic of China, in terms of country and Xu, H., University of Texas, in terms of author are the top in these fronts. It is identified that the NLP applications boost the performance of digital libraries and automated library systems in the digital environment.
Practical implications
Any research fronts that are identified in the findings of this paper may be used as a base for researchers who intended to perform extensive research on NLP.
Originality/value
To the best of the authors’ knowledge, the methodology adopted in this paper is the first of its kind where meta-analysis approach has been used for understanding the research fronts in sub field like NLP for a broad domain like LIS.
Details
Keywords
Abdelhak Belhi, Abdelaziz Bouras, Abdulaziz Khalid Al-Ali and Sebti Foufou
Digital tools have been used to document cultural heritage with high-quality imaging and metadata. However, some of the historical assets are totally or partially unlabeled and…
Abstract
Purpose
Digital tools have been used to document cultural heritage with high-quality imaging and metadata. However, some of the historical assets are totally or partially unlabeled and some are physically damaged, which decreases their attractiveness and induces loss of value. This paper introduces a new framework that aims at tackling the cultural data enrichment challenge using machine learning.
Design/methodology/approach
This framework focuses on the automatic annotation and metadata completion through new deep learning classification and annotation methods. It also addresses issues related to physically damaged heritage objects through a new image reconstruction approach based on supervised and unsupervised learning.
Findings
The authors evaluate approaches on a data set of cultural objects collected from various cultural institutions around the world. For annotation and classification part of this study, the authors proposed and implemented a hierarchical multimodal classifier that improves the quality of annotation and increases the accuracy of the model, thanks to the introduction of multitask multimodal learning. Regarding cultural data visual reconstruction, the proposed clustering-based method, which combines supervised and unsupervised learning is found to yield better quality completion than existing inpainting frameworks.
Originality/value
This research work is original in sense that it proposes new approaches for the cultural data enrichment, and to the authors’ knowledge, none of the existing enrichment approaches focus on providing an integrated framework based on machine learning to solve current challenges in cultural heritage. These challenges, which are identified by the authors are related to metadata annotation and visual reconstruction.
Details
Keywords
Giovanna Aracri, Antonietta Folino and Stefano Silvestri
The purpose of this paper is to propose a methodology for the enrichment and tailoring of a knowledge organization system (KOS), in order to support the information extraction…
Abstract
Purpose
The purpose of this paper is to propose a methodology for the enrichment and tailoring of a knowledge organization system (KOS), in order to support the information extraction (IE) task for the analysis of documents in the tourism domain. In particular, the KOS is used to develop a named entity recognition (NER) system.
Design/methodology/approach
A method to improve and customize an available thesaurus by leveraging documents related to the tourism in Italy is firstly presented. Then, the obtained thesaurus is used to create an annotated NER corpus, exploiting both distant supervision, deep learning and a light human supervision.
Findings
The study shows that a customized KOS can effectively support IE tasks when applied to documents belonging to the same domains and types used for its construction. Moreover, it is very useful to support and ease the annotation task using the proposed methodology, allowing to annotate a corpus with a fraction of the effort required for a manual annotation.
Originality/value
The paper explores an alternative use of a KOS, proposing an innovative NER corpus annotation methodology. Moreover, the KOS and the annotated NER data set will be made publicly available.
Details
Keywords
Miroslav Despotovic, David Koch, Eric Stumpe, Wolfgang A. Brunauer and Matthias Zeppelzauer
In this study the authors aim to outline new ways of information extraction for automated valuation models, which in turn would help to increase transparency in valuation…
Abstract
Purpose
In this study the authors aim to outline new ways of information extraction for automated valuation models, which in turn would help to increase transparency in valuation procedures and thus contribute to more reliable statements about the value of real estate.
Design/methodology/approach
The authors hypothesize that empirical error in the interpretation and qualitative assessment of visual content can be minimized by collating the assessments of multiple individuals and through use of repeated trials. Motivated by this problem, the authors developed an experimental approach for semi-automatic extraction of qualitative real estate metadata based on Comparative Judgments and Deep Learning. The authors evaluate the feasibility of our approach with the help of Hedonic Models.
Findings
The results show that the collated assessments of qualitative features of interior images show a notable effect on the price models and thus over potential for further research within this paradigm.
Originality/value
To the best of the authors’ knowledge, this is the first approach that combines and collates the subjective ratings of visual features and deep learning for real estate use cases.
Details
Keywords
Julián Monsalve-Pulido, Jose Aguilar, Edwin Montoya and Camilo Salazar
This article proposes an architecture of an intelligent and autonomous recommendation system to be applied to any virtual learning environment, with the objective of efficiently…
Abstract
This article proposes an architecture of an intelligent and autonomous recommendation system to be applied to any virtual learning environment, with the objective of efficiently recommending digital resources. The paper presents the architectural details of the intelligent and autonomous dimensions of the recommendation system. The paper describes a hybrid recommendation model that orchestrates and manages the available information and the specific recommendation needs, in order to determine the recommendation algorithms to be used. The hybrid model allows the integration of the approaches based on collaborative filter, content or knowledge. In the architecture, information is extracted from four sources: the context, the students, the course and the digital resources, identifying variables, such as individual learning styles, socioeconomic information, connection characteristics, location, etc. Tests were carried out for the creation of an academic course, in order to analyse the intelligent and autonomous capabilities of the architecture.
Details
Keywords
Ziming Zeng, Tingting Li, Jingjing Sun, Shouqiang Sun and Yu Zhang
The proliferation of bots in social networks has profoundly affected the interactions of legitimate users. Detecting and rejecting these unwelcome bots has become part of the…
Abstract
Purpose
The proliferation of bots in social networks has profoundly affected the interactions of legitimate users. Detecting and rejecting these unwelcome bots has become part of the collective Internet agenda. Unfortunately, as bot creators use more sophisticated approaches to avoid being discovered, it has become increasingly difficult to distinguish social bots from legitimate users. Therefore, this paper proposes a novel social bot detection mechanism to adapt to new and different kinds of bots.
Design/methodology/approach
This paper proposes a research framework to enhance the generalization of social bot detection from two dimensions: feature extraction and detection approaches. First, 36 features are extracted from four views for social bot detection. Then, this paper analyzes the feature contribution in different kinds of social bots, and the features with stronger generalization are proposed. Finally, this paper introduces outlier detection approaches to enhance the ever-changing social bot detection.
Findings
The experimental results show that the more important features can be more effectively generalized to different social bot detection tasks. Compared with the traditional binary-class classifier, the proposed outlier detection approaches can better adapt to the ever-changing social bots with a performance of 89.23 per cent measured using the F1 score.
Originality/value
Based on the visual interpretation of the feature contribution, the features with stronger generalization in different detection tasks are found. The outlier detection approaches are first introduced to enhance the detection of ever-changing social bots.
Details
Keywords
Xiancheng Ou, Yuting Chen, Siwei Zhou and Jiandong Shi
With the continuous growth of online education, the quality issue of online educational videos has become increasingly prominent, causing students in online learning to face the…
Abstract
Purpose
With the continuous growth of online education, the quality issue of online educational videos has become increasingly prominent, causing students in online learning to face the dilemma of knowledge confusion. The existing mechanisms for controlling the quality of online educational videos suffer from subjectivity and low timeliness. Monitoring the quality of online educational videos involves analyzing metadata features and log data, which is an important aspect. With the development of artificial intelligence technology, deep learning techniques with strong predictive capabilities can provide new methods for predicting the quality of online educational videos, effectively overcoming the shortcomings of existing methods. The purpose of this study is to find a deep neural network that can model the dynamic and static features of the video itself, as well as the relationships between videos, to achieve dynamic monitoring of the quality of online educational videos.
Design/methodology/approach
The quality of a video cannot be directly measured. According to previous research, the authors use engagement to represent the level of video quality. Engagement is the normalized participation time, which represents the degree to which learners tend to participate in the video. Based on existing public data sets, this study designs an online educational video engagement prediction model based on dynamic graph neural networks (DGNNs). The model is trained based on the video’s static features and dynamic features generated after its release by constructing dynamic graph data. The model includes a spatiotemporal feature extraction layer composed of DGNNs, which can effectively extract the time and space features contained in the video's dynamic graph data. The trained model is used to predict the engagement level of learners with the video on day T after its release, thereby achieving dynamic monitoring of video quality.
Findings
Models with spatiotemporal feature extraction layers consisting of four types of DGNNs can accurately predict the engagement level of online educational videos. Of these, the model using the temporal graph convolutional neural network has the smallest prediction error. In dynamic graph construction, using cosine similarity and Euclidean distance functions with reasonable threshold settings can construct a structurally appropriate dynamic graph. In the training of this model, the amount of historical time series data used will affect the model’s predictive performance. The more historical time series data used, the smaller the prediction error of the trained model.
Research limitations/implications
A limitation of this study is that not all video data in the data set was used to construct the dynamic graph due to memory constraints. In addition, the DGNNs used in the spatiotemporal feature extraction layer are relatively conventional.
Originality/value
In this study, the authors propose an online educational video engagement prediction model based on DGNNs, which can achieve the dynamic monitoring of video quality. The model can be applied as part of a video quality monitoring mechanism for various online educational resource platforms.
Details