Search results
1 – 10 of 88Zhengfa Yang, Qian Liu, Baowen Sun and Xin Zhao
This paper aims to make it convenient for those who have only just begun their research into Community Question Answering (CQA) expert recommendation, and for those who are…
Abstract
Purpose
This paper aims to make it convenient for those who have only just begun their research into Community Question Answering (CQA) expert recommendation, and for those who are already concerned with this issue, to ease the extension of our understanding with future research.
Design/methodology/approach
In this paper, keywords such as “CQA”, “Social Question Answering”, “expert recommendation”, “question routing” and “expert finding” are used to search major digital libraries. The final sample includes a list of 83 relevant articles authored in academia as well as industry that have been published from January 1, 2008 to March 1, 2019.
Findings
This study proposes a comprehensive framework to categorize extant studies into three broad areas of CQA expert recommendation research: understanding profile modeling, recommendation approaches and recommendation system impacts.
Originality/value
This paper focuses on discussing and sorting out the key research issues from these three research genres. Finally, it was found that conflicting and contradictory research results and research gaps in the existing research, and then put forward the urgent research topics.
Details
Keywords
Haosen Liu, Youwei Wang, Xiabing Zhou, Zhengzheng Lou and Yangdong Ye
The railway signal equipment failure diagnosis is a vital element to keep the railway system operating safely. One of the most difficulties in signal equipment failure diagnosis…
Abstract
Purpose
The railway signal equipment failure diagnosis is a vital element to keep the railway system operating safely. One of the most difficulties in signal equipment failure diagnosis is the uncertainty of causality between the consequence and cause for the accident. The traditional method to solve this problem is based on Bayesian Network, which needs a rigid and independent assumption basis and prior probability knowledge but ignoring the semantic relationship in causality analysis. This paper aims to perform the uncertainty of causality in signal equipment failure diagnosis through a new way that emphasis on mining semantic relationships.
Design/methodology/approach
This study proposes a deterministic failure diagnosis (DFD) model based on the question answering system to implement railway signal equipment failure diagnosis. It includes the failure diagnosis module and deterministic diagnosis module. In the failure diagnosis module, this paper exploits the question answering system to recognise the cause of failure consequences. The question answering is composed of multi-layer neural networks, which extracts the position and part of speech features of text data from lower layers and acquires contextual features and interactive features of text data by Bi-LSTM and Match-LSTM, respectively, from high layers, subsequently generates the candidate failure cause set by proposed the enhanced boundary unit. In the second module, this study ranks the candidate failure cause set in the semantic matching mechanism (SMM), choosing the top 1st semantic matching degree as the deterministic failure causative factor.
Findings
Experiments on real data set railway maintenance signal equipment show that the proposed DFD model can implement the deterministic diagnosis of railway signal equipment failure. Comparing massive existing methods, the model achieves the state of art in the natural understanding semantic of railway signal equipment diagnosis domain.
Originality/value
It is the first time to use a question answering system executing signal equipment failure diagnoses, which makes failure diagnosis more intelligent than before. The EMU enables the DFD model to understand the natural semantic in long sequence contexture. Then, the SMM makes the DFD model acquire the certainty failure cause in the failure diagnosis of railway signal equipment.
Details
Keywords
Sofia Baroncini, Bruno Sartini, Marieke Van Erp, Francesca Tomasi and Aldo Gangemi
In the last few years, the size of Linked Open Data (LOD) describing artworks, in general or domain-specific Knowledge Graphs (KGs), is gradually increasing. This provides…
Abstract
Purpose
In the last few years, the size of Linked Open Data (LOD) describing artworks, in general or domain-specific Knowledge Graphs (KGs), is gradually increasing. This provides (art-)historians and Cultural Heritage professionals with a wealth of information to explore. Specifically, structured data about iconographical and iconological (icon) aspects, i.e. information about the subjects, concepts and meanings of artworks, are extremely valuable for the state-of-the-art of computational tools, e.g. content recognition through computer vision. Nevertheless, a data quality evaluation for art domains, fundamental for data reuse, is still missing. The purpose of this study is filling this gap with an overview of art-historical data quality in current KGs with a focus on the icon aspects.
Design/methodology/approach
This study’s analyses are based on established KG evaluation methodologies, adapted to the domain by addressing requirements from art historians’ theories. The authors first select several KGs according to Semantic Web principles. Then, the authors evaluate (1) their structures’ suitability to describe icon information through quantitative and qualitative assessment and (2) their content, qualitatively assessed in terms of correctness and completeness.
Findings
This study’s results reveal several issues on the current expression of icon information in KGs. The content evaluation shows that these domain-specific statements are generally correct but often not complete. The incompleteness is confirmed by the structure evaluation, which highlights the unsuitability of the KG schemas to describe icon information with the required granularity.
Originality/value
The main contribution of this work is an overview of the actual landscape of the icon information expressed in LOD. Therefore, it is valuable to cultural institutions by providing them a first domain-specific data quality evaluation. Since this study’s results suggest that the selected domain information is underrepresented in Semantic Web datasets, the authors highlight the need for the creation and fostering of such information to provide a more thorough art-historical dimension to LOD.
Details
Keywords
Maria Giovanna Confetto and Claudia Covucci
For companies that intend to respond to the modern conscious consumers' needs, a great competitive advantage is played on the ability to incorporate sustainability messages in…
Abstract
Purpose
For companies that intend to respond to the modern conscious consumers' needs, a great competitive advantage is played on the ability to incorporate sustainability messages in marketing communications. The aim of this paper is to address this important priority in the web context, building a semantic algorithm that allows content managers to evaluate the quality of sustainability web contents for search engines, considering the current semantic web development.
Design/methodology/approach
Following the Design Science (DS) methodological approach, the study develops the algorithm as an artefact capable of solving a practical problem and improving the operation of content managerial process.
Findings
The algorithm considers multiple factors of evaluation, grouped in three parameters: completeness, clarity and consistency. An applicability test of the algorithm was conducted on a sample of web pages of the Google blog on sustainability to highlight the correspondence between the established evaluation factors and those actually used by Google.
Practical implications
Studying content marketing for sustainability communication constitutes a new field of research that offers exciting opportunities. Writing sustainability contents in an effective way is a fundamental step to trigger stakeholder engagement mechanisms online. It could be a positive social engineering technique in the hands of marketers to make web users able to pursue sustainable development in their choices.
Originality/value
This is the first study that creates a theoretical connection between digital content marketing and sustainability communication focussing, especially, on the aspects of search engine optimization (SEO). The algorithm of “Sustainability-contents SEO” is the first operational software tool, with a regulatory nature, that is able to analyse the web contents, detecting the terms of the sustainability language and measuring the compliance to SEO requirements.
Details
Keywords
Bufei Xing, Haonan Yin, Zhijun Yan and Jiachen Wang
The purpose of this paper is to propose a new approach to retrieve similar questions in online health communities to improve the efficiency of health information retrieval and…
Abstract
Purpose
The purpose of this paper is to propose a new approach to retrieve similar questions in online health communities to improve the efficiency of health information retrieval and sharing.
Design/methodology/approach
This paper proposes a hybrid approach to combining domain knowledge similarity and topic similarity to retrieve similar questions in online health communities. The domain knowledge similarity can evaluate the domain distance between different questions. And the topic similarity measures questions’ relationship base on the extracted latent topics.
Findings
The experiment results show that the proposed method outperforms the baseline methods.
Originality/value
This method conquers the problem of word mismatch and considers the named entities included in questions, which most of existing studies did not.
Details
Keywords
Daniel Hofer, Markus Jäger, Aya Khaled Youssef Sayed Mohamed and Josef Küng
For aiding computer security experts in their study, log files are a crucial piece of information. Especially the time domain is very important for us because in most cases…
Abstract
Purpose
For aiding computer security experts in their study, log files are a crucial piece of information. Especially the time domain is very important for us because in most cases, timestamps are the only linking points between events caused by attackers, faulty systems or simple errors and their corresponding entries in log files. With the idea of storing and analyzing this log information in graph databases, we need a suitable model to store and connect timestamps and their events. This paper aims to find and evaluate different approaches how to store timestamps in graph databases and their individual benefits and drawbacks.
Design/methodology/approach
We analyse three different approaches, how timestamp information can be represented and stored in graph databases. For checking the models, we set up four typical questions that are important for log file analysis and tested them for each of the models. During the evaluation, we used the performance and other properties as metrics, how suitable each of the models is for representing the log files’ timestamp information. In the last part, we try to improve one promising looking model.
Findings
We come to the conclusion, that the simplest model with the least graph database-specific concepts in use is also the one yielding the simplest and fastest queries.
Research limitations/implications
Limitations to this research are that only one graph database was studied and also improvements to the query engine might change future results.
Originality/value
In the study, we addressed the issue of storing timestamps in graph databases in a meaningful, practical and efficient way. The results can be used as a pattern for similar scenarios and applications.
Details
Keywords
As a relatively new computing paradigm, crowdsourcing has gained enormous attention in the recent decade. Its compliance with the Web 2.0 principles, also, puts forward…
Abstract
Purpose
As a relatively new computing paradigm, crowdsourcing has gained enormous attention in the recent decade. Its compliance with the Web 2.0 principles, also, puts forward unprecedented opportunities to empower the related services and mechanisms by leveraging humans’ intelligence and problem solving abilities. With respect to the pivotal role of search engines in the Web and information community, this paper aims to investigate the advantages and challenges of incorporating people – as intelligent agents – into search engines’ workflow.
Design/methodology/approach
To emphasize the role of the human in computational processes, some specific and related areas are studied. Then, through studying the current trends in the field of crowd-powered search engines and analyzing the actual needs and requirements, the perspectives and challenges are discussed.
Findings
As the research on this topic is still in its infancy, it is believed that this study can be considered as a roadmap for future works in the field. In this regard, current status and development trends are delineated through providing a general overview of the literature. Moreover, several recommendations for extending the applicability and efficiency of next generation of crowd-powered search engines are presented. In fact, becoming aware of different aspects and challenges of constructing search engines of this kind can shed light on the way of developing working systems with respect to essential considerations.
Originality/value
The present study was aimed to portrait the big picture of crowd-powered search engines and possible challenges and issues. As one of the early works that provided a comprehensive report on different aspects of the topic, it can be regarded as a reference point.
Details
Keywords
Tim Gorichanaz, Jonathan Furner, Lai Ma, David Bawden, Lyn Robinson, Dominic Dixon, Ken Herold, Sille Obelitz Søe, Betsy Van der Veer Martens and Luciano Floridi
The purpose of this paper is to review and discuss Luciano Floridi’s 2019 book The Logic of Information: A Theory of Philosophy as Conceptual Design, the latest instalment in his…
Abstract
Purpose
The purpose of this paper is to review and discuss Luciano Floridi’s 2019 book The Logic of Information: A Theory of Philosophy as Conceptual Design, the latest instalment in his philosophy of information (PI) tetralogy, particularly with respect to its implications for library and information studies (LIS).
Design/methodology/approach
Nine scholars with research interests in philosophy and LIS read and responded to the book, raising critical and heuristic questions in the spirit of scholarly dialogue. Floridi responded to these questions.
Findings
Floridi’s PI, including this latest publication, is of interest to LIS scholars, and much insight can be gained by exploring this connection. It seems also that LIS has the potential to contribute to PI’s further development in some respects.
Research limitations/implications
Floridi’s PI work is technical philosophy for which many LIS scholars do not have the training or patience to engage with, yet doing so is rewarding. This suggests a role for translational work between philosophy and LIS.
Originality/value
The book symposium format, not yet seen in LIS, provides forum for sustained, multifaceted and generative dialogue around ideas.
Details
Keywords
Martin Nečaský, Petr Škoda, David Bernhauer, Jakub Klímek and Tomáš Skopal
Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking…
Abstract
Purpose
Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking the luxury of centralized database administration, database schemes, shared attributes, vocabulary, structure and semantics. The existing dataset catalogs provide basic search functionality relying on keyword search in brief, incomplete or misleading textual metadata attached to the datasets. The search results are thus often insufficient. However, there exist many ways of improving the dataset discovery by employing content-based retrieval, machine learning tools, third-party (external) knowledge bases, countless feature extraction methods and description models and so forth.
Design/methodology/approach
In this paper, the authors propose a modular framework for rapid experimentation with methods for similarity-based dataset discovery. The framework consists of an extensible catalog of components prepared to form custom pipelines for dataset representation and discovery.
Findings
The study proposes several proof-of-concept pipelines including experimental evaluation, which showcase the usage of the framework.
Originality/value
To the best of authors’ knowledge, there is no similar formal framework for experimentation with various similarity methods in the context of dataset discovery. The framework has the ambition to establish a platform for reproducible and comparable research in the area of dataset discovery. The prototype implementation of the framework is available on GitHub.
Details
Keywords
Bachriah Fatwa Dhini, Abba Suganda Girsang, Unggul Utan Sufandi and Heny Kurniawati
The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes…
Abstract
Purpose
The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes essay scoring, which is conducted through two parameters, semantic and keyword similarities, using a SentenceTransformers pre-trained model that can construct the highest vector embedding. Combining these models is used to optimize the model with increasing accuracy.
Design/methodology/approach
The development of the model in the study is divided into seven stages: (1) data collection, (2) pre-processing data, (3) selected pre-trained SentenceTransformers model, (4) semantic similarity (sentence pair), (5) keyword similarity, (6) calculate final score and (7) evaluating model.
Findings
The multilingual paraphrase-multilingual-MiniLM-L12-v2 and distilbert-base-multilingual-cased-v1 models got the highest scores from comparisons of 11 pre-trained multilingual models of SentenceTransformers with Indonesian data (Dhini and Girsang, 2023). Both multilingual models were adopted in this study. A combination of two parameters is obtained by comparing the response of the keyword extraction responses with the rubric keywords. Based on the experimental results, proposing a combination can increase the evaluation results by 0.2.
Originality/value
This study uses discussion forum data from the general biology course in online learning at the open university for the 2020.2 and 2021.2 semesters. Forum discussion ratings are still manual. In this survey, the authors created a model that automatically calculates the value of discussion forums, which are essays based on the lecturer's answers moreover rubrics.
Details