Search results

1 – 10 of over 1000
Open Access
Article
Publication date: 31 July 2023

Daniel Šandor and Marina Bagić Babac

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning…

2941

Abstract

Purpose

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning. It is mainly distinguished by the inflection with which it is spoken, with an undercurrent of irony, and is largely dependent on context, which makes it a difficult task for computational analysis. Moreover, sarcasm expresses negative sentiments using positive words, allowing it to easily confuse sentiment analysis models. This paper aims to demonstrate the task of sarcasm detection using the approach of machine and deep learning.

Design/methodology/approach

For the purpose of sarcasm detection, machine and deep learning models were used on a data set consisting of 1.3 million social media comments, including both sarcastic and non-sarcastic comments. The data set was pre-processed using natural language processing methods, and additional features were extracted and analysed. Several machine learning models, including logistic regression, ridge regression, linear support vector and support vector machines, along with two deep learning models based on bidirectional long short-term memory and one bidirectional encoder representations from transformers (BERT)-based model, were implemented, evaluated and compared.

Findings

The performance of machine and deep learning models was compared in the task of sarcasm detection, and possible ways of improvement were discussed. Deep learning models showed more promise, performance-wise, for this type of task. Specifically, a state-of-the-art model in natural language processing, namely, BERT-based model, outperformed other machine and deep learning models.

Originality/value

This study compared the performance of the various machine and deep learning models in the task of sarcasm detection using the data set of 1.3 million comments from social media.

Details

Information Discovery and Delivery, vol. 52 no. 2
Type: Research Article
ISSN: 2398-6247

Keywords

Open Access
Article
Publication date: 24 June 2021

Bo Wang, Guanwei Wang, Youwei Wang, Zhengzheng Lou, Shizhe Hu and Yangdong Ye

Vehicle fault diagnosis is a key factor in ensuring the safe and efficient operation of the railway system. Due to the numerous vehicle categories and different fault mechanisms…

Abstract

Purpose

Vehicle fault diagnosis is a key factor in ensuring the safe and efficient operation of the railway system. Due to the numerous vehicle categories and different fault mechanisms, there is an unbalanced fault category problem. Most of the current methods to solve this problem have complex algorithm structures, low efficiency and require prior knowledge. This study aims to propose a new method which has a simple structure and does not require any prior knowledge to achieve a fast diagnosis of unbalanced vehicle faults.

Design/methodology/approach

This study proposes a novel K-means with feature learning based on the feature learning K-means-improved cluster-centers selection (FKM-ICS) method, which includes the ICS and the FKM. Specifically, this study defines cluster centers approximation to select the initialized cluster centers in the ICS. This study uses improved term frequency-inverse document frequency to measure and adjust the feature word weights in each cluster, retaining the top τ feature words with the highest weight in each cluster and perform the clustering process again in the FKM. With the FKM-ICS method, clustering performance for unbalanced vehicle fault diagnosis can be significantly enhanced.

Findings

This study finds that the FKM-ICS can achieve a fast diagnosis of vehicle faults on the vehicle fault text (VFT) data set from a railway station in the 2017 (VFT) data set. The experimental results on VFT indicate the proposed method in this paper, outperforms several state-of-the-art methods.

Originality/value

This is the first effort to address the vehicle fault diagnostic problem and the proposed method performs effectively and efficiently. The ICS enables the FKM-ICS method to exclude the effect of outliers, solves the disadvantages of the fault text data contained a certain amount of noisy data, which effectively enhanced the method stability. The FKM enhances the distribution of feature words that discriminate between different fault categories and reduces the number of feature words to make the FKM-ICS method faster and better cluster for unbalanced vehicle fault diagnostic.

Details

Smart and Resilient Transportation, vol. 3 no. 2
Type: Research Article
ISSN: 2632-0487

Keywords

Open Access
Article
Publication date: 13 October 2022

Linzi Wang, Qiudan Li, Jingjun David Xu and Minjie Yuan

Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models…

379

Abstract

Purpose

Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models primarily integrate word embedding and matrix decomposition, which only generates keyword-based hot topics with weak interpretability, making it difficult to meet the specific needs of users. Mining phrase-based hot topics with syntactic dependency structure have been proven to model structure information effectively. A key challenge lies in the effective integration of the above information into the hot topic mining process.

Design/methodology/approach

This paper proposes the nonnegative matrix factorization (NMF)-based hot topic mining method, semantics syntax-assisted hot topic model (SSAHM), which combines semantic association and syntactic dependency structure. First, a semantic–syntactic component association matrix is constructed. Then, the matrix is used as a constraint condition to be incorporated into the block coordinate descent (BCD)-based matrix decomposition process. Finally, a hot topic information-driven phrase extraction algorithm is applied to describe hot topics.

Findings

The efficacy of the developed model is demonstrated on two real-world datasets, and the effects of dependency structure information on different topics are compared. The qualitative examples further explain the application of the method in real scenarios.

Originality/value

Most prior research focuses on keyword-based hot topics. Thus, the literature is advanced by mining phrase-based hot topics with syntactic dependency structure, which can effectively analyze the semantics. The development of syntactic dependency structure considering the combination of word order and part-of-speech (POS) is a step forward as word order, and POS are only separately utilized in the prior literature. Ignoring this synergy may miss important information, such as grammatical structure coherence and logical relations between syntactic components.

Details

Journal of Electronic Business & Digital Economics, vol. 1 no. 1/2
Type: Research Article
ISSN: 2754-4214

Keywords

Open Access
Article
Publication date: 10 July 2023

Réka Tamássy, Zsuzsanna Géring, Gábor Király, Réka Plugor and Márton Rakovics

This study aims to investigate how highly ranked business schools portray ideal students in terms of their attributes and their agency. Understanding how these higher education…

Abstract

Purpose

This study aims to investigate how highly ranked business schools portray ideal students in terms of their attributes and their agency. Understanding how these higher education institutions (HEIs) discursively construct their present and prospective students also shed light on the institutions’ self-representation, the portrayal of the student–institution relationship and eventually the discursive construction of higher education’s (HE) role.

Design/methodology/approach

To understand this dynamic interrelationship, this study uses mixed methodological textual analysis first quantitatively identifying different modes of language use and then qualitatively analysing them.

Findings

With this approach, this study identified six language use groups. While the portrayal of the business schools and that of the students are always co-constructed, these groups differ in the extent of student and organisational agency displayed as well as the role and purpose of the institution. Business schools are always active agents in these discourses, but their roles and the students’ agency vary greatly across these six groups.

Practical implications

These findings can help practitioners determine how students are currently portrayed in their organisational texts, how their peers and competitors talk and where they want to position themselves in relation to them.

Originality/value

Previous studies discussed the ideal HE students from the perspective of the students or their educators. Other analyses on HE discourse focused on HEIs’ discursive construction and social role This study, however, unveils how the highly ranked business schools in their external organisational communication discursively construct their ideals and expectations for both their students and the general public.

Details

Journal of International Education in Business, vol. 17 no. 1
Type: Research Article
ISSN: 2046-469X

Keywords

Open Access
Article
Publication date: 8 December 2020

Matjaž Kragelj and Mirjana Kljajić Borštnar

The purpose of this study is to develop a model for automated classification of old digitised texts to the Universal Decimal Classification (UDC), using machine-learning methods.

2889

Abstract

Purpose

The purpose of this study is to develop a model for automated classification of old digitised texts to the Universal Decimal Classification (UDC), using machine-learning methods.

Design/methodology/approach

The general research approach is inherent to design science research, in which the problem of UDC assignment of the old, digitised texts is addressed by developing a machine-learning classification model. A corpus of 70,000 scholarly texts, fully bibliographically processed by librarians, was used to train and test the model, which was used for classification of old texts on a corpus of 200,000 items. Human experts evaluated the performance of the model.

Findings

Results suggest that machine-learning models can correctly assign the UDC at some level for almost any scholarly text. Furthermore, the model can be recommended for the UDC assignment of older texts. Ten librarians corroborated this on 150 randomly selected texts.

Research limitations/implications

The main limitations of this study were unavailability of labelled older texts and the limited availability of librarians.

Practical implications

The classification model can provide a recommendation to the librarians during their classification work; furthermore, it can be implemented as an add-on to full-text search in the library databases.

Social implications

The proposed methodology supports librarians by recommending UDC classifiers, thus saving time in their daily work. By automatically classifying older texts, digital libraries can provide a better user experience by enabling structured searches. These contribute to making knowledge more widely available and useable.

Originality/value

These findings contribute to the field of automated classification of bibliographical information with the usage of full texts, especially in cases in which the texts are old, unstructured and in which archaic language and vocabulary are used.

Details

Journal of Documentation, vol. 77 no. 3
Type: Research Article
ISSN: 0022-0418

Keywords

Open Access
Article
Publication date: 26 April 2018

Reijo Savolainen

The purpose of this paper is to clarify the conceptual issues of information behaviour research by reviewing the approaches to information interaction in the context of…

8067

Abstract

Purpose

The purpose of this paper is to clarify the conceptual issues of information behaviour research by reviewing the approaches to information interaction in the context of information seeking and retrieval (IS&R).

Design/methodology/approach

The study uses the conceptual analysis focussing on four pioneering models for interactive IS&R proposed by Belkin, Ingwersen and Ingwersen and Järvelin.

Findings

A main characteristic of models for information interaction is the tripartite setting identifying information resources accessible through information systems, intermediary/interface and user. Dialogue is a fundamental constituent of information interaction. Early models proposed by Belkin and Ingwersen focussed on the dialogue occurring in user-intermediary interaction, while more recent frameworks developed by Ingwersen and Järvelin devote more attention to dialogue constitutive of user-information system interaction.

Research limitations/implications

As the study focusses on four models developed within the period of 1984-2005, the findings cannot be generalised to depict the phenomena of information interaction as a whole. Further research is needed to model the specific features of information interaction occurring in the networked information environments in particular.

Originality/value

The study pioneers by providing an in-depth analysis of the ways in which pioneering researchers have conceptualised the phenomena of interaction in the context of IS&R. The findings contribute to the elaboration of the conceptual space of information behaviour research.

Open Access
Article
Publication date: 29 June 2018

Jonathan Simões Freitas, Jéssica Castilho Andrade Ferreira, André Azevedo Rennó Campos, Júlio Cézar Fonseca de Melo, Lin Chih Cheng and Carlos Alberto Gonçalves

This paper aims to map the creation and evolution of centering resonance analysis (CRA). This method was an innovative approach developed to conduct textual content analysis in a…

1295

Abstract

Purpose

This paper aims to map the creation and evolution of centering resonance analysis (CRA). This method was an innovative approach developed to conduct textual content analysis in a semi-automatic, theory-informed and analytically rigorous way. Nevertheless, despite its robust procedures to analyze documents and interviews, CRA is still broadly unknown and scarcely used in management research.

Design/methodology/approach

To track CRA’s development, the roadmapping approach was properly adapted. The traditional time-based multi-layered map format was customized to depict, graphically, the results obtained from a systematic literature review of the main CRA publications.

Findings

In total, 19 papers were reviewed, from the method’s introduction in 2002 to its last tracked methodological development. In all, 26 types of CRA analysis were identified and grouped in five categories. The most innovative procedures in each group were discussed and exemplified. Finally, a CRA methodological roadmap was presented, including a layered typology of the publications, in terms of their focus and innovativeness; the number of analysis conducted in each publication; references for further CRA development; a segmentation and description of the main publication periods; main turning points; citation-based relationships; and four possible future scenarios for CRA as a method.

Originality/value

This paper offers a unique and comprehensive review of CRA’s development, favoring its broader use in management research. In addition, it develops an adapted version of the roadmapping approach, customized for mapping methodological innovations over time.

Details

RAUSP Management Journal, vol. 53 no. 3
Type: Research Article
ISSN: 2531-0488

Keywords

Open Access
Article
Publication date: 11 October 2023

Bachriah Fatwa Dhini, Abba Suganda Girsang, Unggul Utan Sufandi and Heny Kurniawati

The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes…

Abstract

Purpose

The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes essay scoring, which is conducted through two parameters, semantic and keyword similarities, using a SentenceTransformers pre-trained model that can construct the highest vector embedding. Combining these models is used to optimize the model with increasing accuracy.

Design/methodology/approach

The development of the model in the study is divided into seven stages: (1) data collection, (2) pre-processing data, (3) selected pre-trained SentenceTransformers model, (4) semantic similarity (sentence pair), (5) keyword similarity, (6) calculate final score and (7) evaluating model.

Findings

The multilingual paraphrase-multilingual-MiniLM-L12-v2 and distilbert-base-multilingual-cased-v1 models got the highest scores from comparisons of 11 pre-trained multilingual models of SentenceTransformers with Indonesian data (Dhini and Girsang, 2023). Both multilingual models were adopted in this study. A combination of two parameters is obtained by comparing the response of the keyword extraction responses with the rubric keywords. Based on the experimental results, proposing a combination can increase the evaluation results by 0.2.

Originality/value

This study uses discussion forum data from the general biology course in online learning at the open university for the 2020.2 and 2021.2 semesters. Forum discussion ratings are still manual. In this survey, the authors created a model that automatically calculates the value of discussion forums, which are essays based on the lecturer's answers moreover rubrics.

Details

Asian Association of Open Universities Journal, vol. 18 no. 3
Type: Research Article
ISSN: 1858-3431

Keywords

Open Access
Article
Publication date: 6 September 2021

Gerd Hübscher, Verena Geist, Dagmar Auer, Nicole Hübscher and Josef Küng

Knowledge- and communication-intensive domains still long for a better support of creativity that considers legal requirements, compliance rules and administrative tasks as well…

880

Abstract

Purpose

Knowledge- and communication-intensive domains still long for a better support of creativity that considers legal requirements, compliance rules and administrative tasks as well, because current systems focus either on knowledge representation or business process management. The purpose of this paper is to discuss our model of integrated knowledge and business process representation and its presentation to users.

Design/methodology/approach

The authors follow a design science approach in the environment of patent prosecution, which is characterized by a highly standardized, legally prescribed process and individual knowledge study. Thus, the research is based on knowledge study, BPM, graph-based knowledge representation and user interface design. The authors iteratively designed and built a model and a prototype. To evaluate the approach, the authors used analytical proof of concept, real-world test scenarios and case studies in real-world settings, where the authors conducted observations and open interviews.

Findings

The authors designed a model and implemented a prototype for evolving and storing static and dynamic aspects of knowledge. The proposed solution leverages the flexibility of a graph-based model to enable open and not only continuously developing user-centered processes but also pre-defined ones. The authors further propose a user interface concept which supports users to benefit from the richness of the model but provides sufficient guidance.

Originality/value

The balanced integration of the data and task perspectives distinguishes the model significantly from other approaches such as BPM or knowledge graphs. The authors further provide a sophisticated user interface design, which allows the users to effectively and efficiently use the graph-based knowledge representation in their daily study.

Details

International Journal of Web Information Systems, vol. 17 no. 6
Type: Research Article
ISSN: 1744-0084

Keywords

Open Access
Article
Publication date: 14 July 2020

Yuning Zhao, Xinxue Zhou and Tianmei Wang

Following Hovland’s persuasion theory, this paper aims to develop a conceptual model and analyzes characteristics of online political deliberation behavior from three aspects…

1306

Abstract

Purpose

Following Hovland’s persuasion theory, this paper aims to develop a conceptual model and analyzes characteristics of online political deliberation behavior from three aspects (i.e. information, situation and manager). Based on the whole interactive process of online political deliberation, this paper aims to reveal the key points that affect the response effect of the government from the persuasive perspective of online political consultation.

Design/methodology/approach

Based on more than 40,000 netizens’ posts and government responses from 2011 to the first half of 2019 of the Chinese political platform, this paper used the text analysis and machine learning methods to extract measurement variables of online political deliberation characteristics and the econometrics analysis method to conduct empirical research.

Findings

The results showed that the textual information, political environment and identity of the political objects affect the effectiveness of government response. Furthermore, for different position categories of political officials, the length of political texts, topic categories and emotional tendencies have different effects on the response effectiveness. Additionally, the effect of political time on the effectiveness of response differs.

Originality/value

The findings will help ascertain the characteristics of online political deliberation behavior that affect how effective government response is and provide a theoretical basis for why the public should express their political concerns.

Details

International Journal of Crowd Science, vol. 4 no. 3
Type: Research Article
ISSN: 2398-7294

Keywords

1 – 10 of over 1000