Search results

1 – 10 of 18
Article
Publication date: 16 August 2022

Jung Ran Park, Erik Poole and Jiexun Li

The purpose of this study is to explore linguistic stylometric patterns encompassing lexical, syntactic, structural, sentiment and politeness features that are found in…

Abstract

Purpose

The purpose of this study is to explore linguistic stylometric patterns encompassing lexical, syntactic, structural, sentiment and politeness features that are found in librarians’ responses to user queries.

Design/methodology/approach

A total of 462 online texts/transcripts comprising answers of librarians to users’ questions drawn from the Internet Public Library were examined. A Principal Component Analysis, which is a data reduction technique, was conducted on the texts and transcripts. Data analysis illustrates the three principal components that predominantly occur in librarians’ answers: stylometric richness, stylometric brevity and interpersonal support.

Findings

The results of the study have important implications in digital information services because stylometric features such as lexical richness, structural clarity and interpersonal support may interplay with the degree of complexity of user queries, the (a)synchronous communication mode, application of information service guideline and manuals and overall characteristics and quality of a given digital information service. Such interplay may bring forth a direct impact on user perceptions and satisfaction regarding interaction with librarians and the information service received through the computer-mediated communication channel.

Originality/value

To the best of the authors’ knowledge, the stylometric features encompassing lexical, syntactic, structural, sentiment and politeness using Principal Component Analysis have not been explored in digital information/reference services. Thus, there is an emergent need to explore more fully how linguistic stylometric features interplay with the types of user queries, the asynchronous online communication mode, application of information service guidelines and the quality of a particular digital information service.

Details

Global Knowledge, Memory and Communication, vol. 73 no. 3
Type: Research Article
ISSN: 2514-9342

Keywords

Book part
Publication date: 28 March 2024

Margarethe Born Steinberger-Elias

In times of crisis, such as the Covid-19 global pandemic, journalists who write about biomedical information must have the strategic aim to be clearly and easily understood by…

Abstract

In times of crisis, such as the Covid-19 global pandemic, journalists who write about biomedical information must have the strategic aim to be clearly and easily understood by everyone. In this study, we assume that journalistic discourse could benefit from language redundancy to improve clarity and simplicity aimed at science popularization. The concept of language redundancy is theoretically discussed with the support of discourse analysis and information theory. The methodology adopted is a corpus-based qualitative approach. Two corpora samples with Brazilian Portuguese (BP) texts on Covid-19 were collected. One with texts from a monthly science digital magazine called Pesquisa FAPESP aimed at students and researchers for scientific information dissemination and the other with popular language texts from a news Portal G1 (Rede Globo) aimed at unspecified and/or non-specialized readers. The materials were filtered with two descriptors: “vaccine” and “test.” Preliminary analysis of examples from these materials revealed two categories of redundancy: paraphrastic and polysemic. Paraphrastic redundancy is based on concomitant language reformulation of words, sentences, text excerpts, or even larger units. Polysemic redundancy does not easily show material evidence, but is based on cognitively predictable semantic association in socio-cultural domains. Both kinds of redundancy contribute, each in their own way, to improving text readability for science popularization in Brazil.

Details

Geo Spaces of Communication Research
Type: Book
ISBN: 978-1-80071-606-3

Keywords

Book part
Publication date: 28 March 2024

Julien Figeac, Nathalie Paton, Angelina Peralva, Arthur Coelho Bezerra, Héloïse Prévost, Pierre Ratinaud and Tristan Salord

Based on a lexical analysis of publications on 529 Facebook pages, published between 2013 and 2017, this research explores how Brazilian left-wing activist groups participate on…

Abstract

Based on a lexical analysis of publications on 529 Facebook pages, published between 2013 and 2017, this research explores how Brazilian left-wing activist groups participate on Facebook to coordinate their opposition and engage in social struggles. This chapter shows how activist groups set up two main digital network repertoires of action when mobilizing on Facebook. First, in direct connection with major political events, the platform is used as a media arena to challenge governments’ political actions and second, it is employed as a tool to coordinate mobilization, whether these mobilizations are demonstrations on the street or at cultural events, such as at a music concert. These repertoires of action exemplify ways in which contemporary Brazilian activism is carried out at the intersection of online and offline engagements. While participants engage through these two repertoires, this network of activists is held together over time through a more mundane type of event, pertaining to the repertoire of action allowing the organization of mobilization. Stepping aside from opposition and struggles brought to the streets, the organization of cultural activities, such as concerts and exhibitions, punctuates the everyday exchanges in activists’ communications. Talk about cultural events and their related social agendas structures activist networks on a medium-term basis and creates the conditions for the coordination of (future) social movements, in that they offer the opportunities to stay in contact, in addition to taking part in occasional gatherings, between more highly visible social protests.

Details

Geo Spaces of Communication Research
Type: Book
ISBN: 978-1-80071-606-3

Keywords

Article
Publication date: 20 December 2022

Javaid Ahmad Wani and Shabir Ahmad Ganaie

The current study aims to map the scientific output of grey literature (GL) through bibliometric approaches.

Abstract

Purpose

The current study aims to map the scientific output of grey literature (GL) through bibliometric approaches.

Design/methodology/approach

The source for data extraction is a comprehensive “indexing and abstracting” database, “Web of Science” (WOS). A lexical title search was applied to get the corpus of the study – a total of 4,599 articles were extracted for data analysis and visualisation. Further, the data were analysed by using the data analytical tools, R-studio and VOSViewer.

Findings

The findings showed that the “publications” have substantially grown up during the timeline. The most productive phase (2018–2021) resulted in 47% of articles. The prominent sources were PLOS One and NeuroImage. The highest number of papers were contributed by Haddaway and Kumar. The most relevant countries were the USA and UK.

Practical implications

The study is useful for researchers interested in the GL research domain. The study helps to understand the evolution of the GL to provide research support further in this area.

Originality/value

The present study provides a new orientation to the scholarly output of the GL. The study is rigorous and all-inclusive based on analytical operations like the research networks, collaboration and visualisation. To the best of the authors' knowledge, this manuscript is original, and no similar works have been found with the research objectives included here.

Details

Library Hi Tech, vol. 42 no. 1
Type: Research Article
ISSN: 0737-8831

Keywords

Open Access
Article
Publication date: 31 July 2023

Daniel Šandor and Marina Bagić Babac

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning…

2914

Abstract

Purpose

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning. It is mainly distinguished by the inflection with which it is spoken, with an undercurrent of irony, and is largely dependent on context, which makes it a difficult task for computational analysis. Moreover, sarcasm expresses negative sentiments using positive words, allowing it to easily confuse sentiment analysis models. This paper aims to demonstrate the task of sarcasm detection using the approach of machine and deep learning.

Design/methodology/approach

For the purpose of sarcasm detection, machine and deep learning models were used on a data set consisting of 1.3 million social media comments, including both sarcastic and non-sarcastic comments. The data set was pre-processed using natural language processing methods, and additional features were extracted and analysed. Several machine learning models, including logistic regression, ridge regression, linear support vector and support vector machines, along with two deep learning models based on bidirectional long short-term memory and one bidirectional encoder representations from transformers (BERT)-based model, were implemented, evaluated and compared.

Findings

The performance of machine and deep learning models was compared in the task of sarcasm detection, and possible ways of improvement were discussed. Deep learning models showed more promise, performance-wise, for this type of task. Specifically, a state-of-the-art model in natural language processing, namely, BERT-based model, outperformed other machine and deep learning models.

Originality/value

This study compared the performance of the various machine and deep learning models in the task of sarcasm detection using the data set of 1.3 million comments from social media.

Details

Information Discovery and Delivery, vol. 52 no. 2
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 12 February 2024

Júlia Quintino Sant’Ana, Linda Jessica De Montreuil Carmona and Giancarlo Gomes

This study aims to answer the following research question: What are the opportunities for future research concerning the Frugal Innovation (FI) phenomenon? To address this, the…

Abstract

Purpose

This study aims to answer the following research question: What are the opportunities for future research concerning the Frugal Innovation (FI) phenomenon? To address this, the authors propose a novel approach to literature review on the topic. They do so in view of synthesising scholars’ recommendations for subsequent studies. They also advocate that it is time to contribute to the establishment of the FI field by mapping the future of this approach.

Design/methodology/approach

The authors conducted a systematic literature review (SLR) to connect past and future research on FI. After the screening process of the documents extracted from multiple databases, they performed a bibliometric analysis to provide an overview of the field. Furthermore, the lexical analysis and descending hierarchical analysis were generated through the IRAMUTEQ software to identify the clusters for future research on FI.

Findings

This research not only demonstrates the current state of the art of FI literature but also identifies a research agenda with six categories of opportunities for further studies on the topic: frugal consumer behaviour; establishment of the field; sustainable impact; approaches to different contexts; implementation processes; and challenges for value creation.

Originality/value

The FI phenomenon is receiving increasing attention from scholars in the management field due to its socioeconomic and managerial implications, especially after the Covid-19 outbreak. Therefore, the findings benefit scholars striving to expand the scope of FI research, as well as entrepreneurs, managers and organisations aiming to enhance their social responsibility to reduce their environmental impact.

Details

International Journal of Innovation Science, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1757-2223

Keywords

Article
Publication date: 5 May 2023

Ying Yu and Jing Ma

The tender documents, an essential data source for internet-based logistics tendering platforms, incorporate massive fine-grained data, ranging from information on tenderee…

Abstract

Purpose

The tender documents, an essential data source for internet-based logistics tendering platforms, incorporate massive fine-grained data, ranging from information on tenderee, shipping location and shipping items. Automated information extraction in this area is, however, under-researched, making the extraction process a time- and effort-consuming one. For Chinese logistics tender entities, in particular, existing named entity recognition (NER) solutions are mostly unsuitable as they involve domain-specific terminologies and possess different semantic features.

Design/methodology/approach

To tackle this problem, a novel lattice long short-term memory (LSTM) model, combining a variant contextual feature representation and a conditional random field (CRF) layer, is proposed in this paper for identifying valuable entities from logistic tender documents. Instead of traditional word embedding, the proposed model uses the pretrained Bidirectional Encoder Representations from Transformers (BERT) model as input to augment the contextual feature representation. Subsequently, with the Lattice-LSTM model, the information of characters and words is effectively utilized to avoid error segmentation.

Findings

The proposed model is then verified by the Chinese logistic tender named entity corpus. Moreover, the results suggest that the proposed model excels in the logistics tender corpus over other mainstream NER models. The proposed model underpins the automatic extraction of logistics tender information, enabling logistic companies to perceive the ever-changing market trends and make far-sighted logistic decisions.

Originality/value

(1) A practical model for logistic tender NER is proposed in the manuscript. By employing and fine-tuning BERT into the downstream task with a small amount of data, the experiment results show that the model has a better performance than other existing models. This is the first study, to the best of the authors' knowledge, to extract named entities from Chinese logistic tender documents. (2) A real logistic tender corpus for practical use is constructed and a program of the model for online-processing real logistic tender documents is developed in this work. The authors believe that the model will facilitate logistic companies in converting unstructured documents to structured data and further perceive the ever-changing market trends to make far-sighted logistic decisions.

Details

Data Technologies and Applications, vol. 58 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 7 July 2023

Wuyan Liang and Xiaolong Xu

In the COVID-19 era, sign language (SL) translation has gained attention in online learning, which evaluates the physical gestures of each student and bridges the communication…

Abstract

Purpose

In the COVID-19 era, sign language (SL) translation has gained attention in online learning, which evaluates the physical gestures of each student and bridges the communication gap between dysphonia and hearing people. The purpose of this paper is to devote the alignment between SL sequence and nature language sequence with high translation performance.

Design/methodology/approach

SL can be characterized as joint/bone location information in two-dimensional space over time, forming skeleton sequences. To encode joint, bone and their motion information, we propose a multistream hierarchy network (MHN) along with a vocab prediction network (VPN) and a joint network (JN) with the recurrent neural network transducer. The JN is used to concatenate the sequences encoded by the MHN and VPN and learn their sequence alignments.

Findings

We verify the effectiveness of the proposed approach and provide experimental results on three large-scale datasets, which show that translation accuracy is 94.96, 54.52, and 92.88 per cent, and the inference time is 18 and 1.7 times faster than listen-attend-spell network (LAS) and visual hierarchy to lexical sequence network (H2SNet) , respectively.

Originality/value

In this paper, we propose a novel framework that can fuse multimodal input (i.e. joint, bone and their motion stream) and align input streams with nature language. Moreover, the provided framework is improved by the different properties of MHN, VPN and JN. Experimental results on the three datasets demonstrate that our approaches outperform the state-of-the-art methods in terms of translation accuracy and speed.

Details

Data Technologies and Applications, vol. 58 no. 2
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 3 October 2023

Anna Sokolova, Polina Lobanova and Ilya Kuzminov

The purpose of the paper is to present an integrated methodology for identifying trends in a particular subject area based on a combination of advanced text mining and expert…

Abstract

Purpose

The purpose of the paper is to present an integrated methodology for identifying trends in a particular subject area based on a combination of advanced text mining and expert methods. The authors aim to test it in an area of clinical psychology and psychotherapy in 2010–2019.

Design/methodology/approach

The authors demonstrate the way of applying text-mining and the Word2Vec model to identify hot topics (HT) and emerging trends (ET) in clinical psychology and psychotherapy. The analysis of 11.3 million scientific publications in the Microsoft Academic Graph database revealed the most rapidly growing clinical psychology and psychotherapy terms – those with the largest increase in the number of publications reflecting real or potential trends.

Findings

The proposed approach allows one to identify HT and ET for the six thematic clusters related to mental disorders, symptoms, pharmacology, psychotherapy, treatment techniques and important psychological skills.

Practical implications

The developed methodology allows one to see the broad picture of the most dynamic research areas in the field of clinical psychology and psychotherapy in 2010–2019. For clinicians, who are often overwhelmed by practical work, this map of the current research can help identify the areas worthy of further attention to improve the effectiveness of their clinical work. This methodology might be applied for the identification of trends in any other subject area by taking into account its specificity.

Originality/value

The paper demonstrates the value of the advanced text-mining approach for understanding trends in a subject area. To the best of the authors’ knowledge, for the first time, text-mining and the Word2Vec model have been applied to identifying trends in the field of clinical psychology and psychotherapy.

Details

foresight, vol. 26 no. 1
Type: Research Article
ISSN: 1463-6689

Keywords

Article
Publication date: 6 February 2024

Somayeh Tamjid, Fatemeh Nooshinfard, Molouk Sadat Hosseini Beheshti, Nadjla Hariri and Fahimeh Babalhavaeji

The purpose of this study is to develop a domain independent, cost-effective, time-saving and semi-automated ontology generation framework that could extract taxonomic concepts…

Abstract

Purpose

The purpose of this study is to develop a domain independent, cost-effective, time-saving and semi-automated ontology generation framework that could extract taxonomic concepts from unstructured text corpus. In the human disease domain, ontologies are found to be extremely useful for managing the diversity of technical expressions in favour of information retrieval objectives. The boundaries of these domains are expanding so fast that it is essential to continuously develop new ontologies or upgrade available ones.

Design/methodology/approach

This paper proposes a semi-automated approach that extracts entities/relations via text mining of scientific publications. Text mining-based ontology (TmbOnt)-named code is generated to assist a user in capturing, processing and establishing ontology elements. This code takes a pile of unstructured text files as input and projects them into high-valued entities or relations as output. As a semi-automated approach, a user supervises the process, filters meaningful predecessor/successor phrases and finalizes the demanded ontology-taxonomy. To verify the practical capabilities of the scheme, a case study was performed to drive glaucoma ontology-taxonomy. For this purpose, text files containing 10,000 records were collected from PubMed.

Findings

The proposed approach processed over 3.8 million tokenized terms of those records and yielded the resultant glaucoma ontology-taxonomy. Compared with two famous disease ontologies, TmbOnt-driven taxonomy demonstrated a 60%–100% coverage ratio against famous medical thesauruses and ontology taxonomies, such as Human Disease Ontology, Medical Subject Headings and National Cancer Institute Thesaurus, with an average of 70% additional terms recommended for ontology development.

Originality/value

According to the literature, the proposed scheme demonstrated novel capability in expanding the ontology-taxonomy structure with a semi-automated text mining approach, aiming for future fully-automated approaches.

Details

The Electronic Library , vol. 42 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

1 – 10 of 18