Search results

1 – 10 of 17
Article
Publication date: 16 August 2022

Jung Ran Park, Erik Poole and Jiexun Li

The purpose of this study is to explore linguistic stylometric patterns encompassing lexical, syntactic, structural, sentiment and politeness features that are found in…

Abstract

Purpose

The purpose of this study is to explore linguistic stylometric patterns encompassing lexical, syntactic, structural, sentiment and politeness features that are found in librarians’ responses to user queries.

Design/methodology/approach

A total of 462 online texts/transcripts comprising answers of librarians to users’ questions drawn from the Internet Public Library were examined. A Principal Component Analysis, which is a data reduction technique, was conducted on the texts and transcripts. Data analysis illustrates the three principal components that predominantly occur in librarians’ answers: stylometric richness, stylometric brevity and interpersonal support.

Findings

The results of the study have important implications in digital information services because stylometric features such as lexical richness, structural clarity and interpersonal support may interplay with the degree of complexity of user queries, the (a)synchronous communication mode, application of information service guideline and manuals and overall characteristics and quality of a given digital information service. Such interplay may bring forth a direct impact on user perceptions and satisfaction regarding interaction with librarians and the information service received through the computer-mediated communication channel.

Originality/value

To the best of the authors’ knowledge, the stylometric features encompassing lexical, syntactic, structural, sentiment and politeness using Principal Component Analysis have not been explored in digital information/reference services. Thus, there is an emergent need to explore more fully how linguistic stylometric features interplay with the types of user queries, the asynchronous online communication mode, application of information service guidelines and the quality of a particular digital information service.

Details

Global Knowledge, Memory and Communication, vol. 73 no. 3
Type: Research Article
ISSN: 2514-9342

Keywords

Book part
Publication date: 28 March 2024

Margarethe Born Steinberger-Elias

In times of crisis, such as the Covid-19 global pandemic, journalists who write about biomedical information must have the strategic aim to be clearly and easily understood by…

Abstract

In times of crisis, such as the Covid-19 global pandemic, journalists who write about biomedical information must have the strategic aim to be clearly and easily understood by everyone. In this study, we assume that journalistic discourse could benefit from language redundancy to improve clarity and simplicity aimed at science popularization. The concept of language redundancy is theoretically discussed with the support of discourse analysis and information theory. The methodology adopted is a corpus-based qualitative approach. Two corpora samples with Brazilian Portuguese (BP) texts on Covid-19 were collected. One with texts from a monthly science digital magazine called Pesquisa FAPESP aimed at students and researchers for scientific information dissemination and the other with popular language texts from a news Portal G1 (Rede Globo) aimed at unspecified and/or non-specialized readers. The materials were filtered with two descriptors: “vaccine” and “test.” Preliminary analysis of examples from these materials revealed two categories of redundancy: paraphrastic and polysemic. Paraphrastic redundancy is based on concomitant language reformulation of words, sentences, text excerpts, or even larger units. Polysemic redundancy does not easily show material evidence, but is based on cognitively predictable semantic association in socio-cultural domains. Both kinds of redundancy contribute, each in their own way, to improving text readability for science popularization in Brazil.

Details

Geo Spaces of Communication Research
Type: Book
ISBN: 978-1-80071-606-3

Keywords

Book part
Publication date: 28 March 2024

Julien Figeac, Nathalie Paton, Angelina Peralva, Arthur Coelho Bezerra, Héloïse Prévost, Pierre Ratinaud and Tristan Salord

Based on a lexical analysis of publications on 529 Facebook pages, published between 2013 and 2017, this research explores how Brazilian left-wing activist groups participate on…

Abstract

Based on a lexical analysis of publications on 529 Facebook pages, published between 2013 and 2017, this research explores how Brazilian left-wing activist groups participate on Facebook to coordinate their opposition and engage in social struggles. This chapter shows how activist groups set up two main digital network repertoires of action when mobilizing on Facebook. First, in direct connection with major political events, the platform is used as a media arena to challenge governments’ political actions and second, it is employed as a tool to coordinate mobilization, whether these mobilizations are demonstrations on the street or at cultural events, such as at a music concert. These repertoires of action exemplify ways in which contemporary Brazilian activism is carried out at the intersection of online and offline engagements. While participants engage through these two repertoires, this network of activists is held together over time through a more mundane type of event, pertaining to the repertoire of action allowing the organization of mobilization. Stepping aside from opposition and struggles brought to the streets, the organization of cultural activities, such as concerts and exhibitions, punctuates the everyday exchanges in activists’ communications. Talk about cultural events and their related social agendas structures activist networks on a medium-term basis and creates the conditions for the coordination of (future) social movements, in that they offer the opportunities to stay in contact, in addition to taking part in occasional gatherings, between more highly visible social protests.

Details

Geo Spaces of Communication Research
Type: Book
ISBN: 978-1-80071-606-3

Keywords

Article
Publication date: 20 December 2022

Javaid Ahmad Wani and Shabir Ahmad Ganaie

The current study aims to map the scientific output of grey literature (GL) through bibliometric approaches.

Abstract

Purpose

The current study aims to map the scientific output of grey literature (GL) through bibliometric approaches.

Design/methodology/approach

The source for data extraction is a comprehensive “indexing and abstracting” database, “Web of Science” (WOS). A lexical title search was applied to get the corpus of the study – a total of 4,599 articles were extracted for data analysis and visualisation. Further, the data were analysed by using the data analytical tools, R-studio and VOSViewer.

Findings

The findings showed that the “publications” have substantially grown up during the timeline. The most productive phase (2018–2021) resulted in 47% of articles. The prominent sources were PLOS One and NeuroImage. The highest number of papers were contributed by Haddaway and Kumar. The most relevant countries were the USA and UK.

Practical implications

The study is useful for researchers interested in the GL research domain. The study helps to understand the evolution of the GL to provide research support further in this area.

Originality/value

The present study provides a new orientation to the scholarly output of the GL. The study is rigorous and all-inclusive based on analytical operations like the research networks, collaboration and visualisation. To the best of the authors' knowledge, this manuscript is original, and no similar works have been found with the research objectives included here.

Details

Library Hi Tech, vol. 42 no. 1
Type: Research Article
ISSN: 0737-8831

Keywords

Open Access
Article
Publication date: 31 July 2023

Daniel Šandor and Marina Bagić Babac

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning…

3157

Abstract

Purpose

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning. It is mainly distinguished by the inflection with which it is spoken, with an undercurrent of irony, and is largely dependent on context, which makes it a difficult task for computational analysis. Moreover, sarcasm expresses negative sentiments using positive words, allowing it to easily confuse sentiment analysis models. This paper aims to demonstrate the task of sarcasm detection using the approach of machine and deep learning.

Design/methodology/approach

For the purpose of sarcasm detection, machine and deep learning models were used on a data set consisting of 1.3 million social media comments, including both sarcastic and non-sarcastic comments. The data set was pre-processed using natural language processing methods, and additional features were extracted and analysed. Several machine learning models, including logistic regression, ridge regression, linear support vector and support vector machines, along with two deep learning models based on bidirectional long short-term memory and one bidirectional encoder representations from transformers (BERT)-based model, were implemented, evaluated and compared.

Findings

The performance of machine and deep learning models was compared in the task of sarcasm detection, and possible ways of improvement were discussed. Deep learning models showed more promise, performance-wise, for this type of task. Specifically, a state-of-the-art model in natural language processing, namely, BERT-based model, outperformed other machine and deep learning models.

Originality/value

This study compared the performance of the various machine and deep learning models in the task of sarcasm detection using the data set of 1.3 million comments from social media.

Details

Information Discovery and Delivery, vol. 52 no. 2
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 7 July 2023

Wuyan Liang and Xiaolong Xu

In the COVID-19 era, sign language (SL) translation has gained attention in online learning, which evaluates the physical gestures of each student and bridges the communication…

Abstract

Purpose

In the COVID-19 era, sign language (SL) translation has gained attention in online learning, which evaluates the physical gestures of each student and bridges the communication gap between dysphonia and hearing people. The purpose of this paper is to devote the alignment between SL sequence and nature language sequence with high translation performance.

Design/methodology/approach

SL can be characterized as joint/bone location information in two-dimensional space over time, forming skeleton sequences. To encode joint, bone and their motion information, we propose a multistream hierarchy network (MHN) along with a vocab prediction network (VPN) and a joint network (JN) with the recurrent neural network transducer. The JN is used to concatenate the sequences encoded by the MHN and VPN and learn their sequence alignments.

Findings

We verify the effectiveness of the proposed approach and provide experimental results on three large-scale datasets, which show that translation accuracy is 94.96, 54.52, and 92.88 per cent, and the inference time is 18 and 1.7 times faster than listen-attend-spell network (LAS) and visual hierarchy to lexical sequence network (H2SNet) , respectively.

Originality/value

In this paper, we propose a novel framework that can fuse multimodal input (i.e. joint, bone and their motion stream) and align input streams with nature language. Moreover, the provided framework is improved by the different properties of MHN, VPN and JN. Experimental results on the three datasets demonstrate that our approaches outperform the state-of-the-art methods in terms of translation accuracy and speed.

Details

Data Technologies and Applications, vol. 58 no. 2
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 3 October 2023

Anna Sokolova, Polina Lobanova and Ilya Kuzminov

The purpose of the paper is to present an integrated methodology for identifying trends in a particular subject area based on a combination of advanced text mining and expert…

Abstract

Purpose

The purpose of the paper is to present an integrated methodology for identifying trends in a particular subject area based on a combination of advanced text mining and expert methods. The authors aim to test it in an area of clinical psychology and psychotherapy in 2010–2019.

Design/methodology/approach

The authors demonstrate the way of applying text-mining and the Word2Vec model to identify hot topics (HT) and emerging trends (ET) in clinical psychology and psychotherapy. The analysis of 11.3 million scientific publications in the Microsoft Academic Graph database revealed the most rapidly growing clinical psychology and psychotherapy terms – those with the largest increase in the number of publications reflecting real or potential trends.

Findings

The proposed approach allows one to identify HT and ET for the six thematic clusters related to mental disorders, symptoms, pharmacology, psychotherapy, treatment techniques and important psychological skills.

Practical implications

The developed methodology allows one to see the broad picture of the most dynamic research areas in the field of clinical psychology and psychotherapy in 2010–2019. For clinicians, who are often overwhelmed by practical work, this map of the current research can help identify the areas worthy of further attention to improve the effectiveness of their clinical work. This methodology might be applied for the identification of trends in any other subject area by taking into account its specificity.

Originality/value

The paper demonstrates the value of the advanced text-mining approach for understanding trends in a subject area. To the best of the authors’ knowledge, for the first time, text-mining and the Word2Vec model have been applied to identifying trends in the field of clinical psychology and psychotherapy.

Details

foresight, vol. 26 no. 1
Type: Research Article
ISSN: 1463-6689

Keywords

Open Access
Article
Publication date: 21 May 2024

Jonathan David Schöps and Philipp Jaufenthaler

Large-scale text-based data increasingly poses methodological challenges due to its size, scope and nature, requiring sophisticated methods for managing, visualizing, analyzing…

Abstract

Purpose

Large-scale text-based data increasingly poses methodological challenges due to its size, scope and nature, requiring sophisticated methods for managing, visualizing, analyzing and interpreting such data. This paper aims to propose semantic network analysis (SemNA) as one possible solution to these challenges, showcasing its potential for consumer and marketing researchers through three application areas in phygital contexts.

Design/methodology/approach

This paper outlines three general application areas for SemNA in phygital contexts and presents specific use cases, data collection methodologies, analyses, findings and discussions for each application area.

Findings

The paper uncovers three application areas and use cases where SemNA holds promise for providing valuable insights and driving further adoption of the method: (1) Investigating phygital experiences and consumption phenomena; (2) Exploring phygital consumer and market discourse, trends and practices; and (3) Capturing phygital social constructs.

Research limitations/implications

The limitations section highlights the specific challenges of the qualitative, interpretivist approach to SemNA, along with general methodological constraints.

Practical implications

Practical implications highlight SemNA as a pragmatic tool for managers to analyze and visualize company-/brand-related data, supporting strategic decision-making in physical, digital and phygital spaces.

Originality/value

This paper contributes to the expanding body of computational, tool-based methods by providing an overview of application areas for the qualitative, interpretivist approach to SemNA in consumer and marketing research. It emphasizes the diversity of research contexts and data, where the boundaries between physical and digital spaces have become increasingly intertwined with physical and digital elements closely integrated – a phenomenon known as phygital.

Details

Qualitative Market Research: An International Journal, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1352-2752

Keywords

Article
Publication date: 6 February 2024

Somayeh Tamjid, Fatemeh Nooshinfard, Molouk Sadat Hosseini Beheshti, Nadjla Hariri and Fahimeh Babalhavaeji

The purpose of this study is to develop a domain independent, cost-effective, time-saving and semi-automated ontology generation framework that could extract taxonomic concepts…

Abstract

Purpose

The purpose of this study is to develop a domain independent, cost-effective, time-saving and semi-automated ontology generation framework that could extract taxonomic concepts from unstructured text corpus. In the human disease domain, ontologies are found to be extremely useful for managing the diversity of technical expressions in favour of information retrieval objectives. The boundaries of these domains are expanding so fast that it is essential to continuously develop new ontologies or upgrade available ones.

Design/methodology/approach

This paper proposes a semi-automated approach that extracts entities/relations via text mining of scientific publications. Text mining-based ontology (TmbOnt)-named code is generated to assist a user in capturing, processing and establishing ontology elements. This code takes a pile of unstructured text files as input and projects them into high-valued entities or relations as output. As a semi-automated approach, a user supervises the process, filters meaningful predecessor/successor phrases and finalizes the demanded ontology-taxonomy. To verify the practical capabilities of the scheme, a case study was performed to drive glaucoma ontology-taxonomy. For this purpose, text files containing 10,000 records were collected from PubMed.

Findings

The proposed approach processed over 3.8 million tokenized terms of those records and yielded the resultant glaucoma ontology-taxonomy. Compared with two famous disease ontologies, TmbOnt-driven taxonomy demonstrated a 60%–100% coverage ratio against famous medical thesauruses and ontology taxonomies, such as Human Disease Ontology, Medical Subject Headings and National Cancer Institute Thesaurus, with an average of 70% additional terms recommended for ontology development.

Originality/value

According to the literature, the proposed scheme demonstrated novel capability in expanding the ontology-taxonomy structure with a semi-automated text mining approach, aiming for future fully-automated approaches.

Details

The Electronic Library , vol. 42 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 22 February 2024

Yuzhuo Wang, Chengzhi Zhang, Min Song, Seongdeok Kim, Youngsoo Ko and Juhee Lee

In the era of artificial intelligence (AI), algorithms have gained unprecedented importance. Scientific studies have shown that algorithms are frequently mentioned in papers…

103

Abstract

Purpose

In the era of artificial intelligence (AI), algorithms have gained unprecedented importance. Scientific studies have shown that algorithms are frequently mentioned in papers, making mention frequency a classical indicator of their popularity and influence. However, contemporary methods for evaluating influence tend to focus solely on individual algorithms, disregarding the collective impact resulting from the interconnectedness of these algorithms, which can provide a new way to reveal their roles and importance within algorithm clusters. This paper aims to build the co-occurrence network of algorithms in the natural language processing field based on the full-text content of academic papers and analyze the academic influence of algorithms in the group based on the features of the network.

Design/methodology/approach

We use deep learning models to extract algorithm entities from articles and construct the whole, cumulative and annual co-occurrence networks. We first analyze the characteristics of algorithm networks and then use various centrality metrics to obtain the score and ranking of group influence for each algorithm in the whole domain and each year. Finally, we analyze the influence evolution of different representative algorithms.

Findings

The results indicate that algorithm networks also have the characteristics of complex networks, with tight connections between nodes developing over approximately four decades. For different algorithms, algorithms that are classic, high-performing and appear at the junctions of different eras can possess high popularity, control, central position and balanced influence in the network. As an algorithm gradually diminishes its sway within the group, it typically loses its core position first, followed by a dwindling association with other algorithms.

Originality/value

To the best of the authors’ knowledge, this paper is the first large-scale analysis of algorithm networks. The extensive temporal coverage, spanning over four decades of academic publications, ensures the depth and integrity of the network. Our results serve as a cornerstone for constructing multifaceted networks interlinking algorithms, scholars and tasks, facilitating future exploration of their scientific roles and semantic relations.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

1 – 10 of 17