Search results

1 – 10 of 57
Article
Publication date: 1 November 2023

Juan Yang, Zhenkun Li and Xu Du

Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their…

Abstract

Purpose

Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their emotional states in daily communication. Therefore, how to achieve automatic and accurate audiovisual emotion recognition is significantly important for developing engaging and empathetic human–computer interaction environment. However, two major challenges exist in the field of audiovisual emotion recognition: (1) how to effectively capture representations of each single modality and eliminate redundant features and (2) how to efficiently integrate information from these two modalities to generate discriminative representations.

Design/methodology/approach

A novel key-frame extraction-based attention fusion network (KE-AFN) is proposed for audiovisual emotion recognition. KE-AFN attempts to integrate key-frame extraction with multimodal interaction and fusion to enhance audiovisual representations and reduce redundant computation, filling the research gaps of existing approaches. Specifically, the local maximum–based content analysis is designed to extract key-frames from videos for the purpose of eliminating data redundancy. Two modules, including “Multi-head Attention-based Intra-modality Interaction Module” and “Multi-head Attention-based Cross-modality Interaction Module”, are proposed to mine and capture intra- and cross-modality interactions for further reducing data redundancy and producing more powerful multimodal representations.

Findings

Extensive experiments on two benchmark datasets (i.e. RAVDESS and CMU-MOSEI) demonstrate the effectiveness and rationality of KE-AFN. Specifically, (1) KE-AFN is superior to state-of-the-art baselines for audiovisual emotion recognition. (2) Exploring the supplementary and complementary information of different modalities can provide more emotional clues for better emotion recognition. (3) The proposed key-frame extraction strategy can enhance the performance by more than 2.79 per cent on accuracy. (4) Both exploring intra- and cross-modality interactions and employing attention-based audiovisual fusion can lead to better prediction performance.

Originality/value

The proposed KE-AFN can support the development of engaging and empathetic human–computer interaction environment.

Article
Publication date: 6 March 2024

Xiaohui Li, Dongfang Fan, Yi Deng, Yu Lei and Owen Omalley

This study aims to offer a comprehensive exploration of the potential and challenges associated with sensor fusion-based virtual reality (VR) applications in the context of…

Abstract

Purpose

This study aims to offer a comprehensive exploration of the potential and challenges associated with sensor fusion-based virtual reality (VR) applications in the context of enhanced physical training. The main objective is to identify key advancements in sensor fusion technology, evaluate its application in VR systems and understand its impact on physical training.

Design/methodology/approach

The research initiates by providing context to the physical training environment in today’s technology-driven world, followed by an in-depth overview of VR. This overview includes a concise discussion on the advancements in sensor fusion technology and its application in VR systems for physical training. A systematic review of literature then follows, examining VR’s application in various facets of physical training: from exercise, skill development and technique enhancement to injury prevention, rehabilitation and psychological preparation.

Findings

Sensor fusion-based VR presents tangible advantages in the sphere of physical training, offering immersive experiences that could redefine traditional training methodologies. While the advantages are evident in domains such as exercise optimization, skill acquisition and mental preparation, challenges persist. The current research suggests there is a need for further studies to address these limitations to fully harness VR’s potential in physical training.

Originality/value

The integration of sensor fusion technology with VR in the domain of physical training remains a rapidly evolving field. Highlighting the advancements and challenges, this review makes a significant contribution by addressing gaps in knowledge and offering directions for future research.

Details

Robotic Intelligence and Automation, vol. 44 no. 1
Type: Research Article
ISSN: 2754-6969

Keywords

Article
Publication date: 7 July 2023

Wuyan Liang and Xiaolong Xu

In the COVID-19 era, sign language (SL) translation has gained attention in online learning, which evaluates the physical gestures of each student and bridges the communication…

Abstract

Purpose

In the COVID-19 era, sign language (SL) translation has gained attention in online learning, which evaluates the physical gestures of each student and bridges the communication gap between dysphonia and hearing people. The purpose of this paper is to devote the alignment between SL sequence and nature language sequence with high translation performance.

Design/methodology/approach

SL can be characterized as joint/bone location information in two-dimensional space over time, forming skeleton sequences. To encode joint, bone and their motion information, we propose a multistream hierarchy network (MHN) along with a vocab prediction network (VPN) and a joint network (JN) with the recurrent neural network transducer. The JN is used to concatenate the sequences encoded by the MHN and VPN and learn their sequence alignments.

Findings

We verify the effectiveness of the proposed approach and provide experimental results on three large-scale datasets, which show that translation accuracy is 94.96, 54.52, and 92.88 per cent, and the inference time is 18 and 1.7 times faster than listen-attend-spell network (LAS) and visual hierarchy to lexical sequence network (H2SNet) , respectively.

Originality/value

In this paper, we propose a novel framework that can fuse multimodal input (i.e. joint, bone and their motion stream) and align input streams with nature language. Moreover, the provided framework is improved by the different properties of MHN, VPN and JN. Experimental results on the three datasets demonstrate that our approaches outperform the state-of-the-art methods in terms of translation accuracy and speed.

Details

Data Technologies and Applications, vol. 58 no. 2
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 15 February 2024

Xinyu Liu, Kun Ma, Ke Ji, Zhenxiang Chen and Bo Yang

Propaganda is a prevalent technique used in social media to intentionally express opinions or actions with the aim of manipulating or deceiving users. Existing methods for…

Abstract

Purpose

Propaganda is a prevalent technique used in social media to intentionally express opinions or actions with the aim of manipulating or deceiving users. Existing methods for propaganda detection primarily focus on capturing language features within its content. However, these methods tend to overlook the information presented within the external news environment from which propaganda news originated and spread. This news environment reflects recent mainstream media opinions and public attention and contains language characteristics of non-propaganda news. Therefore, the authors have proposed a graph-based multi-information integration network with an external news environment (abbreviated as G-MINE) for propaganda detection.

Design/methodology/approach

G-MINE is proposed to comprise four parts: textual information extraction module, external news environment perception module, multi-information integration module and classifier. Specifically, the external news environment perception module and multi-information integration module extract and integrate the popularity and novelty into the textual information and capture the high-order complementary information between them.

Findings

G-MINE achieves state-of-the-art performance on both the TSHP-17, Qprop and the PTC data sets, with an accuracy of 98.24%, 90.59% and 97.44%, respectively.

Originality/value

An external news environment perception module is proposed to capture the popularity and novelty information, and a multi-information integration module is proposed to effectively fuse them with the textual information.

Details

International Journal of Web Information Systems, vol. 20 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Open Access
Article
Publication date: 4 April 2024

Yanmin Zhou, Zheng Yan, Ye Yang, Zhipeng Wang, Ping Lu, Philip F. Yuan and Bin He

Vision, audition, olfactory, tactile and taste are five important senses that human uses to interact with the real world. As facing more and more complex environments, a sensing…

Abstract

Purpose

Vision, audition, olfactory, tactile and taste are five important senses that human uses to interact with the real world. As facing more and more complex environments, a sensing system is essential for intelligent robots with various types of sensors. To mimic human-like abilities, sensors similar to human perception capabilities are indispensable. However, most research only concentrated on analyzing literature on single-modal sensors and their robotics application.

Design/methodology/approach

This study presents a systematic review of five bioinspired senses, especially considering a brief introduction of multimodal sensing applications and predicting current trends and future directions of this field, which may have continuous enlightenments.

Findings

This review shows that bioinspired sensors can enable robots to better understand the environment, and multiple sensor combinations can support the robot’s ability to behave intelligently.

Originality/value

The review starts with a brief survey of the biological sensing mechanisms of the five senses, which are followed by their bioinspired electronic counterparts. Their applications in the robots are then reviewed as another emphasis, covering the main application scopes of localization and navigation, objection identification, dexterous manipulation, compliant interaction and so on. Finally, the trends, difficulties and challenges of this research were discussed to help guide future research on intelligent robot sensors.

Details

Robotic Intelligence and Automation, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2754-6969

Keywords

Article
Publication date: 8 March 2024

Sarah Jerasa and Sarah K. Burriss

Artificial intelligence (AI) has become increasingly important and influential in reading and writing. The influx of social media digital spaces, like TikTok, has also shifted the…

Abstract

Purpose

Artificial intelligence (AI) has become increasingly important and influential in reading and writing. The influx of social media digital spaces, like TikTok, has also shifted the ways multimodal composition takes place alongside AI. This study aims to argue that within spaces like TikTok, human composers must attend to the ways they write for, with and against the AI-powered algorithm.

Design/methodology/approach

Data collection was drawn from a larger study on #BookTok (the TikTok subcommunity for readers) that included semi-structured interviews including watching and reflecting on a TikTok they created. The authors grounded this study in critical posthumanist literacies to analyze and open code five #BookTok content creators’ interview transcripts. Using axial coding, authors collaboratively determined three overarching and entangled themes: writing for, with and against.

Findings

Findings highlight the nuanced ways #BookTokers consider the AI algorithm in their compositional choices, namely, in the ways how they want to disseminate their videos to a larger audience or more niche-focused community. Throughout the interviews, participants revealed how the AI algorithm was situated differently as both audience member, co-author and censor.

Originality/value

This study is grounded in critical posthumanist literacies and explores composition as a joint accomplishment between humans and machines. The authors argued that it is necessary to expand our human-centered notions of what it means to write for an audience, to co-author and to resist censorship or gatekeeping.

Details

English Teaching: Practice & Critique, vol. 23 no. 1
Type: Research Article
ISSN: 1175-8708

Keywords

Article
Publication date: 15 April 2024

Xiaona Wang, Jiahao Chen and Hong Qiao

Limited by the types of sensors, the state information available for musculoskeletal robots with highly redundant, nonlinear muscles is often incomplete, which makes the control…

Abstract

Purpose

Limited by the types of sensors, the state information available for musculoskeletal robots with highly redundant, nonlinear muscles is often incomplete, which makes the control face a bottleneck problem. The aim of this paper is to design a method to improve the motion performance of musculoskeletal robots in partially observable scenarios, and to leverage the ontology knowledge to enhance the algorithm’s adaptability to musculoskeletal robots that have undergone changes.

Design/methodology/approach

A memory and attention-based reinforcement learning method is proposed for musculoskeletal robots with prior knowledge of muscle synergies. First, to deal with partially observed states available to musculoskeletal robots, a memory and attention-based network architecture is proposed for inferring more sufficient and intrinsic states. Second, inspired by muscle synergy hypothesis in neuroscience, prior knowledge of a musculoskeletal robot’s muscle synergies is embedded in network structure and reward shaping.

Findings

Based on systematic validation, it is found that the proposed method demonstrates superiority over the traditional twin delayed deep deterministic policy gradients (TD3) algorithm. A musculoskeletal robot with highly redundant, nonlinear muscles is adopted to implement goal-directed tasks. In the case of 21-dimensional states, the learning efficiency and accuracy are significantly improved compared with the traditional TD3 algorithm; in the case of 13-dimensional states without velocities and information from the end effector, the traditional TD3 is unable to complete the reaching tasks, while the proposed method breaks through this bottleneck problem.

Originality/value

In this paper, a novel memory and attention-based reinforcement learning method with prior knowledge of muscle synergies is proposed for musculoskeletal robots to deal with partially observable scenarios. Compared with the existing methods, the proposed method effectively improves the performance. Furthermore, this paper promotes the fusion of neuroscience and robotics.

Details

Robotic Intelligence and Automation, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2754-6969

Keywords

Article
Publication date: 2 April 2024

Paulo Alberto Sampaio Santos, Breno Cortez and Michele Tereza Marques Carvalho

Present study aimed to integrate Geographic Information Systems (GIS) and Building Information Modeling (BIM) in conjunction with multicriteria decision-making (MCDM) to enhance…

Abstract

Purpose

Present study aimed to integrate Geographic Information Systems (GIS) and Building Information Modeling (BIM) in conjunction with multicriteria decision-making (MCDM) to enhance infrastructure investment planning.

Design/methodology/approach

This analysis combines GIS databases with BIM simulations for a novel highway project. Around 150 potential alternatives were simulated, narrowed to 25 more effective routes and 3 options underwent in-depth analysis using PROMETHEE method for decision-making, based on environmental, cost and safety criteria, allowing for comprehensive cross-perspective comparisons.

Findings

A comprehensive framework proposed was validated through a case study. Demonstrating its adaptability with customizable parameters. It aids decision-making, cost estimation, environmental impact analysis and outcome prediction. Considering these critical factors, this study holds the potential to advance new techniques for assessment and planning railways, power lines, gas and water.

Research limitations/implications

The study acknowledges limitations in GIS data quality, particularly in underdeveloped areas or regions with limited technology access. It also overlooks other pertinent variables, like social, economic, political and cultural issues. Thus, conclusions from these simulations may not entirely represent reality or diverse potential scenarios.

Practical implications

The proposed method automates decision-making, reducing subjectivity, aids in selecting effective alternatives and considers environmental criteria to mitigate negative impacts. Additionally, it minimizes costs and risks while demonstrating adaptability for assessing diverse infrastructures.

Originality/value

By integrating GIS and BIM data to support a MCDM workflow, this study proposes to fill the existing research gap in decision-making prioritization and mitigate subjective biases.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 28 February 2024

Victoria Pennington, Emily Howell, Rebecca Kaminski, Nicole Ferguson-Sams, Mihaela Gazioglu, Kavita Mittapalli, Amlan Banerjee and Mikel Cole

Computer-assisted language learning (CALL) can create participatory cultures by removing barriers to access materials, encouraging student modes of expression, differentiating…

Abstract

Purpose

Computer-assisted language learning (CALL) can create participatory cultures by removing barriers to access materials, encouraging student modes of expression, differentiating student interactions through digital environments and increasing learner autonomy. Participatory cultures require competencies or new media literacy (NML) skills to be successful in a digital world. However, professional development (PD) often lacks training on CALL and its implementation to develop such skills. The purpose of this study is to describe teachers use of digital tools for multilingual learners through a relevant theoretical perspective.

Design/methodology/approach

This design-based research study examines 30 in-service teachers in South Carolina, a destination state for Latinx immigrants, focusing data over three semesters of PD: interviews and instructional logs. The researchers address the question: How are teachers using digital tools to advance NML for multilingual learners (MLs)?

Findings

The authors analyzed current elementary teachers’ use of digital tools for language learning and NML purposes. Three themes are discussed: NMLs and digital literacy boundaries, digital tools for MLs and literacy teaching for MLs and NML skills.

Originality/value

Teacher PD often needs more specificity regarding the intersection of MLs and digital literacy. The authors contribute to the literature on needed elementary teaching practices for MLs, the integration of NML and how these practices may be addressed through PD.

Details

Journal for Multicultural Education, vol. 18 no. 1/2
Type: Research Article
ISSN: 2053-535X

Keywords

Article
Publication date: 27 March 2024

Jyoti Mudkanna Gavhane and Reena Pagare

The purpose of this study was to analyze importance of artificial intelligence (AI) in education and its emphasis on assessment and adversity quotient (AQ).

Abstract

Purpose

The purpose of this study was to analyze importance of artificial intelligence (AI) in education and its emphasis on assessment and adversity quotient (AQ).

Design/methodology/approach

The study utilizes a systematic literature review of over 141 journal papers and psychometric tests to evaluate AQ. Thematic analysis of quantitative and qualitative studies explores domains of AI in education.

Findings

Results suggest that assessing the AQ of students with the help of AI techniques is necessary. Education is a vital tool to develop and improve natural intelligence, and this survey presents the discourse use of AI techniques and behavioral strategies in the education sector of the recent era. The study proposes a conceptual framework of AQ with the help of assessment style for higher education undergraduates.

Originality/value

Research on AQ evaluation in the Indian context is still emerging, presenting a potential avenue for future research. Investigating the relationship between AQ and academic performance among Indian students is a crucial area of research. This can provide insights into the role of AQ in academic motivation, persistence and success in different academic disciplines and levels of education. AQ evaluation offers valuable insights into how individuals deal with and overcome challenges. The findings of this study have implications for higher education institutions to prepare for future challenges and better equip students with necessary skills for success. The papers reviewed related to AI for education opens research opportunities in the field of psychometrics, educational assessment and the evaluation of AQ.

Details

Education + Training, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0040-0912

Keywords

1 – 10 of 57