Search results

1 – 10 of 549
Article
Publication date: 1 November 2023

Juan Yang, Zhenkun Li and Xu Du

Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their…

Abstract

Purpose

Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their emotional states in daily communication. Therefore, how to achieve automatic and accurate audiovisual emotion recognition is significantly important for developing engaging and empathetic human–computer interaction environment. However, two major challenges exist in the field of audiovisual emotion recognition: (1) how to effectively capture representations of each single modality and eliminate redundant features and (2) how to efficiently integrate information from these two modalities to generate discriminative representations.

Design/methodology/approach

A novel key-frame extraction-based attention fusion network (KE-AFN) is proposed for audiovisual emotion recognition. KE-AFN attempts to integrate key-frame extraction with multimodal interaction and fusion to enhance audiovisual representations and reduce redundant computation, filling the research gaps of existing approaches. Specifically, the local maximum–based content analysis is designed to extract key-frames from videos for the purpose of eliminating data redundancy. Two modules, including “Multi-head Attention-based Intra-modality Interaction Module” and “Multi-head Attention-based Cross-modality Interaction Module”, are proposed to mine and capture intra- and cross-modality interactions for further reducing data redundancy and producing more powerful multimodal representations.

Findings

Extensive experiments on two benchmark datasets (i.e. RAVDESS and CMU-MOSEI) demonstrate the effectiveness and rationality of KE-AFN. Specifically, (1) KE-AFN is superior to state-of-the-art baselines for audiovisual emotion recognition. (2) Exploring the supplementary and complementary information of different modalities can provide more emotional clues for better emotion recognition. (3) The proposed key-frame extraction strategy can enhance the performance by more than 2.79 per cent on accuracy. (4) Both exploring intra- and cross-modality interactions and employing attention-based audiovisual fusion can lead to better prediction performance.

Originality/value

The proposed KE-AFN can support the development of engaging and empathetic human–computer interaction environment.

Article
Publication date: 14 November 2023

Brajesh Mishra and Avanish Kumar

Globally, the governance has shifted from positivist to the regulatory-centric approach, necessitating accurate contouring of regulatory governance framework. The study proposes a…

Abstract

Purpose

Globally, the governance has shifted from positivist to the regulatory-centric approach, necessitating accurate contouring of regulatory governance framework. The study proposes a novel approach to unravel the regulatory governance framework in the context of the Indian electronics industry – extendable to other sectors in India and other emerging economies.

Design/methodology/approach

The research objective has been operationalized through document analysis and thematic analysis of semi-structured interview transcripts in three steps: (1) arrive at parameters of the regulatory governance framework, (2) identify instruments against each parameter and (3) characterize parameters in terms of dominant instruments and their underlying modalities. The authors have adopted a set of 6 Cs modalities (control, communications, competition, consensus, code and collaboration) and regulatory space theory to analyze existing modalities mix in the dominant instruments.

Findings

In summary, the study has (1) identified eight macro and twenty micro regulatory governance parameters, (2) mapped regulatory governance parameters with instruments and institutions (3) revealed the top two dominant modalities for each regulatory governance parameter.

Practical implications

The existing modality characteristics of regulatory governance parameters can be used by manufacturers, investors and other stakeholders to make a realistic assessment of regulatory governance and reduce regulatory risk and regulatory burden.

Originality/value

The multidimensional use of parameters, instruments and modalities broadens the understanding of the existing regulatory governance framework and may assist the regulators in optimizing it to meet market requirements.

Details

International Journal of Emerging Markets, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1746-8809

Keywords

Article
Publication date: 27 April 2023

Elizabeth A. Cudney, Somer Anderson, Robbie Beane, Sandra Furterer, Lakshmy Mohandas and Chad Laux

Teaching effectiveness is essential to student learning, engagement and success. This study aims to identify the perceived teaching effectiveness attributes from the student’s…

Abstract

Purpose

Teaching effectiveness is essential to student learning, engagement and success. This study aims to identify the perceived teaching effectiveness attributes from the student’s perspective through a pilot study.

Design/methodology/approach

A comprehensive literature review identified 6 demographic and 25 teaching effectiveness characteristics. The Kano model was used to gather and analyze the student’s voices. The research validated the survey instrument using Cronbach’s alpha to ensure internal consistency and Chi-square goodness of fit to test the data distribution. Differences in response patterns were analyzed using Fisher’s exact test. Furthermore, the magnitude of the effect between the teaching effectiveness attributes was determined using Cramer’s V test.

Findings

This study determined that students perceived 19 attributes as one-dimensional, 3 as indifferent, 2 as attractive and 1 as one-dimensional and attractive. The analysis found differences in response patterns concerning readings and materials, grading rubrics to set assignment expectations and group/teamwork on projects.

Research limitations/implications

As a pilot study, the sample size was small. Additional research should validate the survey using a larger sample. While the study results are specific to the college surveyed, other educators can use the methodology to identify the attributes important to their students.

Practical implications

Categorizing attributes based on the student’s voice enables instructors to focus on attributes that will improve the learning experience.

Originality/value

This research provides a comprehensive methodology for identifying critical teaching effectiveness attributes from the student’s perspective.

Details

Quality Assurance in Education, vol. 31 no. 3
Type: Research Article
ISSN: 0968-4883

Keywords

Article
Publication date: 19 January 2023

Jinha Lee and Heejin Lim

This study aims to investigate the effects of two visual design principles, repetition and compositional lines, in a food image on purchase intention in the context of a mobile…

Abstract

Purpose

This study aims to investigate the effects of two visual design principles, repetition and compositional lines, in a food image on purchase intention in the context of a mobile food delivery app and test the effect of crossmodal correspondences between vision and taste as a processing mechanism.

Design/methodology/approach

In this study, two experiments were conducted using burgers and iced tea as stimuli.

Findings

The results demonstrate that repetition of an identical food product increases visual appeal for both burgers and iced tea. However, the optimal level of repetition was different between the two products. The findings show that different compositional lines generate different levels of visual appeal and the effects of compositional lines vary between burgers and iced tea. The results also validate the serial mediation effects of vision and taste between design principles and purchase intention.

Originality/value

The findings of this study add substantially to the understanding of visual information processing in food retailing by demonstrating how design principles such as repetition and compositional lines facilitate crossmodal responses between vision and taste and influence purchase decisions in a mobile platform. Also this study provides guidance as to how food retailers use design principles (e.g. repetition and compositional lines) for different products effectively when the food retailers develop visual digital content for a mobile app.

Details

International Journal of Retail & Distribution Management, vol. 51 no. 8
Type: Research Article
ISSN: 0959-0552

Keywords

Content available
Article
Publication date: 13 November 2023

Sheuli Paul

This paper presents a survey of research into interactive robotic systems for the purpose of identifying the state of the art capabilities as well as the extant gaps in this…

1003

Abstract

Purpose

This paper presents a survey of research into interactive robotic systems for the purpose of identifying the state of the art capabilities as well as the extant gaps in this emerging field. Communication is multimodal. Multimodality is a representation of many modes chosen from rhetorical aspects for its communication potentials. The author seeks to define the available automation capabilities in communication using multimodalities that will support a proposed Interactive Robot System (IRS) as an AI mounted robotic platform to advance the speed and quality of military operational and tactical decision making.

Design/methodology/approach

This review will begin by presenting key developments in the robotic interaction field with the objective of identifying essential technological developments that set conditions for robotic platforms to function autonomously. After surveying the key aspects in Human Robot Interaction (HRI), Unmanned Autonomous System (UAS), visualization, Virtual Environment (VE) and prediction, the paper then proceeds to describe the gaps in the application areas that will require extension and integration to enable the prototyping of the IRS. A brief examination of other work in HRI-related fields concludes with a recapitulation of the IRS challenge that will set conditions for future success.

Findings

Using insights from a balanced cross section of sources from the government, academic, and commercial entities that contribute to HRI a multimodal IRS in military communication is introduced. Multimodal IRS (MIRS) in military communication has yet to be deployed.

Research limitations/implications

Multimodal robotic interface for the MIRS is an interdisciplinary endeavour. This is not realistic that one can comprehend all expert and related knowledge and skills to design and develop such multimodal interactive robotic interface. In this brief preliminary survey, the author has discussed extant AI, robotics, NLP, CV, VDM, and VE applications that is directly related to multimodal interaction. Each mode of this multimodal communication is an active research area. Multimodal human/military robot communication is the ultimate goal of this research.

Practical implications

A multimodal autonomous robot in military communication using speech, images, gestures, VST and VE has yet to be deployed. Autonomous multimodal communication is expected to open wider possibilities for all armed forces. Given the density of the land domain, the army is in a position to exploit the opportunities for human–machine teaming (HMT) exposure. Naval and air forces will adopt platform specific suites for specially selected operators to integrate with and leverage this emerging technology. The possession of a flexible communications means that readily adapts to virtual training will enhance planning and mission rehearsals tremendously.

Social implications

Interaction, perception, cognition and visualization based multimodal communication system is yet missing. Options to communicate, express and convey information in HMT setting with multiple options, suggestions and recommendations will certainly enhance military communication, strength, engagement, security, cognition, perception as well as the ability to act confidently for a successful mission.

Originality/value

The objective is to develop a multimodal autonomous interactive robot for military communications. This survey reports the state of the art, what exists and what is missing, what can be done and possibilities of extension that support the military in maintaining effective communication using multimodalities. There are some separate ongoing progresses, such as in machine-enabled speech, image recognition, tracking, visualizations for situational awareness, and virtual environments. At this time, there is no integrated approach for multimodal human robot interaction that proposes a flexible and agile communication. The report briefly introduces the research proposal about multimodal interactive robot in military communication.

Article
Publication date: 12 September 2023

Wei Shi, Jing Zhang and Shaoyi He

With the rapid development of short videos in China, the public has become accustomed to using short videos to express their opinions. This paper aims to solve problems such as…

113

Abstract

Purpose

With the rapid development of short videos in China, the public has become accustomed to using short videos to express their opinions. This paper aims to solve problems such as how to represent the features of different modalities and achieve effective cross-modal feature fusion when analyzing the multi-modal sentiment of Chinese short videos (CSVs).

Design/methodology/approach

This paper aims to propose a sentiment analysis model MSCNN-CPL-CAFF using multi-scale convolutional neural network and cross attention fusion mechanism to analyze the CSVs. The audio-visual and textual data of CSVs themed on “COVID-19, catering industry” are collected from CSV platform Douyin first, and then a comparative analysis is conducted with advanced baseline models.

Findings

The sample number of the weak negative and neutral sentiment is the largest, and the sample number of the positive and weak positive sentiment is relatively small, accounting for only about 11% of the total samples. The MSCNN-CPL-CAFF model has achieved the Acc-2, Acc-3 and F1 score of 85.01%, 74.16 and 84.84%, respectively, which outperforms the highest value of baseline methods in accuracy and achieves competitive computation speed.

Practical implications

This research offers some implications regarding the impact of COVID-19 on catering industry in China by focusing on multi-modal sentiment of CSVs. The methodology can be utilized to analyze the opinions of the general public on social media platform and to categorize them accordingly.

Originality/value

This paper presents a novel deep-learning multimodal sentiment analysis model, which provides a new perspective for public opinion research on the short video platform.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 26 May 2023

Tseng-Lung Huang, Henry F.L. Chung and Xiang Chen

The purpose of this study is to clarify the role of various levels of modality richness [text-visual, audiovisual and augmented reality interactive technology (ARIT)] on vivid…

Abstract

Purpose

The purpose of this study is to clarify the role of various levels of modality richness [text-visual, audiovisual and augmented reality interactive technology (ARIT)] on vivid memories (visual sensory detailed, emotionally intense, first-person perspective and coherent) and exploratory behavior. To clarify which modality richness online retailers use is more appropriate to create a virtual reality simulation experience to fill a significant gap in the sensory interactive marketing paradigm.

Design/methodology/approach

A task-based laboratory study was conducted to provide users with private try-on space. A total of 429 valid questionnaires were collected, and partial least squares path modeling was adopted to test hypotheses.

Findings

The results indicate that various levels of modality richness (text-visual, audiovisual and ARIT) positively affect vivid memories (visual sensory detailed, emotionally intense, first-person perspective and coherent), and vivid memories successfully induce exploratory behavior.

Practical implications

The study results could also help retailers and brands with clear guidance in designing and creating simulation experience services and choosing the best way to present products. With the results of this research, retailers will also be able to grasp better the critical points of introducing innovative technology into the service experience and then create the benefits of digital economic growth.

Originality/value

Exploring which digital interactive technology online retailers use is more appropriate to create a virtual reality shopping experience to fill a significant gap in the sensory interactive marketing paradigm. Exploring the antecedents of vivid memories in a digital sensory interactive experience contributes to the body schema literature and the script theory. We draw from construal level theory (CLT) to clarify the impact of various levels of modality richness on driving the difference in sensory simulation schema to break through the limited findings of previous studies, namely using CLT to interpret psychological distance.

Details

Journal of Research in Interactive Marketing, vol. 17 no. 6
Type: Research Article
ISSN: 2040-7122

Keywords

Article
Publication date: 9 January 2024

Patrick Clements and Aidan Turkington

This study aims to explore medical students’ attitudes to electroconvulsive therapy (ECT). The authors sought to determine correlates of baseline attitudes to ECT and whether…

Abstract

Purpose

This study aims to explore medical students’ attitudes to electroconvulsive therapy (ECT). The authors sought to determine correlates of baseline attitudes to ECT and whether specific forms of ECT teaching improved attitudes to ECT during students’ psychiatry placement.

Design/methodology/approach

At the beginning of their placement, fourth-year medical students completed a questionnaire capturing background information and baseline attitudes. A second questionnaire, in the second half of the placement, recorded educational and clinical experience gained on ECT during placement, in addition to attitudes at this timepoint. The authors measured attitude using a five-point Likert scale and defined a positive shift in attitude as an improvement of ≥ 1 point between the two time points.

Findings

At Timepoint 1, 66% reported a positive attitude to ECT. This was associated with having attended a lecture and with having read a professional article on ECT at some time before the psychiatry placement. Attitudes significantly improved during the placement (66% vs 95% positive). Students who attended a lecture on ECT were more likely to have a positive shift in attitude, as were students who experienced three or more teaching modalities.

Practical implications

Personal, social and medical problems arise from treatment-resistant psychiatric disorders. ECT is a safe and effective treatment for such disorders.

Originality/value

It is hoped that this study will contribute to the development of medical education, so that lectures on ECT, and three or more teaching modalities, are incorporated into the undergraduate medical curriculum.

Details

The Journal of Mental Health Training, Education and Practice, vol. 19 no. 1
Type: Research Article
ISSN: 1755-6228

Keywords

Book part
Publication date: 12 December 2023

Floris de Krijger

A growing body of research finds that gig economy platforms use gamification to enhance managerial control. Focusing on technologically mediated forms of gamification, this…

Abstract

A growing body of research finds that gig economy platforms use gamification to enhance managerial control. Focusing on technologically mediated forms of gamification, this literature reveals how platforms mobilize gig workers’ work effort by making the labour process resemble a game. This chapter contends that this tech-centric scholarship fails to fully capture the historical continuities between contemporary and much older occurrences of game-playing at work. Informed by interviews and participatory observations at two food delivery platforms in Amsterdam, I document how these platforms’ piece wage system gives rise to a workplace dynamic in which severely underpaid delivery couriers continuously employ game strategies to maximize their gig income. Reminiscent of observations from the early shop floor ethnographies of the manufacturing industry, I show that the game of gig income maximization operates as an indirect modality of control by (re)aligning the interests of couriers with the interests of capital and by individualizing and depoliticizing couriers’ overall low wage level. I argue that the new, algorithmic technologies expand and intensify the much older forms of gamified control by infusing the organizational activities of shift and task allocation with the logic of the piece wage game and by increasing the possibilities for interaction, direct feedback and immersion. My study contributes to the literature on gamification in the gig economy by interweaving it with the classic observations derived from the manufacturing industry and by developing a conceptualization of gamification in which both capital and labour exercise agency.

Details

Ethnographies of Work
Type: Book
ISBN: 978-1-83753-949-9

Keywords

Article
Publication date: 21 August 2023

Zengxin Kang, Jing Cui and Zhongyi Chu

Accurate segmentation of artificial assembly action is the basis of autonomous industrial assembly robots. This paper aims to study the precise segmentation method of manual…

Abstract

Purpose

Accurate segmentation of artificial assembly action is the basis of autonomous industrial assembly robots. This paper aims to study the precise segmentation method of manual assembly action.

Design/methodology/approach

In this paper, a temporal-spatial-contact features segmentation system (TSCFSS) for manual assembly actions recognition and segmentation is proposed. The system consists of three stages: spatial features extraction, contact force features extraction and action segmentation in the temporal dimension. In the spatial features extraction stage, a vectors assembly graph (VAG) is proposed to precisely describe the motion state of the objects and relative position between objects in an RGB-D video frame. Then graph networks are used to extract the spatial features from the VAG. In the contact features extraction stage, a sliding window is used to cut contact force features between hands and tools/parts corresponding to the video frame. Finally, in the action segmentation stage, the spatial and contact features are concatenated as the input of temporal convolution networks for action recognition and segmentation. The experiments have been conducted on a new manual assembly data set containing RGB-D video and contact force.

Findings

In the experiments, the TSCFSS is used to recognize 11 kinds of assembly actions in demonstrations and outperforms the other comparative action identification methods.

Originality/value

A novel manual assembly actions precisely segmentation system, which fuses temporal features, spatial features and contact force features, has been proposed. The VAG, a symbolic knowledge representation for describing assembly scene state, is proposed, making action segmentation more convenient. A data set with RGB-D video and contact force is specifically tailored for researching manual assembly actions.

Details

Robotic Intelligence and Automation, vol. 43 no. 5
Type: Research Article
ISSN: 2754-6969

Keywords

1 – 10 of 549