Search results

1 – 10 of over 1000
Article
Publication date: 17 April 2020

Rajasekhar B, Kamaraju M and Sumalatha V

Nowadays, the speech emotion recognition (SER) model has enhanced as the main research topic in various fields including human–computer interaction as well as speech processing…

Abstract

Purpose

Nowadays, the speech emotion recognition (SER) model has enhanced as the main research topic in various fields including human–computer interaction as well as speech processing. Generally, it focuses on utilizing the models of machine learning for predicting the exact emotional status from speech. The advanced SER applications go successful in affective computing and human–computer interaction, which is making as the main component of computer system's next generation. This is because the natural human machine interface could grant the automatic service provisions, which need a better appreciation of user's emotional states.

Design/methodology/approach

This paper implements a new SER model that incorporates both gender and emotion recognition. Certain features are extracted and subjected for classification of emotions. For this, this paper uses deep belief network DBN model.

Findings

Through the performance analysis, it is observed that the developed method attains high accuracy rate (for best case) when compared to other methods, and it is 1.02% superior to whale optimization algorithm (WOA), 0.32% better from firefly (FF), 23.45% superior to particle swarm optimization (PSO) and 23.41% superior to genetic algorithm (GA). In case of worst scenario, the mean update of particle swarm and whale optimization (MUPW) in terms of accuracy is 15.63, 15.98, 16.06% and 16.03% superior to WOA, FF, PSO and GA, respectively. Under the mean case, the performance of MUPW is high, and it is 16.67, 10.38, 22.30 and 22.47% better from existing methods like WOA, FF, PSO, as well as GA, respectively.

Originality/value

This paper presents a new model for SER that aids both gender and emotion recognition. For the classification purpose, DBN is used and the weight of DBN is used and this is the first work uses MUPW algorithm for finding the optimal weight of DBN model.

Details

Data Technologies and Applications, vol. 54 no. 3
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 29 September 2020

Stefano Bromuri, Alexander P. Henkel, Deniz Iren and Visara Urovi

A vast body of literature has documented the negative consequences of stress on employee performance and well-being. These deleterious effects are particularly pronounced for…

2121

Abstract

Purpose

A vast body of literature has documented the negative consequences of stress on employee performance and well-being. These deleterious effects are particularly pronounced for service agents who need to constantly endure and manage customer emotions. The purpose of this paper is to introduce and describe a deep learning model to predict in real-time service agent stress from emotion patterns in voice-to-voice service interactions.

Design/methodology/approach

A deep learning model was developed to identify emotion patterns in call center interactions based on 363 recorded service interactions, subdivided in 27,889 manually expert-labeled three-second audio snippets. In a second step, the deep learning model was deployed in a call center for a period of one month to be further trained by the data collected from 40 service agents in another 4,672 service interactions.

Findings

The deep learning emotion classifier reached a balanced accuracy of 68% in predicting discrete emotions in service interactions. Integrating this model in a binary classification model, it was able to predict service agent stress with a balanced accuracy of 80%.

Practical implications

Service managers can benefit from employing the deep learning model to continuously and unobtrusively monitor the stress level of their service agents with numerous practical applications, including real-time early warning systems for service agents, customized training and automatically linking stress to customer-related outcomes.

Originality/value

The present study is the first to document an artificial intelligence (AI)-based model that is able to identify emotions in natural (i.e. nonstaged) interactions. It is further a pioneer in developing a smart emotion-based stress measure for service agents. Finally, the study contributes to the literature on the role of emotions in service interactions and employee stress.

Article
Publication date: 5 August 2014

Theodoros Anagnostopoulos and Christos Skourlas

The purpose of this paper is to understand the emotional state of a human being by capturing the speech utterances that are used during common conversation. Human beings except of…

Abstract

Purpose

The purpose of this paper is to understand the emotional state of a human being by capturing the speech utterances that are used during common conversation. Human beings except of thinking creatures are also sentimental and emotional organisms. There are six universal basic emotions plus a neutral emotion: happiness, surprise, fear, sadness, anger, disgust and neutral.

Design/methodology/approach

It is proved that, given enough acoustic evidence, the emotional state of a person can be classified by an ensemble majority voting classifier. The proposed ensemble classifier is constructed over three base classifiers: k nearest neighbors, C4.5 and support vector machine (SVM) polynomial kernel.

Findings

The proposed ensemble classifier achieves better performance than each base classifier. It is compared with two other ensemble classifiers: one-against-all (OAA) multiclass SVM with radial basis function kernels and OAA multiclass SVM with hybrid kernels. The proposed ensemble classifier achieves better performance than the other two ensemble classifiers.

Originality/value

The current paper performs emotion classification with an ensemble majority voting classifier that combines three certain types of base classifiers which are of low computational complexity. The base classifiers stem from different theoretical background to avoid bias and redundancy. It gives to the proposed ensemble classifier the ability to be generalized in the emotion domain space.

Details

Journal of Systems and Information Technology, vol. 16 no. 3
Type: Research Article
ISSN: 1328-7265

Keywords

Article
Publication date: 25 January 2011

William H. Bommer, Bryan J. Pesta and Susan F. Storrud‐Barnes

This paper aims to explore and test the relationship between emotion recognition skill and assessment center performance after controlling for both general mental ability (GMA…

3566

Abstract

Purpose

This paper aims to explore and test the relationship between emotion recognition skill and assessment center performance after controlling for both general mental ability (GMA) and conscientiousness. It also seeks to test whether participant sex or race moderated these relationships.

Design/methodology/approach

Using independent observers as raters, the paper tested 528 business students participating in a managerial assessment center, while they performed four distinct activities of: an in‐basket task; a team meeting for an executive hiring decision; a team meeting to discuss customer service initiatives; and an individual speech.

Findings

Emotion recognition predicted assessment center performance uniquely over both GMA and conscientiousness, but results varied by race. Females were better at emotion recognition overall, but sex neither was related to assessment center performance nor moderated the relationship between it and emotion recognition. The paper also found that GMA moderated the emotion recognition/assessment performance link, as the former was important to performance only for people with low levels of GMA.

Practical implications

The results seem to contradict those who argue that E‐IQ is an unqualified predictor of performance. Emotional recognition is not uniformly valuable; instead, it appears to benefit some groups more than others.

Originality/value

The paper clarifies the emotional intelligence literature by providing further support for the predictive validity of emotion recognition in performance contexts, and by separating out how emotional recognition benefits certain population groups more.

Details

Journal of Managerial Psychology, vol. 26 no. 1
Type: Research Article
ISSN: 0268-3946

Keywords

Article
Publication date: 1 November 2023

Juan Yang, Zhenkun Li and Xu Du

Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their…

Abstract

Purpose

Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their emotional states in daily communication. Therefore, how to achieve automatic and accurate audiovisual emotion recognition is significantly important for developing engaging and empathetic human–computer interaction environment. However, two major challenges exist in the field of audiovisual emotion recognition: (1) how to effectively capture representations of each single modality and eliminate redundant features and (2) how to efficiently integrate information from these two modalities to generate discriminative representations.

Design/methodology/approach

A novel key-frame extraction-based attention fusion network (KE-AFN) is proposed for audiovisual emotion recognition. KE-AFN attempts to integrate key-frame extraction with multimodal interaction and fusion to enhance audiovisual representations and reduce redundant computation, filling the research gaps of existing approaches. Specifically, the local maximum–based content analysis is designed to extract key-frames from videos for the purpose of eliminating data redundancy. Two modules, including “Multi-head Attention-based Intra-modality Interaction Module” and “Multi-head Attention-based Cross-modality Interaction Module”, are proposed to mine and capture intra- and cross-modality interactions for further reducing data redundancy and producing more powerful multimodal representations.

Findings

Extensive experiments on two benchmark datasets (i.e. RAVDESS and CMU-MOSEI) demonstrate the effectiveness and rationality of KE-AFN. Specifically, (1) KE-AFN is superior to state-of-the-art baselines for audiovisual emotion recognition. (2) Exploring the supplementary and complementary information of different modalities can provide more emotional clues for better emotion recognition. (3) The proposed key-frame extraction strategy can enhance the performance by more than 2.79 per cent on accuracy. (4) Both exploring intra- and cross-modality interactions and employing attention-based audiovisual fusion can lead to better prediction performance.

Originality/value

The proposed KE-AFN can support the development of engaging and empathetic human–computer interaction environment.

Article
Publication date: 4 September 2019

Björn Schuller

Uncertainty is an under-respected issue when it comes to automatic assessment of human emotion by machines. The purpose of this paper is to highlight the existent approaches…

Abstract

Purpose

Uncertainty is an under-respected issue when it comes to automatic assessment of human emotion by machines. The purpose of this paper is to highlight the existent approaches towards such measurement of uncertainty, and identify further research need.

Design/methodology/approach

The discussion is based on a literature review.

Findings

Technical solutions towards measurement of uncertainty in automatic emotion recognition (AER) exist but need to be extended to respect a range of so far underrepresented sources of uncertainty. These then need to be integrated into systems available to general users.

Research limitations/implications

Not all sources of uncertainty in automatic emotion recognition (AER) including emotion representation and annotation can be touched upon in this communication.

Practical implications

AER systems shall be enhanced by more meaningful and complete information provision on the uncertainty underlying their estimates. Limitations of their applicability should be communicated to users.

Social implications

Users of automatic emotion recognition technology will become aware of their limitations, potentially leading to a fairer usage in crucial application context.

Originality/value

There is no previous discussion including the technical view point on extended uncertainty measurement in automatic emotion recognition.

Details

Journal of Information, Communication and Ethics in Society, vol. 17 no. 3
Type: Research Article
ISSN: 1477-996X

Keywords

Article
Publication date: 25 September 2019

Fatima Zohra Ennaji, Abdelaziz El Fazziki, Hasna El Alaoui El Abdallaoui, Djamal Benslimane and Mohamed Sadgal

The purpose of this paper is to bring together the textual and multimedia opinions, since the use of social data has become the new trend that enables to gather the product…

Abstract

Purpose

The purpose of this paper is to bring together the textual and multimedia opinions, since the use of social data has become the new trend that enables to gather the product reputation traded in social media. Integrating a product reputation process into the companies' strategy will bring several benefits such as helping in decision-making regarding the current and the new generation of the product by understanding the customers’ needs. However, image-centric sentiment analysis has received much less attention than text-based sentiment detection.

Design/methodology/approach

In this work, the authors propose a multimedia content-based product reputation framework that helps in detecting opinions from social media. Thus, in this case, the analysis of a certain publication is made by combining their textual and multimedia parts.

Findings

To test the effectiveness of the proposed framework, a case study based on YouTube videos has been established, as it brings together the image, the audio and the video processing at the same time.

Originality/value

The key novelty is the implication of multimedia content in addition of the textual one with the goal of gathering opinions about a certain product. The multimedia analysis brings together facial sentiment detection, printed text analysis, opinion detection from speeches and textual opinion analysis.

Details

International Journal of Web Information Systems, vol. 16 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 20 June 2016

Yuan Wei and Jing Zhao

This paper aims to deal with the problem of designing robot behaviors (mainly to robotic arms) to express emotions. The authors study the effects of robot behaviors from our…

Abstract

Purpose

This paper aims to deal with the problem of designing robot behaviors (mainly to robotic arms) to express emotions. The authors study the effects of robot behaviors from our humanoid robot NAO on the subject’s emotion expression in human–robot interaction (HRI).

Design/methodology/approach

A method to design robot behavior through the movement primitives is proposed. Then, a novel dimensional affective model is built. Finally, the concept of action semantics is adopted to combine the robot behaviors with emotion expression.

Findings

For the evaluation of this combination, the authors assess positive (excited and happy) and negative (frightened and sad) emotional patterns on 20 subjects which are divided into two groups (whether they were familiar with robots). The results show that the recognition of the different emotion patterns does not have differences between the two groups and the subjects could recognize the robot behaviors with emotions.

Practical implications

Using affective models to guide robots’ behavior or express their intentions is highly beneficial in human–robot interaction. The authors think about several applications of the emotional motion: improve efficiency in HRI, direct people during disasters, better understanding with human partners or help people perform their tasks better.

Originality/value

This paper presents a method to design robot behaviors with emotion expression. Meanwhile, a similar methodology can be used in other parts (leg, torso, head and so on) of humanoid robots or non-humanoid robots, such as industrial robots.

Details

Industrial Robot: An International Journal, vol. 43 no. 4
Type: Research Article
ISSN: 0143-991X

Keywords

Article
Publication date: 12 November 2021

G. Merlin Linda, N.V.S. Sree Rathna Lakshmi, N. Senthil Murugan, Rajendra Prasad Mahapatra, V. Muthukumaran and M. Sivaram

The paper aims to introduce an intelligent recognition system for viewpoint variations of gait and speech. It proposes a convolutional neural network-based capsule network…

Abstract

Purpose

The paper aims to introduce an intelligent recognition system for viewpoint variations of gait and speech. It proposes a convolutional neural network-based capsule network (CNN-CapsNet) model and outlining the performance of the system in recognition of gait and speech variations. The proposed intelligent system mainly focuses on relative spatial hierarchies between gait features in the entities of the image due to translational invariances in sub-sampling and speech variations.

Design/methodology/approach

This proposed work CNN-CapsNet is mainly used for automatic learning of feature representations based on CNN and used capsule vectors as neurons to encode all the spatial information of an image by adapting equal variances to change in viewpoint. The proposed study will resolve the discrepancies caused by cofactors and gait recognition between opinions based on a model of CNN-CapsNet.

Findings

This research work provides recognition of signal, biometric-based gait recognition and sound/speech analysis. Empirical evaluations are conducted on three aspects of scenarios, namely fixed-view, cross-view and multi-view conditions. The main parameters for recognition of gait are speed, change in clothes, subjects walking with carrying object and intensity of light.

Research limitations/implications

The proposed CNN-CapsNet has some limitations when considering for detecting the walking targets from surveillance videos considering multimodal fusion approaches using hardware sensor devices. It can also act as a pre-requisite tool to analyze, identify, detect and verify the malware practices.

Practical implications

This research work includes for detecting the walking targets from surveillance videos considering multimodal fusion approaches using hardware sensor devices. It can also act as a pre-requisite tool to analyze, identify, detect and verify the malware practices.

Originality/value

This proposed research work proves to be performing better for the recognition of gait and speech when compared with other techniques.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 15 no. 3
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 1 March 1991

Holley R. Lange, George Philip, Bradley C. Watson, John Kountz, Samuel T. Waters and George Doddington

A real potential exists for library use of voice technologies: as aids to the disabled or illiterate library user, as front‐ends for general library help systems, in online…

207

Abstract

A real potential exists for library use of voice technologies: as aids to the disabled or illiterate library user, as front‐ends for general library help systems, in online systems for commands or control words, and in many of the hands‐busy‐eyes‐busy activities that are common in libraries. Initially, these applications would be small, limited processes that would not require the more fluent human‐machine communication that we might hope for in the future. Voice technologies will depend on and benefit from new computer systems, advances in artificial intelligence and expert systems to facilitate their use and enable them to better circumvent present input and output problems. These voice systems will gradually assume more importance, improving access to information and complementing existing systems, but they will not likely revolutionize or dominate human‐machine communications or library services in the near future.

Details

Library Hi Tech, vol. 9 no. 3
Type: Research Article
ISSN: 0737-8831

1 – 10 of over 1000