Search results

1 – 10 of 673
Article
Publication date: 26 April 2022

Ebenhaeser Otto Janse van Rensburg, Reinhardt A. Botha and Rossouw von Solms

Authenticating an individual through voice can prove convenient as nothing needs to be stored and cannot easily be stolen. However, if an individual is authenticating under…

Abstract

Purpose

Authenticating an individual through voice can prove convenient as nothing needs to be stored and cannot easily be stolen. However, if an individual is authenticating under duress, the coerced attempt must be acknowledged and appropriate warnings issued. Furthermore, as duress may entail multiple combinations of emotions, the current f-score evaluation does not accommodate that multiple selected samples possess similar levels of importance. Thus, this study aims to demonstrate an approach to identifying duress within a voice-based authentication system.

Design/methodology/approach

Measuring the value that a classifier presents is often done using an f-score. However, the f-score does not effectively portray the proposed value when multiple classes could be grouped as one. The f-score also does not provide any information when numerous classes are often incorrectly identified as the other. Therefore, the proposed approach uses the confusion matrix, aggregates the select classes into another matrix and calculates a more precise representation of the selected classifier’s value. The utility of the proposed approach is demonstrated through multiple tests and is conducted as follows. The initial tests’ value is presented by an f-score, which does not value the individual emotions. The lack of value is then remedied with further tests, which include a confusion matrix. Final tests are then conducted that aggregate selected emotions within the confusion matrix to present a more precise utility value.

Findings

Two tests within the set of experiments achieved an f-score difference of 1%, indicating, Mel frequency cepstral coefficient, emotion detection, confusion matrix, multi-layer perceptron, Ryerson audio-visual database of emotional speech and song (RAVDESS), voice authentication that the two tests provided similar value. The confusion matrix used to calculate the f-score indicated that some emotions are often confused, which could all be considered closely related. Although the f-score can represent an accuracy value, these tests’ value is not accurately portrayed when not considering often confused emotions. Deciding which approach to take based on the f-score did not prove beneficial as it did not address the confused emotions. When aggregating the confusion matrix of these two tests based on selected emotions, the newly calculated utility value demonstrated a difference of 4%, indicating that the two tests may not provide a similar value as previously indicated.

Research limitations/implications

This approach’s performance is dependent on the data presented to it. If the classifier is presented with incomplete or degraded data, the results obtained from the classifier will reflect that. Additionally, the grouping of emotions is not based on psychological evidence, and this was purely done to demonstrate the implementation of an aggregated confusion matrix.

Originality/value

The f-score offers a value that represents the classifiers’ ability to classify a class correctly. This paper demonstrates that aggregating a confusion matrix could provide more value than a single f-score in the context of classifying an emotion that could consist of a combination of emotions. This approach can similarly be applied to different combinations of classifiers for the desired effect of extracting a more accurate performance value that a selected classifier presents.

Details

Information & Computer Security, vol. 30 no. 5
Type: Research Article
ISSN: 2056-4961

Keywords

Article
Publication date: 17 January 2022

Syed Haroon Abdul Gafoor and Padma Theagarajan

Conventional diagnostic techniques, on the other hand, may be prone to subjectivity since they depend on assessment of motions that are often subtle to individual eyes and hence…

126

Abstract

Purpose

Conventional diagnostic techniques, on the other hand, may be prone to subjectivity since they depend on assessment of motions that are often subtle to individual eyes and hence hard to classify, potentially resulting in misdiagnosis. Meanwhile, early nonmotor signs of Parkinson’s disease (PD) can be mild and may be due to variety of other conditions. As a result, these signs are usually ignored, making early PD diagnosis difficult. Machine learning approaches for PD classification and healthy controls or individuals with similar medical symptoms have been introduced to solve these problems and to enhance the diagnostic and assessment processes of PD (like, movement disorders or other Parkinsonian syndromes).

Design/methodology/approach

Medical observations and evaluation of medical symptoms, including characterization of a wide range of motor indications, are commonly used to diagnose PD. The quantity of the data being processed has grown in the last five years; feature selection has become a prerequisite before any classification. This study introduces a feature selection method based on the score-based artificial fish swarm algorithm (SAFSA) to overcome this issue.

Findings

This study adds to the accuracy of PD identification by reducing the amount of chosen vocal features while to use the most recent and largest publicly accessible database. Feature subset selection in PD detection techniques starts by eliminating features that are not relevant or redundant. According to a few objective functions, features subset chosen should provide the best performance.

Research limitations/implications

In many situations, this is an Nondeterministic Polynomial Time (NP-Hard) issue. This method enhances the PD detection rate by selecting the most essential features from the database. To begin, the data set's dimensionality is reduced using Singular Value Decomposition dimensionality technique. Next, Biogeography-Based Optimization (BBO) for feature selection; the weight value is a vital parameter for finding the best features in PD classification.

Originality/value

PD classification is done by using ensemble learning classification approaches such as hybrid classifier of fuzzy K-nearest neighbor, kernel support vector machines, fuzzy convolutional neural network and random forest. The suggested classifiers are trained using data from UCI ML repository, and their results are verified using leave-one-person-out cross validation. The measures employed to assess the classifier efficiency include accuracy, F-measure, Matthews correlation coefficient.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 15 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 29 September 2020

Stefano Bromuri, Alexander P. Henkel, Deniz Iren and Visara Urovi

A vast body of literature has documented the negative consequences of stress on employee performance and well-being. These deleterious effects are particularly pronounced for…

2121

Abstract

Purpose

A vast body of literature has documented the negative consequences of stress on employee performance and well-being. These deleterious effects are particularly pronounced for service agents who need to constantly endure and manage customer emotions. The purpose of this paper is to introduce and describe a deep learning model to predict in real-time service agent stress from emotion patterns in voice-to-voice service interactions.

Design/methodology/approach

A deep learning model was developed to identify emotion patterns in call center interactions based on 363 recorded service interactions, subdivided in 27,889 manually expert-labeled three-second audio snippets. In a second step, the deep learning model was deployed in a call center for a period of one month to be further trained by the data collected from 40 service agents in another 4,672 service interactions.

Findings

The deep learning emotion classifier reached a balanced accuracy of 68% in predicting discrete emotions in service interactions. Integrating this model in a binary classification model, it was able to predict service agent stress with a balanced accuracy of 80%.

Practical implications

Service managers can benefit from employing the deep learning model to continuously and unobtrusively monitor the stress level of their service agents with numerous practical applications, including real-time early warning systems for service agents, customized training and automatically linking stress to customer-related outcomes.

Originality/value

The present study is the first to document an artificial intelligence (AI)-based model that is able to identify emotions in natural (i.e. nonstaged) interactions. It is further a pioneer in developing a smart emotion-based stress measure for service agents. Finally, the study contributes to the literature on the role of emotions in service interactions and employee stress.

Article
Publication date: 1 April 1996

K.S. Chin, K.V. Patri, K.F. Pun, W.H. Yeung, L.T. Poon and K.K. Poon

Describes the design and early implementation of a team‐based quality improvement campaign in progress within the Department of Manufacturing Engineering at City University of…

499

Abstract

Describes the design and early implementation of a team‐based quality improvement campaign in progress within the Department of Manufacturing Engineering at City University of Hong Kong to improve the performance of its laboratory services. Midway into a two‐year implementation plan aimed at fostering a worker‐oriented group culture to achieve the quality targets, the “Quality Campaign in Manufacturing Engineering Laboratories (QCIMEL) discusses, via the working committee, the rationale of the programme design and structure, and reports progress at the end of the one‐year mark.

Details

The TQM Magazine, vol. 8 no. 2
Type: Research Article
ISSN: 0954-478X

Keywords

Article
Publication date: 7 June 2013

Mariah Strella P. Indrinal, Ranyel Bryan L. Maliwanag and Marynyriene I. Silvestre

The purpose of this paper is to introduce VoxGrid, a mobile voice verification system intended for improving the security of the username‐password authentication scheme.

Abstract

Purpose

The purpose of this paper is to introduce VoxGrid, a mobile voice verification system intended for improving the security of the username‐password authentication scheme.

Design/methodology/approach

The system incorporates text‐dependant speaker verification via mobile devices that provides for a three‐factor authentication scheme for granting authorised access to certain websites or applications. The same speech recognition engine used by Google Voice Search is utilised to provide voice‐to‐text feature. All verification tasks are performed on a centralised server to minimise computing requirements on mobile platforms where feature extractions is executed using Mel Frequency Cepstral Coefficients. The resulting features are transmitted to the server instead of raw voice data to reduce network load. Actual voice verification takes place in the central server using Vector Quantisation.

Findings

The initial results have indicated that VoxGrid is capable of providing an additional level of security on user authentications at a low cost and without using extra security tokens other than one's voice with a good enough performance given the limited resources available during testing.

Originality/value

Past speaker verification experiments have been conducted but we see that this is the first time it is done on mobile devices with a client‐server architecture using K‐Means Clustering and Vector Quantisation. Future improvements on performance and testing could result in a more secure mobile computing environment.

Details

Information Management & Computer Security, vol. 21 no. 2
Type: Research Article
ISSN: 0968-5227

Keywords

Article
Publication date: 3 October 2019

Dinesh Kumar D.S. and P.V. Rao

The purpose of this paper is to incorporate a multimodal biometric system, which plays a major role in improving the accuracy and reducing FAR and FRR performance metrics…

Abstract

Purpose

The purpose of this paper is to incorporate a multimodal biometric system, which plays a major role in improving the accuracy and reducing FAR and FRR performance metrics. Biometrics plays a major role in several areas including military applications because of robustness of the system. Speech and face data are considered as key elements that are commonly used for multimodal biometric applications, as they are simultaneously acquired from camera and microphone.

Design/methodology/approach

In this proposed work, Viola‒Jones algorithm is used for face detection, and Local Binary Pattern consists of texture operators that perform thresholding operation to extract the features of face. Mel-frequency cepstral coefficients exploit the performances of voice data, and median filter is used for removing noise. KNN classifier is used for fusion of both face and voice. The proposed method produces better results in noisy environment with better accuracy. In this proposed method, from the database, 120 face and voice samples are trained and tested with simulation results using MATLAB tool that improves performance in better recognition and accuracy.

Findings

The algorithms perform better for both face and voice recognition. The outcome of this work provides better accuracy up to 98 per cent with reduced FAR of 0.5 per cent and FRR of 0.75 per cent.

Originality/value

The algorithms perform better for both face and voice recognition. The outcome of this work provides better accuracy up to 98 per cent with reduced FAR of 0.5 per cent and FRR of 0.75 per cent.

Details

International Journal of Intelligent Unmanned Systems, vol. 8 no. 1
Type: Research Article
ISSN: 2049-6427

Keywords

Article
Publication date: 24 March 2023

Haoning Pu, Zhan Wen, Xiulan Sun, Lemei Han, Yanhe Na, Hantao Liu and Wenzao Li

The purpose of this paper is to provide a shorter time cost, high-accuracy fault diagnosis method for water pumps. Water pumps are widely used in industrial equipment and their…

Abstract

Purpose

The purpose of this paper is to provide a shorter time cost, high-accuracy fault diagnosis method for water pumps. Water pumps are widely used in industrial equipment and their fault diagnosis is gaining increasing attention. Considering the time-consuming empirical mode decomposition (EMD) method and the more efficient classification provided by the convolutional neural network (CNN) method, a novel classification method based on incomplete empirical mode decomposition (IEMD) and dual-input dual-channel convolutional neural network (DDCNN) composite data is proposed and applied to the fault diagnosis of water pumps.

Design/methodology/approach

This paper proposes a data preprocessing method using IEMD combined with mel-frequency cepstrum coefficient (MFCC) and a neural network model of DDCNN. First, the sound signal is decomposed by IEMD to get numerous intrinsic mode functions (IMFs) and a residual (RES). Several IMFs and one RES are then extracted by MFCC features. Ultimately, the obtained features are split into two channels (IMFs one channel; RES one channel) and input into DDCNN.

Findings

The Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection (MIMII dataset) is used to verify the practicability of the method. Experimental results show that decomposition into an IMF is optimal when taking into account the real-time and accuracy of the diagnosis. Compared with EMD, 51.52% of data preprocessing time, 67.25% of network training time and 63.7% of test time are saved and also improve accuracy.

Research limitations/implications

This method can achieve higher accuracy in fault diagnosis with a shorter time cost. Therefore, the fault diagnosis of equipment based on the sound signal in the factory has certain feasibility and research importance.

Originality/value

This method provides a feasible method for mechanical fault diagnosis based on sound signals in industrial applications.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 16 no. 3
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 29 November 2022

Claudia Barile, Caterina Casavola, Giovanni Pappalettera and Vimalathithan Paramsamy Kannan

The acousto-ultrasonic approach is used for propagating stress waves through different configurations of CORTEN steel specimens. The propagated waves are recorded and analysed by…

Abstract

Purpose

The acousto-ultrasonic approach is used for propagating stress waves through different configurations of CORTEN steel specimens. The propagated waves are recorded and analysed by piezoelectric sensors. The purpose of the study is to study the characteristics of the CORTEN steel by analysing the propagated waves.

Design/methodology/approach

To investigate the attenuation in acoustic wave propagation due to the corrosion formation in CORTEN steel specimens and to train a neural network model to classify the attenuated acoustic waves automatically.

Findings

Due to the corrosion formation in CORTEN steel specimens, attenuation is observed in amplitude, energy, counts and duration of the propagated waves. When the waves are analysed in their time-frequency characteristics, attenuation is observed in their frequency and spectral energy.

Originality/value

The corrosion formation in CORTEN steel can automatically be analysed by using the acousto-ultrasonic approach and the trained deep learning neural network.

Details

International Journal of Structural Integrity, vol. 14 no. 1
Type: Research Article
ISSN: 1757-9864

Keywords

Article
Publication date: 16 November 2010

Alistair Brandon‐Jones and Rhian Silvestro

This paper aims to build upon the debate in the service quality literature regarding both the theoretical and practical effectiveness of expectations data in the measurement of…

5110

Abstract

Purpose

This paper aims to build upon the debate in the service quality literature regarding both the theoretical and practical effectiveness of expectations data in the measurement of internal service quality (ISQ). Gap‐based and perceptions‐only approaches to measuring ISQ are tested and their respective benefits and limitations evaluated.

Design/methodology/approach

The internal service context used in this study is the provision of e‐procurement software, training, and user support in four organisations. The two approaches are evaluated in terms of reliability and validity, as well as pragmatic aspects of survey administration.

Findings

The various tests carried out indicate that both the gap‐measure and perceptions‐only measure are reliable and valid, the latter being the marginally higher performer. Both approaches were found to have benefits and limitations, and so the empirical study, combined with contributions from the literature, generates some understanding of the internal service context in which the two approaches might be appropriate.

Research limitations/implications

The survey was based on an internal e‐procurement service; as such, the variables and dimensions selected to measure ISQ in this context inevitably limit the scope of the research.

Practical implications

For operations managers, the paper clarifies the basis on which they might choose between the two approaches to ISQ measurement.

Originality/value

This study is the first to directly test and compare the relative merits of these two approaches to ISQ measurement. The paper also offers insights as to the operational contexts in which each approach might be appropriate.

Details

International Journal of Operations & Production Management, vol. 30 no. 12
Type: Research Article
ISSN: 0144-3577

Keywords

Article
Publication date: 26 July 2021

Dhanalakshmi M., Nagarajan T. and Vijayalakshmi P.

Dysarthria is a neuromotor speech disorder caused by neuromuscular disturbances that affect one or more articulators resulting in unintelligible speech. Though inter-phoneme…

Abstract

Purpose

Dysarthria is a neuromotor speech disorder caused by neuromuscular disturbances that affect one or more articulators resulting in unintelligible speech. Though inter-phoneme articulatory variations are well captured by formant frequency-based acoustic features, these variations are expected to be much higher for dysarthric speakers than normal. These substantial variations can be well captured by placing sensors in appropriate articulatory position. This study focuses to determine a set of articulatory sensors and parameters in order to assess articulatory dysfunctions in dysarthric speech.

Design/methodology/approach

The current work aims to determine significant sensors and parameters associated using motion path and correlation analyzes on the TORGO database of dysarthric speech. Among eight informative sensor channels and six parameters per channel in positional data, the sensors such as tongue middle, back and tip, lower and upper lips and parameters (y, z, φ) are found to contribute significantly toward capturing the articulatory information. Acoustic and positional data analyzes are performed to validate these identified significant sensors. Furthermore, a convolutional neural network-based classifier is developed for both phone-and word-level classification of dysarthric speech using acoustic and positional data.

Findings

The average phone error rate is observed to be lower, up to 15.54% for positional data when compared with acoustic-only data. Further, word-level classification using a combination of both acoustic and positional information is performed to study that the positional data acquired using significant sensors will boost the performance of classification even for severe dysarthric speakers.

Originality/value

The proposed work shows that the significant sensors and parameters can be used to assess dysfunctions in dysarthric speech effectively. The articulatory sensor data helps in better assessment than the acoustic data even for severe dysarthric speakers.

1 – 10 of 673