Search results

1 – 10 of over 3000
Article
Publication date: 9 August 2021

Hrishikesh B Vanjari and Mahesh T Kolte

Speech is the primary means of communication for humans. A proper functioning auditory system is needed for accurate cognition of speech. Compressed sensing (CS) is a method for…

Abstract

Purpose

Speech is the primary means of communication for humans. A proper functioning auditory system is needed for accurate cognition of speech. Compressed sensing (CS) is a method for simultaneous compression and sampling of a given signal. It is a novel method increasingly being used in many speech processing applications. The paper aims to use Compressive sensing algorithm for hearing aid applications to reduce surrounding noise.

Design/methodology/approach

In this work, the authors propose a machine learning algorithm for improving the performance of compressive sensing using a neural network.

Findings

The proposed solution is able to reduce the signal reconstruction time by about 21.62% and root mean square error of 43% compared to default L2 norm minimization used in CS reconstruction. This work proposes an adaptive neural network–based algorithm to enhance the compressive sensing so that it is able to reconstruct the signal in a comparatively lower time and with minimal distortion to the quality.

Research limitations/implications

The use of compressive sensing for speech enhancement in a hearing aid is limited due to the delay in the reconstruction of the signal.

Practical implications

In many digital applications, the acquired raw signals are compressed to achieve smaller size so that it becomes effective for storage and transmission. In this process, even unnecessary signals are acquired and compressed leading to inefficiency.

Social implications

Hearing loss is the most common sensory deficit in humans today. Worldwide, it is the second leading cause for “Years lived with Disability” the first being depression. A recent study by World health organization estimates nearly 450 million people in the world had been disabled by hearing loss, and the prevalence of hearing impairment in India is around 6.3% (63 million people suffering from significant auditory loss).

Originality/value

The objective is to reduce the time taken for CS reconstruction with minimal degradation to the reconstructed signal. Also, the solution must be adaptive to different characteristics of the signal and in presence of different types of noises.

Details

World Journal of Engineering, vol. 19 no. 2
Type: Research Article
ISSN: 1708-5284

Keywords

Article
Publication date: 4 September 2009

Michael Schuricht, Zachary Davis, Michael Hu, Shreyas Prasad, Peter M. Melliar‐Smith and Louise E. Moser

Mobile handheld devices, such as cellular phones and personal digital assistants, are inherently small and lack an intuitive and natural user interface. Speech recognition and…

Abstract

Purpose

Mobile handheld devices, such as cellular phones and personal digital assistants, are inherently small and lack an intuitive and natural user interface. Speech recognition and synthesis technology can be used in mobile handheld devices to improve the user experience. The purpose of this paper is to describe a prototype system that supports multiple speech‐enabled applications in a mobile handheld device.

Design/methodology/approach

The main component of the system, the Program Manager, coordinates and controls the speech‐enabled applications. Human speech requests to, and responses from, these applications are processed in the mobile handheld device, to achieve the goal of human‐like interactions between the human and the device. In addition to speech, the system also supports graphics and text, i.e., multimodal input and output, for greater usability, flexibility, adaptivity, accuracy, and robustness. The paper presents a qualitative and quantitative evaluation of the prototype system. The Program Manager is currently designed to handle the specific speech‐enabled applications that we developed.

Findings

The paper determines that many human interactions involve not single applications but multiple applications working together in possibly unanticipated ways.

Research limitations/implications

Future work includes generalization of the Program Manager so that it supports arbitrary applications and the addition of new applications dynamically. Future work also includes deployment of the Program Manager and the applications on cellular phones running the Android Platform or the Openmoko Framework.

Originality/value

This paper presents a first step towards a future human interface for mobile handheld devices and for speech‐enabled applications operating on those devices.

Details

International Journal of Pervasive Computing and Communications, vol. 5 no. 3
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 17 March 2016

Yuma Sandoval, Victor H. Diaz-Ramirez and Vitaly Kober

The purpose of the present work is to design robust estimators for speech enhancement by incorporation of calculation rank-order statistics and locally-adaptive neighborhoods. The…

Abstract

Purpose

The purpose of the present work is to design robust estimators for speech enhancement by incorporation of calculation rank-order statistics and locally-adaptive neighborhoods. The proposed estimators are able to increase the speech quality of a noisy signal, to preserve better speech intelligibility, and to introduce less artifacts comparing with known speech enhancement estimators.

Design/methodology/approach

We design a novel speech enhancement algorithm based on rank-order statistics and local adaptive signal processing to improve the accuracy of existing speech enhancement estimators, in terms of speech quality, intelligibility, and introduction of artificial artifacts.

Findings

We found that by using the proposed estimators for speech enhancement we obtain a better adaptation to nonstationary characteristics of speech and noise processes comparing with that of known speech enhancement estimators. The proposed algorithm increases speech quality, preserves better speech intelligibility, and introduces less artifacts comparing with known speech enhancement estimators.

Research limitations/implications

The proposed approach for speech enhancement is a locally-adaptive signal processing performed for each element of a noisy speech signal. Thus, the main limitation of the proposed approach is an increase of computational complexity compared with that of nonadaptive conventional techniques.

Practical implications

In order to perform real-time speech enhancement with the proposed approach, it is recommended to use a digital system with a fast processor. Another option is by using a parallel architecture such as a FPGA.

Originality/value

We propose a novel local-adaptive algorithm for robust speech enhancement by incorporation of calculation of rank-order statistics and local-adaptive neighborhoods. The proposed algorithm is able to adjust itself in response to changes in the statistical properties of ambience noise.

Details

COMPEL - The international journal for computation and mathematics in electrical and electronic engineering , vol. 35 no. 3
Type: Research Article
ISSN: 0332-1649

Open Access
Article
Publication date: 21 June 2022

Abhishek Das and Mihir Narayan Mohanty

In time and accurate detection of cancer can save the life of the person affected. According to the World Health Organization (WHO), breast cancer occupies the most frequent…

Abstract

Purpose

In time and accurate detection of cancer can save the life of the person affected. According to the World Health Organization (WHO), breast cancer occupies the most frequent incidence among all the cancers whereas breast cancer takes fifth place in the case of mortality numbers. Out of many image processing techniques, certain works have focused on convolutional neural networks (CNNs) for processing these images. However, deep learning models are to be explored well.

Design/methodology/approach

In this work, multivariate statistics-based kernel principal component analysis (KPCA) is used for essential features. KPCA is simultaneously helpful for denoising the data. These features are processed through a heterogeneous ensemble model that consists of three base models. The base models comprise recurrent neural network (RNN), long short-term memory (LSTM) and gated recurrent unit (GRU). The outcomes of these base learners are fed to fuzzy adaptive resonance theory mapping (ARTMAP) model for decision making as the nodes are added to the F_2ˆa layer if the winning criteria are fulfilled that makes the ARTMAP model more robust.

Findings

The proposed model is verified using breast histopathology image dataset publicly available at Kaggle. The model provides 99.36% training accuracy and 98.72% validation accuracy. The proposed model utilizes data processing in all aspects, i.e. image denoising to reduce the data redundancy, training by ensemble learning to provide higher results than that of single models. The final classification by a fuzzy ARTMAP model that controls the number of nodes depending upon the performance makes robust accurate classification.

Research limitations/implications

Research in the field of medical applications is an ongoing method. More advanced algorithms are being developed for better classification. Still, the scope is there to design the models in terms of better performance, practicability and cost efficiency in the future. Also, the ensemble models may be chosen with different combinations and characteristics. Only signal instead of images may be verified for this proposed model. Experimental analysis shows the improved performance of the proposed model. This method needs to be verified using practical models. Also, the practical implementation will be carried out for its real-time performance and cost efficiency.

Originality/value

The proposed model is utilized for denoising and to reduce the data redundancy so that the feature selection is done using KPCA. Training and classification are performed using heterogeneous ensemble model designed using RNN, LSTM and GRU as base classifiers to provide higher results than that of single models. Use of adaptive fuzzy mapping model makes the final classification accurate. The effectiveness of combining these methods to a single model is analyzed in this work.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 1 March 1991

Leslie Rosen

A variety of enabling technologies such as synthetic speech, print enlargement on CRT screens, braille printers and displays, and communications technology has made library…

Abstract

A variety of enabling technologies such as synthetic speech, print enlargement on CRT screens, braille printers and displays, and communications technology has made library operations at the American Foundation for the Blind accessible to persons who are blind or visually impaired. INMAGIC software, a versatile database management system, has automated many library functions and has been integrated with other adaptive technologies. In addition to other applications, INMAGIC is used to update and create bibliographies and accession lists in inkprint, large print, or braille formats (with tape cassette versions available on request). Sidebars discuss the Xerox/Kurzweil Personal Reader (KPR); closed circuit television (CCTV); computers with speech; large print enhancements; Inmagic, Inc.—the company; and, in some depth, the functionality of INMAGIC.

Details

Library Hi Tech, vol. 9 no. 3
Type: Research Article
ISSN: 0737-8831

Article
Publication date: 3 October 2016

Yeou-Jiunn Chen and Jiunn-Liang Wu

Articulation errors substantially reduce speech intelligibility and the ease of spoken communication. Moreover, the articulation learning process that speech-language pathologists…

Abstract

Purpose

Articulation errors substantially reduce speech intelligibility and the ease of spoken communication. Moreover, the articulation learning process that speech-language pathologists must provide is time consuming and expensive. The purpose of this paper, to facilitate the articulation learning process, is to develop a computer-aided articulation learning system to help subjects with articulation disorders.

Design/methodology/approach

Facial animations, including lip and tongue animations, are used to convey the manner and place of articulation to the subject. This process improves the effectiveness of articulation learning. An interactive learning system is implemented through pronunciation confusion networks (PCNs) and automatic speech recognition (ASR), which are applied to identify mispronunciations.

Findings

Speech and facial animations are effective for assisting subjects in imitating sounds and developing articulatory ability. PCNs and ASR can be used to automatically identify mispronunciations.

Research limitations/implications

Future research will evaluate the clinical performance of this approach to articulation learning.

Practical implications

The experimental results of this study indicate that it is feasible for clinically implementing a computer-aided articulation learning system in learning articulation.

Originality/value

This study developed a computer-aided articulation learning system to facilitate improving speech production ability in subjects with articulation disorders.

Details

Engineering Computations, vol. 33 no. 7
Type: Research Article
ISSN: 0264-4401

Keywords

Content available
Article
Publication date: 13 November 2023

Sheuli Paul

This paper presents a survey of research into interactive robotic systems for the purpose of identifying the state of the art capabilities as well as the extant gaps in this…

1408

Abstract

Purpose

This paper presents a survey of research into interactive robotic systems for the purpose of identifying the state of the art capabilities as well as the extant gaps in this emerging field. Communication is multimodal. Multimodality is a representation of many modes chosen from rhetorical aspects for its communication potentials. The author seeks to define the available automation capabilities in communication using multimodalities that will support a proposed Interactive Robot System (IRS) as an AI mounted robotic platform to advance the speed and quality of military operational and tactical decision making.

Design/methodology/approach

This review will begin by presenting key developments in the robotic interaction field with the objective of identifying essential technological developments that set conditions for robotic platforms to function autonomously. After surveying the key aspects in Human Robot Interaction (HRI), Unmanned Autonomous System (UAS), visualization, Virtual Environment (VE) and prediction, the paper then proceeds to describe the gaps in the application areas that will require extension and integration to enable the prototyping of the IRS. A brief examination of other work in HRI-related fields concludes with a recapitulation of the IRS challenge that will set conditions for future success.

Findings

Using insights from a balanced cross section of sources from the government, academic, and commercial entities that contribute to HRI a multimodal IRS in military communication is introduced. Multimodal IRS (MIRS) in military communication has yet to be deployed.

Research limitations/implications

Multimodal robotic interface for the MIRS is an interdisciplinary endeavour. This is not realistic that one can comprehend all expert and related knowledge and skills to design and develop such multimodal interactive robotic interface. In this brief preliminary survey, the author has discussed extant AI, robotics, NLP, CV, VDM, and VE applications that is directly related to multimodal interaction. Each mode of this multimodal communication is an active research area. Multimodal human/military robot communication is the ultimate goal of this research.

Practical implications

A multimodal autonomous robot in military communication using speech, images, gestures, VST and VE has yet to be deployed. Autonomous multimodal communication is expected to open wider possibilities for all armed forces. Given the density of the land domain, the army is in a position to exploit the opportunities for human–machine teaming (HMT) exposure. Naval and air forces will adopt platform specific suites for specially selected operators to integrate with and leverage this emerging technology. The possession of a flexible communications means that readily adapts to virtual training will enhance planning and mission rehearsals tremendously.

Social implications

Interaction, perception, cognition and visualization based multimodal communication system is yet missing. Options to communicate, express and convey information in HMT setting with multiple options, suggestions and recommendations will certainly enhance military communication, strength, engagement, security, cognition, perception as well as the ability to act confidently for a successful mission.

Originality/value

The objective is to develop a multimodal autonomous interactive robot for military communications. This survey reports the state of the art, what exists and what is missing, what can be done and possibilities of extension that support the military in maintaining effective communication using multimodalities. There are some separate ongoing progresses, such as in machine-enabled speech, image recognition, tracking, visualizations for situational awareness, and virtual environments. At this time, there is no integrated approach for multimodal human robot interaction that proposes a flexible and agile communication. The report briefly introduces the research proposal about multimodal interactive robot in military communication.

Article
Publication date: 29 March 2011

Halim Sayoud, Siham Ouamour and Salah Khennouf

The purpose of this paper is two‐fold. First, to deal with the problem of audio speaker localization and second, to deal with the problem of mobile camera control. The task of…

Abstract

Purpose

The purpose of this paper is two‐fold. First, to deal with the problem of audio speaker localization and second, to deal with the problem of mobile camera control. The task of speaker localization consists of determining the position of the active speaker and the task of camera control consists of orienting a mobile camera towards that active speaker. These steps represent the main task of speaker tracking, which is the global purpose of the research work.

Design/methodology/approach

In this approach, two‐channel‐based estimation of the speaker position is achieved by comparing the signals received by two cardioids microphones, which are placed the one against the other and separated by a fixed distance. The localization technique presented in this paper is inspired from the human ears, which act as two different sound observation points, enabling humans to estimate the direction of the speaking person with a good precision. Concerning the camera control part, the authors have conceived an automatic system for generating the command signals and controlling the rotation of the mobile camera by a stepper motor.

Findings

The off‐line experiments of speaker tracking by camera have been done in a small meeting room without echo cancelation. Results show the good performances of the proposed localization methods and a correct tracking by camera.

Practical implications

This new technique can be used for the automatic supervision of smart rooms.

Originality/value

The work described in this paper is original, since it uses only two microphones for the speaker localization.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 4 no. 1
Type: Research Article
ISSN: 1756-378X

Keywords

Content available
Article
Publication date: 1 December 2003

Jon Rigelsford

239

Abstract

Details

Sensor Review, vol. 23 no. 4
Type: Research Article
ISSN: 0260-2288

Keywords

Article
Publication date: 29 September 2023

Ata Jahangir Moshayedi, Nafiz Md Imtiaz Uddin, Xiaohong Zhang and Mehran Emadi Andani

This paper aims to explore and review the potential of robotic rehabilitation as a treatment approach for Alzheimer’s disease (AD) and its impact on the health and quality of life…

Abstract

Purpose

This paper aims to explore and review the potential of robotic rehabilitation as a treatment approach for Alzheimer’s disease (AD) and its impact on the health and quality of life of AD patients.

Design/methodology/approach

The present discourse endeavors to provide a comprehensive overview of extant scholarly inquiries that have examined the salience of inhibitory mechanisms vis-à-vis robotic interventions and their impact on patients with AD. Specifically, this review aims to explicate the contemporary state of affairs in this realm by furnishing a detailed explication of ongoing research endeavors. With the objective of elucidating the significance of inhibitory processes in robotic therapies for individuals with AD, this analysis offers a critical appraisal of extant literature that probes the intersection of cognitive mechanisms and assistive technologies. Through a meticulous analysis of diverse scholarly contributions, this review advances a nuanced understanding of the intricate interplay between inhibitory processes and robotic interventions in the context of AD.

Findings

According to the review papers, it appears that implementing robot-assisted rehabilitation can serve as a pragmatic and effective solution for enhancing the well-being and overall quality of life of patients and families engaged with AD. Besides, this new feature in the robotic area is anticipated to have a critical role in the success of this innovative approach.

Research limitations/implications

Due to the nascent nature of this cutting-edge technology and the constrained configuration of the mechanized entity in question, further protracted analysis is imperative to ascertain the advantages and drawbacks of robotic rehabilitation vis-à-vis individuals afflicted with Alzheimer’s ailment.

Social implications

The potential for robots to serve as indispensable assets in the provision of care for individuals afflicted with AD is significant; however, their efficacy and appropriateness for utilization by caregivers of AD patients must be subjected to further rigorous scrutiny.

Originality/value

This paper reviews the current robotic method and compares the current state of the art for the AD patient.

Details

Robotic Intelligence and Automation, vol. 43 no. 6
Type: Research Article
ISSN: 2754-6969

Keywords

1 – 10 of over 3000