Search results

1 – 10 of over 11000
Article
Publication date: 9 September 2013

Sunhee Kim, Yumi Hwang, Daejin Shin, Chang-Yeal Yang, Seung-Yeun Lee, Jin Kim, Byunggoo Kong, Jio Chung, Namhyun Cho, Ji-Hwan Kim and Minhwa Chung

This paper describes the development process of a mobile Voice User Interface (VUI) for Korean users with dysarthria with currently available speech recognition technology by…

Abstract

Purpose

This paper describes the development process of a mobile Voice User Interface (VUI) for Korean users with dysarthria with currently available speech recognition technology by conducting systematic user needs analysis and applying usability testing feedback to prototype system designs.

Design/methodology/approach

Four usability surveys are conducted for the development of the prototype system. According to the two surveys on user needs and user experiences with existing VUI systems at the stage of the prototype design, the target platforms, and target applications are determined. Furthermore, a set of basic words is selected by the prospective users, which enables the system to be not only custom designed for dysarthric speakers but also individualized for each user. Reflecting the requests relating to general usage of the VUI and the UI design preference of users through evaluation of the initial prototype, we develop the final prototype, which is an individualized voice keyboard for mobile devices based on an isolated word recognition engine with word prediction.

Findings

The results of this paper show that target user participation in system development is effective for improving usability and satisfaction of the system, as the system is developed considering various ideas and feedback obtained in each development stage from different prospective users.

Originality/value

We have developed an automatic speech recognition-based mobile VUI system not only custom designed for dysarthric speakers but also individualized for each user, focussing on the usability aspect through four usability surveys. This voice keyboard system has the potential to be an assistive and alternative input method for people with speech impairment, including mild to moderate dysarthria, and people with physical disabilities.

Article
Publication date: 4 September 2009

Michael Schuricht, Zachary Davis, Michael Hu, Shreyas Prasad, Peter M. Melliar‐Smith and Louise E. Moser

Mobile handheld devices, such as cellular phones and personal digital assistants, are inherently small and lack an intuitive and natural user interface. Speech recognition and…

Abstract

Purpose

Mobile handheld devices, such as cellular phones and personal digital assistants, are inherently small and lack an intuitive and natural user interface. Speech recognition and synthesis technology can be used in mobile handheld devices to improve the user experience. The purpose of this paper is to describe a prototype system that supports multiple speech‐enabled applications in a mobile handheld device.

Design/methodology/approach

The main component of the system, the Program Manager, coordinates and controls the speech‐enabled applications. Human speech requests to, and responses from, these applications are processed in the mobile handheld device, to achieve the goal of human‐like interactions between the human and the device. In addition to speech, the system also supports graphics and text, i.e., multimodal input and output, for greater usability, flexibility, adaptivity, accuracy, and robustness. The paper presents a qualitative and quantitative evaluation of the prototype system. The Program Manager is currently designed to handle the specific speech‐enabled applications that we developed.

Findings

The paper determines that many human interactions involve not single applications but multiple applications working together in possibly unanticipated ways.

Research limitations/implications

Future work includes generalization of the Program Manager so that it supports arbitrary applications and the addition of new applications dynamically. Future work also includes deployment of the Program Manager and the applications on cellular phones running the Android Platform or the Openmoko Framework.

Originality/value

This paper presents a first step towards a future human interface for mobile handheld devices and for speech‐enabled applications operating on those devices.

Details

International Journal of Pervasive Computing and Communications, vol. 5 no. 3
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 1 March 1991

Holley R. Lange, George Philip, Bradley C. Watson, John Kountz, Samuel T. Waters and George Doddington

A real potential exists for library use of voice technologies: as aids to the disabled or illiterate library user, as front‐ends for general library help systems, in online…

208

Abstract

A real potential exists for library use of voice technologies: as aids to the disabled or illiterate library user, as front‐ends for general library help systems, in online systems for commands or control words, and in many of the hands‐busy‐eyes‐busy activities that are common in libraries. Initially, these applications would be small, limited processes that would not require the more fluent human‐machine communication that we might hope for in the future. Voice technologies will depend on and benefit from new computer systems, advances in artificial intelligence and expert systems to facilitate their use and enable them to better circumvent present input and output problems. These voice systems will gradually assume more importance, improving access to information and complementing existing systems, but they will not likely revolutionize or dominate human‐machine communications or library services in the near future.

Details

Library Hi Tech, vol. 9 no. 3
Type: Research Article
ISSN: 0737-8831

Article
Publication date: 1 April 1983

Papers and articles on automatic speech recognition appear in many different journals. Research on the nature of speech is prominent in the Journal of the Acoustical Society of…

Abstract

Papers and articles on automatic speech recognition appear in many different journals. Research on the nature of speech is prominent in the Journal of the Acoustical Society of America, and for research on algorithms for speech recognition the IEEE Proceedings on Acoustics, Speech and Signal Processing can be recommended.

Details

Sensor Review, vol. 3 no. 4
Type: Research Article
ISSN: 0260-2288

Article
Publication date: 27 June 2008

Soo‐Young Suk and Hyun‐Yeol Chung

The purpose of this paper is to describe a speech and character combined recognition engine (SCCRE) developed for working on personal digital assistants (PDAs) or on mobile…

Abstract

Purpose

The purpose of this paper is to describe a speech and character combined recognition engine (SCCRE) developed for working on personal digital assistants (PDAs) or on mobile devices. Also, the architecture of a distributed recognition system for providing a more convenient user interface is discussed.

Design/methodology/approach

In SCCRE, feature extraction for speech and for character is carried out separately, but the recognition is performed in an engine. The client recognition engine essentially employs a continuous hidden Markov model (CHMM) structure and this CHMM structure consists of variable parameter topology in order to minimize the number of model parameters and to reduce recognition time. This model also adopts the proposed successive state and mixture splitting (SSMS) method for generating context independent model. SSMS optimizes the number of mixtures through splitting in mixture domain and the number of states through splitting in time domain.

Findings

The recognition results show that the developed engine can reduce the total number of Gaussian up to 40 per cent compared with the fixed parameter models at the same recognition performance when applied to speech recognition for mobile devices. It shows that SSMS can reduce the size of memory for models to 65 per cent and that for processing to 82 per cent. Moreover, the recognition time decreases 17 per cent with the SMS model while maintaining the recognition rate.

Originality/value

The proposed system will be very useful for many on‐line multimodal interfaces such as PDAs and mobile applications.

Details

International Journal of Pervasive Computing and Communications, vol. 4 no. 2
Type: Research Article
ISSN: 1742-7371

Keywords

Open Access
Article
Publication date: 30 November 2023

H.A. Dimuthu Maduranga Arachchi and G. Dinesh Samarasinghe

This study aims to examine the influence of the derived attributes of embedded artificial intelligence-mobile smart speech recognition (AI-MSSR) technology, namely perceived…

1532

Abstract

Purpose

This study aims to examine the influence of the derived attributes of embedded artificial intelligence-mobile smart speech recognition (AI-MSSR) technology, namely perceived usefulness, perceived ease of use (PEOU) and perceived enjoyment (PE) on consumer purchase intention (PI) through the chain relationships of attitudes to AI and consumer smart experience, with the moderating effect of consumer innovativeness and Generation (Gen) X and Gen Y in fashion retail.

Design/methodology/approach

The study employed a quantitative survey strategy, drawing a sample of 836 respondents from Sri Lanka and India representing Gen X and Gen Y. The data analysis was carried out using smart partial least squares structural equation modelling (PLS-SEM).

Findings

The findings show a positive relationship between the perceived attributes of MSSR and consumer PI via attitudes towards AI (AAI) and smart consumer experiences. In addition, consumer innovativeness and Generations X and Y have a moderating impact on the aforementioned relationship. The theoretical and managerial implications of the study are discussed with a note on the research limitations and further research directions.

Practical implications

To multiply the effects of embedded AI-MSSR and consumer PI in fashion retail marketing, managers can develop strategies that strengthen the links between awareness, knowledge of the derived attributes of embedded AI-MSSR and PI by encouraging innovative consumers, especially Gen Y consumers, to engage with embedded AI-MSSR.

Originality/value

This study advances the literature on embedded AI-MSSR and consumer PI in fashion retail marketing by providing an integrated view of the technology acceptance model (TAM), the diffusion of innovation (DOI) theory and the generational cohort perspective in predicting PI.

Details

European Journal of Management Studies, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2183-4172

Keywords

Article
Publication date: 1 April 1983

Speech recognition machines currently on the market are all built upon the same research foundation. The most important milestones on the road to present‐day systems are reviewed…

Abstract

Speech recognition machines currently on the market are all built upon the same research foundation. The most important milestones on the road to present‐day systems are reviewed in this article based largely on an interview with Dr Roger Moore of the Royal Signals and Radar Establishment.

Details

Sensor Review, vol. 3 no. 4
Type: Research Article
ISSN: 0260-2288

Article
Publication date: 1 January 1992

B.J. Garner, C.L. Forrester and D. Lukose

The concept of a knowledge interface for library users is developed as an extension of intelligent knowledge‐base system (IKBS) concepts. Contemporary directions in intelligent…

Abstract

The concept of a knowledge interface for library users is developed as an extension of intelligent knowledge‐base system (IKBS) concepts. Contemporary directions in intelligent decision support, particularly in the role of search intermediaries, are then examined to identify the significance of intelligent intermediaries as a solution to unstructured decision support requirements of library users. A DISCOURSE SCRIPT is given to illustrate one form of intelligent intermediary.

Details

Library Hi Tech, vol. 10 no. 1/2
Type: Research Article
ISSN: 0737-8831

Book part
Publication date: 13 June 2013

Li Xiao, Hye-jin Kim and Min Ding

Purpose – The advancement of multimedia technology has spurred the use of multimedia in business practice. The adoption of audio and visual data will accelerate as marketing…

Abstract

Purpose – The advancement of multimedia technology has spurred the use of multimedia in business practice. The adoption of audio and visual data will accelerate as marketing scholars become more aware of the value of audio and visual data and the technologies required to reveal insights into marketing problems. This chapter aims to introduce marketing scholars into this field of research.Design/methodology/approach – This chapter reviews the current technology in audio and visual data analysis and discusses rewarding research opportunities in marketing using these data.Findings – Compared with traditional data like survey and scanner data, audio and visual data provides richer information and is easier to collect. Given these superiority, data availability, feasibility of storage, and increasing computational power, we believe that these data will contribute to better marketing practices with the help of marketing scholars in the near future.Practical implications: The adoption of audio and visual data in marketing practices will help practitioners to get better insights into marketing problems and thus make better decisions.Value/originality – This chapter makes first attempt in the marketing literature to review the current technology in audio and visual data analysis and proposes promising applications of such technology. We hope it will inspire scholars to utilize audio and visual data in marketing research.

Details

Review of Marketing Research
Type: Book
ISBN: 978-1-78190-761-0

Keywords

Article
Publication date: 17 April 2020

Rajasekhar B, Kamaraju M and Sumalatha V

Nowadays, the speech emotion recognition (SER) model has enhanced as the main research topic in various fields including human–computer interaction as well as speech processing…

Abstract

Purpose

Nowadays, the speech emotion recognition (SER) model has enhanced as the main research topic in various fields including human–computer interaction as well as speech processing. Generally, it focuses on utilizing the models of machine learning for predicting the exact emotional status from speech. The advanced SER applications go successful in affective computing and human–computer interaction, which is making as the main component of computer system's next generation. This is because the natural human machine interface could grant the automatic service provisions, which need a better appreciation of user's emotional states.

Design/methodology/approach

This paper implements a new SER model that incorporates both gender and emotion recognition. Certain features are extracted and subjected for classification of emotions. For this, this paper uses deep belief network DBN model.

Findings

Through the performance analysis, it is observed that the developed method attains high accuracy rate (for best case) when compared to other methods, and it is 1.02% superior to whale optimization algorithm (WOA), 0.32% better from firefly (FF), 23.45% superior to particle swarm optimization (PSO) and 23.41% superior to genetic algorithm (GA). In case of worst scenario, the mean update of particle swarm and whale optimization (MUPW) in terms of accuracy is 15.63, 15.98, 16.06% and 16.03% superior to WOA, FF, PSO and GA, respectively. Under the mean case, the performance of MUPW is high, and it is 16.67, 10.38, 22.30 and 22.47% better from existing methods like WOA, FF, PSO, as well as GA, respectively.

Originality/value

This paper presents a new model for SER that aids both gender and emotion recognition. For the classification purpose, DBN is used and the weight of DBN is used and this is the first work uses MUPW algorithm for finding the optimal weight of DBN model.

Details

Data Technologies and Applications, vol. 54 no. 3
Type: Research Article
ISSN: 2514-9288

Keywords

1 – 10 of over 11000