Search results

1 – 10 of over 1000

View access options

Article

Publication date: 1 May 2006

Creating accessible educational multimedia through editing automatic speech recognition captioning in real time

Lectures can be digitally recorded and replayed to provide multimedia revision material for students who attended the class and a substitute learning experience for students…

HTML

PDF (103 KB)

Downloads

380

Abstract

Lectures can be digitally recorded and replayed to provide multimedia revision material for students who attended the class and a substitute learning experience for students unable to attend. Deaf and hard of hearing people can find it difficult to follow speech through hearing alone or to take notes while they are lip‐reading or watching a sign‐language interpreter. Notetakers can only summarise what is being said while qualified sign language interpreters with a good understanding of the relevant higher education subject content are in very scarce supply. Synchronising the speech with text captions can ensure deaf students are not disadvantaged and assist all learners to search for relevant specific parts of the multimedia recording by means of the synchronised text. Real time stenography transcription is not normally available in UK higher education because of the shortage of stenographers wishing to work in universities. Captions are time consuming and expensive to create by hand and while Automatic Speech Recognition can be used to provide real time captioning directly from lecturers’ speech in classrooms it has proved difficult to obtain accuracy comparable to stenography. This paper describes the development of a system that enables editors to correct errors in the captions as they are created by Automatic Speech Recognition.

Details

Interactive Technology and Smart Education, vol. 3 no. 2

Type: Research Article

DOI:

ISSN: 1741-5659

Keywords

View access options

Article

Publication date: 1 April 1983

Where to find out more about speech technology

Papers and articles on automatic speech recognition appear in many different journals. Research on the nature of speech is prominent in the Journal of the Acoustical Society of…

HTML

PDF (128 KB)

Downloads

Abstract

Papers and articles on automatic speech recognition appear in many different journals. Research on the nature of speech is prominent in the Journal of the Acoustical Society of America, and for research on algorithms for speech recognition the IEEE Proceedings on Acoustics, Speech and Signal Processing can be recommended.

Details

Sensor Review, vol. 3 no. 4

Type: Research Article

DOI:

ISSN: 0260-2288

View access options

Article

Publication date: 9 September 2013

VUI development for Korean people with dysarthria

Sunhee Kim, Yumi Hwang, Daejin Shin, Chang-Yeal Yang, Seung-Yeun Lee, Jin Kim, Byunggoo Kong, Jio Chung, Namhyun Cho, Ji-Hwan Kim and Minhwa Chung

This paper describes the development process of a mobile Voice User Interface (VUI) for Korean users with dysarthria with currently available speech recognition technology by…

HTML

PDF (570 KB)

Downloads

235

Abstract

Purpose

This paper describes the development process of a mobile Voice User Interface (VUI) for Korean users with dysarthria with currently available speech recognition technology by conducting systematic user needs analysis and applying usability testing feedback to prototype system designs.

Design/methodology/approach

Four usability surveys are conducted for the development of the prototype system. According to the two surveys on user needs and user experiences with existing VUI systems at the stage of the prototype design, the target platforms, and target applications are determined. Furthermore, a set of basic words is selected by the prospective users, which enables the system to be not only custom designed for dysarthric speakers but also individualized for each user. Reflecting the requests relating to general usage of the VUI and the UI design preference of users through evaluation of the initial prototype, we develop the final prototype, which is an individualized voice keyboard for mobile devices based on an isolated word recognition engine with word prediction.

Findings

The results of this paper show that target user participation in system development is effective for improving usability and satisfaction of the system, as the system is developed considering various ideas and feedback obtained in each development stage from different prospective users.

Originality/value

We have developed an automatic speech recognition-based mobile VUI system not only custom designed for dysarthric speakers but also individualized for each user, focussing on the usability aspect through four usability surveys. This voice keyboard system has the potential to be an assistive and alternative input method for people with speech impairment, including mild to moderate dysarthria, and people with physical disabilities.

Details

Journal of Assistive Technologies, vol. 7 no. 3

Type: Research Article

DOI:

ISSN: 1754-9450

Keywords

View access options

Article

Publication date: 1 January 1992

A knowledge interface for library users

B.J. Garner, C.L. Forrester and D. Lukose

The concept of a knowledge interface for library users is developed as an extension of intelligent knowledge‐base system (IKBS) concepts. Contemporary directions in intelligent…

HTML

PDF (1.2 MB)

Downloads

Abstract

The concept of a knowledge interface for library users is developed as an extension of intelligent knowledge‐base system (IKBS) concepts. Contemporary directions in intelligent decision support, particularly in the role of search intermediaries, are then examined to identify the significance of intelligent intermediaries as a solution to unstructured decision support requirements of library users. A DISCOURSE SCRIPT is given to illustrate one form of intelligent intermediary.

Details

Library Hi Tech, vol. 10 no. 1/2

Type: Research Article

DOI:

ISSN: 0737-8831

View access options

Article

Publication date: 1 April 1983

How the spoken word is captured

Speech recognition machines currently on the market are all built upon the same research foundation. The most important milestones on the road to present‐day systems are reviewed…

HTML

PDF (536 KB)

Downloads

Abstract

Speech recognition machines currently on the market are all built upon the same research foundation. The most important milestones on the road to present‐day systems are reviewed in this article based largely on an interview with Dr Roger Moore of the Royal Signals and Radar Establishment.

Details

Sensor Review, vol. 3 no. 4

Type: Research Article

DOI:

ISSN: 0260-2288

View access options

Article

Publication date: 17 April 2020

A novel speech emotion recognition model using mean update of particle swarm and whale optimization-based deep belief network

Rajasekhar B, Kamaraju M and Sumalatha V

Nowadays, the speech emotion recognition (SER) model has enhanced as the main research topic in various fields including human–computer interaction as well as speech processing…

HTML

PDF (5.1 MB)

Downloads

182

Abstract

Purpose

Nowadays, the speech emotion recognition (SER) model has enhanced as the main research topic in various fields including human–computer interaction as well as speech processing. Generally, it focuses on utilizing the models of machine learning for predicting the exact emotional status from speech. The advanced SER applications go successful in affective computing and human–computer interaction, which is making as the main component of computer system's next generation. This is because the natural human machine interface could grant the automatic service provisions, which need a better appreciation of user's emotional states.

Design/methodology/approach

This paper implements a new SER model that incorporates both gender and emotion recognition. Certain features are extracted and subjected for classification of emotions. For this, this paper uses deep belief network DBN model.

Findings

Through the performance analysis, it is observed that the developed method attains high accuracy rate (for best case) when compared to other methods, and it is 1.02% superior to whale optimization algorithm (WOA), 0.32% better from firefly (FF), 23.45% superior to particle swarm optimization (PSO) and 23.41% superior to genetic algorithm (GA). In case of worst scenario, the mean update of particle swarm and whale optimization (MUPW) in terms of accuracy is 15.63, 15.98, 16.06% and 16.03% superior to WOA, FF, PSO and GA, respectively. Under the mean case, the performance of MUPW is high, and it is 16.67, 10.38, 22.30 and 22.47% better from existing methods like WOA, FF, PSO, as well as GA, respectively.

Originality/value

This paper presents a new model for SER that aids both gender and emotion recognition. For the classification purpose, DBN is used and the weight of DBN is used and this is the first work uses MUPW algorithm for finding the optimal weight of DBN model.

Details

Data Technologies and Applications, vol. 54 no. 3

Type: Research Article

DOI:

ISSN: 2514-9288

Keywords

View access options

Book part

Publication date: 13 June 2013

An Introduction to Audio and Visual Research and Applications in Marketing

Li Xiao, Hye-jin Kim and Min Ding

Purpose – The advancement of multimedia technology has spurred the use of multimedia in business practice. The adoption of audio and visual data will accelerate as marketing…

HTML

PDF (635 KB)

EPUB (614 KB)

Abstract

Purpose – The advancement of multimedia technology has spurred the use of multimedia in business practice. The adoption of audio and visual data will accelerate as marketing scholars become more aware of the value of audio and visual data and the technologies required to reveal insights into marketing problems. This chapter aims to introduce marketing scholars into this field of research.Design/methodology/approach – This chapter reviews the current technology in audio and visual data analysis and discusses rewarding research opportunities in marketing using these data.Findings – Compared with traditional data like survey and scanner data, audio and visual data provides richer information and is easier to collect. Given these superiority, data availability, feasibility of storage, and increasing computational power, we believe that these data will contribute to better marketing practices with the help of marketing scholars in the near future.Practical implications: The adoption of audio and visual data in marketing practices will help practitioners to get better insights into marketing problems and thus make better decisions.Value/originality – This chapter makes first attempt in the marketing literature to review the current technology in audio and visual data analysis and proposes promising applications of such technology. We hope it will inspire scholars to utilize audio and visual data in marketing research.

Details

Review of Marketing Research

Type: Book

DOI:

ISBN: 978-1-78190-761-0

Keywords

View access options

Article

Publication date: 1 March 1991

Voice technologies in libraries: A look into the future

Holley R. Lange, George Philip, Bradley C. Watson, John Kountz, Samuel T. Waters and George Doddington

A real potential exists for library use of voice technologies: as aids to the disabled or illiterate library user, as front‐ends for general library help systems, in online…

HTML

PDF (1 MB)

Downloads

208

Abstract

A real potential exists for library use of voice technologies: as aids to the disabled or illiterate library user, as front‐ends for general library help systems, in online systems for commands or control words, and in many of the hands‐busy‐eyes‐busy activities that are common in libraries. Initially, these applications would be small, limited processes that would not require the more fluent human‐machine communication that we might hope for in the future. Voice technologies will depend on and benefit from new computer systems, advances in artificial intelligence and expert systems to facilitate their use and enable them to better circumvent present input and output problems. These voice systems will gradually assume more importance, improving access to information and complementing existing systems, but they will not likely revolutionize or dominate human‐machine communications or library services in the near future.

Details

Library Hi Tech, vol. 9 no. 3

Type: Research Article

DOI:

ISSN: 0737-8831

View access options

Article

Publication date: 9 February 2022

Automated service assistances to the visually impaired people using android application

S. Hemalatha, Nripendra Narayan Das, Jayanthy Ramasamy, Suman Madan and P.C. Senthil Mahesh

Internet of Things (IoT) involves connecting physical objects to the internet to provide opportunities to build smart systems or applications. IoT paradigm assumes many devices…

HTML

PDF (2.2 MB)

Downloads

Abstract

Purpose

Internet of Things (IoT) involves connecting physical objects to the internet to provide opportunities to build smart systems or applications. IoT paradigm assumes many devices connected over a conventional intent network. These devices usually have restricted resources, so moving part of the service implementation to a cloud infrastructure is a prominent solution. This study aims to proposes in this project human voice as a potential interface for one or more devices in IoT ecosystem enabling issuing commands and receiving information.

Design/methodology/approach

System design is the process of defining the elements of a system such as the architecture, modules and components, the different interfaces of those components and the data that goes through that system. It is meant to satisfy specific needs and requirements of a business or organization through the engineering of a coherent and well-running system.

Findings

The main aim of this proposed work is to develop a ticket booking application that performs all the operations by speech recognition. Hence, visually impaired people can make use of this application. There are several applications that help visually impaired people. This application adds extra features to those available soft wares. Using this, visually impaired people can book the tickets without the help of personal assistants. For future research, this study hopes to extend this application to perform various other operations that will help visually impaired people to do their daily activities like normal people without the help of personal assistants. For example, making a phone call, sending text messages, booking a taxi, easy navigation, etc.

Originality/value

System design involves the identification of classes, their relationship as well as their collaboration. In objector, classes are divided into entity classes and control classes.

Details

World Journal of Engineering, vol. 20 no. 2

Type: Research Article

DOI:

ISSN: 1708-5284

Keywords

View access options

Article

Publication date: 4 September 2019

Responding to uncertainty in emotion recognition

Björn Schuller

Uncertainty is an under-respected issue when it comes to automatic assessment of human emotion by machines. The purpose of this paper is to highlight the existent approaches…

HTML

PDF (326 KB)

Downloads

174

Abstract

Purpose

Uncertainty is an under-respected issue when it comes to automatic assessment of human emotion by machines. The purpose of this paper is to highlight the existent approaches towards such measurement of uncertainty, and identify further research need.

Design/methodology/approach

The discussion is based on a literature review.

Findings

Technical solutions towards measurement of uncertainty in automatic emotion recognition (AER) exist but need to be extended to respect a range of so far underrepresented sources of uncertainty. These then need to be integrated into systems available to general users.

Research limitations/implications

Not all sources of uncertainty in automatic emotion recognition (AER) including emotion representation and annotation can be touched upon in this communication.

Practical implications

AER systems shall be enhanced by more meaningful and complete information provision on the uncertainty underlying their estimates. Limitations of their applicability should be communicated to users.

Social implications

Users of automatic emotion recognition technology will become aware of their limitations, potentially leading to a fairer usage in crucial application context.

Originality/value

There is no previous discussion including the technical view point on extended uncertainty measurement in automatic emotion recognition.

Details

Journal of Information, Communication and Ethics in Society, vol. 17 no. 3

Type: Research Article

DOI:

ISSN: 1477-996X

Keywords

Access

Year

Content type

1 – 10 of over 1000

Abstract

Details

Keywords

Abstract

Details

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Details

Abstract

Details

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Social implications

Originality/value

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information