Search results

1 – 10 of 153
Article
Publication date: 20 August 2021

Megan Burfoot, Amirhosein Ghaffarianhoseini, Nicola Naismith and Ali GhaffarianHoseini

Informed by acoustic design standards, the built environments are designed with single reverberation times (RTs), a trade-off between long and short RTs needed for different space…

220

Abstract

Purpose

Informed by acoustic design standards, the built environments are designed with single reverberation times (RTs), a trade-off between long and short RTs needed for different space functions. A range of RTs should be achievable in spaces to optimise the acoustic comfort in different aural situations. This paper proclaims a novel concept: Intelligent passive room acoustic technology (IPRAT), which achieves real-time room acoustic optimisation through the integration of passive variable acoustic technology (PVAT) and acoustic scene classification (ASC). ASC can intelligently identify changing aural situations, and PVAT can physically vary the RT.

Design/methodology/approach

A qualitative best-evidence synthesis method is used to review the available literature on PVAT and ASC.

Findings

First, it is highlighted that dynamic spaces should be designed with varying RTs. The review then exposes a gap of intelligently adjusting RT according to changing building function. A solution is found: IPRAT, which integrates PVAT and ASC to uniquely fill this literature gap.

Originality/value

The development, functionality, benefits and challenges of IPRAT offer a holistic understanding of the state-of-the-art IPRAT, and a use case example is provided. Going forward, it is concluded that IPRAT can be prototyped and its impact on acoustic comfort can be quantified.

Details

Smart and Sustainable Built Environment, vol. 12 no. 1
Type: Research Article
ISSN: 2046-6099

Keywords

Article
Publication date: 30 August 2022

Megan Burfoot, Nicola Naismith, Ali GhaffarianHoseini and Amirhosein Ghaffarianhoseini

Informed by acoustic design standards, the built environments are designed with single reverberation times (RTs), a trade-off between long and short RTs needed for different space…

Abstract

Purpose

Informed by acoustic design standards, the built environments are designed with single reverberation times (RTs), a trade-off between long and short RTs needed for different space functions. The novel intelligent passive room acoustic technology (IPRAT) has the potential to revolutionise room acoustics, thus, it is imperative to analyse and quantify its effect. IPRAT achieves real-time room acoustic improvement by integrating passive variable acoustic technology (PVAT) and acoustic scene classification (ASC). This paper aims to compare IPRAT simulation results with the AS/NZS 2107:2016 Australian/New Zealand recommended design acoustic standards.

Design/methodology/approach

In this paper 20 classroom environments are virtually configured for the simulation, multiplying 5 classrooms with 4 aural situations typical to New Zealand classrooms. The acoustic parameters RT, sound clarity (C50) and sound strength (G) are considered and analysed in the simulation. These parameters can be used to determine the effects of improved acoustics for both teacher vocal relief and student comprehension. The IPRAT was assumed to vary RT and was represented in the simulation by six different absorption coefficient spectrums.

Findings

The optimised acoustic parameters were derived from relationships between C50, RT and G. These relationships and optimal RTs contribute a unique database to literature. IPRAT’s advantages were discerned from a comparison of “current,” “attainable” and “optimised” acoustic parameters.

Originality/value

By quantifying the effect of IPRAT, it is understood that IPRAT has the potential to satisfy the key recommendations of professional industry standards (for New Zealand namely; AS/NZS 2107:2016 recommended design acoustic standards).

Details

Smart and Sustainable Built Environment, vol. 12 no. 5
Type: Research Article
ISSN: 2046-6099

Keywords

Article
Publication date: 26 March 2021

Hima Bindu Valiveti, Anil Kumar B., Lakshmi Chaitanya Duggineni, Swetha Namburu and Swaraja Kuraparthi

Road accidents, an inadvertent mishap can be detected automatically and alerts sent instantly with the collaboration of image processing techniques and on-road video surveillance…

Abstract

Purpose

Road accidents, an inadvertent mishap can be detected automatically and alerts sent instantly with the collaboration of image processing techniques and on-road video surveillance systems. However, to rely exclusively on visual information especially under adverse conditions like night times, dark areas and unfavourable weather conditions such as snowfall, rain, and fog which result in faint visibility lead to incertitude. The main goal of the proposed work is certainty of accident occurrence.

Design/methodology/approach

The authors of this work propose a method for detecting road accidents by analyzing audio signals to identify hazardous situations such as tire skidding and car crashes. The motive of this project is to build a simple and complete audio event detection system using signal feature extraction methods to improve its detection accuracy. The experimental analysis is carried out on a publicly available real time data-set consisting of audio samples like car crashes and tire skidding. The Temporal features of the recorded audio signal like Energy Volume Zero Crossing Rate 28ZCR2529 and the Spectral features like Spectral Centroid Spectral Spread Spectral Roll of factor Spectral Flux the Psychoacoustic features Energy Sub Bands ratio and Gammatonegram are computed. The extracted features are pre-processed and trained and tested using Support Vector Machine (SVM) and K-nearest neighborhood (KNN) classification algorithms for exact prediction of the accident occurrence for various SNR ranges. The combination of Gammatonegram with Temporal and Spectral features of the validates to be superior compared to the existing detection techniques.

Findings

Temporal, Spectral, Psychoacoustic features, gammetonegram of the recorded audio signal are extracted. A High level vector is generated based on centroid and the extracted features are classified with the help of machine learning algorithms like SVM, KNN and DT. The audio samples collected have varied SNR ranges and the accuracy of the classification algorithms is thoroughly tested.

Practical implications

Denoising of the audio samples for perfect feature extraction was a tedious chore.

Originality/value

The existing literature cites extraction of Temporal and Spectral features and then the application of classification algorithms. For perfect classification, the authors have chosen to construct a high level vector from all the four extracted Temporal, Spectral, Psycho acoustic and Gammetonegram features. The classification algorithms are employed on samples collected at varied SNR ranges.

Details

International Journal of Pervasive Computing and Communications, vol. 17 no. 3
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 14 August 2017

Padmavati Shrivastava, K.K. Bhoyar and A.S. Zadgaonkar

The purpose of this paper is to build a classification system which mimics the perceptual ability of human vision, in gathering knowledge about the structure, content and the…

Abstract

Purpose

The purpose of this paper is to build a classification system which mimics the perceptual ability of human vision, in gathering knowledge about the structure, content and the surrounding environment of a real-world natural scene, at a quick glance accurately. This paper proposes a set of novel features to determine the gist of a given scene based on dominant color, dominant direction, openness and roughness features.

Design/methodology/approach

The classification system is designed at two different levels. At the first level, a set of low level features are extracted for each semantic feature. At the second level the extracted features are subjected to the process of feature evaluation, based on inter-class and intra-class distances. The most discriminating features are retained and used for training the support vector machine (SVM) classifier for two different data sets.

Findings

Accuracy of the proposed system has been evaluated on two data sets: the well-known Oliva-Torralba data set and the customized image data set comprising of high-resolution images of natural landscapes. The experimentation on these two data sets with the proposed novel feature set and SVM classifier has provided 92.68 percent average classification accuracy, using ten-fold cross validation approach. The set of proposed features efficiently represent visual information and are therefore capable of narrowing the semantic gap between low-level image representation and high-level human perception.

Originality/value

The method presented in this paper represents a new approach for extracting low-level features of reduced dimensionality that is able to model human perception for the task of scene classification. The methods of mapping primitive features to high-level features are intuitive to the user and are capable of reducing the semantic gap. The proposed feature evaluation technique is general and can be applied across any domain.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 10 no. 3
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 26 April 2022

Ebenhaeser Otto Janse van Rensburg, Reinhardt A. Botha and Rossouw von Solms

Authenticating an individual through voice can prove convenient as nothing needs to be stored and cannot easily be stolen. However, if an individual is authenticating under…

Abstract

Purpose

Authenticating an individual through voice can prove convenient as nothing needs to be stored and cannot easily be stolen. However, if an individual is authenticating under duress, the coerced attempt must be acknowledged and appropriate warnings issued. Furthermore, as duress may entail multiple combinations of emotions, the current f-score evaluation does not accommodate that multiple selected samples possess similar levels of importance. Thus, this study aims to demonstrate an approach to identifying duress within a voice-based authentication system.

Design/methodology/approach

Measuring the value that a classifier presents is often done using an f-score. However, the f-score does not effectively portray the proposed value when multiple classes could be grouped as one. The f-score also does not provide any information when numerous classes are often incorrectly identified as the other. Therefore, the proposed approach uses the confusion matrix, aggregates the select classes into another matrix and calculates a more precise representation of the selected classifier’s value. The utility of the proposed approach is demonstrated through multiple tests and is conducted as follows. The initial tests’ value is presented by an f-score, which does not value the individual emotions. The lack of value is then remedied with further tests, which include a confusion matrix. Final tests are then conducted that aggregate selected emotions within the confusion matrix to present a more precise utility value.

Findings

Two tests within the set of experiments achieved an f-score difference of 1%, indicating, Mel frequency cepstral coefficient, emotion detection, confusion matrix, multi-layer perceptron, Ryerson audio-visual database of emotional speech and song (RAVDESS), voice authentication that the two tests provided similar value. The confusion matrix used to calculate the f-score indicated that some emotions are often confused, which could all be considered closely related. Although the f-score can represent an accuracy value, these tests’ value is not accurately portrayed when not considering often confused emotions. Deciding which approach to take based on the f-score did not prove beneficial as it did not address the confused emotions. When aggregating the confusion matrix of these two tests based on selected emotions, the newly calculated utility value demonstrated a difference of 4%, indicating that the two tests may not provide a similar value as previously indicated.

Research limitations/implications

This approach’s performance is dependent on the data presented to it. If the classifier is presented with incomplete or degraded data, the results obtained from the classifier will reflect that. Additionally, the grouping of emotions is not based on psychological evidence, and this was purely done to demonstrate the implementation of an aggregated confusion matrix.

Originality/value

The f-score offers a value that represents the classifiers’ ability to classify a class correctly. This paper demonstrates that aggregating a confusion matrix could provide more value than a single f-score in the context of classifying an emotion that could consist of a combination of emotions. This approach can similarly be applied to different combinations of classifiers for the desired effect of extracting a more accurate performance value that a selected classifier presents.

Details

Information & Computer Security, vol. 30 no. 5
Type: Research Article
ISSN: 2056-4961

Keywords

Article
Publication date: 8 February 2018

Naif Adel Haddad, Leen Adeeb Fakhoury and Talal S. Akasheh

Ancient theatres and odea are one of the most significant and creative socio-cultural edutainment centres of human history that are still in use. They stood and served as huge…

Abstract

Purpose

Ancient theatres and odea are one of the most significant and creative socio-cultural edutainment centres of human history that are still in use. They stood and served as huge multi-functional structures for social, religious, propaganda and political meeting space. Meanwhile, ancient theatres’ sites have an intrinsic value for all people, and as a vital basis for cultural diversity, social and economic development, they should continue to be a source of information for future generations. Though, all places with ancient theatre heritage should be assessed as to their potential risk from any anthropogenic or natural process. The paper aims to discuss these issues.

Design/methodology/approach

The main paper’s objective is to discuss mainly the anthropogenic and technical risks, vulnerability and impact issues on the ancient classical theatres. While elaborating on relevant recent studies, where the authors were involved in ERATO and ATHENA European projects for ancient theatres and odea, this paper provides a brief overview of the main aspects of the anthropogenic qualitative risks and related issues for selected classical antiquity theatres. Some relevant cases are critically presented and investigated in order to examine and clarify the main risk mitigation issues as an essential prerequisite for theatre heritage preservation and its interface with heritage reuse.

Findings

Theatre risk mitigation is an ongoing and challenging task. By preventive conservation, theatre anthropogenic qualitative risks’ management can provide a framework for decision making. The needed related guidelines and recommendations that provide a systematic approach for sustainable management and planning in relation mainly to “ancient theatre compatible use” and “theatre technical risks” are analysed and presented. This is based on identification, classification and assessment of the theatre risk causes and contributing factors and their mitigation.

Originality/value

The paper also suggests a new methodological approach for the theatre anthropogenic qualitative risk assessment and mitigation management, and develop some recommendations that provide a systematic approach for theatre site managers and heritage experts to understand, assess, and mitigate risks mainly due to anthropogenic and technical threats.

Details

Journal of Cultural Heritage Management and Sustainable Development, vol. 8 no. 3
Type: Research Article
ISSN: 2044-1266

Keywords

Article
Publication date: 1 February 2005

Andreas Zimmermann and Andreas Lorenz

The paper deals with the design and creation of an intelligent user interface augmenting the user experience in everyday environments, by providing an immersive audio environment…

Abstract

The paper deals with the design and creation of an intelligent user interface augmenting the user experience in everyday environments, by providing an immersive audio environment. We highlight the potential of augmenting the visual real environment in a personalized way, thanks to context modeling techniques. The LISTEN project, a system for an immersive audio augmented environment applied in the art exhibition domain, provides an example of modeling and personalization methods affecting the audio interface in terms of content and organization. In addition, the different evolution steps of the system and the outcomes of the accompanying user tests are here reported.

Details

International Journal of Pervasive Computing and Communications, vol. 1 no. 1
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 15 June 2021

Runyu Chen

Micro-video platforms have gained attention in recent years and have also become an important new channel for merchants to advertise their products. Since little research has…

Abstract

Purpose

Micro-video platforms have gained attention in recent years and have also become an important new channel for merchants to advertise their products. Since little research has studied micro-video advertising, this paper aims to fill the research gap by exploring the determinants of micro-video advertising clicks. We form a micro-video advertising click prediction model and demonstrate the effectiveness of the multimodal information extracted from the advertisement producers, commodities being sold and micro-video contents in the prediction task.

Design/methodology/approach

A multimodal analysis framework was conducted based on real-world micro-video advertisement datasets. To better capture the relations between different modalities, we adopt a cooperative learning model to predict the advertising clicks.

Findings

The experimental results show that the features extracted from different data sources can improve the prediction performance. Furthermore, the combination of different modal features (visual, acoustic, textual and numerical) is also worth studying. Compared to classical baseline models, the proposed cooperative learning model significantly outperforms the prediction results, which demonstrates that the relations between modalities are also important in advertising micro-video generation.

Originality/value

To the best of our knowledge, this is the first study analysing micro-video advertising effects. With the help of our advertising click prediction model, advertisement producers (merchants or their partners) can benefit from generating more effective micro-video advertisements. Furthermore, micro-video platforms can apply our prediction results to optimise their advertisement allocation algorithm and better manage network traffic. This research can be of great help for more effective development of the micro-video advertisement industry.

Details

Internet Research, vol. 32 no. 2
Type: Research Article
ISSN: 1066-2243

Keywords

Article
Publication date: 17 April 2020

Rajasekhar B, Kamaraju M and Sumalatha V

Nowadays, the speech emotion recognition (SER) model has enhanced as the main research topic in various fields including human–computer interaction as well as speech processing…

Abstract

Purpose

Nowadays, the speech emotion recognition (SER) model has enhanced as the main research topic in various fields including human–computer interaction as well as speech processing. Generally, it focuses on utilizing the models of machine learning for predicting the exact emotional status from speech. The advanced SER applications go successful in affective computing and human–computer interaction, which is making as the main component of computer system's next generation. This is because the natural human machine interface could grant the automatic service provisions, which need a better appreciation of user's emotional states.

Design/methodology/approach

This paper implements a new SER model that incorporates both gender and emotion recognition. Certain features are extracted and subjected for classification of emotions. For this, this paper uses deep belief network DBN model.

Findings

Through the performance analysis, it is observed that the developed method attains high accuracy rate (for best case) when compared to other methods, and it is 1.02% superior to whale optimization algorithm (WOA), 0.32% better from firefly (FF), 23.45% superior to particle swarm optimization (PSO) and 23.41% superior to genetic algorithm (GA). In case of worst scenario, the mean update of particle swarm and whale optimization (MUPW) in terms of accuracy is 15.63, 15.98, 16.06% and 16.03% superior to WOA, FF, PSO and GA, respectively. Under the mean case, the performance of MUPW is high, and it is 16.67, 10.38, 22.30 and 22.47% better from existing methods like WOA, FF, PSO, as well as GA, respectively.

Originality/value

This paper presents a new model for SER that aids both gender and emotion recognition. For the classification purpose, DBN is used and the weight of DBN is used and this is the first work uses MUPW algorithm for finding the optimal weight of DBN model.

Details

Data Technologies and Applications, vol. 54 no. 3
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 12 October 2021

A. Reyana, Sandeep Kautish, A.S. Vibith and S.B. Goyal

In the traffic monitoring system, the detection of stirring vehicles is monitored by fitting static cameras in the traffic scenarios. Background subtraction a commonly used method…

Abstract

Purpose

In the traffic monitoring system, the detection of stirring vehicles is monitored by fitting static cameras in the traffic scenarios. Background subtraction a commonly used method detaches poignant objects in the foreground from the background. The method applies a Gaussian Mixture Model, which can effortlessly be contaminated through slow-moving or momentarily stopped vehicles.

Design/methodology/approach

This paper proposes the Enhanced Gaussian Mixture Model to overcome the addressed issue, efficiently detecting vehicles in complex traffic scenarios.

Findings

The model was evaluated with experiments conducted using real-world on-road travel videos. The evidence intimates that the proposed model excels with other approaches showing the accuracy of 0.9759 when compared with the existing Gaussian mixture model (GMM) model and avoids contamination of slow-moving or momentarily stopped vehicles.

Originality/value

The proposed method effectively combines, tracks and classifies the traffic vehicles, resolving the contamination problem that occurred by slow-moving or momentarily stopped vehicles.

Details

International Journal of Intelligent Unmanned Systems, vol. 11 no. 1
Type: Research Article
ISSN: 2049-6427

Keywords

1 – 10 of 153