Search results

1 – 10 of over 5000

View access options

Article

Publication date: 14 August 2017

Bridging the semantic gap with human perception based features for scene categorization

Padmavati Shrivastava, K.K. Bhoyar and A.S. Zadgaonkar

The purpose of this paper is to build a classification system which mimics the perceptual ability of human vision, in gathering knowledge about the structure, content and the…

HTML

PDF (849 KB)

Downloads

164

Abstract

Purpose

The purpose of this paper is to build a classification system which mimics the perceptual ability of human vision, in gathering knowledge about the structure, content and the surrounding environment of a real-world natural scene, at a quick glance accurately. This paper proposes a set of novel features to determine the gist of a given scene based on dominant color, dominant direction, openness and roughness features.

Design/methodology/approach

The classification system is designed at two different levels. At the first level, a set of low level features are extracted for each semantic feature. At the second level the extracted features are subjected to the process of feature evaluation, based on inter-class and intra-class distances. The most discriminating features are retained and used for training the support vector machine (SVM) classifier for two different data sets.

Findings

Accuracy of the proposed system has been evaluated on two data sets: the well-known Oliva-Torralba data set and the customized image data set comprising of high-resolution images of natural landscapes. The experimentation on these two data sets with the proposed novel feature set and SVM classifier has provided 92.68 percent average classification accuracy, using ten-fold cross validation approach. The set of proposed features efficiently represent visual information and are therefore capable of narrowing the semantic gap between low-level image representation and high-level human perception.

Originality/value

The method presented in this paper represents a new approach for extracting low-level features of reduced dimensionality that is able to model human perception for the task of scene classification. The methods of mapping primitive features to high-level features are intuitive to the user and are capable of reducing the semantic gap. The proposed feature evaluation technique is general and can be applied across any domain.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 10 no. 3

Type: Research Article

DOI:

ISSN: 1756-378X

Keywords

View access options

Article

Publication date: 20 August 2021

The birth of intelligent passive room acoustic technology: a qualitative review

Megan Burfoot, Amirhosein Ghaffarianhoseini, Nicola Naismith and Ali GhaffarianHoseini

Informed by acoustic design standards, the built environments are designed with single reverberation times (RTs), a trade-off between long and short RTs needed for different space…

HTML

PDF (3 MB)

Downloads

237

Abstract

Purpose

Informed by acoustic design standards, the built environments are designed with single reverberation times (RTs), a trade-off between long and short RTs needed for different space functions. A range of RTs should be achievable in spaces to optimise the acoustic comfort in different aural situations. This paper proclaims a novel concept: Intelligent passive room acoustic technology (IPRAT), which achieves real-time room acoustic optimisation through the integration of passive variable acoustic technology (PVAT) and acoustic scene classification (ASC). ASC can intelligently identify changing aural situations, and PVAT can physically vary the RT.

Design/methodology/approach

A qualitative best-evidence synthesis method is used to review the available literature on PVAT and ASC.

Findings

First, it is highlighted that dynamic spaces should be designed with varying RTs. The review then exposes a gap of intelligently adjusting RT according to changing building function. A solution is found: IPRAT, which integrates PVAT and ASC to uniquely fill this literature gap.

Originality/value

The development, functionality, benefits and challenges of IPRAT offer a holistic understanding of the state-of-the-art IPRAT, and a use case example is provided. Going forward, it is concluded that IPRAT can be prototyped and its impact on acoustic comfort can be quantified.

Details

Smart and Sustainable Built Environment, vol. 12 no. 1

Type: Research Article

DOI:

ISSN: 2046-6099

Keywords

View access options

Article

Publication date: 8 July 2010

Large‐scale grid computing for content‐based image retrieval

Chris Town and Karl Harrison

Content‐based image retrieval (CBIR) technologies offer many advantages over purely text‐based image search. However, one of the drawbacks associated with CBIR is the increased…

HTML

PDF (282 KB)

Downloads

705

Abstract

Purpose

Content‐based image retrieval (CBIR) technologies offer many advantages over purely text‐based image search. However, one of the drawbacks associated with CBIR is the increased computational cost arising from tasks such as image processing, feature extraction, image classification, and object detection and recognition. Consequently CBIR systems have suffered from a lack of scalability, which has greatly hampered their adoption for real‐world public and commercial image search. At the same time, paradigms for large‐scale heterogeneous distributed computing such as grid computing, cloud computing, and utility‐based computing are gaining traction as a way of providing more scalable and efficient solutions to large‐scale computing tasks.

Design/methodology/approach

This paper presents an approach in which a large distributed processing grid has been used to apply a range of CBIR methods to a substantial number of images. By massively distributing the required computational task across thousands of grid nodes, very high through‐put has been achieved at relatively low overheads.

Findings

This has allowed one to analyse and index about 25 million high resolution images thus far, while using just two servers for storage and job submission. The CBIR system was developed by Imense Ltd and is based on automated analysis and recognition of image content using a semantic ontology. It features a range of image‐processing and analysis modules, including image segmentation, region classification, scene analysis, object detection, and face recognition methods.

Originality/value

In the case of content‐based image analysis, the primary performance criterion is the overall through‐put achieved by the system in terms of the number of images that can be processed over a given time frame, irrespective of the time taken to process any given image. As such, grid processing has great potential for massively parallel content‐based image retrieval and other tasks with similar performance requirements.

Details

Aslib Proceedings, vol. 62 no. 4/5

Type: Research Article

DOI:

ISSN: 0001-253X

Keywords

View access options

Article

Publication date: 4 August 2021

Tourist gaze through computer vision: where, what, how and why?

Kun Zhang, Hanqin Qiu, Jingyue Wang, Chunlin Li, Jinyi Zhang and Dora Dongzhi Chen

This paper aims to answer the following four research questions: Where do tourists gaze at the destination? What do tourists gaze at the destination? How do tourists gaze…

HTML

PDF (2.3 MB)

Downloads

571

Abstract

Purpose

This paper aims to answer the following four research questions: Where do tourists gaze at the destination? What do tourists gaze at the destination? How do tourists gaze differently? Why do tourists gaze differently referring to relevant theory?

Design/methodology/approach

With a computer vision approach, this study illustrated a series of maps that reflect where and what do tourists gaze at and compared the differences in the visual perceptions among Asian, European and North American tourists in Hong Kong.

Findings

The findings confirm that the “tourist gaze” is influenced by geographical and cultural conditions. The conclusions provided three types of implementations for destination management strategies and advocated a high engagement with computer vision technology.

Originality/value

In theory, this study proves that the “tourist gaze” is influenced by geographical and cultural conditions. The study’s methodological contribution lies in applying advanced technology of visual content analysis for big data relevant to the issue of the tourist gaze. Practically, the finding that has not been achieved via previous questionnaire surveys will serve as a reference for tourism recommendations and precision marketing. In addition, its practical contribution is that it offers a means by which to explore tourists’ perceptions of destinations and understand the attractiveness of destinations to tourists.

研究设计/方法/技术

研究一方面使用计算机视觉深入学习模型对游客照片内容进行识别, 比较了亚洲、欧洲和北美游客在香港不同空间场景的视觉感知差异。另一方面, 研究借助ArcGIS软件对游客凝视地点和内容差异进行了具体可视化分析。

研究目的

这项研究有四个研究子问题：

（1）游客在哪里凝视？
（2）游客凝视了什么？
（3）游客凝视内容有什么不同？

（4）为什么游客凝视不同？

（1）游客在哪里凝视？

（2）游客凝视了什么？

（3）游客凝视内容有什么不同？

（4）为什么游客凝视不同？

研究发现

不同游客在旅游目的地的“凝视”存在差异, 差异表征具体体现在地点选择和内容偏好等维度。同时, 研究结果显示计算机视觉技术在旅游研究领域呈现较好的应用潜力。

原创/价值

理论上, 本研究佐证了”游客凝视”受地理和文化条件影响的理论。技术上, 本研究探索了视觉分析技术在游客凝视议题上应用, 为旅游目的地感知评估提供了新的视角。应用层面, 研究结论为旅游目的地精准营销提供了参考。

Resumen

Diseño/metodología/enfoque

Con un enfoque de visión artificial, este estudio ilustra una serie de mapas que reflejan dónde y qué miran los turistas, y compara las diferencias en las percepciones visuales entre los turistas asiáticos, europeos y norteamericanos en Hong Kong.

Objetivo

El estudio tiene cuatro preguntas de investigación:

(1) ¿Dónde miran los turistas en el destino?
(2) ¿Qué miran los turistas en el destino?
(3) ¿Cómo miran los turistas de forma diferente?
(4) ¿Por qué los turistas miran de forma diferente en referencia a la teoría pertinente?

(1) ¿Dónde miran los turistas en el destino?

(2) ¿Qué miran los turistas en el destino?

(3) ¿Cómo miran los turistas de forma diferente?

(4) ¿Por qué los turistas miran de forma diferente en referencia a la teoría pertinente?

Conclusiones

Las conclusiones confirman que la “mirada del turista” está influida por las condiciones geográficas y culturales. Las conclusiones aportan tres tipos de aplicaciones para las estrategias de gestión de destinos y abogan por un alto compromiso con la tecnología de visión artificial.

Originalidad/valor

En teoría, este estudio demuestra que la “mirada del turista” está influenciada por las condiciones geográficas y culturales. La contribución metodológica del estudio radica en la aplicación de tecnología avanzada de análisis de contenido visual para big data relevante para el tema de la mirada del turista. En la práctica, los hallazgos que no se han logrado a través de encuestas anteriores servirán de referencia para las recomendaciones turísticas y el marketing de precisión. Además, su contribución práctica es que ofrece un medio para explorar las percepciones de los turistas sobre los destinos, y comprender el atractivo de los mismos para los turistas.

Details

Tourism Review, vol. 77 no. 4

Type: Research Article

DOI:

ISSN: 1660-5373

Keywords

View access options

Article

Publication date: 27 January 2021

Motivation and crime scene behavior in Korean fire setting: a new typology

Ashley N. Hewitt, Eric Beauregard and Jonghan Sea

Early classification systems of fire setting have suffered from several limitations, including the lack of empirical validation and the focus mainly on the offender motivation…

HTML

PDF (142 KB)

Downloads

259

Abstract

Purpose

Early classification systems of fire setting have suffered from several limitations, including the lack of empirical validation and the focus mainly on the offender motivation behind this type of crime. More recent research shows that looking at the crime scene behaviors may present a more fruitful approach for helping to solve fire setting offenses. The purpose of this study is to advance current scholarship by developing a new typology of fire setting based on the combination of offender motive and crime scene behaviors.

Design/methodology/approach

Latent class analyses were used with a sample of 134 fire setters who committed 275 arsons from the Korean National Police Agency to identify distinct fire setter motivations and crime scene contexts. Chi-square and crosstabulation analysis were then conducted to determine whether crime scene behaviors were associated with distinct offender motives and vice versa. Lastly, to improve the external validity of each of the latent classes, chi-square analyses were performed using variables related to the fire setters' criminal history, sociodemographic characteristics and arson classification.

Findings

Five motive subtypes were identified as well as five distinct crime scene contexts in which serial fire setting occurs. A significant association among these classes suggests that it is possible to infer fire setters’ motive from crime scene behavior and vice versa.

Originality/value

This comprehensive typology of fire setters has potential for profiling of unknown offenders as well as for suspect prioritization in police investigations.

Details

Policing: An International Journal, vol. 44 no. 5

Type: Research Article

DOI:

ISSN: 1363-951X

Keywords

View access options

Article

Publication date: 14 November 2016

Large-scale scene real-time infrared simulation based on texture blending

Yuye Wang, Guofeng Zhang and Xiaoguang Hu

Infrared simulation plays an important role in small and affordable unmanned aerial vehicles. Its key and main goal is to get the infrared image of a specific target. Infrared…

HTML

PDF (1.3 MB)

Downloads

187

Abstract

Purpose

Infrared simulation plays an important role in small and affordable unmanned aerial vehicles. Its key and main goal is to get the infrared image of a specific target. Infrared physical model is established through a theoretical research, thus the temperature field is available. Then infrared image of a specific target can be simulated properly while taking atmosphere state and effect of infrared imaging system into account. For recent years, some research has been done in this field. Among them, the infrared simulation for large scale is still a key problem to be solved. In this passage, a method of classification based on texture blending is proposed and this method effectively solves the problem of classification of large number of images and increase the frame rate of large infrared scene rendering. The paper aims to discuss these issues.

Design/methodology/approach

Mosart Atmospheric Tool (MAT) is used first to calculate data of sun radiance, skyshine radiance, path radiance, temperatures of different material which is an offline process. Then, shader in OGRE does final calculation to get simulation result and keeps a high frame rate. Considering this, the authors convert data in MAT file into textures which can be easily handled by shader. In shader responding, radiance can be indexed by information of material, vertex normal, eye and sun. Adding the effect of infrared imaging system, the final radiance distribution is obtained. At last, the authors get infrared scene by converting radiance to grayscale.

Findings

In the fragment shader, fake infrared textures are used to look up temperature which can calculate radiance of itself and related radiance.

Research limitations/implications

The radiance is transferred into grayscale image while considering effect of infrared imaging system.

Originality/value

Simulation results show that a high frame rate can be reached while guaranteeing the fidelity.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 9 no. 4

Type: Research Article

DOI:

ISSN: 1756-378X

Keywords

View access options

Article

Publication date: 19 December 2023

Manifold embedded global and local discriminative features selection for single-shot multi-categories clothing recognition and retrieval

Jinchao Huang

Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based…

HTML

PDF (3.2 MB)

Downloads

Abstract

Purpose

Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based on RGBD clothing images often suffer from high-dimensional feature representations, leading to compromised performance and efficiency.

Design/methodology/approach

To address this issue, this paper proposes a novel method called Manifold Embedded Discriminative Feature Selection (MEDFS) to select global and local features, thereby reducing the dimensionality of the feature representation and improving performance. Specifically, by combining three global features and three local features, a low-dimensional embedding is constructed to capture the correlations between features and categories. The MEDFS method designs an optimization framework utilizing manifold mapping and sparse regularization to achieve feature selection. The optimization objective is solved using an alternating iterative strategy, ensuring convergence.

Findings

Empirical studies conducted on a publicly available RGBD clothing image dataset demonstrate that the proposed MEDFS method achieves highly competitive clothing classification performance while maintaining efficiency in clothing recognition and retrieval.

Originality/value

This paper introduces a novel approach for multi-category clothing recognition and retrieval, incorporating the selection of global and local features. The proposed method holds potential for practical applications in real-world clothing scenarios.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 1756-378X

Keywords

View access options

Article

Publication date: 14 December 2023

Efficient knowledge distillation for remote sensing image classification: a CNN-based approach

Huaxiang Song, Chai Wei and Zhou Yong

The paper aims to tackle the classification of Remote Sensing Images (RSIs), which presents a significant challenge for computer algorithms due to the inherent characteristics of…

HTML

PDF (3.4 MB)

Downloads

Abstract

Purpose

The paper aims to tackle the classification of Remote Sensing Images (RSIs), which presents a significant challenge for computer algorithms due to the inherent characteristics of clustered ground objects and noisy backgrounds. Recent research typically leverages larger volume models to achieve advanced performance. However, the operating environments of remote sensing commonly cannot provide unconstrained computational and storage resources. It requires lightweight algorithms with exceptional generalization capabilities.

Design/methodology/approach

This study introduces an efficient knowledge distillation (KD) method to build a lightweight yet precise convolutional neural network (CNN) classifier. This method also aims to substantially decrease the training time expenses commonly linked with traditional KD techniques. This approach entails extensive alterations to both the model training framework and the distillation process, each tailored to the unique characteristics of RSIs. In particular, this study establishes a robust ensemble teacher by independently training two CNN models using a customized, efficient training algorithm. Following this, this study modifies a KD loss function to mitigate the suppression of non-target category predictions, which are essential for capturing the inter- and intra-similarity of RSIs.

Findings

This study validated the student model, termed KD-enhanced network (KDE-Net), obtained through the KD process on three benchmark RSI data sets. The KDE-Net surpasses 42 other state-of-the-art methods in the literature published from 2020 to 2023. Compared to the top-ranked method’s performance on the challenging NWPU45 data set, KDE-Net demonstrated a noticeable 0.4% increase in overall accuracy with a significant 88% reduction in parameters. Meanwhile, this study’s reformed KD framework significantly enhances the knowledge transfer speed by at least three times.

Originality/value

This study illustrates that the logit-based KD technique can effectively develop lightweight CNN classifiers for RSI classification without substantial sacrifices in computation and storage costs. Compared to neural architecture search or other methods aiming to provide lightweight solutions, this study’s KDE-Net, based on the inherent characteristics of RSIs, is currently more efficient in constructing accurate yet lightweight classifiers for RSI classification.

Details

International Journal of Web Information Systems, vol. 20 no. 2

Type: Research Article

DOI:

ISSN: 1744-0084

Keywords

Open Access

Article

Publication date: 24 July 2020

Moving objects classification via category-wise two-dimensional principal component analysis

Falah Alsaqre and Osama Almathkour

Classifying moving objects in video sequences has been extensively studied, yet it is still an ongoing problem. In this paper, we propose to solve moving objects classification…

HTML

PDF (2.3 MB)

Downloads

616

Abstract

Classifying moving objects in video sequences has been extensively studied, yet it is still an ongoing problem. In this paper, we propose to solve moving objects classification problem via an extended version of two-dimensional principal component analysis (2DPCA), named as category-wise 2DPCA (CW2DPCA). A key component of the CW2DPCA is to independently construct optimal projection matrices from object-specific training datasets and produce category-wise feature spaces, wherein each feature space uniquely captures the invariant characteristics of the underlying intra-category samples. Consequently, on one hand, CW2DPCA enables early separation among the different object categories and, on the other hand, extracts effective discriminative features for representing both training datasets and test objects samples in the classification model, which is a nearest neighbor classifier. For ease of exposition, we consider human/vehicle classification, although the proposed CW2DPCA-based classification framework can be easily generalized to handle multiple objects classification. The experimental results prove the effectiveness of CW2DPCA features in discriminating between humans and vehicles in two publicly available video datasets.

Details

Applied Computing and Informatics, vol. 18 no. 1/2

Type: Research Article

DOI:

ISSN: 2210-8327

Keywords

View access options

Article

Publication date: 22 November 2019

ML²S-SVM: multi-label least-squares support vector machine classifiers

Shuo Xu and Xin An

Image classification is becoming a supporting technology in several image-processing tasks. Due to rich semantic information contained in the images, it is very popular for an…

HTML

PDF (1.8 MB)

Downloads

280

Abstract

Purpose

Image classification is becoming a supporting technology in several image-processing tasks. Due to rich semantic information contained in the images, it is very popular for an image to have several labels or tags. This paper aims to develop a novel multi-label classification approach with superior performance.

Design/methodology/approach

Many multi-label classification problems share two main characteristics: label correlations and label imbalance. However, most of current methods are devoted to either model label relationship or to only deal with unbalanced problem with traditional single-label methods. In this paper, multi-label classification problem is regarded as an unbalanced multi-task learning problem. Multi-task least-squares support vector machine (MTLS-SVM) is generalized for this problem, renamed as multi-label LS-SVM (ML²S-SVM).

Findings

Experimental results on the emotions, scene, yeast and bibtex data sets indicate that the ML²S-SVM is competitive with respect to the state-of-the-art methods in terms of Hamming loss and instance-based F1 score. The values of resulting parameters largely influence the performance of ML²S-SVM, so it is necessary for users to identify proper parameters in advance.

Originality/value

On the basis of MTLS-SVM, a novel multi-label classification approach, ML²S-SVM, is put forward. This method can overcome the unbalanced problem but also explicitly models arbitrary order correlations among labels by allowing multiple labels to share a subspace. In addition, the multi-label classification approach has a wider range of applications. That is to say, it is not limited to the field of image classification.

Details

The Electronic Library, vol. 37 no. 6

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

Access

Year

Content type

1 – 10 of over 5000

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

研究设计/方法/技术

研究目的

研究发现

原创/价值

Diseño/metodología/enfoque

Objetivo

Conclusiones

Originalidad/valor

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…