Search results
1 – 10 of over 5000Padmavati Shrivastava, K.K. Bhoyar and A.S. Zadgaonkar
The purpose of this paper is to build a classification system which mimics the perceptual ability of human vision, in gathering knowledge about the structure, content and the…
Abstract
Purpose
The purpose of this paper is to build a classification system which mimics the perceptual ability of human vision, in gathering knowledge about the structure, content and the surrounding environment of a real-world natural scene, at a quick glance accurately. This paper proposes a set of novel features to determine the gist of a given scene based on dominant color, dominant direction, openness and roughness features.
Design/methodology/approach
The classification system is designed at two different levels. At the first level, a set of low level features are extracted for each semantic feature. At the second level the extracted features are subjected to the process of feature evaluation, based on inter-class and intra-class distances. The most discriminating features are retained and used for training the support vector machine (SVM) classifier for two different data sets.
Findings
Accuracy of the proposed system has been evaluated on two data sets: the well-known Oliva-Torralba data set and the customized image data set comprising of high-resolution images of natural landscapes. The experimentation on these two data sets with the proposed novel feature set and SVM classifier has provided 92.68 percent average classification accuracy, using ten-fold cross validation approach. The set of proposed features efficiently represent visual information and are therefore capable of narrowing the semantic gap between low-level image representation and high-level human perception.
Originality/value
The method presented in this paper represents a new approach for extracting low-level features of reduced dimensionality that is able to model human perception for the task of scene classification. The methods of mapping primitive features to high-level features are intuitive to the user and are capable of reducing the semantic gap. The proposed feature evaluation technique is general and can be applied across any domain.
Details
Keywords
Megan Burfoot, Amirhosein Ghaffarianhoseini, Nicola Naismith and Ali GhaffarianHoseini
Informed by acoustic design standards, the built environments are designed with single reverberation times (RTs), a trade-off between long and short RTs needed for different space…
Abstract
Purpose
Informed by acoustic design standards, the built environments are designed with single reverberation times (RTs), a trade-off between long and short RTs needed for different space functions. A range of RTs should be achievable in spaces to optimise the acoustic comfort in different aural situations. This paper proclaims a novel concept: Intelligent passive room acoustic technology (IPRAT), which achieves real-time room acoustic optimisation through the integration of passive variable acoustic technology (PVAT) and acoustic scene classification (ASC). ASC can intelligently identify changing aural situations, and PVAT can physically vary the RT.
Design/methodology/approach
A qualitative best-evidence synthesis method is used to review the available literature on PVAT and ASC.
Findings
First, it is highlighted that dynamic spaces should be designed with varying RTs. The review then exposes a gap of intelligently adjusting RT according to changing building function. A solution is found: IPRAT, which integrates PVAT and ASC to uniquely fill this literature gap.
Originality/value
The development, functionality, benefits and challenges of IPRAT offer a holistic understanding of the state-of-the-art IPRAT, and a use case example is provided. Going forward, it is concluded that IPRAT can be prototyped and its impact on acoustic comfort can be quantified.
Details
Keywords
Content‐based image retrieval (CBIR) technologies offer many advantages over purely text‐based image search. However, one of the drawbacks associated with CBIR is the increased…
Abstract
Purpose
Content‐based image retrieval (CBIR) technologies offer many advantages over purely text‐based image search. However, one of the drawbacks associated with CBIR is the increased computational cost arising from tasks such as image processing, feature extraction, image classification, and object detection and recognition. Consequently CBIR systems have suffered from a lack of scalability, which has greatly hampered their adoption for real‐world public and commercial image search. At the same time, paradigms for large‐scale heterogeneous distributed computing such as grid computing, cloud computing, and utility‐based computing are gaining traction as a way of providing more scalable and efficient solutions to large‐scale computing tasks.
Design/methodology/approach
This paper presents an approach in which a large distributed processing grid has been used to apply a range of CBIR methods to a substantial number of images. By massively distributing the required computational task across thousands of grid nodes, very high through‐put has been achieved at relatively low overheads.
Findings
This has allowed one to analyse and index about 25 million high resolution images thus far, while using just two servers for storage and job submission. The CBIR system was developed by Imense Ltd and is based on automated analysis and recognition of image content using a semantic ontology. It features a range of image‐processing and analysis modules, including image segmentation, region classification, scene analysis, object detection, and face recognition methods.
Originality/value
In the case of content‐based image analysis, the primary performance criterion is the overall through‐put achieved by the system in terms of the number of images that can be processed over a given time frame, irrespective of the time taken to process any given image. As such, grid processing has great potential for massively parallel content‐based image retrieval and other tasks with similar performance requirements.
Details
Keywords
Kun Zhang, Hanqin Qiu, Jingyue Wang, Chunlin Li, Jinyi Zhang and Dora Dongzhi Chen
This paper aims to answer the following four research questions: Where do tourists gaze at the destination? What do tourists gaze at the destination? How do tourists gaze…
Abstract
Purpose
This paper aims to answer the following four research questions: Where do tourists gaze at the destination? What do tourists gaze at the destination? How do tourists gaze differently? Why do tourists gaze differently referring to relevant theory?
Design/methodology/approach
With a computer vision approach, this study illustrated a series of maps that reflect where and what do tourists gaze at and compared the differences in the visual perceptions among Asian, European and North American tourists in Hong Kong.
Findings
The findings confirm that the “tourist gaze” is influenced by geographical and cultural conditions. The conclusions provided three types of implementations for destination management strategies and advocated a high engagement with computer vision technology.
Originality/value
In theory, this study proves that the “tourist gaze” is influenced by geographical and cultural conditions. The study’s methodological contribution lies in applying advanced technology of visual content analysis for big data relevant to the issue of the tourist gaze. Practically, the finding that has not been achieved via previous questionnaire surveys will serve as a reference for tourism recommendations and precision marketing. In addition, its practical contribution is that it offers a means by which to explore tourists’ perceptions of destinations and understand the attractiveness of destinations to tourists.
研究设计/方法/技术
研究一方面使用计算机视觉深入学习模型对游客照片内容进行识别, 比较了亚洲、欧洲和北美游客在香港不同空间场景的视觉感知差异。另一方面, 研究借助ArcGIS软件对游客凝视地点和内容差异进行了具体可视化分析。
研究目的
这项研究有四个研究子问题:
(1) 游客在哪里凝视?
(2) 游客凝视了什么?
(3) 游客凝视内容有什么不同?
(4) 为什么游客凝视不同?
(1) 游客在哪里凝视?
(2) 游客凝视了什么?
(3) 游客凝视内容有什么不同?
(4) 为什么游客凝视不同?
研究发现
不同游客在旅游目的地的“凝视”存在差异, 差异表征具体体现在地点选择和内容偏好等维度。同时, 研究结果显示计算机视觉技术在旅游研究领域呈现较好的应用潜力。
原创/价值
理论上, 本研究佐证了”游客凝视”受地理和文化条件影响的理论。技术上, 本研究探索了视觉分析技术在游客凝视议题上应用, 为旅游目的地感知评估提供了新的视角。应用层面, 研究结论为旅游目的地精准营销提供了参考。
Resumen
Diseño/metodología/enfoque
Con un enfoque de visión artificial, este estudio ilustra una serie de mapas que reflejan dónde y qué miran los turistas, y compara las diferencias en las percepciones visuales entre los turistas asiáticos, europeos y norteamericanos en Hong Kong.
Objetivo
El estudio tiene cuatro preguntas de investigación:
(1) ¿Dónde miran los turistas en el destino?
(2) ¿Qué miran los turistas en el destino?
(3) ¿Cómo miran los turistas de forma diferente?
(4) ¿Por qué los turistas miran de forma diferente en referencia a la teoría pertinente?
(1) ¿Dónde miran los turistas en el destino?
(2) ¿Qué miran los turistas en el destino?
(3) ¿Cómo miran los turistas de forma diferente?
(4) ¿Por qué los turistas miran de forma diferente en referencia a la teoría pertinente?
Conclusiones
Las conclusiones confirman que la “mirada del turista” está influida por las condiciones geográficas y culturales. Las conclusiones aportan tres tipos de aplicaciones para las estrategias de gestión de destinos y abogan por un alto compromiso con la tecnología de visión artificial.
Originalidad/valor
En teoría, este estudio demuestra que la “mirada del turista” está influenciada por las condiciones geográficas y culturales. La contribución metodológica del estudio radica en la aplicación de tecnología avanzada de análisis de contenido visual para big data relevante para el tema de la mirada del turista. En la práctica, los hallazgos que no se han logrado a través de encuestas anteriores servirán de referencia para las recomendaciones turísticas y el marketing de precisión. Además, su contribución práctica es que ofrece un medio para explorar las percepciones de los turistas sobre los destinos, y comprender el atractivo de los mismos para los turistas.
Details
Keywords
- Visual content analysis
- Computer vision technology
- Spatial distribution
- Geo-tagged photos
- Deep learning model
- Cultural convention
- Visual perception
- Análisis de contenido visual
- Tecnología de vision artificial
- Distribución espacial
- Fotos geoetiquetadas
- Modelo de deep learning
- Convención cultural
- 视觉内容分析
- 计算机视觉技术
- 空间分布
- 带有地理标签的照片
- 深入学习模型
- 文化传统
Ashley N. Hewitt, Eric Beauregard and Jonghan Sea
Early classification systems of fire setting have suffered from several limitations, including the lack of empirical validation and the focus mainly on the offender motivation…
Abstract
Purpose
Early classification systems of fire setting have suffered from several limitations, including the lack of empirical validation and the focus mainly on the offender motivation behind this type of crime. More recent research shows that looking at the crime scene behaviors may present a more fruitful approach for helping to solve fire setting offenses. The purpose of this study is to advance current scholarship by developing a new typology of fire setting based on the combination of offender motive and crime scene behaviors.
Design/methodology/approach
Latent class analyses were used with a sample of 134 fire setters who committed 275 arsons from the Korean National Police Agency to identify distinct fire setter motivations and crime scene contexts. Chi-square and crosstabulation analysis were then conducted to determine whether crime scene behaviors were associated with distinct offender motives and vice versa. Lastly, to improve the external validity of each of the latent classes, chi-square analyses were performed using variables related to the fire setters' criminal history, sociodemographic characteristics and arson classification.
Findings
Five motive subtypes were identified as well as five distinct crime scene contexts in which serial fire setting occurs. A significant association among these classes suggests that it is possible to infer fire setters’ motive from crime scene behavior and vice versa.
Originality/value
This comprehensive typology of fire setters has potential for profiling of unknown offenders as well as for suspect prioritization in police investigations.
Details
Keywords
Yuye Wang, Guofeng Zhang and Xiaoguang Hu
Infrared simulation plays an important role in small and affordable unmanned aerial vehicles. Its key and main goal is to get the infrared image of a specific target. Infrared…
Abstract
Purpose
Infrared simulation plays an important role in small and affordable unmanned aerial vehicles. Its key and main goal is to get the infrared image of a specific target. Infrared physical model is established through a theoretical research, thus the temperature field is available. Then infrared image of a specific target can be simulated properly while taking atmosphere state and effect of infrared imaging system into account. For recent years, some research has been done in this field. Among them, the infrared simulation for large scale is still a key problem to be solved. In this passage, a method of classification based on texture blending is proposed and this method effectively solves the problem of classification of large number of images and increase the frame rate of large infrared scene rendering. The paper aims to discuss these issues.
Design/methodology/approach
Mosart Atmospheric Tool (MAT) is used first to calculate data of sun radiance, skyshine radiance, path radiance, temperatures of different material which is an offline process. Then, shader in OGRE does final calculation to get simulation result and keeps a high frame rate. Considering this, the authors convert data in MAT file into textures which can be easily handled by shader. In shader responding, radiance can be indexed by information of material, vertex normal, eye and sun. Adding the effect of infrared imaging system, the final radiance distribution is obtained. At last, the authors get infrared scene by converting radiance to grayscale.
Findings
In the fragment shader, fake infrared textures are used to look up temperature which can calculate radiance of itself and related radiance.
Research limitations/implications
The radiance is transferred into grayscale image while considering effect of infrared imaging system.
Originality/value
Simulation results show that a high frame rate can be reached while guaranteeing the fidelity.
Details
Keywords
Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based…
Abstract
Purpose
Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based on RGBD clothing images often suffer from high-dimensional feature representations, leading to compromised performance and efficiency.
Design/methodology/approach
To address this issue, this paper proposes a novel method called Manifold Embedded Discriminative Feature Selection (MEDFS) to select global and local features, thereby reducing the dimensionality of the feature representation and improving performance. Specifically, by combining three global features and three local features, a low-dimensional embedding is constructed to capture the correlations between features and categories. The MEDFS method designs an optimization framework utilizing manifold mapping and sparse regularization to achieve feature selection. The optimization objective is solved using an alternating iterative strategy, ensuring convergence.
Findings
Empirical studies conducted on a publicly available RGBD clothing image dataset demonstrate that the proposed MEDFS method achieves highly competitive clothing classification performance while maintaining efficiency in clothing recognition and retrieval.
Originality/value
This paper introduces a novel approach for multi-category clothing recognition and retrieval, incorporating the selection of global and local features. The proposed method holds potential for practical applications in real-world clothing scenarios.
Details
Keywords
Huaxiang Song, Chai Wei and Zhou Yong
The paper aims to tackle the classification of Remote Sensing Images (RSIs), which presents a significant challenge for computer algorithms due to the inherent characteristics of…
Abstract
Purpose
The paper aims to tackle the classification of Remote Sensing Images (RSIs), which presents a significant challenge for computer algorithms due to the inherent characteristics of clustered ground objects and noisy backgrounds. Recent research typically leverages larger volume models to achieve advanced performance. However, the operating environments of remote sensing commonly cannot provide unconstrained computational and storage resources. It requires lightweight algorithms with exceptional generalization capabilities.
Design/methodology/approach
This study introduces an efficient knowledge distillation (KD) method to build a lightweight yet precise convolutional neural network (CNN) classifier. This method also aims to substantially decrease the training time expenses commonly linked with traditional KD techniques. This approach entails extensive alterations to both the model training framework and the distillation process, each tailored to the unique characteristics of RSIs. In particular, this study establishes a robust ensemble teacher by independently training two CNN models using a customized, efficient training algorithm. Following this, this study modifies a KD loss function to mitigate the suppression of non-target category predictions, which are essential for capturing the inter- and intra-similarity of RSIs.
Findings
This study validated the student model, termed KD-enhanced network (KDE-Net), obtained through the KD process on three benchmark RSI data sets. The KDE-Net surpasses 42 other state-of-the-art methods in the literature published from 2020 to 2023. Compared to the top-ranked method’s performance on the challenging NWPU45 data set, KDE-Net demonstrated a noticeable 0.4% increase in overall accuracy with a significant 88% reduction in parameters. Meanwhile, this study’s reformed KD framework significantly enhances the knowledge transfer speed by at least three times.
Originality/value
This study illustrates that the logit-based KD technique can effectively develop lightweight CNN classifiers for RSI classification without substantial sacrifices in computation and storage costs. Compared to neural architecture search or other methods aiming to provide lightweight solutions, this study’s KDE-Net, based on the inherent characteristics of RSIs, is currently more efficient in constructing accurate yet lightweight classifiers for RSI classification.
Details
Keywords
Falah Alsaqre and Osama Almathkour
Classifying moving objects in video sequences has been extensively studied, yet it is still an ongoing problem. In this paper, we propose to solve moving objects classification…
Abstract
Classifying moving objects in video sequences has been extensively studied, yet it is still an ongoing problem. In this paper, we propose to solve moving objects classification problem via an extended version of two-dimensional principal component analysis (2DPCA), named as category-wise 2DPCA (CW2DPCA). A key component of the CW2DPCA is to independently construct optimal projection matrices from object-specific training datasets and produce category-wise feature spaces, wherein each feature space uniquely captures the invariant characteristics of the underlying intra-category samples. Consequently, on one hand, CW2DPCA enables early separation among the different object categories and, on the other hand, extracts effective discriminative features for representing both training datasets and test objects samples in the classification model, which is a nearest neighbor classifier. For ease of exposition, we consider human/vehicle classification, although the proposed CW2DPCA-based classification framework can be easily generalized to handle multiple objects classification. The experimental results prove the effectiveness of CW2DPCA features in discriminating between humans and vehicles in two publicly available video datasets.
Details
Keywords
Image classification is becoming a supporting technology in several image-processing tasks. Due to rich semantic information contained in the images, it is very popular for an…
Abstract
Purpose
Image classification is becoming a supporting technology in several image-processing tasks. Due to rich semantic information contained in the images, it is very popular for an image to have several labels or tags. This paper aims to develop a novel multi-label classification approach with superior performance.
Design/methodology/approach
Many multi-label classification problems share two main characteristics: label correlations and label imbalance. However, most of current methods are devoted to either model label relationship or to only deal with unbalanced problem with traditional single-label methods. In this paper, multi-label classification problem is regarded as an unbalanced multi-task learning problem. Multi-task least-squares support vector machine (MTLS-SVM) is generalized for this problem, renamed as multi-label LS-SVM (ML2S-SVM).
Findings
Experimental results on the emotions, scene, yeast and bibtex data sets indicate that the ML2S-SVM is competitive with respect to the state-of-the-art methods in terms of Hamming loss and instance-based F1 score. The values of resulting parameters largely influence the performance of ML2S-SVM, so it is necessary for users to identify proper parameters in advance.
Originality/value
On the basis of MTLS-SVM, a novel multi-label classification approach, ML2S-SVM, is put forward. This method can overcome the unbalanced problem but also explicitly models arbitrary order correlations among labels by allowing multiple labels to share a subspace. In addition, the multi-label classification approach has a wider range of applications. That is to say, it is not limited to the field of image classification.
Details