Search results

1 – 10 of over 7000
Article
Publication date: 18 January 2013

Chen Guodong, Zeyang Xia, Rongchuan Sun, Zhenhua Wang and Lining Sun

Detecting objects in images and videos is a difficult task that has challenged the field of computer vision. Most of the algorithms for object detection are sensitive to…

Abstract

Purpose

Detecting objects in images and videos is a difficult task that has challenged the field of computer vision. Most of the algorithms for object detection are sensitive to background clutter and occlusion, and cannot localize the edge of the object. An object's shape is typically the most discriminative cue for its recognition by humans. The purpose of this paper is to introduce a model‐based object detection method which uses only shape‐fragment features.

Design/methodology/approach

The object shape model is learned from a small set of training images and all object models are composed of shape fragments. The model of the object is in multi‐scales.

Findings

The major contributions of this paper are the application of learned shape fragments‐based model for object detection in complex environment and a novel two‐stage object detection framework.

Originality/value

The results presented in this paper are competitive with other state‐of‐the‐art object detection methods.

Article
Publication date: 20 April 2023

Vishva Payghode, Ayush Goyal, Anupama Bhan, Sailesh Suryanarayan Iyer and Ashwani Kumar Dubey

This paper aims to implement and extend the You Only Live Once (YOLO) algorithm for detection of objects and activities. The advantage of YOLO is that it only runs a neural…

Abstract

Purpose

This paper aims to implement and extend the You Only Live Once (YOLO) algorithm for detection of objects and activities. The advantage of YOLO is that it only runs a neural network once to detect the objects in an image, which is why it is powerful and fast. Cameras are found at many different crossroads and locations, but video processing of the feed through an object detection algorithm allows determining and tracking what is captured. Video Surveillance has many applications such as Car Tracking and tracking of people related to crime prevention. This paper provides exhaustive comparison between the existing methods and proposed method. Proposed method is found to have highest object detection accuracy.

Design/methodology/approach

The goal of this research is to develop a deep learning framework to automate the task of analyzing video footage through object detection in images. This framework processes video feed or image frames from CCTV, webcam or a DroidCam, which allows the camera in a mobile phone to be used as a webcam for a laptop. The object detection algorithm, with its model trained on a large data set of images, is able to load in each image given as an input, process the image and determine the categories of the matching objects that it finds. As a proof of concept, this research demonstrates the algorithm on images of several different objects. This research implements and extends the YOLO algorithm for detection of objects and activities. The advantage of YOLO is that it only runs a neural network once to detect the objects in an image, which is why it is powerful and fast. Cameras are found at many different crossroads and locations, but video processing of the feed through an object detection algorithm allows determining and tracking what is captured. For video surveillance of traffic cameras, this has many applications, such as car tracking and person tracking for crime prevention. In this research, the implemented algorithm with the proposed methodology is compared against several different prior existing methods in literature. The proposed method was found to have the highest object detection accuracy for object detection and activity recognition, better than other existing methods.

Findings

The results indicate that the proposed deep learning–based model can be implemented in real-time for object detection and activity recognition. The added features of car crash detection, fall detection and social distancing detection can be used to implement a real-time video surveillance system that can help save lives and protect people. Such a real-time video surveillance system could be installed at street and traffic cameras and in CCTV systems. When this system would detect a car crash or a fatal human or pedestrian fall with injury, it can be programmed to send automatic messages to the nearest local police, emergency and fire stations. When this system would detect a social distancing violation, it can be programmed to inform the local authorities or sound an alarm with a warning message to alert the public to maintain their distance and avoid spreading their aerosol particles that may cause the spread of viruses, including the COVID-19 virus.

Originality/value

This paper proposes an improved and augmented version of the YOLOv3 model that has been extended to perform activity recognition, such as car crash detection, human fall detection and social distancing detection. The proposed model is based on a deep learning convolutional neural network model used to detect objects in images. The model is trained using the widely used and publicly available Common Objects in Context data set. The proposed model, being an extension of YOLO, can be implemented for real-time object and activity recognition. The proposed model had higher accuracies for both large-scale and all-scale object detection. This proposed model also exceeded all the other previous methods that were compared in extending and augmenting the object detection to activity recognition. The proposed model resulted in the highest accuracy for car crash detection, fall detection and social distancing detection.

Details

International Journal of Web Information Systems, vol. 19 no. 3/4
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 4 September 2017

Stephan Mühlbacher-Karrer, Juliana Padilha Leitzke, Lisa-Marie Faller and Hubert Zangl

This paper aims to investigate the usability of the non-iterative monotonicity approach for electrical capacitance tomography (ECT)-based object detection. This is of particular…

Abstract

Purpose

This paper aims to investigate the usability of the non-iterative monotonicity approach for electrical capacitance tomography (ECT)-based object detection. This is of particular importance with respect to object detection in robotic applications.

Design/methodology/approach

With respect to the detection problem, the authors propose a precomputed threshold value for the exclusion test to speed up the algorithm. Furthermore, they show that the use of an inhomogeneous split-up strategy of the region of interest (ROI) improves the performance of the object detection.

Findings

The proposed split-up strategy enables to use the monotonicity approach for robotic applications, where the spatial placement of the electrodes is constrained to a planar geometry. Additionally, owing to the improvements in the exclusion tests, the selection of subregions in the ROI allows for avoiding self-detection. Furthermore, the computational costs of the algorithm are reduced owing to the use of a predefined threshold, while the detection capabilities are not significantly influenced.

Originality/value

The presented simulation results show that the adapted split-up strategies for the ROI improve significantly the detection performance in comparison to the traditional ROI split-up strategy. Thus, the monotonicity approach becomes applicable for ECT-based object detection for applications, where only a reduced number of electrodes with constrained spatial placement can be used, such as in robotics.

Details

COMPEL - The international journal for computation and mathematics in electrical and electronic engineering, vol. 36 no. 5
Type: Research Article
ISSN: 0332-1649

Keywords

Article
Publication date: 21 November 2022

Aslan Ahmet Haykir and Ilkay Oksuz

Data quality and data resolution are essential for computer vision tasks like medical image processing, object detection, pattern recognition and so on. Super-resolution is a way…

109

Abstract

Purpose

Data quality and data resolution are essential for computer vision tasks like medical image processing, object detection, pattern recognition and so on. Super-resolution is a way to increase the image resolution, and super-resolved images contain more information compared to their low-resolution counterparts. The purpose of this study is analyzing the effects of the super resolution models trained before on object detection for aerial images.

Design/methodology/approach

Two different models were trained using the Super-Resolution Generative Adversarial Network (SRGAN) architecture on two aerial image data sets, the xView and the Dataset for Object deTection in Aerial images (DOTA). This study uses these models to increase the resolution of aerial images for improving object detection performance. This study analyzes the effects of the model with the best perceptual index (PI) and the model with the best RMSE on object detection in detail.

Findings

Super-resolution increases the object detection quality as expected. But, the super-resolution model with better perceptual quality achieves lower mean average precision results compared to the model with better RMSE. It means that the model with a better PI is more meaningful to human perception but less meaningful to computer vision.

Originality/value

The contributions of the authors to the literature are threefold. First, they do a wide analysis of SRGAN results for aerial image super-resolution on the task of object detection. Second, they compare super-resolution models with best PI and best RMSE to showcase the differences on object detection performance as a downstream task first time in the literature. Finally, they use a transfer learning approach for super-resolution to improve the performance of object detection.

Details

Information Discovery and Delivery, vol. 51 no. 4
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 21 May 2021

Chang Liu, Samad M.E. Sepasgozar, Sara Shirowzhan and Gelareh Mohammadi

The practice of artificial intelligence (AI) is increasingly being promoted by technology developers. However, its adoption rate is still reported as low in the construction…

1004

Abstract

Purpose

The practice of artificial intelligence (AI) is increasingly being promoted by technology developers. However, its adoption rate is still reported as low in the construction industry due to a lack of expertise and the limited reliable applications for AI technology. Hence, this paper aims to present the detailed outcome of experimentations evaluating the applicability and the performance of AI object detection algorithms for construction modular object detection.

Design/methodology/approach

This paper provides a thorough evaluation of two deep learning algorithms for object detection, including the faster region-based convolutional neural network (faster RCNN) and single shot multi-box detector (SSD). Two types of metrics are also presented; first, the average recall and mean average precision by image pixels; second, the recall and precision by counting. To conduct the experiments using the selected algorithms, four infrastructure and building construction sites are chosen to collect the required data, including a total of 990 images of three different but common modular objects, including modular panels, safety barricades and site fences.

Findings

The results of the comprehensive evaluation of the algorithms show that the performance of faster RCNN and SSD depends on the context that detection occurs. Indeed, surrounding objects and the backgrounds of the objects affect the level of accuracy obtained from the AI analysis and may particularly effect precision and recall. The analysis of loss lines shows that the loss lines for selected objects depend on both their geometry and the image background. The results on selected objects show that faster RCNN offers higher accuracy than SSD for detection of selected objects.

Research limitations/implications

The results show that modular object detection is crucial in construction for the achievement of the required information for project quality and safety objectives. The detection process can significantly improve monitoring object installation progress in an accurate and machine-based manner avoiding human errors. The results of this paper are limited to three construction sites, but future investigations can cover more tasks or objects from different construction sites in a fully automated manner.

Originality/value

This paper’s originality lies in offering new AI applications in modular construction, using a large first-hand data set collected from three construction sites. Furthermore, the paper presents the scientific evaluation results of implementing recent object detection algorithms across a set of extended metrics using the original training and validation data sets to improve the generalisability of the experimentation. This paper also provides the practitioners and scholars with a workflow on AI applications in the modular context and the first-hand referencing data.

Article
Publication date: 25 January 2023

Hui Xu, Junjie Zhang, Hui Sun, Miao Qi and Jun Kong

Attention is one of the most important factors to affect the academic performance of students. Effectively analyzing students' attention in class can promote teachers' precise…

Abstract

Purpose

Attention is one of the most important factors to affect the academic performance of students. Effectively analyzing students' attention in class can promote teachers' precise teaching and students' personalized learning. To intelligently analyze the students' attention in classroom from the first-person perspective, this paper proposes a fusion model based on gaze tracking and object detection. In particular, the proposed attention analysis model does not depend on any smart equipment.

Design/methodology/approach

Given a first-person view video of students' learning, the authors first estimate the gazing point by using the deep space–time neural network. Second, single shot multi-box detector and fast segmentation convolutional neural network are comparatively adopted to accurately detect the objects in the video. Third, they predict the gazing objects by combining the results of gazing point estimation and object detection. Finally, the personalized attention of students is analyzed based on the predicted gazing objects and the measurable eye movement criteria.

Findings

A large number of experiments are carried out on a public database and a new dataset that is built in a real classroom. The experimental results show that the proposed model not only can accurately track the students' gazing trajectory and effectively analyze the fluctuation of attention of the individual student and all students but also provide a valuable reference to evaluate the process of learning of students.

Originality/value

The contributions of this paper can be summarized as follows. The analysis of students' attention plays an important role in improving teaching quality and student achievement. However, there is little research on how to automatically and intelligently analyze students' attention. To alleviate this problem, this paper focuses on analyzing students' attention by gaze tracking and object detection in classroom teaching, which is significant for practical application in the field of education. The authors proposed an effectively intelligent fusion model based on the deep neural network, which mainly includes the gazing point module and the object detection module, to analyze students' attention in classroom teaching instead of relying on any smart wearable device. They introduce the attention mechanism into the gazing point module to improve the performance of gazing point detection and perform some comparison experiments on the public dataset to prove that the gazing point module can achieve better performance. They associate the eye movement criteria with visual gaze to get quantifiable objective data for students' attention analysis, which can provide a valuable basis to evaluate the learning process of students, provide useful learning information of students for both parents and teachers and support the development of individualized teaching. They built a new database that contains the first-person view videos of 11 subjects in a real classroom and employ it to evaluate the effectiveness and feasibility of the proposed model.

Details

Data Technologies and Applications, vol. 57 no. 5
Type: Research Article
ISSN: 2514-9288

Keywords

Open Access
Article
Publication date: 25 March 2021

Bartłomiej Kulecki, Kamil Młodzikowski, Rafał Staszak and Dominik Belter

The purpose of this paper is to propose and evaluate the method for grasping a defined set of objects in an unstructured environment. To this end, the authors propose the method…

2086

Abstract

Purpose

The purpose of this paper is to propose and evaluate the method for grasping a defined set of objects in an unstructured environment. To this end, the authors propose the method of integrating convolutional neural network (CNN)-based object detection and the category-free grasping method. The considered scenario is related to mobile manipulating platforms that move freely between workstations and manipulate defined objects. In this application, the robot is not positioned with respect to the table and manipulated objects. The robot detects objects in the environment and uses grasping methods to determine the reference pose of the gripper.

Design/methodology/approach

The authors implemented the whole pipeline which includes object detection, grasp planning and motion execution on the real robot. The selected grasping method uses raw depth images to find the configuration of the gripper. The authors compared the proposed approach with a representative grasping method that uses a 3D point cloud as an input to determine the grasp for the robotic arm equipped with a two-fingered gripper. To measure and compare the efficiency of these methods, the authors measured the success rate in various scenarios. Additionally, they evaluated the accuracy of object detection and pose estimation modules.

Findings

The performed experiments revealed that the CNN-based object detection and the category-free grasping methods can be integrated to obtain the system which allows grasping defined objects in the unstructured environment. The authors also identified the specific limitations of neural-based and point cloud-based methods. They show how the determined properties influence the performance of the whole system.

Research limitations/implications

The authors identified the limitations of the proposed methods and the improvements are envisioned as part of future research.

Practical implications

The evaluation of the grasping and object detection methods on the mobile manipulating robot may be useful for all researchers working on the autonomy of similar platforms in various applications.

Social implications

The proposed method increases the autonomy of robots in applications in the small industry which is related to repetitive tasks in a noisy and potentially risky environment. This allows reducing the human workload in these types of environments.

Originality/value

The main contribution of this research is the integration of the state-of-the-art methods for grasping objects with object detection methods and evaluation of the whole system on the industrial robot. Moreover, the properties of each subsystem are identified and measured.

Details

Industrial Robot: the international journal of robotics research and application, vol. 48 no. 5
Type: Research Article
ISSN: 0143-991X

Keywords

Open Access
Article
Publication date: 1 October 2018

Xunjia Zheng, Bin Huang, Daiheng Ni and Qing Xu

The purpose of this paper is to accurately capture the risks which are caused by each road user in time.

2802

Abstract

Purpose

The purpose of this paper is to accurately capture the risks which are caused by each road user in time.

Design/methodology/approach

The authors proposed a novel risk assessment approach based on the multi-sensor fusion algorithm in the real traffic environment. Firstly, they proposed a novel detection-level fusion approach for multi-object perception in dense traffic environment based on evidence theory. This approach integrated four states of track life into a generic fusion framework to improve the performance of multi-object perception. The information of object type, position and velocity was accurately obtained. Then, they conducted several experiments in real dense traffic environment on highways and urban roads, which enabled them to propose a novel road traffic risk modeling approach based on the dynamic analysis of vehicles in a variety of driving scenarios. By analyzing the generation process of traffic risks between vehicles and the road environment, the equivalent forces of vehicle–vehicle and vehicle–road were presented and theoretically calculated. The prediction steering angle and trajectory were considered in the determination of traffic risk influence area.

Findings

The results of multi-object perception in the experiments showed that the proposed fusion approach achieved low false and missing tracking, and the road traffic risk was described as a field of equivalent force. The results extend the understanding of the traffic risk, which supported that the traffic risk from the front and back of the vehicle can be perceived in advance.

Originality/value

This approach integrated four states of track life into a generic fusion framework to improve the performance of multi-object perception. The information of object type, position and velocity was used to reduce erroneous data association between tracks and detections. Then, the authors conducted several experiments in real dense traffic environment on highways and urban roads, which enabled them to propose a novel road traffic risk modeling approach based on the dynamic analysis of vehicles in a variety of driving scenarios. By analyzing the generation process of traffic risks between vehicles and the road environment, the equivalent forces of vehicle–vehicle and vehicle–road were presented and theoretically calculated.

Details

Journal of Intelligent and Connected Vehicles, vol. 1 no. 2
Type: Research Article
ISSN: 2399-9802

Keywords

Open Access
Article
Publication date: 29 July 2020

T. Mahalingam and M. Subramoniam

Surveillance is the emerging concept in the current technology, as it plays a vital role in monitoring keen activities at the nooks and corner of the world. Among which moving…

2120

Abstract

Surveillance is the emerging concept in the current technology, as it plays a vital role in monitoring keen activities at the nooks and corner of the world. Among which moving object identifying and tracking by means of computer vision techniques is the major part in surveillance. If we consider moving object detection in video analysis is the initial step among the various computer applications. The main drawbacks of the existing object tracking method is a time-consuming approach if the video contains a high volume of information. There arise certain issues in choosing the optimum tracking technique for this huge volume of data. Further, the situation becomes worse when the tracked object varies orientation over time and also it is difficult to predict multiple objects at the same time. In order to overcome these issues here, we have intended to propose an effective method for object detection and movement tracking. In this paper, we proposed robust video object detection and tracking technique. The proposed technique is divided into three phases namely detection phase, tracking phase and evaluation phase in which detection phase contains Foreground segmentation and Noise reduction. Mixture of Adaptive Gaussian (MoAG) model is proposed to achieve the efficient foreground segmentation. In addition to it the fuzzy morphological filter model is implemented for removing the noise present in the foreground segmented frames. Moving object tracking is achieved by the blob detection which comes under tracking phase. Finally, the evaluation phase has feature extraction and classification. Texture based and quality based features are extracted from the processed frames which is given for classification. For classification we are using J48 ie, decision tree based classifier. The performance of the proposed technique is analyzed with existing techniques k-NN and MLP in terms of precision, recall, f-measure and ROC.

Details

Applied Computing and Informatics, vol. 17 no. 1
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 14 November 2016

Anan Banharnsakun and Supannee Tanathong

Developing algorithms for automated detection and tracking of multiple objects is one challenge in the field of object tracking. Especially in a traffic video monitoring system…

Abstract

Purpose

Developing algorithms for automated detection and tracking of multiple objects is one challenge in the field of object tracking. Especially in a traffic video monitoring system, vehicle detection is an essential and challenging task. In the previous studies, many vehicle detection methods have been presented. These proposed approaches mostly used either motion information or characteristic information to detect vehicles. Although these methods are effective in detecting vehicles, their detection accuracy still needs to be improved. Moreover, the headlights and windshields, which are used as the vehicle features for detection in these methods, are easily obscured in some traffic conditions. The paper aims to discuss these issues.

Design/methodology/approach

First, each frame will be captured from a video sequence and then the background subtraction is performed by using the Mixture-of-Gaussians background model. Next, the Shi-Tomasi corner detection method is employed to extract the feature points from objects of interest in each foreground scene and the hierarchical clustering approach is then applied to cluster and form them into feature blocks. These feature blocks will be used to track the moving objects frame by frame.

Findings

Using the proposed method, it is possible to detect the vehicles in both day-time and night-time scenarios with a 95 percent accuracy rate and can cope with irrelevant movement (waving trees), which has to be deemed as background. In addition, the proposed method is able to deal with different vehicle shapes such as cars, vans, and motorcycles.

Originality/value

This paper presents a hierarchical clustering of features approach for multiple vehicles tracking in traffic environments to improve the capability of detection and tracking in case that the vehicle features are obscured in some traffic conditions.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 9 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

1 – 10 of over 7000