Search results
11 – 20 of over 23000K. Satya Sujith and G. Sasikala
Object detection models have gained considerable popularity as they aid in lot of applications, like monitoring, video surveillance, etc. Object detection through the video…
Abstract
Purpose
Object detection models have gained considerable popularity as they aid in lot of applications, like monitoring, video surveillance, etc. Object detection through the video tracking faces lot of challenges, as most of the videos obtained as the real time stream are affected due to the environmental factors.
Design/methodology/approach
This research develops a system for crowd tracking and crowd behaviour recognition using hybrid tracking model. The input for the proposed crowd tracking system is high density crowd videos containing hundreds of people. The first step is to detect human through visual recognition algorithms. Here, a priori knowledge of location point is given as input to visual recognition algorithm. The visual recognition algorithm identifies the human through the constraints defined within Minimum Bounding Rectangle (MBR). Then, the spatial tracking model based tracks the path of the human object movement in the video frame, and the tracking is carried out by extraction of color histogram and texture features. Also, the temporal tracking model is applied based on NARX neural network model, which is effectively utilized to detect the location of moving objects. Once the path of the person is tracked, the behaviour of every human object is identified using the Optimal Support Vector Machine which is newly developed by combing SVM and optimization algorithm, namely MBSO. The proposed MBSO algorithm is developed through the integration of the existing techniques, like BSA and MBO.
Findings
The dataset for the object tracking is utilized from Tracking in high crowd density dataset. The proposed OSVM classifier has attained improved performance with the values of 0.95 for accuracy.
Originality/value
This paper presents a hybrid high density video tracking model, and the behaviour recognition model. The proposed hybrid tracking model tracks the path of the object in the video through the temporal tracking and spatial tracking. The features train the proposed OSVM classifier based on the weights selected by the proposed MBSO algorithm. The proposed MBSO algorithm can be regarded as the modified version of the BSO algorithm.
Details
Keywords
Nitha Thomas, Joshin John Mathew and Alex James
The real-time generation of feature descriptors for object recognition is a challenging problem. In this research, the purpose of this paper is to provide a hardware friendly…
Abstract
Purpose
The real-time generation of feature descriptors for object recognition is a challenging problem. In this research, the purpose of this paper is to provide a hardware friendly framework to generate sparse features that can be useful for key feature point selection, feature extraction, and descriptor construction. The inspiration is drawn from feature formation processes of the human brain, taking into account the sparse, modular, and hierarchical processing of visual information.
Design/methodology/approach
A sparse set of neurons referred as active neurons determines the feature points necessary for high-level vision applications such as object recognition. A psycho-physical mechanism of human low-level vision relates edge detection to noticeable local spatial stimuli, representing this set of active neurons. A cognitive memory cell array-based implementation of low-level vision is proposed. Applications of memory cell in edge detection are used for realizing human vision inspired feature selection and leading to feature vector construction for high-level vision applications.
Findings
True parallel architecture and faster response of cognitive circuits avoid time costly and redundant feature extraction steps. Validation of proposed feature vector toward high-level computer vision applications is demonstrated using standard object recognition databases. The comparison against existing state-of-the-art object recognition features and methods shows an accuracy of 97, 95, 69 percent for Columbia Object Image Library-100, ALOI, and PASCAL VOC 2007 databases indicating an increase from benchmark methods by 5, 3 and 10 percent, respectively.
Originality/value
A hardware friendly low-level sparse edge feature processing system is proposed for recognizing objects. The edge features are developed based on threshold logic of neurons, and the sparse selection of the features applies a modular and hierarchical processing inspired from the human neural system.
Details
Keywords
Mario Peña‐Cabrera, Ismael Lopez‐Juarez, Reyes Rios‐Cabrera and Jorge Corona‐Castuera
Outcome with a novel methodology for online recognition and classification of pieces in robotic assembly tasks and its application into an intelligent manufacturing cell.
Abstract
Purpose
Outcome with a novel methodology for online recognition and classification of pieces in robotic assembly tasks and its application into an intelligent manufacturing cell.
Design/methodology/approach
The performance of industrial robots working in unstructured environments can be improved using visual perception and learning techniques. The object recognition is accomplished using an artificial neural network (ANN) architecture which receives a descriptive vector called CFD&POSE as the input. Experimental results were done within a manufacturing cell and assembly parts.
Findings
Find this vector represents an innovative methodology for classification and identification of pieces in robotic tasks, obtaining fast recognition and pose estimation information in real time. The vector compresses 3D object data from assembly parts and it is invariant to scale, rotation and orientation, and it also supports a wide range of illumination levels.
Research limitations/implications
Provides vision guidance in assembly tasks, current work addresses the use of ANN's for assembly and object recognition separately, future work is oriented to use the same neural controller for all different sensorial modes.
Practical implications
Intelligent manufacturing cells developed with multimodal sensor capabilities, might use this methodology for future industrial applications including robotics fixtureless assembly. The approach in combination with the fast learning capability of ART networks indicates the suitability for industrial robot applications as it is demonstrated through experimental results.
Originality/value
This paper introduces a novel method which uses collections of 2D images to obtain a very fast feature data – ”current frame descriptor vector” – of an object by using image projections and canonical forms geometry grouping for invariant object recognition.
Details
Keywords
A great number of object recognition systems for industry have been designed during the last 10–15 years.[1,2] Most of them use TV‐cameras as data sources.[3] The main advantage…
Abstract
A great number of object recognition systems for industry have been designed during the last 10–15 years.[1,2] Most of them use TV‐cameras as data sources.[3] The main advantage of such systems is their high resolution. At the same time many manufacturing processes do not need high‐resolution sensors but require a high reliability of recognition, while working in an environment with bad optical conditions.
Li Xiao, Hye-jin Kim and Min Ding
Purpose – The advancement of multimedia technology has spurred the use of multimedia in business practice. The adoption of audio and visual data will accelerate as marketing…
Abstract
Purpose – The advancement of multimedia technology has spurred the use of multimedia in business practice. The adoption of audio and visual data will accelerate as marketing scholars become more aware of the value of audio and visual data and the technologies required to reveal insights into marketing problems. This chapter aims to introduce marketing scholars into this field of research.Design/methodology/approach – This chapter reviews the current technology in audio and visual data analysis and discusses rewarding research opportunities in marketing using these data.Findings – Compared with traditional data like survey and scanner data, audio and visual data provides richer information and is easier to collect. Given these superiority, data availability, feasibility of storage, and increasing computational power, we believe that these data will contribute to better marketing practices with the help of marketing scholars in the near future.Practical implications: The adoption of audio and visual data in marketing practices will help practitioners to get better insights into marketing problems and thus make better decisions.Value/originality – This chapter makes first attempt in the marketing literature to review the current technology in audio and visual data analysis and proposes promising applications of such technology. We hope it will inspire scholars to utilize audio and visual data in marketing research.
Details
Keywords
Shixin Zhang, Jianhua Shan, Fuchun Sun, Bin Fang and Yiyong Yang
The purpose of this paper is to present a novel tactile sensor and a visual-tactile recognition framework to reduce the uncertainty of the visual recognition of transparent objects…
Abstract
Purpose
The purpose of this paper is to present a novel tactile sensor and a visual-tactile recognition framework to reduce the uncertainty of the visual recognition of transparent objects.
Design/methodology/approach
A multitask learning model is used to recognize intuitive appearance attributes except texture in the visual mode. Tactile mode adopts a novel vision-based tactile sensor via the level-regional feature extraction network (LRFE-Net) recognition framework to acquire high-resolution texture information and temperature information. Finally, the attribute results of the two modes are integrated based on integration rules.
Findings
The recognition accuracy of attributes, such as style, handle, transparency and temperature, is near 100%, and the texture recognition accuracy is 98.75%. The experimental results demonstrate that the proposed framework with a vision-based tactile sensor can improve attribute recognition.
Originality/value
Transparency and visual differences make the texture of transparent glass hard to recognize. Vision-based tactile sensors can improve the texture recognition effect and acquire additional attributes. Integrating visual and tactile information is beneficial to acquiring complete attribute features.
Details
Keywords
Chen Guodong, Zeyang Xia, Rongchuan Sun, Zhenhua Wang and Lining Sun
Detecting objects in images and videos is a difficult task that has challenged the field of computer vision. Most of the algorithms for object detection are sensitive to…
Abstract
Purpose
Detecting objects in images and videos is a difficult task that has challenged the field of computer vision. Most of the algorithms for object detection are sensitive to background clutter and occlusion, and cannot localize the edge of the object. An object's shape is typically the most discriminative cue for its recognition by humans. The purpose of this paper is to introduce a model‐based object detection method which uses only shape‐fragment features.
Design/methodology/approach
The object shape model is learned from a small set of training images and all object models are composed of shape fragments. The model of the object is in multi‐scales.
Findings
The major contributions of this paper are the application of learned shape fragments‐based model for object detection in complex environment and a novel two‐stage object detection framework.
Originality/value
The results presented in this paper are competitive with other state‐of‐the‐art object detection methods.
Details
Keywords
Fei Yan, Ke Wang, Jizhong Xiao and Ruifeng Li
The most prominent example of scan matching algorithm is the Iterative Closest Point (ICP) algorithm. But the ICP algorithm and its variants excessively depend on the initial pose…
Abstract
Purpose
The most prominent example of scan matching algorithm is the Iterative Closest Point (ICP) algorithm. But the ICP algorithm and its variants excessively depend on the initial pose estimate between two scans. The purpose of this paper is to propose a scan matching algorithm, which is adaptable to big initial pose errors.
Design/methodology/approach
The environments are represented by flat units and upright units. The upright units are clustered to represent objects that the robot cannot cross over. The object cluster is further discretized to generate layered model consisting of cross-section ellipses. The layered model provides simplified features that facilitate an object recognition algorithm to discriminate among common objects in outdoor environments. A layered model graph is constructed with the recognized objects as nodes. Based on the similarity of sub-graphs in each scans, the layered model graph-based matching algorithm generates initial pose estimates and uses ICP to refine the scan matching results.
Findings
Experimental results indicate that the proposed algorithm can deal with bad initial pose estimates and increase the processing speed. Its computation time is short enough for real-time implementation in robotic applications in outdoor environments.
Originality/value
This paper proposes a bio-inspired scan matching algorithm for mobile robots based on layered model graph in outdoor environments.
Details
Keywords
Komal Ghafoor, Tauqir Ahmad, Muhammad Aslam and Samyan Wahla
Assistive technology has been developed to assist the visually impaired individuals in their social interactions. Specifically designed to enhance communication skills, facilitate…
Abstract
Purpose
Assistive technology has been developed to assist the visually impaired individuals in their social interactions. Specifically designed to enhance communication skills, facilitate social engagement and improve the overall quality of life, conversational assistive technologies include speech recognition APIs, text-to-speech APIs and various communication tools that are real. Enable real-time interaction. Using natural language processing (NLP) and machine learning algorithms, the technology analyzes spoken language and provides appropriate responses, offering an immersive experience through voice commands, audio feedback and vibration alerts.
Design/methodology/approach
These technologies have demonstrated their ability to promote self-confidence and self-reliance in visually impaired individuals during social interactions. Moreover, they promise to improve social competence and foster better relationships. In short, assistive technology in conversation stands as a promising tool that empowers the visually impaired individuals, elevating the quality of their social engagement.
Findings
The main benefit of assistive communication technology is that it will help visually impaired people overcome communication barriers in social contexts. This technology helps them communicate effectively with acquaintances, family, co-workers and even strangers in public places. By enabling smoother and more natural communication, it works to reduce feelings of isolation and increase overall quality of life.
Originality/value
Research findings include successful activity recognition, aligning with activities on which the VGG-16 model was trained, such as hugging, shaking hands, talking, walking, waving and more. The originality of this study lies in its approach to address the challenges faced by the visually impaired individuals in their social interactions through modern technology. Research adds to the body of knowledge in the area of assistive technologies, which contribute to the empowerment and social inclusion of the visually impaired individuals.
Details
Keywords
Xianwei Liu, Juan Luis Nicolau, Rob Law and Chunhong Li
This study aims to provide a critical reflection of the application of image recognition techniques in visual information mining in hospitality and tourism.
Abstract
Purpose
This study aims to provide a critical reflection of the application of image recognition techniques in visual information mining in hospitality and tourism.
Design/methodology/approach
This study begins by reviewing the progress of image recognition and advantages of convolutional neural network-based image recognition models. Next, this study explains and exemplifies the mechanisms and functions of two relevant image recognition applications: object recognition and facial recognition. This study concludes by providing theoretical and practical implications and potential directions for future research.
Findings
After this study presents different potential applications and compares the use of image recognition with traditional manual methods, the main findings of this critical reflection revolve around the feasibility of the described techniques.
Practical implications
Knowledge on how to extract valuable visual information from large-scale user-generated photos to infer the online behavior of consumers and service providers and its influence on purchase decisions and firm performance is crucial to business practices in hospitality and tourism.
Originality/value
Visual information plays a crucial role in online travel agencies and peer-to-peer accommodation platforms from the side of sellers and buyers. However, extant studies relied heavily on traditional manual identification with small samples and subjective judgment. With the development of deep learning and computer vision techniques, current studies were able to extract various types of visual information from large-scale datasets with high accuracy and efficiency. To the best of the authors’ knowledge, this study is the first to offer an outlook of image recognition techniques for mining visual information in hospitality and tourism.
Details