Search results
1 – 10 of over 33000Yan Kan, Hao Li, Zhengtao Chen, Changjiang Sun, Hao Wang and Joachim Seidelmann
This paper aims to propose a stable and precise recognition and pose estimation method to deal with the difficulties that industrial parts often present, such as incomplete point…
Abstract
Purpose
This paper aims to propose a stable and precise recognition and pose estimation method to deal with the difficulties that industrial parts often present, such as incomplete point cloud data due to surface reflections, lack of color texture features and limited availability of effective three-dimensional geometric information. These challenges lead to less-than-ideal performance of existing object recognition and pose estimation methods based on two-dimensional images or three-dimensional point cloud features.
Design/methodology/approach
In this paper, an image-guided depth map completion method is proposed to improve the algorithm's adaptability to noise and incomplete point cloud scenes. Furthermore, this paper also proposes a pose estimation method based on contour feature matching.
Findings
Through experimental testing on real-world and virtual scene dataset, it has been verified that the image-guided depth map completion method exhibits higher accuracy in estimating depth values for depth map hole pixels. The pose estimation method proposed in this paper was applied to conduct pose estimation experiments on various parts. The average recognition accuracy in real-world scenes was 88.17%, whereas in virtual scenes, the average recognition accuracy reached 95%.
Originality/value
The proposed recognition and pose estimation method can stably and precisely deal with the difficulties that industrial parts present and improve the algorithm's adaptability to noise and incomplete point cloud scenes.
Details
Keywords
This paper aims to propose a new solution for real-time 3D perception with monocular camera. Most of the industrial robots’ solutions use active sensors to acquire 3D structure…
Abstract
Purpose
This paper aims to propose a new solution for real-time 3D perception with monocular camera. Most of the industrial robots’ solutions use active sensors to acquire 3D structure information, which limit their applications to indoor scenarios. By only using monocular camera, some state of art method provides up-to-scale 3D structure information, but scale information of corresponding objects is uncertain.
Design/methodology/approach
First, high-accuracy and scale-informed camera pose and sparse 3D map are provided by leveraging ORB-SLAM and marker. Second, for each frame captured by a camera, a specially designed depth estimation pipeline is used to compute corresponding 3D structure called depth map in real-time. Finally, depth map is integrated into volumetric scene model. A feedback module has been designed for users to visualize intermediate scene surface in real-time.
Findings
The system provides more robust tracking performance and compelling results. The implementation runs near 25 Hz on mainstream laptop based on parallel computation technique.
Originality/value
A new solution for 3D perception is using monocular camera by leveraging ORB-SLAM systems. Results in our system are visually comparable to active sensor systems such as elastic fusion in small scenes. The system is also both efficient and easy to implement, and algorithms and specific configurations involved are introduced in detail.
Details
Keywords
Xiaojun Wu, Peng Li, Jinghui Zhou and Yunhui Liu
Scattered parts are laid randomly during the manufacturing process and have difficulty to recognize and manipulate. This study aims to complete the grasp of the scattered parts by…
Abstract
Purpose
Scattered parts are laid randomly during the manufacturing process and have difficulty to recognize and manipulate. This study aims to complete the grasp of the scattered parts by a manipulator with a camera and learning method.
Design/methodology/approach
In this paper, a cascaded convolutional neural network (CNN) method for robotic grasping based on monocular vision and small data set of scattered parts is proposed. This method can be divided into three steps: object detection, monocular depth estimation and keypoint estimation. In the first stage, an object detection network is improved to effectively locate the candidate parts. Then, it contains a neural network structure and corresponding training method to learn and reason high-resolution input images to obtain depth estimation. The keypoint estimation in the third step is expressed as a cumulative form of multi-scale prediction from a network to use an red green blue depth (RGBD) map that is acquired from the object detection and depth map estimation. Finally, a grasping strategy is studied to achieve successful and continuous grasping. In the experiments, different workpieces are used to validate the proposed method. The best grasping success rate is more than 80%.
Findings
By using the CNN-based method to extract the key points of the scattered parts and calculating the possibility of grasp, the successful rate is increased.
Practical implications
This method and robotic systems can be used in picking and placing of most industrial automatic manufacturing or assembly processes.
Originality/value
Unlike standard parts, scattered parts are randomly laid and have difficulty recognizing and grasping for the robot. This study uses a cascaded CNN network to extract the keypoints of the scattered parts, which are also labeled with the possibility of successful grasping. Experiments are conducted to demonstrate the grasping of those scattered parts.
Details
Keywords
Yuan Tian, Tao Guan and Cheng Wang
To make an augmented image realistic, the virtual objects should be correctly occluded by foreground objects. The purpose of this paper is to propose a new approach that resolves…
Abstract
Purpose
To make an augmented image realistic, the virtual objects should be correctly occluded by foreground objects. The purpose of this paper is to propose a new approach that resolves occlusion problems in augmented reality (AR). The main interest is that it can automatically obtain the proper spatial relationship between virtual and real objects in real time.
Design/methodology/approach
The approach is divided into two steps: off‐line disparity map constructing and on‐line occlusion handling. In the off‐line stage, the disparity map of the real scene is constructed using the global stereo matching method prior and then the disparities are refined by means of the fast mean shift method. Since the depth values of objects in different positions are different, the real object that occludes the virtual object can be specified according to the depth value. In the on‐line stage, the contour of the specified object is tracked using the real time object tracking method with the combination of feature tracking method and minimum s‐t cut method. The augmented image with correct occlusions is produced by redrawing all the tracked object pixels on the augmented image.
Findings
Compared with the existing methods, the proposed approach can automatically resolve occlusion problem in real time. The effectiveness of the method is demonstrated with several experimental results.
Originality/value
This paper makes three contributions. First, a novel framework is proposed to handle occlusion problem in AR. This framework is different to the previously proposed methods. The main procedure includes: obtain occluding real object, track the object, and redraw the pixels of the object on the composed image. It is much easier to implement and can achieve satisfactory results. Second, the disparity map is used to automatically obtain the contour of the occluding real object. To get the contour of the occluding real object precisely, the mean shift method is used to refine the disparity map. By comparing the depth value, the occluding real object can be extracted automatically. Third, the tracking method combining feature tracking method and minimum s‐t cut method ensures the real‐time requirement. The occlusion problem can be handled in real‐time.
Details
Keywords
This paper aims to quickly obtain an accurate and complete dense three-dimensional map of indoor environment with lower cost, which can be directly used in navigation.
Abstract
Purpose
This paper aims to quickly obtain an accurate and complete dense three-dimensional map of indoor environment with lower cost, which can be directly used in navigation.
Design/methodology/approach
This paper proposes an improved ORB-SLAM2 dense map optimization algorithm. This algorithm consists of three parts: ORB feature extraction based on improved FAST-12, feature point extraction with progressive sample consensus (PROSAC) and the dense ORB-SLAM2 algorithm for mapping. Here, the dense ORB-SLAM2 algorithm adds LoopClose optimization thread and dense point cloud map and octree map construction thread. The dense map is computationally expensive and occupies a large amount of memory. Therefore, the proposed method takes higher efficiency, voxel filtering can reduce the memory while ensuring the density of the map and then use the octree format to store the map to further reduce memory.
Findings
The improved ORB-SLAM2 algorithm is compared with the original ORB-SLAM2 algorithm, and the experimental results show that the map through improved ORB-SLAM2 can be directly used in navigation process with higher accuracy, shorter tracking time and smaller memory.
Originality/value
The improved ORB-SLAM2 algorithm can obtain a dense environment map, which ensures the integrity of data. The comparisons of FAST-12 and improved FAST-12, RANSAC and PROSAC prove that the improved FAST-12 and PROSAC both make the feature point extraction process faster and more accurate. Voxel filter helps to take small storage memory and low computation cost, and octree map construction on the dense map can be directly used in navigation.
Details
Keywords
S.M. Cotter and B.G. Batchelor
A depth map module, working with structured light, produces real‐time depth map pictures of three‐dimensional objects.
Zengrui Zheng, Kainan Su, Shifeng Lin, Zhiquan Fu and Chenguang Yang
Visual simultaneous localization and mapping (SLAM) has limitations such as sensitivity to lighting changes and lower measurement accuracy. The effective fusion of information…
Abstract
Purpose
Visual simultaneous localization and mapping (SLAM) has limitations such as sensitivity to lighting changes and lower measurement accuracy. The effective fusion of information from multiple modalities to address these limitations has emerged as a key research focus. This study aims to provide a comprehensive review of the development of vision-based SLAM (including visual SLAM) for navigation and pose estimation, with a specific focus on techniques for integrating multiple modalities.
Design/methodology/approach
This paper initially introduces the mathematical models and framework development of visual SLAM. Subsequently, this paper presents various methods for improving accuracy in visual SLAM by fusing different spatial and semantic features. This paper also examines the research advancements in vision-based SLAM with respect to multi-sensor fusion in both loosely coupled and tightly coupled approaches. Finally, this paper analyzes the limitations of current vision-based SLAM and provides predictions for future advancements.
Findings
The combination of vision-based SLAM and deep learning has significant potential for development. There are advantages and disadvantages to both loosely coupled and tightly coupled approaches in multi-sensor fusion, and the most suitable algorithm should be chosen based on the specific application scenario. In the future, vision-based SLAM is evolving toward better addressing challenges such as resource-limited platforms and long-term mapping.
Originality/value
This review introduces the development of vision-based SLAM and focuses on the advancements in multimodal fusion. It allows readers to quickly understand the progress and current status of research in this field.
Details
Keywords
Rahsidi Sabri Muda, Ainul Bahiah Mohd Khidzir and Mohamad Faiq Md Amin
Dams are constructed for many purposes such as for power generation, irrigation, water supply and flood control. However, dams can also impose risks to the public, and the…
Abstract
Dams are constructed for many purposes such as for power generation, irrigation, water supply and flood control. However, dams can also impose risks to the public, and the situation could be disastrous if dam failure occurred. The study area, Bertam Valley, is located downstream of hydroelectric dam known as Sultan Abu Bakar Dam, Cameron Highlands. The key objectives of the study are to determine the potential risk area at downstream and to assess the flooding impact on damage to buildings and infrastructures due to dam break event. ArcGIS application and output from two-dimensional flood modelling have been used as an integrated approach to analyse the impact due to dam break flood, by creating flood severity grid analysis. The result obtained shows that the estimated inundated area is about 0.28 km2, and almost 197 buildings are potentially affected. Results from this study show that in the event of dam break, the huge volume of impounding water will pound to the downstream areas, threatening the populations, and environment along its path. The finding is useful to assist the local authorities and emergency responders in formulating an emergency procedure to save the people during an emergency.
Details
Keywords
Quentin Kevin Gautier, Thomas G. Garrison, Ferrill Rushton, Nicholas Bouck, Eric Lo, Peter Tueller, Curt Schurgers and Ryan Kastner
Digital documentation techniques of tunneling excavations at archaeological sites are becoming more common. These methods, such as photogrammetry and LiDAR (Light Detection and…
Abstract
Purpose
Digital documentation techniques of tunneling excavations at archaeological sites are becoming more common. These methods, such as photogrammetry and LiDAR (Light Detection and Ranging), are able to create precise three-dimensional models of excavations to complement traditional forms of documentation with millimeter to centimeter accuracy. However, these techniques require either expensive pieces of equipment or a long processing time that can be prohibitive during short field seasons in remote areas. This article aims to determine the effectiveness of various low-cost sensors and real-time algorithms to create digital scans of archaeological excavations.
Design/methodology/approach
The authors used a class of algorithms called SLAM (Simultaneous Localization and Mapping) along with depth-sensing cameras. While these algorithms have largely improved over recent years, the accuracy of the results still depends on the scanning conditions. The authors developed a prototype of a scanning device and collected 3D data at a Maya archaeological site and refined the instrument in a system of natural caves. This article presents an analysis of the resulting 3D models to determine the effectiveness of the various sensors and algorithms employed.
Findings
While not as accurate as commercial LiDAR systems, the prototype presented, employing a time-of-flight depth sensor and using a feature-based SLAM algorithm, is a rapid and effective way to document archaeological contexts at a fraction of the cost.
Practical implications
The proposed system is easy to deploy, provides real-time results and would be particularly useful in salvage operations as well as in high-risk areas where cultural heritage is threatened.
Originality/value
This article compares many different low-cost scanning solutions for underground excavations, along with presenting a prototype that can be easily replicated for documentation purposes.
Details
Keywords
This paper explores a different approach to evaluating the merits of specific technical components of computer based learning applications. A traditional double blind experimental…
Abstract
This paper explores a different approach to evaluating the merits of specific technical components of computer based learning applications. A traditional double blind experimental study was implemented in a new context. A computer based Clinical Decision Simulator (CDS) system was designed and implemented incorporating an intelligent agent. This was compared to an otherwise identical system with no agent, and a group of students not using CBL systems. The results suggested that although no improvement in measurable learning outcomes could be conclusively demonstrated there was some evidence that those students using the intelligent agent system demonstrated more positive learning experiences and a deeper conceptualisation of the issues. This would suggest that a comparative multimethod experimental evaluation strategy, although complex (and not without its shortcomings) may help provide a more comprehensive analysis of students learning experience, and provide a useful picture of the student’s perceptions of CBL tools. This novel approach may be of particular relevance where the justification of a specific technological aspect of an e‐learning application is required. The value of developing and using an experimental strategy to evaluate a specific technological aspect of a computer based learning (CBL) application is discussed.
Details