To read this content please select one of the options below:

A cascaded CNN-based method for monocular vision robotic grasping

Xiaojun Wu (School of Mechanical Engineering and Automation, Harbin Institute of Technology Shenzhen, Shenzhen, China)
Peng Li (School of Mechanical Engineering and Automation, Harbin Institute of Technology Shenzhen, Shenzhen, China)
Jinghui Zhou (School of Mechanical Engineering and Automation, Harbin Institute of Technology Shenzhen, Shenzhen, China)
Yunhui Liu (Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong Kong, China)

Industrial Robot

ISSN: 0143-991x

Article publication date: 15 February 2022

Issue publication date: 1 June 2022

274

Abstract

Purpose

Scattered parts are laid randomly during the manufacturing process and have difficulty to recognize and manipulate. This study aims to complete the grasp of the scattered parts by a manipulator with a camera and learning method.

Design/methodology/approach

In this paper, a cascaded convolutional neural network (CNN) method for robotic grasping based on monocular vision and small data set of scattered parts is proposed. This method can be divided into three steps: object detection, monocular depth estimation and keypoint estimation. In the first stage, an object detection network is improved to effectively locate the candidate parts. Then, it contains a neural network structure and corresponding training method to learn and reason high-resolution input images to obtain depth estimation. The keypoint estimation in the third step is expressed as a cumulative form of multi-scale prediction from a network to use an red green blue depth (RGBD) map that is acquired from the object detection and depth map estimation. Finally, a grasping strategy is studied to achieve successful and continuous grasping. In the experiments, different workpieces are used to validate the proposed method. The best grasping success rate is more than 80%.

Findings

By using the CNN-based method to extract the key points of the scattered parts and calculating the possibility of grasp, the successful rate is increased.

Practical implications

This method and robotic systems can be used in picking and placing of most industrial automatic manufacturing or assembly processes.

Originality/value

Unlike standard parts, scattered parts are randomly laid and have difficulty recognizing and grasping for the robot. This study uses a cascaded CNN network to extract the keypoints of the scattered parts, which are also labeled with the possibility of successful grasping. Experiments are conducted to demonstrate the grasping of those scattered parts.

Keywords

Acknowledgements

This paper forms part of a special section “Dexterous Manipulation”, guest edited by Bin Fang, Qiang Li, Fei Chen and Weiwei Wan.

This work was supported in part by the National Natural Science Foundation of China (52175459), in part by the Basic Research Key Project of Shenzhen Science and Technology Plan (JCYJ20180507183456108, JCYJ20200109113416531), and by the Key Technical Projects Shenzhen Science and Technology Plan (JSGG20200701095007013).

Citation

Wu, X., Li, P., Zhou, J. and Liu, Y. (2022), "A cascaded CNN-based method for monocular vision robotic grasping", Industrial Robot, Vol. 49 No. 4, pp. 645-657. https://doi.org/10.1108/IR-10-2021-0236

Publisher

:

Emerald Publishing Limited

Copyright © 2022, Emerald Publishing Limited

Related articles