Search results

1 – 10 of 160
Article
Publication date: 9 September 2024

Weixing Wang, Yixia Chen and Mingwei Lin

Based on the strong feature representation ability of the convolutional neural network (CNN), generous object detection methods in remote sensing (RS) have been proposed one after…

Abstract

Purpose

Based on the strong feature representation ability of the convolutional neural network (CNN), generous object detection methods in remote sensing (RS) have been proposed one after another. However, due to the large variation in scale and the omission of relevant relationships between objects, there are still great challenges for object detection in RS. Most object detection methods fail to take the difficulties of detecting small and medium-sized objects and global context into account. Moreover, inference time and lightness are also major pain points in the field of RS.

Design/methodology/approach

To alleviate the aforementioned problems, this study proposes a novel method for object detection in RS, which is called lightweight object detection with a multi-receptive field and long-range dependency in RS images (MFLD). The multi-receptive field extraction (MRFE) and long-range dependency information extraction (LDIE) modules are put forward.

Findings

To concentrate on the variability of objects in RS, MRFE effectively expands the receptive field by a combination of atrous separable convolutions with different dilated rates. Considering the shortcomings of CNN in extracting global information, LDIE is designed to capture the relationships between objects. Extensive experiments over public datasets in RS images demonstrate that our MFLD method surpasses the state-of-the-art methods. Most of all, on the NWPU VHR-10 dataset, our MFLD method achieves 94.6% mean average precision with 4.08 M model volume.

Originality/value

This paper proposed a method called lightweight object detection with multi-receptive field and long-range dependency in RS images.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 13 August 2024

Yan Kan, Hao Li, Zhengtao Chen, Changjiang Sun, Hao Wang and Joachim Seidelmann

This paper aims to propose a stable and precise recognition and pose estimation method to deal with the difficulties that industrial parts often present, such as incomplete point…

31

Abstract

Purpose

This paper aims to propose a stable and precise recognition and pose estimation method to deal with the difficulties that industrial parts often present, such as incomplete point cloud data due to surface reflections, lack of color texture features and limited availability of effective three-dimensional geometric information. These challenges lead to less-than-ideal performance of existing object recognition and pose estimation methods based on two-dimensional images or three-dimensional point cloud features.

Design/methodology/approach

In this paper, an image-guided depth map completion method is proposed to improve the algorithm's adaptability to noise and incomplete point cloud scenes. Furthermore, this paper also proposes a pose estimation method based on contour feature matching.

Findings

Through experimental testing on real-world and virtual scene dataset, it has been verified that the image-guided depth map completion method exhibits higher accuracy in estimating depth values for depth map hole pixels. The pose estimation method proposed in this paper was applied to conduct pose estimation experiments on various parts. The average recognition accuracy in real-world scenes was 88.17%, whereas in virtual scenes, the average recognition accuracy reached 95%.

Originality/value

The proposed recognition and pose estimation method can stably and precisely deal with the difficulties that industrial parts present and improve the algorithm's adaptability to noise and incomplete point cloud scenes.

Details

Robotic Intelligence and Automation, vol. 44 no. 5
Type: Research Article
ISSN: 2754-6969

Keywords

Open Access
Article
Publication date: 26 April 2024

Adela Sobotkova, Ross Deans Kristensen-McLachlan, Orla Mallon and Shawn Adrian Ross

This paper provides practical advice for archaeologists and heritage specialists wishing to use ML approaches to identify archaeological features in high-resolution satellite…

Abstract

Purpose

This paper provides practical advice for archaeologists and heritage specialists wishing to use ML approaches to identify archaeological features in high-resolution satellite imagery (or other remotely sensed data sources). We seek to balance the disproportionately optimistic literature related to the application of ML to archaeological prospection through a discussion of limitations, challenges and other difficulties. We further seek to raise awareness among researchers of the time, effort, expertise and resources necessary to implement ML successfully, so that they can make an informed choice between ML and manual inspection approaches.

Design/methodology/approach

Automated object detection has been the holy grail of archaeological remote sensing for the last two decades. Machine learning (ML) models have proven able to detect uniform features across a consistent background, but more variegated imagery remains a challenge. We set out to detect burial mounds in satellite imagery from a diverse landscape in Central Bulgaria using a pre-trained Convolutional Neural Network (CNN) plus additional but low-touch training to improve performance. Training was accomplished using MOUND/NOT MOUND cutouts, and the model assessed arbitrary tiles of the same size from the image. Results were assessed using field data.

Findings

Validation of results against field data showed that self-reported success rates were misleadingly high, and that the model was misidentifying most features. Setting an identification threshold at 60% probability, and noting that we used an approach where the CNN assessed tiles of a fixed size, tile-based false negative rates were 95–96%, false positive rates were 87–95% of tagged tiles, while true positives were only 5–13%. Counterintuitively, the model provided with training data selected for highly visible mounds (rather than all mounds) performed worse. Development of the model, meanwhile, required approximately 135 person-hours of work.

Research limitations/implications

Our attempt to deploy a pre-trained CNN demonstrates the limitations of this approach when it is used to detect varied features of different sizes within a heterogeneous landscape that contains confounding natural and modern features, such as roads, forests and field boundaries. The model has detected incidental features rather than the mounds themselves, making external validation with field data an essential part of CNN workflows. Correcting the model would require refining the training data as well as adopting different approaches to model choice and execution, raising the computational requirements beyond the level of most cultural heritage practitioners.

Practical implications

Improving the pre-trained model’s performance would require considerable time and resources, on top of the time already invested. The degree of manual intervention required – particularly around the subsetting and annotation of training data – is so significant that it raises the question of whether it would be more efficient to identify all of the mounds manually, either through brute-force inspection by experts or by crowdsourcing the analysis to trained – or even untrained – volunteers. Researchers and heritage specialists seeking efficient methods for extracting features from remotely sensed data should weigh the costs and benefits of ML versus manual approaches carefully.

Social implications

Our literature review indicates that use of artificial intelligence (AI) and ML approaches to archaeological prospection have grown exponentially in the past decade, approaching adoption levels associated with “crossing the chasm” from innovators and early adopters to the majority of researchers. The literature itself, however, is overwhelmingly positive, reflecting some combination of publication bias and a rhetoric of unconditional success. This paper presents the failure of a good-faith attempt to utilise these approaches as a counterbalance and cautionary tale to potential adopters of the technology. Early-majority adopters may find ML difficult to implement effectively in real-life scenarios.

Originality/value

Unlike many high-profile reports from well-funded projects, our paper represents a serious but modestly resourced attempt to apply an ML approach to archaeological remote sensing, using techniques like transfer learning that are promoted as solutions to time and cost problems associated with, e.g. annotating and manipulating training data. While the majority of articles uncritically promote ML, or only discuss how challenges were overcome, our paper investigates how – despite reasonable self-reported scores – the model failed to locate the target features when compared to field data. We also present time, expertise and resourcing requirements, a rarity in ML-for-archaeology publications.

Details

Journal of Documentation, vol. 80 no. 5
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 29 August 2023

Åsne Stige, Efpraxia D. Zamani, Patrick Mikalef and Yuzhen Zhu

The aim of this article is to map the use of AI in the user experience (UX) design process. Disrupting the UX process by introducing novel digital tools such as artificial…

4539

Abstract

Purpose

The aim of this article is to map the use of AI in the user experience (UX) design process. Disrupting the UX process by introducing novel digital tools such as artificial intelligence (AI) has the potential to improve efficiency and accuracy, while creating more innovative and creative solutions. Thus, understanding how AI can be leveraged for UX has important research and practical implications.

Design/methodology/approach

This article builds on a systematic literature review approach and aims to understand how AI is used in UX design today, as well as uncover some prominent themes for future research. Through a process of selection and filtering, 46 research articles are analysed, with findings synthesized based on a user-centred design and development process.

Findings

The authors’ analysis shows how AI is leveraged in the UX design process at different key areas. Namely, these include understanding the context of use, uncovering user requirements, aiding solution design, and evaluating design, and for assisting development of solutions. The authors also highlight the ways in which AI is changing the UX design process through illustrative examples.

Originality/value

While there is increased interest in the use of AI in organizations, there is still limited work on how AI can be introduced into processes that depend heavily on human creativity and input. Thus, the authors show the ways in which AI can enhance such activities and assume tasks that have been typically performed by humans.

Details

Information Technology & People, vol. 37 no. 6
Type: Research Article
ISSN: 0959-3845

Keywords

Article
Publication date: 2 September 2024

Li Shaochen, Zhenyu Liu, Yu Huang, Daxin Liu, Guifang Duan and Jianrong Tan

Assembly action recognition plays an important role in assembly process monitoring and human-robot collaborative assembly. Previous works overlook the interaction relationship…

Abstract

Purpose

Assembly action recognition plays an important role in assembly process monitoring and human-robot collaborative assembly. Previous works overlook the interaction relationship between hands and operated objects and lack the modeling of subtle hand motions, which leads to a decline in accuracy for fine-grained action recognition. This paper aims to model the hand-object interactions and hand movements to realize high-accuracy assembly action recognition.

Design/methodology/approach

In this paper, a novel multi-stream hand-object interaction network (MHOINet) is proposed for assembly action recognition. To learn the hand-object interaction relationship in assembly sequence, an interaction modeling network (IMN) comprising both geometric and visual modeling is exploited in the interaction stream. The former captures the spatial location relation of hand and interacted parts/tools according to their detected bounding boxes, and the latter focuses on mining the visual context of hand and object at pixel level through a position attention model. To model the hand movements, a temporal enhancement module (TEM) with multiple convolution kernels is developed in the hand stream, which captures the temporal dependences of hand sequences in short and long ranges. Finally, assembly action prediction is accomplished by merging the outputs of different streams through a weighted score-level fusion. A robotic arm component assembly dataset is created to evaluate the effectiveness of the proposed method.

Findings

The method can achieve the recognition accuracy of 97.31% and 95.32% for coarse and fine assembly actions, which outperforms other comparative methods. Experiments on human-robot collaboration prove that our method can be applied to industrial production.

Originality/value

The author proposes a novel framework for assembly action recognition, which simultaneously leverages the features of hands, objects and hand-object interactions. The TEM enhances the representation of dynamics of hands and facilitates the recognition of assembly actions with various time spans. The IMN learns the semantic information from hand-object interactions, which is significant for distinguishing fine assembly actions.

Details

Robotic Intelligence and Automation, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2754-6969

Keywords

Article
Publication date: 13 September 2024

Dohyeong Kim, Jaehun Yang, Doyeop Lee, Dongmin Lee, Farzad Rahimian and Chansik Park

Computer vision (CV) offers a promising approach to transforming the conventional in-person inspection practices prevalent within the construction industry. However, the reliance…

Abstract

Purpose

Computer vision (CV) offers a promising approach to transforming the conventional in-person inspection practices prevalent within the construction industry. However, the reliance on centralized systems in current CV-based inspections introduces a vulnerability to potential data manipulation. Unreliable inspection records make it challenging for safety managers to make timely decisions to ensure safety compliance. To address this issue, this paper proposes a blockchain (BC) and CV-based framework to enhance safety inspections at construction sites.

Design/methodology/approach

This study adopted a BC-enhanced CV approach. By leveraging CV and BC, safety conditions are automatically identified from site images and can be reliably recorded as safety inspection data through the BC network. Additionally, by using this data, smart contracts coordinate inspection tasks, assign responsibilities and verify safety performance, managing the entire safety inspection process remotely.

Findings

A case study confirms the framework’s applicability and efficacy in facilitating remote and reliable safety inspections. The proposed framework is envisaged to greatly improve current safety inspection practices and, in doing so, contribute to reduced accidents and injuries in the construction industry.

Originality/value

This study provides novel and practical guidance for integrating CV and BC in construction safety inspection. It fulfills an identified need to study how to leverage CV-based inspection results for remotely managing the safety inspection process using BC. This work not only takes a significant step towards data-driven decision-making in the safety inspection process, but also paves the way for future studies aiming to develop tamper-proof data management systems for industrial inspections and audits.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 17 September 2024

Yanbiao Zou and Jianhui Yang

This paper aims to propose a lightweight, high-accuracy object detection model designed to enhance seam tracking quality under strong arcs and splashes condition. Simultaneously…

Abstract

Purpose

This paper aims to propose a lightweight, high-accuracy object detection model designed to enhance seam tracking quality under strong arcs and splashes condition. Simultaneously, the model aims to reduce computational costs.

Design/methodology/approach

The lightweight model is constructed based on Single Shot Multibox Detector (SSD). First, a neural architecture search method based on meta-learning and genetic algorithm is introduced to optimize pruning strategy, reducing human intervention and improving efficiency. Additionally, the Alternating Direction Method of Multipliers (ADMM) is used to perform structural pruning on SSD, effectively compressing the model with minimal loss of accuracy.

Findings

Compared to state-of-the-art models, this method better balances feature extraction accuracy and inference speed. Furthermore, seam tracking experiments on this welding robot experimental platform demonstrate that the proposed method exhibits excellent accuracy and robustness in practical applications.

Originality/value

This paper presents an innovative approach that combines ADMM structural pruning and meta-learning-based neural architecture search to significantly enhance the efficiency and performance of the SSD network. This method reduces computational cost while ensuring high detection accuracy, providing a reliable solution for welding robot laser vision systems in practical applications.

Details

Industrial Robot: the international journal of robotics research and application, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0143-991X

Keywords

Article
Publication date: 12 July 2024

Peng Guo, Weiyong Si and Chenguang Yang

The purpose of this paper is to enhance the performance of robots in peg-in-hole assembly tasks, enabling them to swiftly and robustly accomplish the task. It also focuses on the…

69

Abstract

Purpose

The purpose of this paper is to enhance the performance of robots in peg-in-hole assembly tasks, enabling them to swiftly and robustly accomplish the task. It also focuses on the robot’s ability to generalize across assemblies with different hole sizes.

Design/methodology/approach

Human behavior in peg-in-hole assembly serves as inspiration, where individuals visually locate the hole firstly and then continuously adjust the peg pose based on force/torque feedback during the insertion process. This paper proposes a novel framework that integrate visual servo and adjustment based on force/torque feedback, the authors use deep neural network (DNN) and image processing techniques to determine the pose of hole, then an incremental learning approach based on a broad learning system (BLS) is used to simulate human learning ability, the number of adjustments required for insertion process is continuously reduced.

Findings

The author conducted experiments on visual servo, adjustment based on force/torque feedback, and the proposed framework. Visual servo inferred the pixel position and orientation of the target hole in only about 0.12 s, and the robot achieved peg insertion with 1–3 adjustments based on force/torque feedback. The success rate for peg-in-hole assembly using the proposed framework was 100%. These results proved the effectiveness of the proposed framework.

Originality/value

This paper proposes a framework for peg-in-hole assembly that combines visual servo and adjustment based on force/torque feedback. The assembly tasks are accomplished using DNN, image processing and BLS. To the best of the authors’ knowledge, no similar methods were found in other people’s work. Therefore, the authors believe that this work is original.

Details

Robotic Intelligence and Automation, vol. 44 no. 5
Type: Research Article
ISSN: 2754-6969

Keywords

Article
Publication date: 19 July 2023

Ruochen Zeng, Jonathan J.S. Shi, Chao Wang and Tao Lu

As laser scanning technology becomes readily available and affordable, there is an increasing demand of using point cloud data collected from a laser scanner to create as-built…

Abstract

Purpose

As laser scanning technology becomes readily available and affordable, there is an increasing demand of using point cloud data collected from a laser scanner to create as-built building information modeling (BIM) models for quality assessment, schedule control and energy performance within construction projects. To enhance the as-built modeling efficiency, this study explores an integrated system, called Auto-Scan-To-BIM (ASTB), with an aim to automatically generate a complete Industry Foundation Classes (IFC) model consisted of the 3D building elements for the given building based on its point cloud without requiring additional modeling tools.

Design/methodology/approach

ASTB has been developed with three function modules. Taking the scanned point data as input, Module 1 is built on the basis of the widely used region segmentation methodology and expanded with enhanced plane boundary line detection methods and corner recalibration algorithms. Then, Module 2 is developed with a domain knowledge-based heuristic method to analyze the features of the recognized planes, to associate them with corresponding building elements and to create BIM models. Based on the spatial relationships between these building elements, Module 3 generates a complete IFC model for the entire project compatible with any BIM software.

Findings

A case study validated the ASTB with an application with five common types of building elements (e.g. wall, floor, ceiling, window and door).

Originality/value

First, an integrated system, ASTB, is developed to generate a BIM model from scanned point cloud data without using additional modeling tools. Second, an enhanced plane boundary line detection method and a corner recalibration algorithm are developed in ASTB with high accuracy in obtaining the true surface planes. At last, the research contributes to develop a module, which can automatically convert the identified building elements into an IFC format based on the geometry and spatial relationships of each plan.

Details

Engineering, Construction and Architectural Management, vol. 31 no. 9
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 10 September 2024

Dan Feng, Zhenyu Yin, Xiaohui Wang, Feiqing Zhang and Zisong Wang

Traditional visual simultaneous localization and mapping (SLAM) systems are primarily based on the assumption that the environment is static, which makes them struggle with the…

Abstract

Purpose

Traditional visual simultaneous localization and mapping (SLAM) systems are primarily based on the assumption that the environment is static, which makes them struggle with the interference caused by dynamic objects in complex industrial production environments. This paper aims to improve the stability of visual SLAM in complex dynamic environments through semantic segmentation and its optimization.

Design/methodology/approach

This paper proposes a real-time visual SLAM system for complex dynamic environments based on YOLOv5s semantic segmentation, named YLS-SLAM. The system combines semantic segmentation results and the boundary semantic enhancement algorithm. By recognizing and completing the semantic masks of dynamic objects from coarse to fine, it effectively eliminates the interference of dynamic feature points on the pose estimation and enhances the retention and extraction of prominent features in the background, thereby achieving stable operation of the system in complex dynamic environments.

Findings

Experiments on the Technische Universität München and Bonn data sets show that, under monocular and Red, Green, Blue - Depth modes, the localization accuracy of YLS-SLAM is significantly better than existing advanced dynamic SLAM methods, effectively improving the robustness of visual SLAM. Additionally, the authors also conducted tests using a monocular camera in a real industrial production environment, successfully validating its effectiveness and application potential in complex dynamic environment.

Originality/value

This paper combines semantic segmentation algorithms with boundary semantic enhancement algorithms to effectively achieve precise removal of dynamic objects and their edges, while ensuring the system's real-time performance, offering significant application value.

Details

Industrial Robot: the international journal of robotics research and application, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0143-991X

Keywords

1 – 10 of 160