Search results

1 – 10 of over 9000

View access options

Article

Publication date: 22 November 2023

Using digital twin to enhance Sim2real transfer for reinforcement learning in 3C assembly

Weiwen Mu, Wenbai Chen, Huaidong Zhou, Naijun Liu, Haobin Shi and Jingchen Li

This paper aim to solve the problem of low assembly success rate for 3c assembly lines designed based on classical control algorithms due to inevitable random disturbances and…

HTML

PDF (1.8 MB)

Downloads

124

Abstract

Purpose

This paper aim to solve the problem of low assembly success rate for 3c assembly lines designed based on classical control algorithms due to inevitable random disturbances and other factors,by incorporating intelligent algorithms into the assembly line, the assembly process can be extended to uncertain assembly scenarios.

Design/methodology/approach

This work proposes a reinforcement learning framework based on digital twins. First, the authors used Unity3D to build a simulation environment that matches the real scene and achieved data synchronization between the real environment and the simulation environment through the robot operating system. Then, the authors trained the reinforcement learning model in the simulation environment. Finally, by creating a digital twin environment, the authors transferred the skill learned from the simulation to the real environment and achieved stable algorithm deployment in real-world scenarios.

Findings

In this work, the authors have completed the transfer of skill-learning algorithms from virtual to real environments by establishing a digital twin environment. On the one hand, the experiment proves the progressiveness of the algorithm and the feasibility of the application of digital twins in reinforcement learning transfer. On the other hand, the experimental results also provide reference for the application of digital twins in 3C assembly scenarios.

Originality/value

In this work, the authors designed a new encoder structure in the simulation environment to encode image information, which improved the model’s perception of the environment. At the same time, the authors used the fixed strategy combined with the reinforcement learning strategy to learn skills, which improved the rate of convergence and stability of skills learning. Finally, the authors transferred the learned skills to the physical platform through digital twin technology and realized the safe operation of the flexible printed circuit assembly task.

Details

Industrial Robot: the international journal of robotics research and application, vol. 51 no. 1

Type: Research Article

DOI:

ISSN: 0143-991X

Keywords

View access options

Article

Publication date: 1 May 2020

Optimization of hierarchical reinforcement learning relationship extraction model

Qihang Wu, Daifeng Li, Lu Huang and Biyun Ye

Entity relation extraction is an important research direction to obtain structured information. However, most of the current methods are to determine the relations between…

HTML

PDF (376 KB)

Downloads

165

Abstract

Purpose

Entity relation extraction is an important research direction to obtain structured information. However, most of the current methods are to determine the relations between entities in a given sentence based on a stepwise method, seldom considering entities and relations into a unified framework. The joint learning method is an optimal solution that combines relations and entities. This paper aims to optimize hierarchical reinforcement learning framework and provide an efficient model to extract entity relation.

Design/methodology/approach

This paper is based on the hierarchical reinforcement learning framework of joint learning and combines the model with BERT, the best language representation model, to optimize the word embedding and encoding process. Besides, this paper adjusts some punctuation marks to make the data set more standardized, and introduces positional information to improve the performance of the model.

Findings

Experiments show that the model proposed in this paper outperforms the baseline model with a 13% improvement, and achieve 0.742 in F1 score in NYT10 data set. This model can effectively extract entities and relations in large-scale unstructured text and can be applied to the fields of multi-domain information retrieval, intelligent understanding and intelligent interaction.

Originality/value

The research provides an efficient solution for researchers in a different domain to make use of artificial intelligence (AI) technologies to process their unstructured text more accurately.

Details

Information Discovery and Delivery, vol. 48 no. 3

Type: Research Article

DOI:

ISSN: 2398-6247

Keywords

View access options

Article

Publication date: 8 February 2021

A robotic system with reinforcement learning for lower extremity hemiparesis rehabilitation

Jiajun Xu, Linsen Xu, Gaoxin Cheng, Jia Shi, Jinfu Liu, Xingcan Liang and Shengyao Fan

This paper aims to propose a bilateral robotic system for lower extremity hemiparesis rehabilitation. The hemiplegic patients can complete rehabilitation exercise voluntarily with…

HTML

PDF (918 KB)

Downloads

287

Abstract

Purpose

This paper aims to propose a bilateral robotic system for lower extremity hemiparesis rehabilitation. The hemiplegic patients can complete rehabilitation exercise voluntarily with the assistance of the robot. The reinforcement learning is included in the robot control system, enhancing the muscle activation of the impaired limbs (ILs) efficiently with ensuring the patients’ safety.

Design/methodology/approach

A bilateral leader–follower robotic system is constructed for lower extremity hemiparesis rehabilitation, where the leader robot interacts with the healthy limb (HL) and the follow robot is worn by the IL. The therapeutic training is transferred from the HL to the IL with the assistance of the robot, and the IL follows the motion trajectory prescribed by the HL, which is called the mirror therapy. The model reference adaptive impedance control is used for the leader robot, and the reinforcement learning controller is designed for the follower robot. The reinforcement learning aims to increase the muscle activation of the IL and ensure that its motion can be mastered by the HL for safety. An asynchronous algorithm is designed by improving experience relay to run in parallel on multiple robotic platforms to reduce learning time.

Findings

Through clinical tests, the lower extremity hemiplegic patients can rehabilitate with high efficiency using the robotic system. Also, the proposed scheme outperforms other state-of-the-art methods in tracking performance, muscle activation, learning efficiency and rehabilitation efficacy.

Originality/value

Using the aimed robotic system, the lower extremity hemiplegic patients with different movement abilities can obtain better rehabilitation efficacy.

Details

Industrial Robot: the international journal of robotics research and application, vol. 48 no. 3

Type: Research Article

DOI:

ISSN: 0143-991X

Keywords

View access options

Article

Publication date: 12 April 2024

A novel neural network architecture and cross-model transfer learning for multi-task autonomous driving

Youwei Li and Jian Qu

The purpose of this research is to achieve multi-task autonomous driving by adjusting the network architecture of the model. Meanwhile, after achieving multi-task autonomous…

HTML

PDF (2.1 MB)

Downloads

Abstract

Purpose

The purpose of this research is to achieve multi-task autonomous driving by adjusting the network architecture of the model. Meanwhile, after achieving multi-task autonomous driving, the authors found that the trained neural network model performs poorly in untrained scenarios. Therefore, the authors proposed to improve the transfer efficiency of the model for new scenarios through transfer learning.

Design/methodology/approach

First, the authors achieved multi-task autonomous driving by training a model combining convolutional neural network and different structured long short-term memory (LSTM) layers. Second, the authors achieved fast transfer of neural network models in new scenarios by cross-model transfer learning. Finally, the authors combined data collection and data labeling to improve the efficiency of deep learning. Furthermore, the authors verified that the model has good robustness through light and shadow test.

Findings

This research achieved road tracking, real-time acceleration–deceleration, obstacle avoidance and left/right sign recognition. The model proposed by the authors (UniBiCLSTM) outperforms the existing models tested with model cars in terms of autonomous driving performance. Furthermore, the CMTL-UniBiCL-RL model trained by the authors through cross-model transfer learning improves the efficiency of model adaptation to new scenarios. Meanwhile, this research proposed an automatic data annotation method, which can save 1/4 of the time for deep learning.

Originality/value

This research provided novel solutions in the achievement of multi-task autonomous driving and neural network model scenario for transfer learning. The experiment was achieved on a single camera with an embedded chip and a scale model car, which is expected to simplify the hardware for autonomous driving.

Details

Data Technologies and Applications, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 2514-9288

Keywords

View access options

Article

Publication date: 16 October 2018

Model-based deep reinforcement learning with heuristic search for satellite attitude control

Ke Xu, Fengge Wu and Junsuo Zhao

Recently, deep reinforcement learning is developing rapidly and shows its power to solve difficult problems such as robotics and game of GO. Meanwhile, satellite attitude control…

HTML

PDF (581 KB)

Downloads

317

Abstract

Purpose

Recently, deep reinforcement learning is developing rapidly and shows its power to solve difficult problems such as robotics and game of GO. Meanwhile, satellite attitude control systems are still using classical control technics such as proportional – integral – derivative and slide mode control as major solutions, facing problems with adaptability and automation.

Design/methodology/approach

In this paper, an approach based on deep reinforcement learning is proposed to increase adaptability and autonomy of satellite control system. It is a model-based algorithm which could find solutions with fewer episodes of learning than model-free algorithms.

Findings

Simulation experiment shows that when classical control crashed, this approach could find solution and reach the target with hundreds times of explorations and learning.

Originality/value

This approach is a non-gradient method using heuristic search to optimize policy to avoid local optima. Compared with classical control technics, this approach does not need prior knowledge of satellite or its orbit, has the ability to adapt different kinds of situations with data learning and has the ability to adapt different kinds of satellite and different tasks through transfer learning.

Details

Industrial Robot: the international journal of robotics research and application, vol. 46 no. 3

Type: Research Article

DOI:

ISSN: 0143-991X

Keywords

View access options

Article

Publication date: 23 August 2019

ASPW-DRL: assembly sequence planning for workpieces via a deep reinforcement learning approach

Minghui Zhao, Xian Guo, Xuebo Zhang, Yongchun Fang and Yongsheng Ou

This paper aims to automatically plan sequence for complex assembly products and improve assembly efficiency.

HTML

PDF (2 MB)

Downloads

513

Abstract

Purpose

This paper aims to automatically plan sequence for complex assembly products and improve assembly efficiency.

Design/methodology/approach

An assembly sequence planning system for workpieces (ASPW) based on deep reinforcement learning is proposed in this paper. However, there exist enormous challenges for using DRL to this problem due to the sparse reward and the lack of training environment. In this paper, a novel ASPW-DQN algorithm is proposed and a training platform is built to overcome these challenges.

Findings

The system can get a good decision-making result and a generalized model suitable for other assembly problems. The experiments conducted in Gazebo show good results and great potential of this approach.

Originality/value

The proposed ASPW-DQN unites the curriculum learning and parameter transfer, which can avoid the explosive growth of assembly relations and improve system efficiency. It is combined with realistic physics simulation engine Gazebo to provide required training environment. Additionally with the effect of deep neural networks, the result can be easily applied to other similar tasks.

Details

Assembly Automation, vol. 40 no. 1

Type: Research Article

DOI:

ISSN: 0144-5154

Keywords

View access options

Article

Publication date: 16 January 2024

Optimal service resource management strategy for IoT-based health information system considering value co-creation of users

Ji Fang, Vincent C.S. Lee and Haiyan Wang

This paper explores optimal service resource management strategy, a continuous challenge for health information service to enhance service performance, optimise service resource…

HTML

PDF (2.1 MB)

Downloads

Abstract

Purpose

This paper explores optimal service resource management strategy, a continuous challenge for health information service to enhance service performance, optimise service resource utilisation and deliver interactive health information service.

Design/methodology/approach

An adaptive optimal service resource management strategy was developed considering a value co-creation model in health information service with a focus on collaborative and interactive with users. The deep reinforcement learning algorithm was embedded in the Internet of Things (IoT)-based health information service system (I-HISS) to allocate service resources by controlling service provision and service adaptation based on user engagement behaviour. The simulation experiments were conducted to evaluate the significance of the proposed algorithm under different user reactions to the health information service.

Findings

The results indicate that the proposed service resource management strategy, considering user co-creation in the service delivery, process improved both the service provider’s business revenue and users' individual benefits.

Practical implications

The findings may facilitate the design and implementation of health information services that can achieve a high user service experience with low service operation costs.

Originality/value

This study is amongst the first to propose a service resource management model in I-HISS, considering the value co-creation of the user in the service-dominant logic. The novel artificial intelligence algorithm is developed using the deep reinforcement learning method to learn the adaptive service resource management strategy. The results emphasise user engagement in the health information service process.

Details

Industrial Management & Data Systems, vol. 124 no. 3

Type: Research Article

DOI:

ISSN: 0263-5577

Keywords

View access options

Article

Publication date: 13 March 2024

Robot skill learning and the data dilemma it faces: a systematic review

Rong Jiang, Bin He, Zhipeng Wang, Xu Cheng, Hongrui Sang and Yanmin Zhou

Compared with traditional methods relying on manual teaching or system modeling, data-driven learning methods, such as deep reinforcement learning and imitation learning, show…

HTML

PDF (633 KB)

Downloads

Abstract

Purpose

Compared with traditional methods relying on manual teaching or system modeling, data-driven learning methods, such as deep reinforcement learning and imitation learning, show more promising potential to cope with the challenges brought by increasingly complex tasks and environments, which have become the hot research topic in the field of robot skill learning. However, the contradiction between the difficulty of collecting robot–environment interaction data and the low data efficiency causes all these methods to face a serious data dilemma, which has become one of the key issues restricting their development. Therefore, this paper aims to comprehensively sort out and analyze the cause and solutions for the data dilemma in robot skill learning.

Design/methodology/approach

First, this review analyzes the causes of the data dilemma based on the classification and comparison of data-driven methods for robot skill learning; Then, the existing methods used to solve the data dilemma are introduced in detail. Finally, this review discusses the remaining open challenges and promising research topics for solving the data dilemma in the future.

Findings

This review shows that simulation–reality combination, state representation learning and knowledge sharing are crucial for overcoming the data dilemma of robot skill learning.

Originality/value

To the best of the authors’ knowledge, there are no surveys that systematically and comprehensively sort out and analyze the data dilemma in robot skill learning in the existing literature. It is hoped that this review can be helpful to better address the data dilemma in robot skill learning in the future.

Details

Robotic Intelligence and Automation, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 2754-6969

Keywords

View access options

Article

Publication date: 8 September 2022

SeaRank: relevance prediction based on click models in a reinforcement learning framework

Amir Hosein Keyhanipour and Farhad Oroumchian

User feedback inferred from the user's search-time behavior could improve the learning to rank (L2R) algorithms. Click models (CMs) present probabilistic frameworks for describing…

HTML

PDF (396 KB)

Downloads

Abstract

Purpose

User feedback inferred from the user's search-time behavior could improve the learning to rank (L2R) algorithms. Click models (CMs) present probabilistic frameworks for describing and predicting the user's clicks during search sessions. Most of these CMs are based on common assumptions such as Attractiveness, Examination and User Satisfaction. CMs usually consider the Attractiveness and Examination as pre- and post-estimators of the actual relevance. They also assume that User Satisfaction is a function of the actual relevance. This paper extends the authors' previous work by building a reinforcement learning (RL) model to predict the relevance. The Attractiveness, Examination and User Satisfaction are estimated using a limited number of the features of the utilized benchmark data set and then they are incorporated in the construction of an RL agent. The proposed RL model learns to predict the relevance label of documents with respect to a given query more effectively than the baseline RL models for those data sets.

Design/methodology/approach

In this paper, User Satisfaction is used as an indication of the relevance level of a query to a document. User Satisfaction itself is estimated through Attractiveness and Examination, and in turn, Attractiveness and Examination are calculated by the random forest algorithm. In this process, only a small subset of top information retrieval (IR) features are used, which are selected based on their mean average precision and normalized discounted cumulative gain values. Based on the authors' observations, the multiplication of the Attractiveness and Examination values of a given query–document pair closely approximates the User Satisfaction and hence the relevance level. Besides, an RL model is designed in such a way that the current state of the RL agent is determined by discretization of the estimated Attractiveness and Examination values. In this way, each query–document pair would be mapped into a specific state based on its Attractiveness and Examination values. Then, based on the reward function, the RL agent would try to choose an action (relevance label) which maximizes the received reward in its current state. Using temporal difference (TD) learning algorithms, such as Q-learning and SARSA, the learning agent gradually learns to identify an appropriate relevance label in each state. The reward that is used in the RL agent is proportional to the difference between the User Satisfaction and the selected action.

Findings

Experimental results on MSLR-WEB10K and WCL2R benchmark data sets demonstrate that the proposed algorithm, named as SeaRank, outperforms baseline algorithms. Improvement is more noticeable in top-ranked results, which usually receive more attention from users.

Originality/value

This research provides a mapping from IR features to the CM features and thereafter utilizes these newly generated features to build an RL model. This RL model is proposed with the definition of the states, actions and reward function. By applying TD learning algorithms, such as the Q-learning and SARSA, within several learning episodes, the RL agent would be able to learn how to choose the most appropriate relevance label for a given pair of query–document.

Details

Data Technologies and Applications, vol. 57 no. 4

Type: Research Article

DOI:

ISSN: 2514-9288

Keywords

View access options

Article

Publication date: 15 January 2021

Reinforcement learning for content's customization: a first step of experimentation in Skyscanner

Chiara Giachino, Luigi Bollani, Alessandro Bonadonna and Marco Bertetti

The aim of the paper is to test and demonstrate the potential benefits in applying reinforcement learning instead of traditional methods to optimize the content of a company's…

HTML

PDF (1.3 MB)

Downloads

462

Abstract

Purpose

The aim of the paper is to test and demonstrate the potential benefits in applying reinforcement learning instead of traditional methods to optimize the content of a company's mobile application to best help travellers finding their ideal flights. To this end, two approaches were considered and compared via simulation: standard randomized experiments or A/B testing and multi-armed bandits.

Design/methodology/approach

The simulation of the two approaches to optimize the content of its mobile application and, consequently, increase flights conversions is illustrated as applied by Skyscanner, using R software.

Findings

The first results are about the comparison between the two approaches – A/B testing and multi-armed bandits – to identify the best one to achieve better results for the company. The second one is to gain experiences and suggestion in the application of the two approaches useful for other industries/companies.

Research limitations/implications

The case study demonstrated, via simulation, the potential benefits to apply the reinforcement learning in a company. Finally, the multi-armed bandit was implemented in the company, but the period of the available data was limited, and due to its strategic relevance, the company cannot show all the findings.

Practical implications

The right algorithm can change according to the situation and industry but would bring great benefits to the company's ability to surface content that is more relevant to users and help improving the experience for travellers. The study shows how to manage complexity and data to achieve good results.

Originality/value

The paper describes the approach used by an European leading company operating in the travel sector in understanding how to adapt reinforcement learning to its strategic goals. It presents a real case study and the simulation of the application of A/B testing and multi-armed bandit in Skyscanner; moreover, it highlights practical suggestion useful to other companies.

Details

Industrial Management & Data Systems, vol. 121 no. 6

Type: Research Article

DOI:

ISSN: 0263-5577

Keywords

Access

Year

Content type

1 – 10 of over 9000

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey