Search results
1 – 10 of 258
This purpose of this paper is to provide an overview of the theoretical background and applications of inverse reinforcement learning (IRL).
Abstract
Purpose
This purpose of this paper is to provide an overview of the theoretical background and applications of inverse reinforcement learning (IRL).
Design/methodology/approach
Reinforcement learning (RL) techniques provide a powerful solution for sequential decision making problems under uncertainty. RL uses an agent equipped with a reward function to find a policy through interactions with a dynamic environment. However, one major assumption of existing RL algorithms is that reward function, the most succinct representation of the designer's intention, needs to be provided beforehand. In practice, the reward function can be very hard to specify and exhaustive to tune for large and complex problems, and this inspires the development of IRL, an extension of RL, which directly tackles this problem by learning the reward function through expert demonstrations. In this paper, the original IRL algorithms and its close variants, as well as their recent advances are reviewed and compared.
Findings
This paper can serve as an introduction guide of fundamental theory and developments, as well as the applications of IRL.
Originality/value
This paper surveys the theories and applications of IRL, which is the latest development of RL and has not been done so far.
Details
Keywords
Hongbo Gao, Guanya Shi, Kelong Wang, Guotao Xie and Yuchao Liu
Over the past decades, there has been significant research effort dedicated to the development of autonomous vehicles. The decision-making system, which is responsible for…
Abstract
Purpose
Over the past decades, there has been significant research effort dedicated to the development of autonomous vehicles. The decision-making system, which is responsible for driving safety, is one of the most important technologies for autonomous vehicles. The purpose of this study is the use of an intensive learning method combined with car-following data by a driving simulator to obtain an explanatory learning following algorithm and establish an anthropomorphic car-following model.
Design/methodology/approach
This paper proposed car-following method based on reinforcement learning for autonomous vehicles decision-making. An approximator is used to approximate the value function by determining state space, action space and state transition relationship. A gradient descent method is used to solve the parameter.
Findings
The effect of car-following on certain driving styles is initially achieved through the simulation of step conditions. The effect of car-following initially proves that the reinforcement learning system is more adaptive to car following and that it has certain explanatory and stability based on the explicit calculation of R.
Originality/value
The simulation results show that the car-following method based on reinforcement learning for autonomous vehicle decision-making realizes reliable car-following decision-making and has the advantages of simple sample, small amount of data, simple algorithm and good robustness.
Details
Keywords
Nazli Turan, Miroslav Dudik, Geoff Gordon and Laurie R. Weingart
Purpose – The purpose of this chapter is to introduce new methods to behavioral research on group negotiation.Design/methodology/approach – We describe three techniques…
Abstract
Purpose – The purpose of this chapter is to introduce new methods to behavioral research on group negotiation.
Design/methodology/approach – We describe three techniques from the field of Machine Learning and discuss their possible application to modeling dynamic processes in group negotiation: Markov Models, Hidden Markov Models, and Inverse Reinforcement Learning. Although negotiation research has employed Markov modeling in the past, the latter two methods are even more novel and cutting-edge. They provide the opportunity for researchers to build more comprehensive models and to use data more efficiently. To demonstrate their potential, we use scenarios from group negotiation research and discuss their hypothetical application to these methods. We conclude by suggestions for researchers interested in pursuing this line of work.
Originality/value – This chapter introduces methods that have been successfully used in other fields and discusses how these methods can be used in behavioral negotiation research. This chapter can be a valuable guide to researchers that would like to pursue computational modeling of group negotiation.
Details
Keywords
Qun Shi, Wangda Ying, Lei Lv and Jiajun Xie
This paper aims to present an intelligent motion attitude control algorithm, which is used to solve the poor precision problems of motion-manipulation control and the…
Abstract
Purpose
This paper aims to present an intelligent motion attitude control algorithm, which is used to solve the poor precision problems of motion-manipulation control and the problems of motion balance of humanoid robots. Aiming at the problems of a few physical training samples and low efficiency, this paper proposes an offline pre-training of the attitude controller using the identification model as a priori knowledge of online training in the real physical environment.
Design/methodology/approach
The deep reinforcement learning (DRL) of continuous motion and continuous state space is applied to motion attitude control of humanoid robots and the robot motion intelligent attitude controller is constructed. Combined with the stability analysis of the training process and control process, the stability constraints of the training process and control process are established and the correctness of the constraints is demonstrated in the experiment.
Findings
Comparing with the proportion integration differentiation (PID) controller, PID + MPC controller and MPC + DOB controller in the humanoid robots environment transition walking experiment, the standard deviation of the tracking error of robots’ upper body pitch attitude trajectory under the control of the intelligent attitude controller is reduced by 60.37 per cent, 44.17 per cent and 26.58 per cent.
Originality/value
Using an intelligent motion attitude control algorithm to deal with the strong coupling nonlinear problem in biped robots walking can simplify the control process. The offline pre-training of the attitude controller using the identification model as a priori knowledge of online training in the real physical environment makes up the problems of a few physical training samples and low efficiency. The result of using the theory described in this paper shows the performance of the motion-manipulation control precision and motion balance of humanoid robots and provides some inspiration for the application of using DRL in biped robots walking attitude control.
Details
Keywords
Zoltan Dobra and Krishna S. Dhir
Recent years have seen a technological change, Industry 4.0, in the manufacturing industry. Human–robot cooperation, a new application, is increasing and facilitating…
Abstract
Purpose
Recent years have seen a technological change, Industry 4.0, in the manufacturing industry. Human–robot cooperation, a new application, is increasing and facilitating collaboration without fences, cages or any kind of separation. The purpose of the paper is to review mainstream academic publications to evaluate the current status of human–robot cooperation and identify potential areas of further research.
Design/methodology/approach
A systematic literature review is offered that searches, appraises, synthetizes and analyses relevant works.
Findings
The authors report the prevailing status of human–robot collaboration, human factors, complexity/ programming, safety, collision avoidance, instructing the robot system and other aspects of human–robot collaboration.
Practical implications
This paper identifies new directions and potential research in practice of human–robot collaboration, such as measuring the degree of collaboration, integrating human–robot cooperation into teamwork theories, effective functional relocation of the robot and product design for human robot collaboration.
Originality/value
This paper will be useful for three cohorts of readers, namely, the manufacturers who require a baseline for development and deployment of robots; users of robots-seeking manufacturing advantage and researchers looking for new directions for further exploration of human–machine collaboration.
Details
Keywords
Jinxin Liu, Hui Xiong, Tinghan Wang, Heye Huang, Zhihua Zhong and Yugong Luo
For autonomous vehicles, trajectory prediction of surrounding vehicles is beneficial to improving the situational awareness of dynamic and stochastic traffic environments…
Abstract
Purpose
For autonomous vehicles, trajectory prediction of surrounding vehicles is beneficial to improving the situational awareness of dynamic and stochastic traffic environments, which is a crucial and indispensable element to realize highly automated driving.
Design/methodology/approach
In this paper, the overall framework consists of two parts: first, a novel driver characteristic and intention estimation (DCIE) model is built to indicate the higher-level information of the vehicle using its low-level motion variables; then, according to the estimation results of the DCIE model, a classified Gaussian process model is established for probabilistic vehicle trajectory prediction under different motion patterns.
Findings
The whole method is later applied and analyzed in the highway lane-change scenarios with the parameters of models learned from the public naturalistic driving data set. Compared with other traditional methods, the performance of this proposed approach is proved superior, demonstrated by the higher accuracy in the long prediction horizon and a more reasonable description of uncertainty.
Originality/value
This hierarchical approach is proposed to make trajectory prediction accurately both in the short term and long term, which can also deal with the uncertainties caused by the perception system or indeterminate vehicle behaviors.
Details
Keywords
N. Rezzoug and P. Gorce
In this paper, a biocybernetic method to learn hand grasping posture definition with few knowledge about the task is proposed. The developed model is composed of two…
Abstract
In this paper, a biocybernetic method to learn hand grasping posture definition with few knowledge about the task is proposed. The developed model is composed of two stages. The first is dedicated to the fingers inverse kinematics learning in order to locally define a single finger posture given its desired fingertip position. This motor function is fulfilled by a modular neural network architecture that tackles the discontinuity problem of inverse kinematics function (called Fingers Configuration Neural Network (FCNN)). Following the concept of direct associative learning, a second neural model is used to search the space of hand configuration and optimize it according to an evaluative function based on the results of the FCNN. Simulation results show good learning of grasping posture determination of various object types, with different numbers of fingers involved and different contact configurations.
Details
Keywords
Adolfo Perrusquía, Wen Yu and Alberto Soria
The position/force control of the robot needs the parameters of the impedance model and generates the desired position from the contact force in the environment. When the…
Abstract
Purpose
The position/force control of the robot needs the parameters of the impedance model and generates the desired position from the contact force in the environment. When the environment is unknown, learning algorithms are needed to estimate both the desired force and the parameters of the impedance model.
Design/methodology/approach
In this paper, the authors use reinforcement learning to learn only the desired force, then they use proportional-integral-derivative admittance control to generate the desired position. The results of the experiment are presented to verify their approach.
Findings
The position error is minimized without knowing the environment or the impedance parameters. Another advantage of this simplified position/force control is that the transformation of the Cartesian space to the joint space by inverse kinematics is avoided by the feedback control mechanism. The stability of the closed-loop system is proven.
Originality/value
The position error is minimized without knowing the environment or the impedance parameters. The stability of the closed-loop system is proven.
Details
Keywords