Search results

1 – 10 of over 1000
Open Access
Article
Publication date: 25 January 2024

Atef Gharbi

The purpose of the paper is to propose and demonstrate a novel approach for addressing the challenges of path planning and obstacle avoidance in the context of mobile robots (MR)…

Abstract

Purpose

The purpose of the paper is to propose and demonstrate a novel approach for addressing the challenges of path planning and obstacle avoidance in the context of mobile robots (MR). The specific objectives and purposes outlined in the paper include: introducing a new methodology that combines Q-learning with dynamic reward to improve the efficiency of path planning and obstacle avoidance. Enhancing the navigation of MR through unfamiliar environments by reducing blind exploration and accelerating the convergence to optimal solutions and demonstrating through simulation results that the proposed method, dynamic reward-enhanced Q-learning (DRQL), outperforms existing approaches in terms of achieving convergence to an optimal action strategy more efficiently, requiring less time and improving path exploration with fewer steps and higher average rewards.

Design/methodology/approach

The design adopted in this paper to achieve its purposes involves the following key components: (1) Combination of Q-learning and dynamic reward: the paper’s design integrates Q-learning, a popular reinforcement learning technique, with dynamic reward mechanisms. This combination forms the foundation of the approach. Q-learning is used to learn and update the robot’s action-value function, while dynamic rewards are introduced to guide the robot’s actions effectively. (2) Data accumulation during navigation: when a MR navigates through an unfamiliar environment, it accumulates experience data. This data collection is a crucial part of the design, as it enables the robot to learn from its interactions with the environment. (3) Dynamic reward integration: dynamic reward mechanisms are integrated into the Q-learning process. These mechanisms provide feedback to the robot based on its actions, guiding it to make decisions that lead to better outcomes. Dynamic rewards help reduce blind exploration, which can be time-consuming and inefficient and promote faster convergence to optimal solutions. (4) Simulation-based evaluation: to assess the effectiveness of the proposed approach, the design includes a simulation-based evaluation. This evaluation uses simulated environments and scenarios to test the performance of the DRQL method. (5) Performance metrics: the design incorporates performance metrics to measure the success of the approach. These metrics likely include measures of convergence speed, exploration efficiency, the number of steps taken and the average rewards obtained during the robot’s navigation.

Findings

The findings of the paper can be summarized as follows: (1) Efficient path planning and obstacle avoidance: the paper’s proposed approach, DRQL, leads to more efficient path planning and obstacle avoidance for MR. This is achieved through the combination of Q-learning and dynamic reward mechanisms, which guide the robot’s actions effectively. (2) Faster convergence to optimal solutions: DRQL accelerates the convergence of the MR to optimal action strategies. Dynamic rewards help reduce the need for blind exploration, which typically consumes time and this results in a quicker attainment of optimal solutions. (3) Reduced exploration time: the integration of dynamic reward mechanisms significantly reduces the time required for exploration during navigation. This reduction in exploration time contributes to more efficient and quicker path planning. (4) Improved path exploration: the results from the simulations indicate that the DRQL method leads to improved path exploration in unknown environments. The robot takes fewer steps to reach its destination, which is a crucial indicator of efficiency. (5) Higher average rewards: the paper’s findings reveal that MR using DRQL receive higher average rewards during their navigation. This suggests that the proposed approach results in better decision-making and more successful navigation.

Originality/value

The paper’s originality stems from its unique combination of Q-learning and dynamic rewards, its focus on efficiency and speed in MR navigation and its ability to enhance path exploration and average rewards. These original contributions have the potential to advance the field of mobile robotics by addressing critical challenges in path planning and obstacle avoidance.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 12 April 2024

Youwei Li and Jian Qu

The purpose of this research is to achieve multi-task autonomous driving by adjusting the network architecture of the model. Meanwhile, after achieving multi-task autonomous…

Abstract

Purpose

The purpose of this research is to achieve multi-task autonomous driving by adjusting the network architecture of the model. Meanwhile, after achieving multi-task autonomous driving, the authors found that the trained neural network model performs poorly in untrained scenarios. Therefore, the authors proposed to improve the transfer efficiency of the model for new scenarios through transfer learning.

Design/methodology/approach

First, the authors achieved multi-task autonomous driving by training a model combining convolutional neural network and different structured long short-term memory (LSTM) layers. Second, the authors achieved fast transfer of neural network models in new scenarios by cross-model transfer learning. Finally, the authors combined data collection and data labeling to improve the efficiency of deep learning. Furthermore, the authors verified that the model has good robustness through light and shadow test.

Findings

This research achieved road tracking, real-time acceleration–deceleration, obstacle avoidance and left/right sign recognition. The model proposed by the authors (UniBiCLSTM) outperforms the existing models tested with model cars in terms of autonomous driving performance. Furthermore, the CMTL-UniBiCL-RL model trained by the authors through cross-model transfer learning improves the efficiency of model adaptation to new scenarios. Meanwhile, this research proposed an automatic data annotation method, which can save 1/4 of the time for deep learning.

Originality/value

This research provided novel solutions in the achievement of multi-task autonomous driving and neural network model scenario for transfer learning. The experiment was achieved on a single camera with an embedded chip and a scale model car, which is expected to simplify the hardware for autonomous driving.

Details

Data Technologies and Applications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 1 April 2024

Tao Pang, Wenwen Xiao, Yilin Liu, Tao Wang, Jie Liu and Mingke Gao

This paper aims to study the agent learning from expert demonstration data while incorporating reinforcement learning (RL), which enables the agent to break through the…

Abstract

Purpose

This paper aims to study the agent learning from expert demonstration data while incorporating reinforcement learning (RL), which enables the agent to break through the limitations of expert demonstration data and reduces the dimensionality of the agent’s exploration space to speed up the training convergence rate.

Design/methodology/approach

Firstly, the decay weight function is set in the objective function of the agent’s training to combine both types of methods, and both RL and imitation learning (IL) are considered to guide the agent's behavior when updating the policy. Second, this study designs a coupling utilization method between the demonstration trajectory and the training experience, so that samples from both aspects can be combined during the agent’s learning process, and the utilization rate of the data and the agent’s learning speed can be improved.

Findings

The method is superior to other algorithms in terms of convergence speed and decision stability, avoiding training from scratch for reward values, and breaking through the restrictions brought by demonstration data.

Originality/value

The agent can adapt to dynamic scenes through exploration and trial-and-error mechanisms based on the experience of demonstrating trajectories. The demonstration data set used in IL and the experience samples obtained in the process of RL are coupled and used to improve the data utilization efficiency and the generalization ability of the agent.

Details

International Journal of Web Information Systems, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 15 April 2024

Xiaona Wang, Jiahao Chen and Hong Qiao

Limited by the types of sensors, the state information available for musculoskeletal robots with highly redundant, nonlinear muscles is often incomplete, which makes the control…

Abstract

Purpose

Limited by the types of sensors, the state information available for musculoskeletal robots with highly redundant, nonlinear muscles is often incomplete, which makes the control face a bottleneck problem. The aim of this paper is to design a method to improve the motion performance of musculoskeletal robots in partially observable scenarios, and to leverage the ontology knowledge to enhance the algorithm’s adaptability to musculoskeletal robots that have undergone changes.

Design/methodology/approach

A memory and attention-based reinforcement learning method is proposed for musculoskeletal robots with prior knowledge of muscle synergies. First, to deal with partially observed states available to musculoskeletal robots, a memory and attention-based network architecture is proposed for inferring more sufficient and intrinsic states. Second, inspired by muscle synergy hypothesis in neuroscience, prior knowledge of a musculoskeletal robot’s muscle synergies is embedded in network structure and reward shaping.

Findings

Based on systematic validation, it is found that the proposed method demonstrates superiority over the traditional twin delayed deep deterministic policy gradients (TD3) algorithm. A musculoskeletal robot with highly redundant, nonlinear muscles is adopted to implement goal-directed tasks. In the case of 21-dimensional states, the learning efficiency and accuracy are significantly improved compared with the traditional TD3 algorithm; in the case of 13-dimensional states without velocities and information from the end effector, the traditional TD3 is unable to complete the reaching tasks, while the proposed method breaks through this bottleneck problem.

Originality/value

In this paper, a novel memory and attention-based reinforcement learning method with prior knowledge of muscle synergies is proposed for musculoskeletal robots to deal with partially observable scenarios. Compared with the existing methods, the proposed method effectively improves the performance. Furthermore, this paper promotes the fusion of neuroscience and robotics.

Details

Robotic Intelligence and Automation, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2754-6969

Keywords

Article
Publication date: 19 March 2024

Mingke Gao, Zhenyu Zhang, Jinyuan Zhang, Shihao Tang, Han Zhang and Tao Pang

Because of the various advantages of reinforcement learning (RL) mentioned above, this study uses RL to train unmanned aerial vehicles to perform two tasks: target search and…

Abstract

Purpose

Because of the various advantages of reinforcement learning (RL) mentioned above, this study uses RL to train unmanned aerial vehicles to perform two tasks: target search and cooperative obstacle avoidance.

Design/methodology/approach

This study draws inspiration from the recurrent state-space model and recurrent models (RPM) to propose a simpler yet highly effective model called the unmanned aerial vehicles prediction model (UAVPM). The main objective is to assist in training the UAV representation model with a recurrent neural network, using the soft actor-critic algorithm.

Findings

This study proposes a generalized actor-critic framework consisting of three modules: representation, policy and value. This architecture serves as the foundation for training UAVPM. This study proposes the UAVPM, which is designed to aid in training the recurrent representation using the transition model, reward recovery model and observation recovery model. Unlike traditional approaches reliant solely on reward signals, RPM incorporates temporal information. In addition, it allows the inclusion of extra knowledge or information from virtual training environments. This study designs UAV target search and UAV cooperative obstacle avoidance tasks. The algorithm outperforms baselines in these two environments.

Originality/value

It is important to note that UAVPM does not play a role in the inference phase. This means that the representation model and policy remain independent of UAVPM. Consequently, this study can introduce additional “cheating” information from virtual training environments to guide the UAV representation without concerns about its real-world existence. By leveraging historical information more effectively, this study enhances UAVs’ decision-making abilities, thus improving the performance of both tasks at hand.

Details

International Journal of Web Information Systems, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 3 January 2024

Miao Ye, Lin Qiang Huang, Xiao Li Wang, Yong Wang, Qiu Xiang Jiang and Hong Bing Qiu

A cross-domain intelligent software-defined network (SDN) routing method based on a proposed multiagent deep reinforcement learning (MDRL) method is developed.

Abstract

Purpose

A cross-domain intelligent software-defined network (SDN) routing method based on a proposed multiagent deep reinforcement learning (MDRL) method is developed.

Design/methodology/approach

First, the network is divided into multiple subdomains managed by multiple local controllers, and the state information of each subdomain is flexibly obtained by the designed SDN multithreaded network measurement mechanism. Then, a cooperative communication module is designed to realize message transmission and message synchronization between the root and local controllers, and socket technology is used to ensure the reliability and stability of message transmission between multiple controllers to acquire global network state information in real time. Finally, after the optimal intradomain and interdomain routing paths are adaptively generated by the agents in the root and local controllers, a network traffic state prediction mechanism is designed to improve awareness of the cross-domain intelligent routing method and enable the generation of the optimal routing paths in the global network in real time.

Findings

Experimental results show that the proposed cross-domain intelligent routing method can significantly improve the network throughput and reduce the network delay and packet loss rate compared to those of the Dijkstra and open shortest path first (OSPF) routing methods.

Originality/value

Message transmission and message synchronization for multicontroller interdomain routing in SDN have long adaptation times and slow convergence speeds, coupled with the shortcomings of traditional interdomain routing methods, such as cumbersome configuration and inflexible acquisition of network state information. These drawbacks make it difficult to obtain global state information about the network, and the optimal routing decision cannot be made in real time, affecting network performance. This paper proposes a cross-domain intelligent SDN routing method based on a proposed MDRL method. First, the network is divided into multiple subdomains managed by multiple local controllers, and the state information of each subdomain is flexibly obtained by the designed SDN multithreaded network measurement mechanism. Then, a cooperative communication module is designed to realize message transmission and message synchronization between root and local controllers, and socket technology is used to ensure the reliability and stability of message transmission between multiple controllers to realize the real-time acquisition of global network state information. Finally, after the optimal intradomain and interdomain routing paths are adaptively generated by the agents in the root and local controllers, a prediction mechanism for the network traffic state is designed to improve awareness of the cross-domain intelligent routing method and enable the generation of the optimal routing paths in the global network in real time. Experimental results show that the proposed cross-domain intelligent routing method can significantly improve the network throughput and reduce the network delay and packet loss rate compared to those of the Dijkstra and OSPF routing methods.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 29 December 2023

Hao Chen and Shuangkang Hao

Addressing the significant differences between referral programs and traditional promotional marketing, this paper aims to investigate and examine the impact of how reward-related…

Abstract

Purpose

Addressing the significant differences between referral programs and traditional promotional marketing, this paper aims to investigate and examine the impact of how reward-related information is presented within referral programs and how it interacts with reward size and reward allocation.

Design/methodology/approach

This study adopts framing effect and equity theory to build the relationship between reward presentation, reward size and reward allocation. Then, two scenario-based experimental studies are designed and conducted on Amazon Mechanical Turk.

Findings

The results show that there is no direct impact of reward presentation on referral likelihood, while the effect relies on reward size. As the levels of reward size increase, the referral likelihood gradually shifts from percentage form to dollar form as perceived size mediates the interaction effect on referral likelihood. Further, adding information about reward allocation also indicate the different impacts of equity and inequity on influencing the above findings.

Originality/value

The study contributes to the literature by introducing reward presentation and emphasizes its impact on individual’s behavior decisions in the context of referral programs. This study extends and broadens the scope and effectiveness of the framing effect on traditional promotional marketing strategies, while also bridging the gap in the literature by examining the combined role of information about rewards.

Details

Asia Pacific Journal of Marketing and Logistics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1355-5855

Keywords

Article
Publication date: 9 February 2024

Tachia Chin, T.C.E. Cheng, Chenhao Wang and Lei Huang

Aiming to resolve cross-cultural paradoxes in combining artificial intelligence (AI) with human intelligence (HI) for international humanitarian logistics, this paper aims to…

Abstract

Purpose

Aiming to resolve cross-cultural paradoxes in combining artificial intelligence (AI) with human intelligence (HI) for international humanitarian logistics, this paper aims to adopt an unorthodox Yin–Yang dialectic approach to address how AI–HI interactions can be interpreted as a sophisticated cross-cultural knowledge creation (KC) system that enables more effective decision-making for providing humanitarian relief across borders.

Design/methodology/approach

This paper is conceptual and pragmatic in nature, whereas its structure design follows the requirements of a real impact study.

Findings

Based on experimental information and logical reasoning, the authors first identify three critical cross-cultural challenges in AI–HI collaboration: paradoxes of building a cross-cultural KC system, paradoxes of integrative AI and HI in moral judgement and paradoxes of processing moral-related information with emotions in AI–HI collaboration. Then applying the Yin–Yang dialectic to interpret Klir’s epistemological frame (1993), the authors propose an unconventional stratified system of cross-cultural KC for understanding integrative AI–HI decision-making for humanitarian logistics across cultures.

Practical implications

This paper aids not only in deeply understanding complex issues stemming from human emotions and cultural cognitions in the context of cross-border humanitarian logistics, but also equips culturally-diverse stakeholders to effectively navigate these challenges and their potential ramifications. It enhances the decision-making process and optimizes the synergy between AI and HI for cross-cultural humanitarian logistics.

Originality/value

The originality lies in the use of a cognitive methodology of the Yin–Yang dialectic to metaphorize the dynamic genesis of integrative AI-HI KC for international humanitarian logistics. Based on system science and knowledge management, this paper applies game theory, multi-objective optimization and Markov decision process to operationalize the conceptual framework in the context of cross-cultural humanitarian logistics.

Details

Journal of Knowledge Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1367-3270

Keywords

Article
Publication date: 13 March 2024

Rong Jiang, Bin He, Zhipeng Wang, Xu Cheng, Hongrui Sang and Yanmin Zhou

Compared with traditional methods relying on manual teaching or system modeling, data-driven learning methods, such as deep reinforcement learning and imitation learning, show…

Abstract

Purpose

Compared with traditional methods relying on manual teaching or system modeling, data-driven learning methods, such as deep reinforcement learning and imitation learning, show more promising potential to cope with the challenges brought by increasingly complex tasks and environments, which have become the hot research topic in the field of robot skill learning. However, the contradiction between the difficulty of collecting robot–environment interaction data and the low data efficiency causes all these methods to face a serious data dilemma, which has become one of the key issues restricting their development. Therefore, this paper aims to comprehensively sort out and analyze the cause and solutions for the data dilemma in robot skill learning.

Design/methodology/approach

First, this review analyzes the causes of the data dilemma based on the classification and comparison of data-driven methods for robot skill learning; Then, the existing methods used to solve the data dilemma are introduced in detail. Finally, this review discusses the remaining open challenges and promising research topics for solving the data dilemma in the future.

Findings

This review shows that simulation–reality combination, state representation learning and knowledge sharing are crucial for overcoming the data dilemma of robot skill learning.

Originality/value

To the best of the authors’ knowledge, there are no surveys that systematically and comprehensively sort out and analyze the data dilemma in robot skill learning in the existing literature. It is hoped that this review can be helpful to better address the data dilemma in robot skill learning in the future.

Details

Robotic Intelligence and Automation, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2754-6969

Keywords

Article
Publication date: 5 December 2023

Andrea Sestino, Alessandro Bernardo, Cristian Rizzo and Stefano Bresciani

Gamification unlocks unprecedented opportunities in healthcare, wellness and lifestyle context. In this scenario, by leveraging on such an approach, information technologies now…

Abstract

Purpose

Gamification unlocks unprecedented opportunities in healthcare, wellness and lifestyle context. In this scenario, by leveraging on such an approach, information technologies now enabled gamification-based mobile applications primarily employed in health and wellness contexts, focusing on areas such as disease prevention, self-management, medication adherence and telehealth programs. The synergistic integration of gamification-based methodologies in conjunction with the utilization of digital tools, (e.g. as for Internet of Things, mobile applications) for the realm of digital therapeutics (DTx), thus unveiled powerful approaches and paradigms, yielding innovative applications that, through the harnessing of sensors and software-based systems, transform healthcare maintenance, wellness and lifestyle into an engaging pursuit, as a game. This paper explores the factors influencing individuals' intention to autonomously utilize mobile gamification-based apps for self-care and wellness maintenance.

Design/methodology/approach

Through explorative research designs an experiment has been conducted among a sample of 376 participants regarding the use of a fictitious gamification-based DTx solution, consisting in a mobile app namely “Health'n’Fit”.

Findings

Findings from an experiment conducted with a sample of 460 participants shed light on the possible antecedents and consequents of gamification. Results of the SEM model indicate that customization (CU), trust (TR), mobility (MO) and social value (SV) are the main determinants, although at a different extent of the playful experience; Moreover, gamification positively impacts attitudes and, in turn, perceived usefulness, intention to use and behavioral intentions.

Practical implications

This paper offers a dual-pronged approach that holds practical significance in the realm of healthcare innovation. First, the authors delve into the antecedents shaping individuals' intention to engage with gamification-based DTx, unraveling the factors that influence user adoption. Beyond this, the authors extend their focus to the realm of healthcare service design. By harnessing the potential of gamification and technology, the authors illuminate pathways to conceptualize and create novel healthcare services. This work not only identifies the building blocks of user engagement but also serves as a guide to innovatively craft healthcare solutions that leverage this amalgamation of technology and gamification, contributing to the evolution of modern healthcare paradigms.

Social implications

In a social context, the paper introduces pioneering technological synergies that merge gamification and DTx to enhance individuals' health and wellness maintenance. By proposing innovative combinations, the authors present novel avenues for promoting healthier lifestyles and behavior change. This not only underscores the potential of technology to positively impact individuals but also highlights the significance of aligning technological advancements with societal well-being. As the research advocates for these innovative solutions, it reinforces the importance of collaborative technological and marketing endeavors, ultimately contributing to the betterment of society as a whole.

Originality/value

This is the first paper exploring the combined effect of gamification and DTx, by shedding light on the peculiarities of both the antecedents of individuals' intention to use such combined technologies.

Details

European Journal of Innovation Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1460-1060

Keywords

1 – 10 of over 1000