Search results

1 – 10 of 150
Article
Publication date: 15 April 2024

Xiaona Wang, Jiahao Chen and Hong Qiao

Limited by the types of sensors, the state information available for musculoskeletal robots with highly redundant, nonlinear muscles is often incomplete, which makes the control…

Abstract

Purpose

Limited by the types of sensors, the state information available for musculoskeletal robots with highly redundant, nonlinear muscles is often incomplete, which makes the control face a bottleneck problem. The aim of this paper is to design a method to improve the motion performance of musculoskeletal robots in partially observable scenarios, and to leverage the ontology knowledge to enhance the algorithm’s adaptability to musculoskeletal robots that have undergone changes.

Design/methodology/approach

A memory and attention-based reinforcement learning method is proposed for musculoskeletal robots with prior knowledge of muscle synergies. First, to deal with partially observed states available to musculoskeletal robots, a memory and attention-based network architecture is proposed for inferring more sufficient and intrinsic states. Second, inspired by muscle synergy hypothesis in neuroscience, prior knowledge of a musculoskeletal robot’s muscle synergies is embedded in network structure and reward shaping.

Findings

Based on systematic validation, it is found that the proposed method demonstrates superiority over the traditional twin delayed deep deterministic policy gradients (TD3) algorithm. A musculoskeletal robot with highly redundant, nonlinear muscles is adopted to implement goal-directed tasks. In the case of 21-dimensional states, the learning efficiency and accuracy are significantly improved compared with the traditional TD3 algorithm; in the case of 13-dimensional states without velocities and information from the end effector, the traditional TD3 is unable to complete the reaching tasks, while the proposed method breaks through this bottleneck problem.

Originality/value

In this paper, a novel memory and attention-based reinforcement learning method with prior knowledge of muscle synergies is proposed for musculoskeletal robots to deal with partially observable scenarios. Compared with the existing methods, the proposed method effectively improves the performance. Furthermore, this paper promotes the fusion of neuroscience and robotics.

Details

Robotic Intelligence and Automation, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2754-6969

Keywords

Article
Publication date: 11 November 2019

Jayashree Jagdale and Emmanuel M.

Sentiment analysis is the subfield of data mining, which is profusely used for studying the opinions of the users by analyzing their suggestions on the Web platform. It plays an…

Abstract

Purpose

Sentiment analysis is the subfield of data mining, which is profusely used for studying the opinions of the users by analyzing their suggestions on the Web platform. It plays an important role in the daily decision-making process, and every decision has a great impact on daily life. Various techniques including machine learning algorithms have been proposed for sentiment analysis, but still, they are inefficient for extracting the sentiment features from the given text. Although the improvement in sentiment analysis approaches, there are several problems, which make the analysis inefficient and inaccurate. This paper aims to develop the sentiment analysis scheme on movie reviews by proposing a novel classifier.

Design/methodology/approach

For the analysis, the movie reviews are collected and subjected to pre-processing. From the pre-processed review, a total of nine sentiment related features are extracted and provided to the proposed exponential-salp swarm algorithm based actor-critic neural network (ESSA-ACNN) classifier for the sentiment classification. The ESSA algorithm is developed by integrating the exponentially weighted moving average (EWMA) and SSA for selecting the optimal weight of ACNN. Finally, the proposed classifier classifies the reviews into positive or negative class.

Findings

The performance of the ESSA-ACNN classifier is analyzed by considering the reviews present in the movie review database. From, the simulation results, it is evident that the proposed ESSA-ACNN classifier has improved performance than the existing works by having the performance of 0.7417, 0.8807 and 0.8119, for sensitivity, specificity and accuracy, respectively.

Originality/value

The proposed classifier can be applicable for real-world problems, such as business, political activities and so on.

Details

VINE Journal of Information and Knowledge Management Systems, vol. 49 no. 4
Type: Research Article
ISSN: 2059-5891

Keywords

Article
Publication date: 23 December 2019

Mahua Bhowmik and P. Malathi P. Malathi

Cognitive radio (CR) plays a very important role in enabling spectral efficiency in wireless communication networks, where the secondary user (SU) allows the licensed primary…

Abstract

Purpose

Cognitive radio (CR) plays a very important role in enabling spectral efficiency in wireless communication networks, where the secondary user (SU) allows the licensed primary users (PUs). The purpose of this paper is to develop a prediction model for spectrum sensing in CR.

Design/methodology/approach

This paper proposes a hybrid prediction model, called krill-herd whale optimization-based actor critic neural network and hidden Markov model (KHWO-ACNN-HMM). The spectral bands are determined optimally using the proposed hybrid prediction model for allocating the spectrum bands to the PUs. For better sensing, the eigenvalue based on cooperative sensing used in CR. Finally, a hybrid model is designed by hybridizing KHWO-ACNN and HMM to enhance the accuracy of sensing. The predicted results of KHWO-ACNN and HMM are combined by a fusion model, for which a weighted entropy fusion is employed to determine the free spectrum available in CRs.

Findings

The performance of the prediction model is evaluated based on metrics, such as probability of detection, probability of false alarm, throughput and sensing time. The proposed spectrum sensing method achieves maximum probability of detection of 0.9696, minimum probability of false alarm rate as 0.78, minimum throughput of 0.0303 and the maximum sensing time of 650.08 s.

Research implications

The proposed method is useful in various applications, including authentication applications, wireless medical networks and so on.

Originality/value

A hybrid prediction model is introduced for energy efficient spectrum sensing in CR and the performance of the proposed model is evaluated with the existing models. The proposed hybrid model outperformed the other techniques.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 15 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 27 October 2022

Haifeng Huang, Xiaoyang Wu, Tingting Wang, Yongbin Sun and Qiang Fu

This paper aims to study the application of reinforcement learning (RL) in the control of an output-constrained flapping-wing micro aerial vehicle (FWMAV) with system uncertainty.

Abstract

Purpose

This paper aims to study the application of reinforcement learning (RL) in the control of an output-constrained flapping-wing micro aerial vehicle (FWMAV) with system uncertainty.

Design/methodology/approach

A six-degrees-of-freedom hummingbird model is used without consideration of the inertial effects of the wings. A RL algorithm based on actorcritic framework is applied, which consists of an actor network with unknown policy gradient and a critic network with unknown value function. Considering the good performance of neural network (NN) in fitting nonlinearity and its optimum characteristics, an actorcritic NN optimization algorithm is designed, in which the actor and critic NNs are used to generate a policy and approximate the cost functions, respectively. In addition, to ensure the safe and stable flight of the FWMAV, a barrier Lyapunov function is used to make the flight states constrained in predefined regions. Based on the Lyapunov stability theory, the stability of the system is analyzed, and finally, the feasibility of RL in the control of a FWMAV is verified through simulation.

Findings

The proposed RL control scheme works well in ensuring the trajectory tracking of the FWMAV in the presence of output constraint and system uncertainty.

Originality/value

A novel RL algorithm based on actorcritic framework is applied to the control of a FWMAV with system uncertainty. For the stable and safe flight of the FWMAV, the output constraint problem is considered and solved by barrier Lyapunov function-based control.

Details

Assembly Automation, vol. 42 no. 6
Type: Research Article
ISSN: 0144-5154

Keywords

Article
Publication date: 19 March 2024

Mingke Gao, Zhenyu Zhang, Jinyuan Zhang, Shihao Tang, Han Zhang and Tao Pang

Because of the various advantages of reinforcement learning (RL) mentioned above, this study uses RL to train unmanned aerial vehicles to perform two tasks: target search and…

Abstract

Purpose

Because of the various advantages of reinforcement learning (RL) mentioned above, this study uses RL to train unmanned aerial vehicles to perform two tasks: target search and cooperative obstacle avoidance.

Design/methodology/approach

This study draws inspiration from the recurrent state-space model and recurrent models (RPM) to propose a simpler yet highly effective model called the unmanned aerial vehicles prediction model (UAVPM). The main objective is to assist in training the UAV representation model with a recurrent neural network, using the soft actor-critic algorithm.

Findings

This study proposes a generalized actor-critic framework consisting of three modules: representation, policy and value. This architecture serves as the foundation for training UAVPM. This study proposes the UAVPM, which is designed to aid in training the recurrent representation using the transition model, reward recovery model and observation recovery model. Unlike traditional approaches reliant solely on reward signals, RPM incorporates temporal information. In addition, it allows the inclusion of extra knowledge or information from virtual training environments. This study designs UAV target search and UAV cooperative obstacle avoidance tasks. The algorithm outperforms baselines in these two environments.

Originality/value

It is important to note that UAVPM does not play a role in the inference phase. This means that the representation model and policy remain independent of UAVPM. Consequently, this study can introduce additional “cheating” information from virtual training environments to guide the UAV representation without concerns about its real-world existence. By leveraging historical information more effectively, this study enhances UAVs’ decision-making abilities, thus improving the performance of both tasks at hand.

Details

International Journal of Web Information Systems, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 16 April 2020

Qiaoling Zhou

English original movies played an important role in English learning and communication. In order to find the required movies for us from a large number of English original movies…

Abstract

Purpose

English original movies played an important role in English learning and communication. In order to find the required movies for us from a large number of English original movies and reviews, this paper proposed an improved deep reinforcement learning algorithm for the recommendation of movies. In fact, although the conventional movies recommendation algorithms have solved the problem of information overload, they still have their limitations in the case of cold start-up and sparse data.

Design/methodology/approach

To solve the aforementioned problems of conventional movies recommendation algorithms, this paper proposed a recommendation algorithm based on the theory of deep reinforcement learning, which uses the deep deterministic policy gradient (DDPG) algorithm to solve the cold starting and sparse data problems and uses Item2vec to transform discrete action space into a continuous one. Meanwhile, a reward function combining with cosine distance and Euclidean distance is proposed to ensure that the neural network does not converge to local optimum prematurely.

Findings

In order to verify the feasibility and validity of the proposed algorithm, the state of the art and the proposed algorithm are compared in indexes of RMSE, recall rate and accuracy based on the MovieLens English original movie data set for the experiments. Experimental results have shown that the proposed algorithm is superior to the conventional algorithm in various indicators.

Originality/value

Applying the proposed algorithm to recommend English original movies, DDPG policy produces better recommendation results and alleviates the impact of cold start and sparse data.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 13 no. 1
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 31 May 2013

Chao Guo, Huai‐Ning Wu, Biao Luo and Lei Guo

The air‐breathing hypersonic vehicle (AHV) includes intricate inherent coupling between the propulsion system and the airframe dynamics, which results in an intractable nonlinear…

Abstract

Purpose

The air‐breathing hypersonic vehicle (AHV) includes intricate inherent coupling between the propulsion system and the airframe dynamics, which results in an intractable nonlinear system for the controller design. The purpose of this paper is to propose an H∞ control method for AHV based on the online simultaneous policy update algorithm (SPUA).

Design/methodology/approach

Initially, the H∞ state feedback control problem of the AHV is converted to the problem of solving the Hamilton‐Jacobi‐Isaacs (HJI) equation, which is notoriously difficult to solve both numerically and analytically. To overcome this difficulty, the online SPUA is introduced to solve the HJI equation without requiring the accurate knowledge of the internal system dynamics. Subsequently, the online SPUA is implemented on the basis of an actorcritic structure, in which neural network (NN) is employed for approximating the cost function and a least‐square method is used to calculate the NN weight parameters.

Findings

Simulation study on the AHV demonstrates the effectiveness of the proposed H∞ control method.

Originality/value

The paper presents an interesting method for the H∞ state feedback control design problem of the AHV based on online SPUA.

Article
Publication date: 20 November 2009

Takashi Kuremoto, Masanao Obayashi and Kunikazu Kobayashi

The purpose of this paper is to present a neuro‐fuzzy system with a reinforcement learning algorithm (RL) for adaptive swarm behaviors acquisition. The basic idea is that each…

Abstract

Purpose

The purpose of this paper is to present a neuro‐fuzzy system with a reinforcement learning algorithm (RL) for adaptive swarm behaviors acquisition. The basic idea is that each individual (agent) has the same internal model and the same learning procedure, and the adaptive behaviors are acquired only by the reward or punishment from the environment. The formation of the swarm is also designed by RL, e.g. temporal difference (TD)‐error learning algorithm, and it may bring out a faster exploration procedure comparing with the case of individual learning.

Design/methodology/approach

The internal model of each individual composes a part of input states classification by a fuzzy net, and a part of optimal behavior learning network which adopting a kind of RL methodology named actorcritic method. The membership functions and fuzzy rules in the fuzzy net are adaptively formed online by the change of environment states observed in the trials of agent's behaviors. The weights of connections between the fuzzy net and the action‐value functions of actor which provides a stochastic policy of action selection, and critic which provides an evaluation to state transmission, are modified by TD‐error.

Findings

Simulation experiments of the proposed system with several goal‐directed navigation problems are accomplished and the results show that swarms are successfully formed and optimized routes are found by swarm learning faster than the case of individual learning.

Originality/value

Two techniques, i.e. fuzzy identification system and RL algorithm, are fused into an internal model of the individuals for swarm formation and adaptive behavior acquisition. The proposed model may be applied to multi‐agent systems, swarm robotics, metaheuristic optimization, and so on.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 2 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 18 December 2023

Volodymyr Novykov, Christopher Bilson, Adrian Gepp, Geoff Harris and Bruce James Vanstone

Machine learning (ML), and deep learning in particular, is gaining traction across a myriad of real-life applications. Portfolio management is no exception. This paper provides a…

Abstract

Purpose

Machine learning (ML), and deep learning in particular, is gaining traction across a myriad of real-life applications. Portfolio management is no exception. This paper provides a systematic literature review of deep learning applications for portfolio management. The findings are likely to be valuable for industry practitioners and researchers alike, experimenting with novel portfolio management approaches and furthering investment management practice.

Design/methodology/approach

This review follows the guidance and methodology of Linnenluecke et al. (2020), Massaro et al. (2016) and Fisch and Block (2018) to first identify relevant literature based on an appropriately developed search phrase, filter the resultant set of publications and present descriptive and analytical findings of the research itself and its metadata.

Findings

The authors find a strong dominance of reinforcement learning algorithms applied to the field, given their through-time portfolio management capabilities. Other well-known deep learning models, such as convolutional neural network (CNN) and recurrent neural network (RNN) and its derivatives, have shown to be well-suited for time-series forecasting. Most recently, the number of papers published in the field has been increasing, potentially driven by computational advances, hardware accessibility and data availability. The review shows several promising applications and identifies future research opportunities, including better balance on the risk-reward spectrum, novel ways to reduce data dimensionality and pre-process the inputs, stronger focus on direct weights generation, novel deep learning architectures and consistent data choices.

Originality/value

Several systematic reviews have been conducted with a broader focus of ML applications in finance. However, to the best of the authors’ knowledge, this is the first review to focus on deep learning architectures and their applications in the investment portfolio management problem. The review also presents a novel universal taxonomy of models used.

Details

Journal of Accounting Literature, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0737-4607

Keywords

Article
Publication date: 1 August 2002

Pawan Budhwar, Andy Crane, Annette Davies, Rick Delbridge, Tim Edwards, Mahmoud Ezzamel, Lloyd Harris, Emmanuel Ogbonna and Robyn Thomas

Wonders whether companies actually have employees best interests at heart across physical, mental and spiritual spheres. Posits that most organizations ignore their workforce …

57604

Abstract

Wonders whether companies actually have employees best interests at heart across physical, mental and spiritual spheres. Posits that most organizations ignore their workforce – not even, in many cases, describing workers as assets! Describes many studies to back up this claim in theis work based on the 2002 Employment Research Unit Annual Conference, in Cardiff, Wales.

Details

Management Research News, vol. 25 no. 8/9/10
Type: Research Article
ISSN: 0140-9174

Keywords

1 – 10 of 150