Search results

1 – 10 of 102

View access options

Article

Publication date: 8 September 2022

SeaRank: relevance prediction based on click models in a reinforcement learning framework

Amir Hosein Keyhanipour and Farhad Oroumchian

User feedback inferred from the user's search-time behavior could improve the learning to rank (L2R) algorithms. Click models (CMs) present probabilistic frameworks for describing…

HTML

PDF (396 KB)

Downloads

Abstract

Purpose

User feedback inferred from the user's search-time behavior could improve the learning to rank (L2R) algorithms. Click models (CMs) present probabilistic frameworks for describing and predicting the user's clicks during search sessions. Most of these CMs are based on common assumptions such as Attractiveness, Examination and User Satisfaction. CMs usually consider the Attractiveness and Examination as pre- and post-estimators of the actual relevance. They also assume that User Satisfaction is a function of the actual relevance. This paper extends the authors' previous work by building a reinforcement learning (RL) model to predict the relevance. The Attractiveness, Examination and User Satisfaction are estimated using a limited number of the features of the utilized benchmark data set and then they are incorporated in the construction of an RL agent. The proposed RL model learns to predict the relevance label of documents with respect to a given query more effectively than the baseline RL models for those data sets.

Design/methodology/approach

In this paper, User Satisfaction is used as an indication of the relevance level of a query to a document. User Satisfaction itself is estimated through Attractiveness and Examination, and in turn, Attractiveness and Examination are calculated by the random forest algorithm. In this process, only a small subset of top information retrieval (IR) features are used, which are selected based on their mean average precision and normalized discounted cumulative gain values. Based on the authors' observations, the multiplication of the Attractiveness and Examination values of a given query–document pair closely approximates the User Satisfaction and hence the relevance level. Besides, an RL model is designed in such a way that the current state of the RL agent is determined by discretization of the estimated Attractiveness and Examination values. In this way, each query–document pair would be mapped into a specific state based on its Attractiveness and Examination values. Then, based on the reward function, the RL agent would try to choose an action (relevance label) which maximizes the received reward in its current state. Using temporal difference (TD) learning algorithms, such as Q-learning and SARSA, the learning agent gradually learns to identify an appropriate relevance label in each state. The reward that is used in the RL agent is proportional to the difference between the User Satisfaction and the selected action.

Findings

Experimental results on MSLR-WEB10K and WCL2R benchmark data sets demonstrate that the proposed algorithm, named as SeaRank, outperforms baseline algorithms. Improvement is more noticeable in top-ranked results, which usually receive more attention from users.

Originality/value

This research provides a mapping from IR features to the CM features and thereafter utilizes these newly generated features to build an RL model. This RL model is proposed with the definition of the states, actions and reward function. By applying TD learning algorithms, such as the Q-learning and SARSA, within several learning episodes, the RL agent would be able to learn how to choose the most appropriate relevance label for a given pair of query–document.

Details

Data Technologies and Applications, vol. 57 no. 4

Type: Research Article

DOI:

ISSN: 2514-9288

Keywords

View access options

Article

Publication date: 28 March 2008

Adaptive learning by a target‐tracking system

Daniel Lockery and James F. Peters

The purpose of this paper is to report upon research into developing a biologically inspired target‐tracking system (TTS) capable of acquiring quality images of a known target…

HTML

PDF (556 KB)

Downloads

835

Abstract

Purpose

The purpose of this paper is to report upon research into developing a biologically inspired target‐tracking system (TTS) capable of acquiring quality images of a known target type for a robotic inspection application.

Design/methodology/approach

The approach used in the design of the TTS hearkens back to the work on adaptive learning by Oliver Selfridge and Chris J.C.H. Watkins and the work on the classification of objects by Zdzislaw Pawlak during the 1980s in an approximation space‐based form of feedback during learning. Also, during the 1980s, it was Ewa Orlowska who called attention to the importance of approximation spaces as a formal counterpart of perception. This insight by Orlowska has been important in working toward a new form of adaptive learning useful in controlling the behaviour of machines to accomplish system goals. The adaptive learning algorithms presented in this paper are strictly temporal difference methods, including Q‐learning, sarsa, and the actor‐critic method. Learning itself is considered episodic. During each episode, the equivalent of a Tinbergen‐like ethogram is constructed. Such an ethogram provides a basis for the construction of an approximation space at the end of each episode. The combination of episodic ethograms and approximation spaces provides an extremely effective means of feedback useful in guiding learning during the lifetime of a robotic system such as the TTS reported in this paper.

Findings

It was discovered that even though the adaptive learning methods were computationally more expensive than the classical algorithm implementations, they proved to be more effective in a number of cases, especially in noisy environments.

Originality/value

The novelty associated with this work is the introduction of an approach to adaptive adaptive learning carried out within the framework of ethology‐based approximation spaces to provide performance feedback during the learning process.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 1 no. 1

Type: Research Article

DOI:

ISSN: 1756-378X

Keywords

View access options

Article

Publication date: 1 December 2022

Human-robot force cooperation analysis by deep reinforcement learning

Shaodong Li, Xiaogang Yuan and Hongjian Yu

This study aims to realize natural and effort-saving motion behavior and improve effectiveness for different operators in human–robot force cooperation.

HTML

PDF (2.1 MB)

Downloads

205

Abstract

Purpose

This study aims to realize natural and effort-saving motion behavior and improve effectiveness for different operators in human–robot force cooperation.

Design/methodology/approach

The parameter of admittance model is identified by deep deterministic policy gradient (DDPG) to realize human–robot force cooperation for different operators in this paper. The movement coupling problem of hybrid robot is solved by realizing position and pose drags. In DDPG, minimum jerk trajectory is selected as the reward objective function, and the variable prioritized experience replay is applied to balance the exploration and exploitation.

Findings

A series of simulations are implemented to validate the superiority and stability of DDPG. Furthermore, three sets of experiments involving mass parameter, damping parameter and DDPG are implemented, the effect of DDPG in real environment is validated and could meet the cooperation demand for different operators.

Originality/value

DDPG is applied in admittance model identification to realize human–robot force cooperation for different operators. And minimum jerk trajectory is introduced into reward objective to meet requirement of human arm free movements. The algorithm proposed in this paper could be further extended in the other operation task.

Details

Industrial Robot: the international journal of robotics research and application, vol. 50 no. 2

Type: Research Article

DOI:

ISSN: 0143-991X

Keywords

View access options

Article

Publication date: 7 May 2019

Position/force control of robot manipulators using reinforcement learning

Adolfo Perrusquía, Wen Yu and Alberto Soria

The position/force control of the robot needs the parameters of the impedance model and generates the desired position from the contact force in the environment. When the…

HTML

PDF (1 MB)

Downloads

1051

Abstract

Purpose

The position/force control of the robot needs the parameters of the impedance model and generates the desired position from the contact force in the environment. When the environment is unknown, learning algorithms are needed to estimate both the desired force and the parameters of the impedance model.

Design/methodology/approach

In this paper, the authors use reinforcement learning to learn only the desired force, then they use proportional-integral-derivative admittance control to generate the desired position. The results of the experiment are presented to verify their approach.

Findings

The position error is minimized without knowing the environment or the impedance parameters. Another advantage of this simplified position/force control is that the transformation of the Cartesian space to the joint space by inverse kinematics is avoided by the feedback control mechanism. The stability of the closed-loop system is proven.

Originality/value

The position error is minimized without knowing the environment or the impedance parameters. The stability of the closed-loop system is proven.

Details

Industrial Robot: the international journal of robotics research and application, vol. 46 no. 2

Type: Research Article

DOI:

ISSN: 0143-991X

Keywords

View access options

Article

Publication date: 7 November 2016

Learning to rank with click-through features in a reinforcement learning framework

Amir Hosein Keyhanipour, Behzad Moshiri, Maryam Piroozmand, Farhad Oroumchian and Ali Moeini

Learning to rank algorithms inherently faces many challenges. The most important challenges could be listed as high-dimensionality of the training data, the dynamic nature of Web…

HTML

PDF (992 KB)

Downloads

467

Abstract

Purpose

Learning to rank algorithms inherently faces many challenges. The most important challenges could be listed as high-dimensionality of the training data, the dynamic nature of Web information resources and lack of click-through data. High dimensionality of the training data affects effectiveness and efficiency of learning algorithms. Besides, most of learning to rank benchmark datasets do not include click-through data as a very rich source of information about the search behavior of users while dealing with the ranked lists of search results. To deal with these limitations, this paper aims to introduce a novel learning to rank algorithm by using a set of complex click-through features in a reinforcement learning (RL) model. These features are calculated from the existing click-through information in the data set or even from data sets without any explicit click-through information.

Design/methodology/approach

The proposed ranking algorithm (QRC-Rank) applies RL techniques on a set of calculated click-through features. QRC-Rank is as a two-steps process. In the first step, Transformation phase, a compact benchmark data set is created which contains a set of click-through features. These feature are calculated from the original click-through information available in the data set and constitute a compact representation of click-through information. To find most effective click-through feature, a number of scenarios are investigated. The second phase is Model-Generation, in which a RL model is built to rank the documents. This model is created by applying temporal difference learning methods such as Q-Learning and SARSA.

Findings

The proposed learning to rank method, QRC-rank, is evaluated on WCL2R and LETOR4.0 data sets. Experimental results demonstrate that QRC-Rank outperforms the state-of-the-art learning to rank methods such as SVMRank, RankBoost, ListNet and AdaRank based on the precision and normalized discount cumulative gain evaluation criteria. The use of the click-through features calculated from the training data set is a major contributor to the performance of the system.

Originality/value

In this paper, we have demonstrated the viability of the proposed features that provide a compact representation for the click through data in a learning to rank application. These compact click-through features are calculated from the original features of the learning to rank benchmark data set. In addition, a Markov Decision Process model is proposed for the learning to rank problem using RL, including the sets of states, actions, rewarding strategy and the transition function.

Details

International Journal of Web Information Systems, vol. 12 no. 4

Type: Research Article

DOI:

ISSN: 1744-0084

Keywords

View access options

Article

Publication date: 25 March 2022

A binary integer programming (BIP) model for optimal financial turning points detection

Fatemeh Yazdani, Mehdi Khashei and Seyed Reza Hejazi

This paper aims to detect the most profitable, i.e. optimal turning points (TPs), from the history of time series using a binary integer programming (BIP) model. TPs prediction…

HTML

PDF (2.2 MB)

Downloads

Abstract

Purpose

This paper aims to detect the most profitable, i.e. optimal turning points (TPs), from the history of time series using a binary integer programming (BIP) model. TPs prediction problem is one of the most popular yet challenging topics in financial planning. Predicting profitable TPs results in earning profit by offering the opportunity to buy at low and selling at high. TPs detected from the history of time series will be used as the prediction model’s input. According to the literature, the predicted TPs’ profitability depends on the detected TPs’ profitability. Therefore, research for improving the profitability of detection methods has been never given up. Nevertheless, to the best of our knowledge, none of the existing methods can detect the optimal TPs.

Design/methodology/approach

The objective function of our model maximizes the profit of adopting all the trading strategies. The decision variables represent whether or not to detect the breakpoints as TPs. The assumptions of the model are as follows. Short-selling is possible. The time value for the money is not considered. Detection of consecutive buying (selling) TPs is not possible.

Findings

Empirical results with 20 data sets from Shanghai Stock Exchange indicate that the model detects the optimal TPs.

Originality/value

The proposed model, in contrast to the other methods, can detect the optimal TPs. Additionally, the proposed model, in contrast to the other methods, requires transaction cost as its only input parameter. This advantage reduces the process’ calculations.

Details

Journal of Modelling in Management, vol. 18 no. 5

Type: Research Article

DOI:

ISSN: 1746-5664

Keywords

View access options

Article

Publication date: 28 March 2019

Research trends in the study of gamification

Raed S. Alsawaier

The purpose of this paper is to examine the research design of several publications on the study of gamification and proposes a mixed-method research design for creating a…

HTML

PDF (175 KB)

Downloads

1279

Abstract

Purpose

The purpose of this paper is to examine the research design of several publications on the study of gamification and proposes a mixed-method research design for creating a holistic understanding of the gamification phenomenon. It presents an argument in support of combining both qualitative and quantitative data sources through mixed-method design as being equally important in illuminating all aspects of the research problem.

Design/methodology/approach

The paper covers a number of methodological themes relevant to the study of gamification: research design trends in the study of gamification; the importance of mixed-method design in the study of gamification; methodological challenges; conclusion and recommendations.

Findings

Majority of the studies on gamification before 2015 are either quantitative or described as mixed method but overly focused on quantitative data sources. However, there is a tendency between 2015 and 2017 to adopt mixed-method design.

Research limitations/implications

The study does not examine all research done on the topic of gamification but relies on 56 empirical studies reviewed by Hamari, Koivisto, Sarsa (2014) and Seaborn and Fels (2015) between 2009 and 2015.

Originality/value

The author believes it to be one of the few studies of its kind on proposing a methodological design for the study of gamification as a pedagogical tool.

Details

The International Journal of Information and Learning Technology, vol. 36 no. 5

Type: Research Article

DOI:

ISSN: 2056-4880

Keywords

View access options

Book part

Publication date: 9 August 2017

Smart Industry Research in the Field of HRM: Resetting Job Design as an Example of Upcoming Challenges

Milou Habraken and Tanya Bondarouk

This chapter aims to encourage and guide smart industry HRM-related research by addressing upcoming challenges developed using a job design lens.

HTML

PDF (880 KB)

EPUB (1.9 MB)

Abstract

Purpose

This chapter aims to encourage and guide smart industry HRM-related research by addressing upcoming challenges developed using a job design lens.

Methodology/approach

The challenges are constructed based on a developed overview of the existing body of work related to job design and a description of smart industry.

Research implications

The challenges are meant as an indication of the issues that arise within job design due to smart industry and, in so doing, suggest directions for future research in this specific field. Additionally, through laying out challenges for this particular example, the chapter encourages scholars to consider the possible impact of smart industry within other HRM areas.

Details

Electronic HRM in the Smart Era

Type: Book

DOI:

ISBN: 978-1-78714-315-9

Keywords

View access options

Book part

Publication date: 11 March 2022

Gamification and Game-Based Assessments

Franziska Leutner, Reece Akhtar and Tomas Chamorro-Premuzic

HTML

PDF (159 KB)

EPUB (267 KB)

Details

The Future of Recruitment

Type: Book

DOI:

ISBN: 978-1-83867-562-2

View access options

Article

Publication date: 13 January 2022

Intelligent obstacle avoidance path planning method for picking manipulator combined with artificial potential field method

Zheng Fang and Xifeng Liang

The results of obstacle avoidance path planning for the manipulator using artificial potential field (APF) method contain a large number of path nodes, which reduce the efficiency…

HTML

PDF (3.2 MB)

Downloads

445

Abstract

Purpose

The results of obstacle avoidance path planning for the manipulator using artificial potential field (APF) method contain a large number of path nodes, which reduce the efficiency of manipulators. This paper aims to propose a new intelligent obstacle avoidance path planning method for picking robot to improve the efficiency of manipulators.

Design/methodology/approach

To improve the efficiency of the robot, this paper proposes a new intelligent obstacle avoidance path planning method for picking robot. In this method, we present a snake-tongue algorithm based on slope-type potential field and combine the snake-tongue algorithm with genetic algorithm (GA) and reinforcement learning (RL) to reduce the path length and the number of path nodes in the path planning results.

Findings

Simulation experiments were conducted with tomato string picking manipulator. The results showed that the path length is reduced from 4.1 to 2.979 m, the number of nodes is reduced from 31 to 3 and the working time of the robot is reduced from 87.35 to 37.12 s, after APF method combined with GA and RL.

Originality/value

This paper proposes a new improved method of APF, and combines it with GA and RL. The experimental results show that the new intelligent obstacle avoidance path planning method proposed in this paper is beneficial to improve the efficiency of the robotic arm.

Graphical abstract

Figure 1 According to principles of bionics, we propose a new path search method, snake-tongue algorithm, based on a slope-type potential field. At the same time, we use genetic algorithm to strengthen the ability of the artificial potential field method for path searching, so that it can complete the path searching in a variety of complex obstacle distribution situations with shorter path searching results. Reinforcement learning is used to reduce the number of path nodes, which is good for improving the efficiency of robot work. The use of genetic algorithm and reinforcement learning lays the foundation for intelligent control.

Details

Industrial Robot: the international journal of robotics research and application, vol. 49 no. 5

Type: Research Article

DOI:

ISSN: 0143-991X

Keywords

Access

Year

Content type

1 – 10 of 102

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Originality/value

Details

Keywords

Abstract

Purpose

Methodology/approach

Research implications

Details

Keywords

Abstract

Details

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Graphical abstract

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information