Search results
1 – 6 of 6Hongfang Zhou, Xiqian Wang and Yao Zhang
Feature selection is an essential step in data mining. The core of it is to analyze and quantize the relevancy and redundancy between the features and the classes. In CFR feature…
Abstract
Feature selection is an essential step in data mining. The core of it is to analyze and quantize the relevancy and redundancy between the features and the classes. In CFR feature selection method, they rarely consider which feature to choose if two or more features have the same value using evaluation criterion. In order to address this problem, the standard deviation is employed to adjust the importance between relevancy and redundancy. Based on this idea, a novel feature selection method named as Feature Selection Based on Weighted Conditional Mutual Information (WCFR) is introduced. Experimental results on ten datasets show that our proposed method has higher classification accuracy.
Details
Keywords
Kai Zheng, Xianjun Yang, Yilei Wang, Yingjie Wu and Xianghan Zheng
The purpose of this paper is to alleviate the problem of poor robustness and over-fitting caused by large-scale data in collaborative filtering recommendation algorithms.
Abstract
Purpose
The purpose of this paper is to alleviate the problem of poor robustness and over-fitting caused by large-scale data in collaborative filtering recommendation algorithms.
Design/methodology/approach
Interpreting user behavior from the probabilistic perspective of hidden variables is helpful to improve robustness and over-fitting problems. Constructing a recommendation network by variational inference can effectively solve the complex distribution calculation in the probabilistic recommendation model. Based on the aforementioned analysis, this paper uses variational auto-encoder to construct a generating network, which can restore user-rating data to solve the problem of poor robustness and over-fitting caused by large-scale data. Meanwhile, for the existing KL-vanishing problem in the variational inference deep learning model, this paper optimizes the model by the KL annealing and Free Bits methods.
Findings
The effect of the basic model is considerably improved after using the KL annealing or Free Bits method to solve KL vanishing. The proposed models evidently perform worse than competitors on small data sets, such as MovieLens 1 M. By contrast, they have better effects on large data sets such as MovieLens 10 M and MovieLens 20 M.
Originality/value
This paper presents the usage of the variational inference model for collaborative filtering recommendation and introduces the KL annealing and Free Bits methods to improve the basic model effect. Because the variational inference training denotes the probability distribution of the hidden vector, the problem of poor robustness and overfitting is alleviated. When the amount of data is relatively large in the actual application scenario, the probability distribution of the fitted actual data can better represent the user and the item. Therefore, using variational inference for collaborative filtering recommendation is of practical value.
Details
Keywords
Abstract
Details