Search results1 – 3 of 3
Multiple classifier systems have been used widely in computing, communications, and informatics. Combining multiple classifier systems (MCS) has been shown to outperform a…
Multiple classifier systems have been used widely in computing, communications, and informatics. Combining multiple classifier systems (MCS) has been shown to outperform a single classifier system. It has been demonstrated that improvement in ensemble performance depends on either the diversity among or the performance of individual systems. A variety of diversity measures and ensemble methods have been proposed and studied. However, it remains a challenging problem to estimate the ensemble performance in terms of the performance of and the diversity among individual systems. The purpose of this paper is to study the general problem of estimating ensemble performance for various combination methods using the concept of a performance distribution pattern (PDP).
In particular, the paper establishes upper and lower bounds for majority voting ensemble performance with disagreement diversity measure Dis, weighted majority voting performance in terms of weighted average performance and weighted disagreement diversity, and plurality voting ensemble performance with entropy diversity measure D.
Bounds for these three cases are shown to be tight using the PDP for the input set.
As a consequence of the authors' previous results on diversity equivalence, the results of majority voting ensemble performance can be extended to several other diversity measures. Moreover, the paper showed in the case of majority voting ensemble performance that when the average of individual systems performance P is big enough, the ensemble performance Pm resulting from a maximum (information‐theoretic) entropy PDP is an increasing function with respect to the disagreement diversity Dis. Eight experiments using data sets from various application domains are conducted to demonstrate the complexity, richness, and diverseness of the problem in estimating the ensemble performance.
Telecommunication (telecom) fraud is one of the most common crimes and causes the greatest financial losses. To effectively eradicate fraud groups, the key fraudsters must…
Telecommunication (telecom) fraud is one of the most common crimes and causes the greatest financial losses. To effectively eradicate fraud groups, the key fraudsters must be identified and captured. One strategy is to analyze the fraud interaction network using social network analysis. However, the underlying structures of fraud networks are different from those of common social networks, which makes traditional indicators such as centrality not directly applicable. Recently, a new line of research called deep random walk has emerged. These methods utilize random walks to explore local information and then apply deep learning algorithms to learn the representative feature vectors. Although effective for many types of networks, random walk is used for discovering local structural equivalence and does not consider the global properties of nodes.
The authors proposed a new method to combine the merits of deep random walk and social network analysis, which is called centrality-guided deep random walk. By using the centrality of nodes as edge weights, the authors’ biased random walks implicitly consider the global importance of nodes and can thus find key fraudster roles more accurately. To evaluate the authors’ algorithm, a real telecom fraud data set with around 562 fraudsters was built, which is the largest telecom fraud network to date.
The authors’ proposed method achieved better results than traditional centrality indices and various deep random walk algorithms and successfully identified key roles in a fraud network.
The study used co-offending and flight record to construct a criminal network, more interpersonal relationships of fraudsters, such as friendships and relatives, can be included in the future.
This paper proposed a novel algorithm, centrality-guided deep random walk, and applied it to a new telecom fraud data set. Experimental results show that the authors’ method can successfully identify the key roles in a fraud group and outperform other baseline methods. To the best of the authors’ knowledge, it is the largest analysis of telecom fraud network to date.