Search results
1 – 10 of 970All the modified Nearest Neighbour methods of pattern classification2–6 developed to reduce the amount of computer storage and time needed for the implementation of a NN…
Abstract
All the modified Nearest Neighbour methods of pattern classification2–6 developed to reduce the amount of computer storage and time needed for the implementation of a NN classifier require prohibitively costly data preprocessing which involves detailed examination of the neighbouring points to the elements of the reference set. In this paper a method for determining k‐nearest neighbours to a given point is described. The method uses the computationally efficient city block distance to select candidate points for the set of k‐nearest neighbours. In this way the preprocessing time is considerably reduced.
With the ever‐increasing volume of text data via the internet, it is important that documents are classified as manageable and easy to understand categories. This paper proposes…
Abstract
Purpose
With the ever‐increasing volume of text data via the internet, it is important that documents are classified as manageable and easy to understand categories. This paper proposes the use of binary k‐nearest neighbour (BKNN) for text categorization.
Design/methodology/approach
The paper describes the traditional k‐nearest neighbor (KNN) classifier, introduces BKNN and outlines experiemental results.
Findings
The experimental results indicate that BKNN requires much less CPU time than KNN, without loss of classification performance.
Originality/value
The paper demonstrates how BKNN can be an efficient and effective algorithm for text categorization. Proposes the use of binary k‐nearest neighbor (BKNN ) for text categorization.
Details
Keywords
Dominique Guégan and Patrick Rakotomarolahy
Purpose – The purpose of this chapter is twofold: to forecast gross domestic product (GDP) using nonparametric method, known as multivariate k-nearest neighbors method, and to…
Abstract
Purpose – The purpose of this chapter is twofold: to forecast gross domestic product (GDP) using nonparametric method, known as multivariate k-nearest neighbors method, and to provide asymptotic properties for this method.
Methodology/approach – We consider monthly and quarterly macroeconomic variables, and to match the quarterly GDP, we estimate the missing monthly economic variables using multivariate k-nearest neighbors method and parametric vector autoregressive (VAR) modeling. Then linking these monthly macroeconomic variables through the use of bridge equations, we can produce nowcasting and forecasting of GDP.
Findings – Using multivariate k-nearest neighbors method, we provide a forecast of the euro area monthly economic indicator and quarterly GDP, which is better than that obtained with a competitive linear VAR modeling. We also provide the asymptotic normality of this k-nearest neighbors regression estimator for dependent time series, as a confidence interval for point forecast in time series.
Originality/value of chapter – We provide a new theoretical result for nonparametric method and propose a novel methodology for forecasting using macroeconomic data.
Details
Keywords
This study aims to propose and evaluate a searching scheme for a bichromatic reverse k-nearest neighbor (BRkNN) that has objects and queries in spatial networks. In this proposed…
Abstract
Purpose
This study aims to propose and evaluate a searching scheme for a bichromatic reverse k-nearest neighbor (BRkNN) that has objects and queries in spatial networks. In this proposed scheme, the author’s search for the BRkNN of the query using an influence zone for each object with a network Voronoi diagram (NVD).
Design/methodology/approach
The author’s analyze and evaluate the performance of the proposed searching scheme.
Findings
The contribution of this paper is that it confirmed that the proposed searching scheme gives shorter processing time than the conventional linear search.
Research limitations/implications
A future direction of this study will involve making a searching scheme that reduces the processing time when objects move automatically on spatial networks.
Practical implications
In BRkNN, consider two groups in a convenience store, where several convenience stores, which are constructed in Groups A and B, operate in a given region. The author’s can use RNN is RkNN when k = 1 (RNN) effectively to set a new store considering the Euclidean and road distances among stores and the location relationship between Groups A and B.
Originality/value
In the proposed searching scheme, the author’s search for the BRkNN of the query for each object with an NVD using the influence zone, which is the region where an object in the spatial network recognizes the nearest neighbor for the query.
Details
Keywords
Wei Zhang, Xianghong Hua, Kegen Yu, Weining Qiu, Shoujian Zhang and Xiaoxing He
This paper aims to introduce the weighted squared Euclidean distance between points in signal space, to improve the performance of the Wi-Fi indoor positioning. Nowadays, the…
Abstract
Purpose
This paper aims to introduce the weighted squared Euclidean distance between points in signal space, to improve the performance of the Wi-Fi indoor positioning. Nowadays, the received signal strength-based Wi-Fi indoor positioning, a low-cost indoor positioning approach, has attracted a significant attention from both academia and industry.
Design/methodology/approach
The local principal gradient direction is introduced and used to define the weighting function and an average algorithm based on k-means algorithm is used to estimate the local principal gradient direction of each access point. Then, correlation distance is used in the new method to find the k nearest calibration points. The weighted squared Euclidean distance between the nearest calibration point and target point is calculated and used to estimate the position of target point.
Findings
Experiments are conducted and the results indicate that the proposed Wi-Fi indoor positioning approach considerably outperforms the weighted k nearest neighbor method. The new method also outperforms support vector regression and extreme learning machine algorithms in the absence of sufficient fingerprints.
Research limitations/implications
Weighted k nearest neighbor approach, support vector regression algorithm and extreme learning machine algorithm are the three classic strategies for location determination using Wi-Fi fingerprinting. However, weighted k nearest neighbor suffers from dramatic performance degradation in the presence of multipath signal attenuation and environmental changes. More fingerprints are required for support vector regression algorithm to ensure the desirable performance; and labeling Wi-Fi fingerprints is labor-intensive. The performance of extreme learning machine algorithm may not be stable.
Practical implications
The new weighted squared Euclidean distance-based Wi-Fi indoor positioning strategy can improve the performance of Wi-Fi indoor positioning system.
Social implications
The received signal strength-based effective Wi-Fi indoor positioning system can substitute for global positioning system that does not work indoors. This effective and low-cost positioning approach would be promising for many indoor-based location services.
Originality/value
A novel Wi-Fi indoor positioning strategy based on the weighted squared Euclidean distance is proposed in this paper to improve the performance of the Wi-Fi indoor positioning, and the local principal gradient direction is introduced and used to define the weighting function.
Details
Keywords
Cracks on surface are often identified as one of the early indications of damage and possible future catastrophic structural failure. Thus, detection of cracks is vital for the…
Abstract
Purpose
Cracks on surface are often identified as one of the early indications of damage and possible future catastrophic structural failure. Thus, detection of cracks is vital for the timely inspection, health diagnosis and maintenance of infrastructures. However, conventional visual inspection-based methods are criticized for being subjective, greatly affected by inspector's expertise, labor-intensive and time-consuming.
Design/methodology/approach
This paper proposes a novel self-adaptive-based method for automated and semantic crack detection and recognition in various infrastructures using computer vision technologies. The developed method is envisioned on three main models that are structured to circumvent the shortcomings of visual inspection in detection of cracks in walls, pavement and deck. The first model deploys modified visual geometry group network (VGG19) for extraction of global contextual and local deep learning features in an attempt to alleviate the drawbacks of hand-crafted features. The second model is conceptualized on the integration of K-nearest neighbors (KNN) and differential evolution (DE) algorithm for the automated optimization of its structure. The third model is designated for validating the developed method through an extensive four layers of performance evaluation and statistical comparisons.
Findings
It was observed that the developed method significantly outperformed other crack and detection models. For instance, the developed wall crack detection method accomplished overall accuracy, F-measure, Kappa coefficient, area under the curve, balanced accuracy, Matthew's correlation coefficient and Youden's index of 99.62%, 99.16%, 0.998, 0.998, 99.17%, 0.989 and 0.983, respectively.
Originality/value
Literature review lacks an efficient method which can look at crack detection and recognition of an ensemble of infrastructures. Furthermore, there is absence of systematic and detailed comparisons between crack detection and recognition models.
Details
Keywords
Teddy Mantoro, Akeem Olowolayemo, Sunday O. Olatunji, Media A. Ayu, Abu Osman and Tap
Prediction accuracies are usually affected by the techniques and devices used as well as the algorithms applied. This work aims to attempt to further devise a better positioning…
Abstract
Purpose
Prediction accuracies are usually affected by the techniques and devices used as well as the algorithms applied. This work aims to attempt to further devise a better positioning accuracy based on location fingerprinting taking advantage of two important mobile fingerprints, namely signal strength (SS) and signal quality (SQ) and subsequently building a model based on extreme learning machine (ELM), a new learning algorithm for single‐hidden‐layer neural networks.
Design/methodology/approach
Prediction approach to location determination based on historical data has attracted a lot of attention in recent studies, the reason being that it offers the convenience of using previously accumulated location data to subsequently determine locations using predictive algorithms. There have been various approaches to location positioning to further improve mobile user location determination accuracy. This work examines the location determination techniques by attempting to determine the location of mobile users by taking advantage of SS and SQ history data and modeling the locations using the ELM algorithm. The empirical results show that the proposed model based on the ELM algorithm noticeably outperforms k‐Nearest Neighbor approaches.
Findings
WiFi's SS contributes more in accuracy to the prediction of user location than WiFi's SQ. Moreover, the new framework based on ELM has been compared with the k‐Nearest Neighbor and the results have shown that the proposed model based on the extreme learning algorithm outperforms the k‐Nearest Neighbor approach.
Originality/value
A new computational intelligence modeling scheme, based on the ELM has been investigated, developed and implemented, as an efficient and more accurate predictive solution for determining position of mobile users based on location fingerprint data (SS and SQ).
Details
Keywords
Farshid Abdi, Kaveh Khalili-Damghani and Shaghayegh Abolmakarem
Customer insurance coverage sales plan problem, in which the loyal customers are recognized and offered some special plans, is an essential problem facing insurance companies. On…
Abstract
Purpose
Customer insurance coverage sales plan problem, in which the loyal customers are recognized and offered some special plans, is an essential problem facing insurance companies. On the other hand, the loyal customers who have enough potential to renew their insurance contracts at the end of the contract term should be persuaded to repurchase or renew their contracts. The aim of this paper is to propose a three-stage data-mining approach to recognize high-potential loyal insurance customers and to predict/plan special insurance coverage sales.
Design/methodology/approach
The first stage addresses data cleansing. In the second stage, several filter and wrapper methods are implemented to select proper features. In the third stage, K-nearest neighbor algorithm is used to cluster the customers. The approach aims to select a compact feature subset with the maximal prediction capability. The proposed approach can detect the customers who are more likely to buy a specific insurance coverage at the end of a contract term.
Findings
The proposed approach has been applied in a real case study of insurance company in Iran. On the basis of the findings, the proposed approach is capable of recognizing the customer clusters and planning a suitable insurance coverage sales plans for loyal customers with proper accuracy level. Therefore, the proposed approach can be useful for the insurance company which helps them to identify their potential clients. Consequently, insurance managers can consider appropriate marketing tactics and appropriate resource allocation of the insurance company to their high-potential loyal customers and prevent switching them to competitors.
Originality/value
Despite the importance of recognizing high-potential loyal insurance customers, little study has been done in this area. In this paper, data-mining techniques were developed for the prediction of special insurance coverage sales on the basis of customers’ characteristics. The method allows the insurance company to prioritize their customers and focus their attention on high-potential loyal customers. Using the outputs of the proposed approach, the insurance companies can offer the most productive/economic insurance coverage contracts to their customers. The approach proposed by this study be customized and may be used in other service companies.
Details
Keywords
Samira Khodabandehlou, S. Alireza Hashemi Golpayegani and Mahmoud Zivari Rahman
Improving the performance of recommender systems (RSs) has always been a major challenge in the area of e-commerce because the systems face issues such as cold start, sparsity…
Abstract
Purpose
Improving the performance of recommender systems (RSs) has always been a major challenge in the area of e-commerce because the systems face issues such as cold start, sparsity, scalability and interest drift that affect their performance. Despite the efforts made to solve these problems, there is still no RS that can solve or reduce all the problems simultaneously. Therefore, the purpose of this study is to provide an effective and comprehensive RS to solve or reduce all of the above issues, which uses a combination of basic customer information as well as big data techniques.
Design/methodology/approach
The most important steps in the proposed RS are: (1) collecting demographic and behavioral data of customers from an e-clothing store; (2) assessing customer personality traits; (3) creating a new user-item matrix based on customer/user interest; (4) calculating the similarity between customers with efficient k-nearest neighbor (EKNN) algorithm based on locality-sensitive hashing (LSH) approach and (5) defining a new similarity function based on a combination of personality traits, demographic characteristics and time-based purchasing behavior that are the key incentives for customers' purchases.
Findings
The proposed method was compared with different baselines (matrix factorization and ensemble). The results showed that the proposed method in terms of all evaluation measures led to a significant improvement in traditional collaborative filtering (CF) performance, and with a significant difference (more than 40%), performed better than all baselines. According to the results, we find that our proposed method, which uses a combination of personality information and demographics, as well as tracking the recent interests and needs of the customer with the LSH approach, helps to improve the effectiveness of the recommendations more than the baselines. This is due to the fact that this method, which uses the above information in conjunction with the LSH technique, is more effective and more accurate in solving problems of cold start, scalability, sparsity and interest drift.
Research limitations/implications
The research data were limited to only one e-clothing store.
Practical implications
In order to achieve an accurate and real-time RS in e-commerce, it is essential to use a combination of customer information with efficient techniques. In this regard, according to the results of the research, the use of personality traits and demographic characteristics lead to a more accurate knowledge of customers' interests and thus better identification of similar customers. Therefore, this information should be considered as a solution to reduce the problems of cold start and sparsity. Also, a better judgment can be made about customers' interests by considering their recent purchases; therefore, in order to solve the problems of interest drifts, different weights should be assigned to purchases and launch time of products/items at different times (the more recent, the more weight). Finally, the LSH technique is used to increase the RS scalability in e-commerce. In total, a combination of personality traits, demographics and customer purchasing behavior over time with the LSH technique should be used to achieve an ideal RS. Using the RS proposed in this research, it is possible to create a comfortable and enjoyable shopping experience for customers by providing real-time recommendations that match customers' preferences and can result in an increase in the profitability of e-shops.
Originality/value
In this study, by considering a combination of personality traits, demographic characteristics and time-based purchasing behavior of customers along with the LSH technique, we were able for the first time to simultaneously solve the basic problems of CF, namely cold start, scalability, sparsity and interest drift, which led to a decrease in significant errors of recommendations and an increase in the accuracy of CF. The average errors of the recommendations provided to users based on the proposed model is only about 13%, and the accuracy and compliance of these recommendations with the interests of customers is about 92%. In addition, a 40% difference between the accuracy of the proposed method and the traditional CF method has been observed. This level of accuracy in RSs is very significant and special, which is certainly welcomed by e-business owners. This is also a new scientific finding that is very useful for programmers, users and researchers. In general, the main contributions of this research are: 1) proposing an accurate RS using personality traits, demographic characteristics and time-based purchasing behavior; 2) proposing an effective and comprehensive RS for a “clothing” online store; 3) improving the RS performance by solving the cold start issue using personality traits and demographic characteristics; 4) improving the scalability issue in RS through efficient k-nearest neighbors; 5) Mitigating the sparsity issue by using personality traits and demographic characteristics and also by densifying the user-item matrix and 6) improving the RS accuracy by solving the interest drift issue through developing a time-based user-item matrix.
Details
Keywords
Jyoti Godara, Rajni Aron and Mohammad Shabaz
Sentiment analysis has observed a nascent interest over the past decade in the field of social media analytics. With major advances in the volume, rationality and veracity of…
Abstract
Purpose
Sentiment analysis has observed a nascent interest over the past decade in the field of social media analytics. With major advances in the volume, rationality and veracity of social networking data, the misunderstanding, uncertainty and inaccuracy within the data have multiplied. In the textual data, the location of sarcasm is a challenging task. It is a different way of expressing sentiments, in which people write or says something different than what they actually intended to. So, the researchers are showing interest to develop various techniques for the detection of sarcasm in the texts to boost the performance of sentiment analysis. This paper aims to overview the sentiment analysis, sarcasm and related work for sarcasm detection. Further, this paper provides training to health-care professionals to make the decision on the patient’s sentiments.
Design/methodology/approach
This paper has compared the performance of five different classifiers – support vector machine, naïve Bayes classifier, decision tree classifier, AdaBoost classifier and K-nearest neighbour on the Twitter data set.
Findings
This paper has observed that naïve Bayes has performed the best having the highest accuracy of 61.18%, and decision tree performed the worst with an accuracy of 54.27%. Accuracy of AdaBoost, K-nearest neighbour and support vector machine measured were 56.13%, 54.81% and 59.55%, respectively.
Originality/value
This research work is original.
Details