Search results

1 – 10 of over 10000
Book part
Publication date: 10 April 2019

Shu Yang and Jae Kwang Kim

Nearest neighbor imputation has a long tradition for handling item nonresponse in survey sampling. In this article, we study the asymptotic properties of the nearest neighbor

Abstract

Nearest neighbor imputation has a long tradition for handling item nonresponse in survey sampling. In this article, we study the asymptotic properties of the nearest neighbor imputation estimator for general population parameters, including population means, proportions and quantiles. For variance estimation, we propose novel replication variance estimation, which is asymptotically valid and straightforward to implement. The main idea is to construct replicates of the estimator directly based on its asymptotically linear terms, instead of individual records of variables. The simulation results show that nearest neighbor imputation and the proposed variance estimation provide valid inferences for general population parameters.

Details

The Econometrics of Complex Survey Data
Type: Book
ISBN: 978-1-78756-726-9

Keywords

Article
Publication date: 1 April 1978

JOSEF KITTLER

All the modified Nearest Neighbour methods of pattern classification2–6 developed to reduce the amount of computer storage and time needed for the implementation of a NN…

Abstract

All the modified Nearest Neighbour methods of pattern classification2–6 developed to reduce the amount of computer storage and time needed for the implementation of a NN classifier require prohibitively costly data preprocessing which involves detailed examination of the neighbouring points to the elements of the reference set. In this paper a method for determining knearest neighbours to a given point is described. The method uses the computationally efficient city block distance to select candidate points for the set of knearest neighbours. In this way the preprocessing time is considerably reduced.

Details

Kybernetes, vol. 7 no. 4
Type: Research Article
ISSN: 0368-492X

Book part
Publication date: 31 December 2010

Dominique Guégan and Patrick Rakotomarolahy

Purpose – The purpose of this chapter is twofold: to forecast gross domestic product (GDP) using nonparametric method, known as multivariate k-nearest neighbors method, and to…

Abstract

Purpose – The purpose of this chapter is twofold: to forecast gross domestic product (GDP) using nonparametric method, known as multivariate k-nearest neighbors method, and to provide asymptotic properties for this method.

Methodology/approach – We consider monthly and quarterly macroeconomic variables, and to match the quarterly GDP, we estimate the missing monthly economic variables using multivariate k-nearest neighbors method and parametric vector autoregressive (VAR) modeling. Then linking these monthly macroeconomic variables through the use of bridge equations, we can produce nowcasting and forecasting of GDP.

Findings – Using multivariate k-nearest neighbors method, we provide a forecast of the euro area monthly economic indicator and quarterly GDP, which is better than that obtained with a competitive linear VAR modeling. We also provide the asymptotic normality of this k-nearest neighbors regression estimator for dependent time series, as a confidence interval for point forecast in time series.

Originality/value of chapter – We provide a new theoretical result for nonparametric method and propose a novel methodology for forecasting using macroeconomic data.

Details

Nonlinear Modeling of Economic and Financial Time-Series
Type: Book
ISBN: 978-0-85724-489-5

Keywords

Article
Publication date: 1 August 2005

Songbo Tan

With the ever‐increasing volume of text data via the internet, it is important that documents are classified as manageable and easy to understand categories. This paper proposes…

Abstract

Purpose

With the ever‐increasing volume of text data via the internet, it is important that documents are classified as manageable and easy to understand categories. This paper proposes the use of binary knearest neighbour (BKNN) for text categorization.

Design/methodology/approach

The paper describes the traditional knearest neighbor (KNN) classifier, introduces BKNN and outlines experiemental results.

Findings

The experimental results indicate that BKNN requires much less CPU time than KNN, without loss of classification performance.

Originality/value

The paper demonstrates how BKNN can be an efficient and effective algorithm for text categorization. Proposes the use of binary knearest neighbor (BKNN ) for text categorization.

Details

Online Information Review, vol. 29 no. 4
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 3 April 2017

Yusuke Gotoh and Chiori Okubo

This study aims to propose and evaluate a searching scheme for a bichromatic reverse k-nearest neighbor (BRkNN) that has objects and queries in spatial networks. In this proposed…

Abstract

Purpose

This study aims to propose and evaluate a searching scheme for a bichromatic reverse k-nearest neighbor (BRkNN) that has objects and queries in spatial networks. In this proposed scheme, the author’s search for the BRkNN of the query using an influence zone for each object with a network Voronoi diagram (NVD).

Design/methodology/approach

The author’s analyze and evaluate the performance of the proposed searching scheme.

Findings

The contribution of this paper is that it confirmed that the proposed searching scheme gives shorter processing time than the conventional linear search.

Research limitations/implications

A future direction of this study will involve making a searching scheme that reduces the processing time when objects move automatically on spatial networks.

Practical implications

In BRkNN, consider two groups in a convenience store, where several convenience stores, which are constructed in Groups A and B, operate in a given region. The author’s can use RNN is RkNN when k = 1 (RNN) effectively to set a new store considering the Euclidean and road distances among stores and the location relationship between Groups A and B.

Originality/value

In the proposed searching scheme, the author’s search for the BRkNN of the query for each object with an NVD using the influence zone, which is the region where an object in the spatial network recognizes the nearest neighbor for the query.

Details

International Journal of Pervasive Computing and Communications, vol. 13 no. 1
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 1 August 1999

William McCluskey and Sarabjot Anand

Hybrid systems as the next generation of intelligent applications within the field of mass appraisal and valuation are investigated. Motivated by the obvious limitations of…

1773

Abstract

Hybrid systems as the next generation of intelligent applications within the field of mass appraisal and valuation are investigated. Motivated by the obvious limitations of paradigms that are being used in isolation or as stand‐alone techniques such as multiple regression analysis, artificial neural networks and expert systems. Clearly, there are distinct advantages in integrating two or more information processing systems that would address some of the discrete problems of individual techniques. Examines first, the strategic development of mass appraisal approaches which have traditionally been based on “stand‐alone” techniques; second, the potential application of an intelligent hybrid system. Highlights possible solutions by investigating various hybrid systems that may be developed incorporating a nearest neighbour algorithm (k‐NN). The enhancements are aimed at two major deficiencies in traditional distance metrics; user dependence for attribute weights and biases in the distance metric towards matching categorical variables in the retrieval of neighbours. Solutions include statistical techniques: mean, coefficient of variation and significant mean. Data mining paradigms based on a loosely coupled neural network or alternatively a tight coupling with genetic algorithms are used to discover attribute weights. The hybrid architectures developed are applied to a property data set and their performance measured based on their predictive value as well as perspicuity. Concludes by considering the application and the relevance of these techniques within the field of computer assisted mass appraisal.

Details

Journal of Property Investment & Finance, vol. 17 no. 3
Type: Research Article
ISSN: 1463-578X

Keywords

Article
Publication date: 23 May 2018

Wei Zhang, Xianghong Hua, Kegen Yu, Weining Qiu, Shoujian Zhang and Xiaoxing He

This paper aims to introduce the weighted squared Euclidean distance between points in signal space, to improve the performance of the Wi-Fi indoor positioning. Nowadays, the…

Abstract

Purpose

This paper aims to introduce the weighted squared Euclidean distance between points in signal space, to improve the performance of the Wi-Fi indoor positioning. Nowadays, the received signal strength-based Wi-Fi indoor positioning, a low-cost indoor positioning approach, has attracted a significant attention from both academia and industry.

Design/methodology/approach

The local principal gradient direction is introduced and used to define the weighting function and an average algorithm based on k-means algorithm is used to estimate the local principal gradient direction of each access point. Then, correlation distance is used in the new method to find the k nearest calibration points. The weighted squared Euclidean distance between the nearest calibration point and target point is calculated and used to estimate the position of target point.

Findings

Experiments are conducted and the results indicate that the proposed Wi-Fi indoor positioning approach considerably outperforms the weighted k nearest neighbor method. The new method also outperforms support vector regression and extreme learning machine algorithms in the absence of sufficient fingerprints.

Research limitations/implications

Weighted k nearest neighbor approach, support vector regression algorithm and extreme learning machine algorithm are the three classic strategies for location determination using Wi-Fi fingerprinting. However, weighted k nearest neighbor suffers from dramatic performance degradation in the presence of multipath signal attenuation and environmental changes. More fingerprints are required for support vector regression algorithm to ensure the desirable performance; and labeling Wi-Fi fingerprints is labor-intensive. The performance of extreme learning machine algorithm may not be stable.

Practical implications

The new weighted squared Euclidean distance-based Wi-Fi indoor positioning strategy can improve the performance of Wi-Fi indoor positioning system.

Social implications

The received signal strength-based effective Wi-Fi indoor positioning system can substitute for global positioning system that does not work indoors. This effective and low-cost positioning approach would be promising for many indoor-based location services.

Originality/value

A novel Wi-Fi indoor positioning strategy based on the weighted squared Euclidean distance is proposed in this paper to improve the performance of the Wi-Fi indoor positioning, and the local principal gradient direction is introduced and used to define the weighting function.

Article
Publication date: 1 March 2013

Song Zhang, Cong Li, Li Ma and Qi Li

The purpose of this paper is to introduce an improved nearestneighbor collaborative filtering algorithm based on rough set theory to alleviate the sparsity problem of…

Abstract

Purpose

The purpose of this paper is to introduce an improved nearestneighbor collaborative filtering algorithm based on rough set theory to alleviate the sparsity problem of collaborative filtering. With experimentations, the new algorithm is thereafter evaluated.

Design/methodology/approach

Nearestneighbor algorithm is the earliest proposed and the main collaborative filtering recommendation algorithm, and its recommendation quality is seriously influenced by the sparsity of user ratings. By using rough set theory, the nearestneighbor collaborative filtering algorithm can be improved in the sparsity data situation. The union of user rating items is used as the basis of similarity computing among users, and then a rating predicting method based on rough set theory is proposed to estimate missing values in the union of user rating items for decreasing sparsity.

Findings

The sparsity problem of collaborative filtering can be alleviated by using the union of user rating items and estimating missing values based on rough set theory. The experimental results show that the new algorithm can efficiently improve recommendation quality of collaborative filtering.

Originality/value

The union of user rating items was used as the basis of similarity computing among users. A rating prediction method based on rough set theory with an assistant method was proposed to complete the missing values in the union of user rating items. Orthogonal list was used to storage user‐item ratings matrix.

Details

COMPEL - The international journal for computation and mathematics in electrical and electronic engineering, vol. 32 no. 2
Type: Research Article
ISSN: 0332-1649

Keywords

Article
Publication date: 15 January 2018

Wei Lu, Heng Ding and Jiepu Jiang

The purpose of this paper is to utilize document expansion techniques for improving image representation and retrieval. This paper proposes a concise framework for tag-based image…

Abstract

Purpose

The purpose of this paper is to utilize document expansion techniques for improving image representation and retrieval. This paper proposes a concise framework for tag-based image retrieval (TBIR).

Design/methodology/approach

The proposed approach includes three core components: a strategy of selecting expansion (similar) images from the whole corpus (e.g. cluster-based or nearest neighbor-based); a technique for assessing image similarity, which is adopted for selecting expansion images (text, image, or mixed); and a model for matching the expanded image representation with the search query (merging or separate).

Findings

The results show that applying the proposed method yields significant improvements in effectiveness, and the method obtains better performance on the top of the rank and makes a great improvement on some topics with zero score in baseline. Moreover, nearest neighbor-based expansion strategy outperforms the cluster-based expansion strategy, and using image features for selecting expansion images is better than using text features in most cases, and the separate method for calculating the augmented probability P(q|RD) is able to erase the negative influences of error images in RD.

Research limitations/implications

Despite these methods only outperform on the top of the rank instead of the entire rank list, TBIR on mobile platforms still can benefit from this approach.

Originality/value

Unlike former studies addressing the sparsity, vocabulary mismatch, and tag relatedness in TBIR individually, the approach proposed by this paper addresses all these issues with a single document expansion framework. It is a comprehensive investigation of document expansion techniques in TBIR.

Details

Aslib Journal of Information Management, vol. 70 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 14 March 2016

Gebeyehu Belay Gebremeskel, Chai Yi, Zhongshi He and Dawit Haile

Among the growing number of data mining (DM) techniques, outlier detection has gained importance in many applications and also attracted much attention in recent times. In the…

Abstract

Purpose

Among the growing number of data mining (DM) techniques, outlier detection has gained importance in many applications and also attracted much attention in recent times. In the past, outlier detection researched papers appeared in a safety care that can view as searching for the needles in the haystack. However, outliers are not always erroneous. Therefore, the purpose of this paper is to investigate the role of outliers in healthcare services in general and patient safety care, in particular.

Design/methodology/approach

It is a combined DM (clustering and the nearest neighbor) technique for outliers’ detection, which provides a clear understanding and meaningful insights to visualize the data behaviors for healthcare safety. The outcomes or the knowledge implicit is vitally essential to a proper clinical decision-making process. The method is important to the semantic, and the novel tactic of patients’ events and situations prove that play a significant role in the process of patient care safety and medications.

Findings

The outcomes of the paper is discussing a novel and integrated methodology, which can be inferring for different biological data analysis. It is discussed as integrated DM techniques to optimize its performance in the field of health and medical science. It is an integrated method of outliers detection that can be extending for searching valuable information and knowledge implicit based on selected patient factors. Based on these facts, outliers are detected as clusters and point events, and novel ideas proposed to empower clinical services in consideration of customers’ satisfactions. It is also essential to be a baseline for further healthcare strategic development and research works.

Research limitations/implications

This paper mainly focussed on outliers detections. Outlier isolation that are essential to investigate the reason how it happened and communications how to mitigate it did not touch. Therefore, the research can be extended more about the hierarchy of patient problems.

Originality/value

DM is a dynamic and successful gateway for discovering useful knowledge for enhancing healthcare performances and patient safety. Clinical data based outlier detection is a basic task to achieve healthcare strategy. Therefore, in this paper, the authors focussed on combined DM techniques for a deep analysis of clinical data, which provide an optimal level of clinical decision-making processes. Proper clinical decisions can obtain in terms of attributes selections that important to know the influential factors or parameters of healthcare services. Therefore, using integrated clustering and nearest neighbors techniques give more acceptable searched such complex data outliers, which could be fundamental to further analysis of healthcare and patient safety situational analysis.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 9 no. 1
Type: Research Article
ISSN: 1756-378X

Keywords

1 – 10 of over 10000