Search results

1 – 10 of 833
Article
Publication date: 23 May 2018

Wei Zhang, Xianghong Hua, Kegen Yu, Weining Qiu, Shoujian Zhang and Xiaoxing He

This paper aims to introduce the weighted squared Euclidean distance between points in signal space, to improve the performance of the Wi-Fi indoor positioning. Nowadays, the…

Abstract

Purpose

This paper aims to introduce the weighted squared Euclidean distance between points in signal space, to improve the performance of the Wi-Fi indoor positioning. Nowadays, the received signal strength-based Wi-Fi indoor positioning, a low-cost indoor positioning approach, has attracted a significant attention from both academia and industry.

Design/methodology/approach

The local principal gradient direction is introduced and used to define the weighting function and an average algorithm based on k-means algorithm is used to estimate the local principal gradient direction of each access point. Then, correlation distance is used in the new method to find the k nearest calibration points. The weighted squared Euclidean distance between the nearest calibration point and target point is calculated and used to estimate the position of target point.

Findings

Experiments are conducted and the results indicate that the proposed Wi-Fi indoor positioning approach considerably outperforms the weighted k nearest neighbor method. The new method also outperforms support vector regression and extreme learning machine algorithms in the absence of sufficient fingerprints.

Research limitations/implications

Weighted k nearest neighbor approach, support vector regression algorithm and extreme learning machine algorithm are the three classic strategies for location determination using Wi-Fi fingerprinting. However, weighted k nearest neighbor suffers from dramatic performance degradation in the presence of multipath signal attenuation and environmental changes. More fingerprints are required for support vector regression algorithm to ensure the desirable performance; and labeling Wi-Fi fingerprints is labor-intensive. The performance of extreme learning machine algorithm may not be stable.

Practical implications

The new weighted squared Euclidean distance-based Wi-Fi indoor positioning strategy can improve the performance of Wi-Fi indoor positioning system.

Social implications

The received signal strength-based effective Wi-Fi indoor positioning system can substitute for global positioning system that does not work indoors. This effective and low-cost positioning approach would be promising for many indoor-based location services.

Originality/value

A novel Wi-Fi indoor positioning strategy based on the weighted squared Euclidean distance is proposed in this paper to improve the performance of the Wi-Fi indoor positioning, and the local principal gradient direction is introduced and used to define the weighting function.

Article
Publication date: 15 December 2017

Farshid Abdi, Kaveh Khalili-Damghani and Shaghayegh Abolmakarem

Customer insurance coverage sales plan problem, in which the loyal customers are recognized and offered some special plans, is an essential problem facing insurance companies. On…

Abstract

Purpose

Customer insurance coverage sales plan problem, in which the loyal customers are recognized and offered some special plans, is an essential problem facing insurance companies. On the other hand, the loyal customers who have enough potential to renew their insurance contracts at the end of the contract term should be persuaded to repurchase or renew their contracts. The aim of this paper is to propose a three-stage data-mining approach to recognize high-potential loyal insurance customers and to predict/plan special insurance coverage sales.

Design/methodology/approach

The first stage addresses data cleansing. In the second stage, several filter and wrapper methods are implemented to select proper features. In the third stage, K-nearest neighbor algorithm is used to cluster the customers. The approach aims to select a compact feature subset with the maximal prediction capability. The proposed approach can detect the customers who are more likely to buy a specific insurance coverage at the end of a contract term.

Findings

The proposed approach has been applied in a real case study of insurance company in Iran. On the basis of the findings, the proposed approach is capable of recognizing the customer clusters and planning a suitable insurance coverage sales plans for loyal customers with proper accuracy level. Therefore, the proposed approach can be useful for the insurance company which helps them to identify their potential clients. Consequently, insurance managers can consider appropriate marketing tactics and appropriate resource allocation of the insurance company to their high-potential loyal customers and prevent switching them to competitors.

Originality/value

Despite the importance of recognizing high-potential loyal insurance customers, little study has been done in this area. In this paper, data-mining techniques were developed for the prediction of special insurance coverage sales on the basis of customers’ characteristics. The method allows the insurance company to prioritize their customers and focus their attention on high-potential loyal customers. Using the outputs of the proposed approach, the insurance companies can offer the most productive/economic insurance coverage contracts to their customers. The approach proposed by this study be customized and may be used in other service companies.

Article
Publication date: 13 January 2021

Manish Sinha and Divyank Srivastava

With the current pandemic situation, the world is shifting to online buying and therefore the purpose of this study is to understand how the industry can improve sales based on…

Abstract

Purpose

With the current pandemic situation, the world is shifting to online buying and therefore the purpose of this study is to understand how the industry can improve sales based on the product recommendations shown on their online platforms.

Design/methodology/approach

This paper has studied content-based filtering using decision trees algorithm and collaborative filtering using K-nearest neighbour algorithm and measured their impact on sales of product of different genres on e-commerce websites and if their recommendation causes a difference in sales.This paper has conducted a field experiment to analyse the customer frequency, change in sales caused by different algorithms and also tried analysing the change in buying preferences of customers in post-pandemic situation and how this paper can improve on the search results by incorporating them in the already used algorithms.

Findings

This study indicates that different algorithms cause differences in sales and score over each other depending upon the category of the product sold. It also suggests that post-Covid, the buying frequency and the preferences of consumers have changed significantly.

Research limitations/implications

The study is limited to existing users of these sites, it also requires the sites to have a huge database of active users and products. Also, the preferences and likings of Indian subcontinent might not generally apply everywhere else.

Originality/value

This study enables better insight into consumer behaviour, thus enabling the data scientists to design better algorithms and help the companies improve their product sales.

Details

International Journal of Innovation Science, vol. 13 no. 2
Type: Research Article
ISSN: 1757-2223

Keywords

Article
Publication date: 10 November 2020

Samira Khodabandehlou, S. Alireza Hashemi Golpayegani and Mahmoud Zivari Rahman

Improving the performance of recommender systems (RSs) has always been a major challenge in the area of e-commerce because the systems face issues such as cold start, sparsity…

Abstract

Purpose

Improving the performance of recommender systems (RSs) has always been a major challenge in the area of e-commerce because the systems face issues such as cold start, sparsity, scalability and interest drift that affect their performance. Despite the efforts made to solve these problems, there is still no RS that can solve or reduce all the problems simultaneously. Therefore, the purpose of this study is to provide an effective and comprehensive RS to solve or reduce all of the above issues, which uses a combination of basic customer information as well as big data techniques.

Design/methodology/approach

The most important steps in the proposed RS are: (1) collecting demographic and behavioral data of customers from an e-clothing store; (2) assessing customer personality traits; (3) creating a new user-item matrix based on customer/user interest; (4) calculating the similarity between customers with efficient k-nearest neighbor (EKNN) algorithm based on locality-sensitive hashing (LSH) approach and (5) defining a new similarity function based on a combination of personality traits, demographic characteristics and time-based purchasing behavior that are the key incentives for customers' purchases.

Findings

The proposed method was compared with different baselines (matrix factorization and ensemble). The results showed that the proposed method in terms of all evaluation measures led to a significant improvement in traditional collaborative filtering (CF) performance, and with a significant difference (more than 40%), performed better than all baselines. According to the results, we find that our proposed method, which uses a combination of personality information and demographics, as well as tracking the recent interests and needs of the customer with the LSH approach, helps to improve the effectiveness of the recommendations more than the baselines. This is due to the fact that this method, which uses the above information in conjunction with the LSH technique, is more effective and more accurate in solving problems of cold start, scalability, sparsity and interest drift.

Research limitations/implications

The research data were limited to only one e-clothing store.

Practical implications

In order to achieve an accurate and real-time RS in e-commerce, it is essential to use a combination of customer information with efficient techniques. In this regard, according to the results of the research, the use of personality traits and demographic characteristics lead to a more accurate knowledge of customers' interests and thus better identification of similar customers. Therefore, this information should be considered as a solution to reduce the problems of cold start and sparsity. Also, a better judgment can be made about customers' interests by considering their recent purchases; therefore, in order to solve the problems of interest drifts, different weights should be assigned to purchases and launch time of products/items at different times (the more recent, the more weight). Finally, the LSH technique is used to increase the RS scalability in e-commerce. In total, a combination of personality traits, demographics and customer purchasing behavior over time with the LSH technique should be used to achieve an ideal RS. Using the RS proposed in this research, it is possible to create a comfortable and enjoyable shopping experience for customers by providing real-time recommendations that match customers' preferences and can result in an increase in the profitability of e-shops.

Originality/value

In this study, by considering a combination of personality traits, demographic characteristics and time-based purchasing behavior of customers along with the LSH technique, we were able for the first time to simultaneously solve the basic problems of CF, namely cold start, scalability, sparsity and interest drift, which led to a decrease in significant errors of recommendations and an increase in the accuracy of CF. The average errors of the recommendations provided to users based on the proposed model is only about 13%, and the accuracy and compliance of these recommendations with the interests of customers is about 92%. In addition, a 40% difference between the accuracy of the proposed method and the traditional CF method has been observed. This level of accuracy in RSs is very significant and special, which is certainly welcomed by e-business owners. This is also a new scientific finding that is very useful for programmers, users and researchers. In general, the main contributions of this research are: 1) proposing an accurate RS using personality traits, demographic characteristics and time-based purchasing behavior; 2) proposing an effective and comprehensive RS for a “clothing” online store; 3) improving the RS performance by solving the cold start issue using personality traits and demographic characteristics; 4) improving the scalability issue in RS through efficient k-nearest neighbors; 5) Mitigating the sparsity issue by using personality traits and demographic characteristics and also by densifying the user-item matrix and 6) improving the RS accuracy by solving the interest drift issue through developing a time-based user-item matrix.

Article
Publication date: 11 May 2020

Bojan Bozic, Andre Rios and Sarah Jane Delany

This paper aims to investigate the methods for the prediction of tags on a textual corpus that describes diverse data sets based on short messages; as an example, the authors…

Abstract

Purpose

This paper aims to investigate the methods for the prediction of tags on a textual corpus that describes diverse data sets based on short messages; as an example, the authors demonstrate the usage of methods based on hotel staff inputs in a ticketing system as well as the publicly available StackOverflow corpus. The aim is to improve the tagging process and find the most suitable method for suggesting tags for a new text entry.

Design/methodology/approach

The paper consists of two parts: exploration of existing sample data, which includes statistical analysis and visualisation of the data to provide an overview, and evaluation of tag prediction approaches. The authors have included different approaches from different research fields to cover a broad spectrum of possible solutions. As a result, the authors have tested a machine learning model for multi-label classification (using gradient boosting), a statistical approach (using frequency heuristics) and three similarity-based classification approaches (nearest centroid, k-nearest neighbours (k-NN) and naive Bayes). The experiment that compares the approaches uses recall to measure the quality of results. Finally, the authors provide a recommendation of the modelling approach that produces the best accuracy in terms of tag prediction on the sample data.

Findings

The authors have calculated the performance of each method against the test data set by measuring recall. The authors show recall for each method with different features (except for frequency heuristics, which does not provide the option to add additional features) for the dmbook pro and StackOverflow data sets. k-NN clearly provides the best recall. As k-NN turned out to provide the best results, the authors have performed further experiments with values of k from 1–10. This helped us to observe the impact of the number of neighbours used on the performance and to identify the best value for k.

Originality/value

The value and originality of the paper are given by extensive experiments with several methods from different domains. The authors have used probabilistic methods, such as naive Bayes, statistical methods, such as frequency heuristics, and similarity approaches, such as k-NN. Furthermore, the authors have produced results on an industrial-scale data set that has been provided by a company and used directly in their project, as well as a community-based data set with a large amount of data and dimensionality. The study results can be used to select a model based on diverse corpora for a specific use case, taking into account advantages and disadvantages when applying the model to your data.

Details

International Journal of Web Information Systems, vol. 16 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Open Access
Article
Publication date: 21 June 2023

Sudhaman Parthasarathy and S.T. Padmapriya

Algorithm bias refers to repetitive computer program errors that give some users more weight than others. The aim of this article is to provide a deeper insight of algorithm bias…

Abstract

Purpose

Algorithm bias refers to repetitive computer program errors that give some users more weight than others. The aim of this article is to provide a deeper insight of algorithm bias in AI-enabled ERP software customization. Although algorithmic bias in machine learning models has uneven, unfair and unjust impacts, research on it is mostly anecdotal and scattered.

Design/methodology/approach

As guided by the previous research (Akter et al., 2022), this study presents the possible design bias (model, data and method) one may experience with enterprise resource planning (ERP) software customization algorithm. This study then presents the artificial intelligence (AI) version of ERP customization algorithm using k-nearest neighbours algorithm.

Findings

This study illustrates the possible bias when the prioritized requirements customization estimation (PRCE) algorithm available in the ERP literature is executed without any AI. Then, the authors present their newly developed AI version of the PRCE algorithm that uses ML techniques. The authors then discuss its adjoining algorithmic bias with an illustration. Further, the authors also draw a roadmap for managing algorithmic bias during ERP customization in practice.

Originality/value

To the best of the authors’ knowledge, no prior research has attempted to understand the algorithmic bias that occurs during the execution of the ERP customization algorithm (with or without AI).

Details

Journal of Ethics in Entrepreneurship and Technology, vol. 3 no. 2
Type: Research Article
ISSN: 2633-7436

Keywords

Open Access
Article
Publication date: 17 May 2022

M'hamed Bilal Abidine, Mourad Oussalah, Belkacem Fergani and Hakim Lounis

Mobile phone-based human activity recognition (HAR) consists of inferring user’s activity type from the analysis of the inertial mobile sensor data. This paper aims to mainly…

Abstract

Purpose

Mobile phone-based human activity recognition (HAR) consists of inferring user’s activity type from the analysis of the inertial mobile sensor data. This paper aims to mainly introduce a new classification approach called adaptive k-nearest neighbors (AKNN) for intelligent HAR using smartphone inertial sensors with a potential real-time implementation on smartphone platform.

Design/methodology/approach

The proposed method puts forward several modification on AKNN baseline by using kernel discriminant analysis for feature reduction and hybridizing weighted support vector machines and KNN to tackle imbalanced class data set.

Findings

Extensive experiments on a five large scale daily activity recognition data set have been performed to demonstrate the effectiveness of the method in terms of error rate, recall, precision, F1-score and computational/memory resources, with several comparison with state-of-the art methods and other hybridization modes. The results showed that the proposed method can achieve more than 50% improvement in error rate metric and up to 5.6% in F1-score. The training phase is also shown to be reduced by a factor of six compared to baseline, which provides solid assets for smartphone implementation.

Practical implications

This work builds a bridge to already growing work in machine learning related to learning with small data set. Besides, the availability of systems that are able to perform on flight activity recognition on smartphone will have a significant impact in the field of pervasive health care, supporting a variety of practical applications such as elderly care, ambient assisted living and remote monitoring.

Originality/value

The purpose of this study is to build and test an accurate offline model by using only a compact training data that can reduce the computational and memory complexity of the system. This provides grounds for developing new innovative hybridization modes in the context of daily activity recognition and smartphone-based implementation. This study demonstrates that the new AKNN is able to classify the data without any training step because it does not use any model for fitting and only uses memory resources to store the corresponding support vectors.

Details

Sensor Review, vol. 42 no. 4
Type: Research Article
ISSN: 0260-2288

Keywords

Open Access
Article
Publication date: 1 April 2021

Arunit Maity, P. Prakasam and Sarthak Bhargava

Due to the continuous and rapid evolution of telecommunication equipment, the demand for more efficient and noise-robust detection of dual-tone multi-frequency (DTMF) signals is…

1277

Abstract

Purpose

Due to the continuous and rapid evolution of telecommunication equipment, the demand for more efficient and noise-robust detection of dual-tone multi-frequency (DTMF) signals is most significant.

Design/methodology/approach

A novel machine learning-based approach to detect DTMF tones affected by noise, frequency and time variations by employing the k-nearest neighbour (KNN) algorithm is proposed. The features required for training the proposed KNN classifier are extracted using Goertzel's algorithm that estimates the absolute discrete Fourier transform (DFT) coefficient values for the fundamental DTMF frequencies with or without considering their second harmonic frequencies. The proposed KNN classifier model is configured in four different manners which differ in being trained with or without augmented data, as well as, with or without the inclusion of second harmonic frequency DFT coefficient values as features.

Findings

It is found that the model which is trained using the augmented data set and additionally includes the absolute DFT values of the second harmonic frequency values for the eight fundamental DTMF frequencies as the features, achieved the best performance with a macro classification F1 score of 0.980835, a five-fold stratified cross-validation accuracy of 98.47% and test data set detection accuracy of 98.1053%.

Originality/value

The generated DTMF signal has been classified and detected using the proposed KNN classifier which utilizes the DFT coefficient along with second harmonic frequencies for better classification. Additionally, the proposed KNN classifier has been compared with existing models to ascertain its superiority and proclaim its state-of-the-art performance.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 28 February 2023

Meltem Aksoy, Seda Yanık and Mehmet Fatih Amasyali

When a large number of project proposals are evaluated to allocate available funds, grouping them based on their similarities is beneficial. Current approaches to group proposals…

Abstract

Purpose

When a large number of project proposals are evaluated to allocate available funds, grouping them based on their similarities is beneficial. Current approaches to group proposals are primarily based on manual matching of similar topics, discipline areas and keywords declared by project applicants. When the number of proposals increases, this task becomes complex and requires excessive time. This paper aims to demonstrate how to effectively use the rich information in the titles and abstracts of Turkish project proposals to group them automatically.

Design/methodology/approach

This study proposes a model that effectively groups Turkish project proposals by combining word embedding, clustering and classification techniques. The proposed model uses FastText, BERT and term frequency/inverse document frequency (TF/IDF) word-embedding techniques to extract terms from the titles and abstracts of project proposals in Turkish. The extracted terms were grouped using both the clustering and classification techniques. Natural groups contained within the corpus were discovered using k-means, k-means++, k-medoids and agglomerative clustering algorithms. Additionally, this study employs classification approaches to predict the target class for each document in the corpus. To classify project proposals, various classifiers, including k-nearest neighbors (KNN), support vector machines (SVM), artificial neural networks (ANN), classification and regression trees (CART) and random forest (RF), are used. Empirical experiments were conducted to validate the effectiveness of the proposed method by using real data from the Istanbul Development Agency.

Findings

The results show that the generated word embeddings can effectively represent proposal texts as vectors, and can be used as inputs for clustering or classification algorithms. Using clustering algorithms, the document corpus is divided into five groups. In addition, the results demonstrate that the proposals can easily be categorized into predefined categories using classification algorithms. SVM-Linear achieved the highest prediction accuracy (89.2%) with the FastText word embedding method. A comparison of manual grouping with automatic classification and clustering results revealed that both classification and clustering techniques have a high success rate.

Research limitations/implications

The proposed model automatically benefits from the rich information in project proposals and significantly reduces numerous time-consuming tasks that managers must perform manually. Thus, it eliminates the drawbacks of the current manual methods and yields significantly more accurate results. In the future, additional experiments should be conducted to validate the proposed method using data from other funding organizations.

Originality/value

This study presents the application of word embedding methods to effectively use the rich information in the titles and abstracts of Turkish project proposals. Existing research studies focus on the automatic grouping of proposals; traditional frequency-based word embedding methods are used for feature extraction methods to represent project proposals. Unlike previous research, this study employs two outperforming neural network-based textual feature extraction techniques to obtain terms representing the proposals: BERT as a contextual word embedding method and FastText as a static word embedding method. Moreover, to the best of our knowledge, there has been no research conducted on the grouping of project proposals in Turkish.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 16 no. 3
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 26 September 2022

Tulsi Pawan Fowdur and Lavesh Babooram

The purpose of this paper is geared towards the capture and analysis of network traffic using an array ofmachine learning (ML) and deep learning (DL) techniques to classify…

52

Abstract

Purpose

The purpose of this paper is geared towards the capture and analysis of network traffic using an array ofmachine learning (ML) and deep learning (DL) techniques to classify network traffic into different classes and predict network traffic parameters.

Design/methodology/approach

The classifier models include k-nearest neighbour (KNN), multilayer perceptron (MLP) and support vector machine (SVM), while the regression models studied are multiple linear regression (MLR) as well as MLP. The analytics were performed on both a local server and a servlet hosted on the international business machines cloud. Moreover, the local server could aggregate data from multiple devices on the network and perform collaborative ML to predict network parameters. With optimised hyperparameters, analytical models were incorporated in the cloud hosted Java servlets that operate on a client–server basis where the back-end communicates with Cloudant databases.

Findings

Regarding classification, it was found that KNN performs significantly better than MLP and SVM with a comparative precision gain of approximately 7%, when classifying both Wi-Fi and long term evolution (LTE) traffic.

Originality/value

Collaborative regression models using traffic collected from two devices were experimented and resulted in an increased average accuracy of 0.50% for all variables, with a multivariate MLP model.

Details

International Journal of Pervasive Computing and Communications, vol. 19 no. 5
Type: Research Article
ISSN: 1742-7371

Keywords

1 – 10 of 833