Search results

1 – 10 of over 6000
Article
Publication date: 4 August 2021

Archana Yashodip Chaudhari and Preeti Mulay

To reduce the electricity consumption in our homes, a first step is to make the user aware of it. Reading a meter once in a month is not enough, instead, it requires real-time…

Abstract

Purpose

To reduce the electricity consumption in our homes, a first step is to make the user aware of it. Reading a meter once in a month is not enough, instead, it requires real-time meter reading. Smart electricity meter (SEM) is capable of providing a quick and exact meter reading in real-time at regular time intervals. SEM generates a considerable amount of household electricity consumption data in an incremental manner. However, such data has embedded load patterns and hidden information to extract and learn consumer behavior. The extracted load patterns from data clustering should be updated because consumer behaviors may be changed over time. The purpose of this study is to update the new clustering results based on the old data rather than to re-cluster all of the data from scratch.

Design/methodology/approach

This paper proposes an incremental clustering with nearness factor (ICNF) algorithm to update load patterns without overall daily load curve clustering.

Findings

Extensive experiments are implemented on real-world SEM data of Irish Social Science Data Archive (Ireland) data set. The results are evaluated by both accuracy measures and clustering validity indices, which indicate that proposed method is useful for using the enormous amount of smart meter data to understand customers’ electricity consumption behaviors.

Originality/value

ICNF can provide an efficient response for electricity consumption patterns analysis to end consumers via SEMs.

Article
Publication date: 5 September 2016

Runhai Jiao, Shaolong Liu, Wu Wen and Biying Lin

The large volume of big data makes it impractical for traditional clustering algorithms which are usually designed for entire data set. The purpose of this paper is to focus on…

Abstract

Purpose

The large volume of big data makes it impractical for traditional clustering algorithms which are usually designed for entire data set. The purpose of this paper is to focus on incremental clustering which divides data into series of data chunks and only a small amount of data need to be clustered at each time. Few researches on incremental clustering algorithm address the problem of optimizing cluster center initialization for each data chunk and selecting multiple passing points for each cluster.

Design/methodology/approach

Through optimizing initial cluster centers, quality of clustering results is improved for each data chunk and then quality of final clustering results is enhanced. Moreover, through selecting multiple passing points, more accurate information is passed down to improve the final clustering results. The method has been proposed to solve those two problems and is applied in the proposed algorithm based on streaming kernel fuzzy c-means (stKFCM) algorithm.

Findings

Experimental results show that the proposed algorithm demonstrates more accuracy and better performance than streaming kernel stKFCM algorithm.

Originality/value

This paper addresses the problem of improving the performance of increment clustering through optimizing cluster center initialization and selecting multiple passing points. The paper analyzed the performance of the proposed scheme and proved its effectiveness.

Details

Kybernetes, vol. 45 no. 8
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 17 October 2008

Rui Xu and Donald C. Wunsch

The purpose of this paper is to provide a review of the issues related to cluster analysis, one of the most important and primitive activities of human beings, and of the advances…

1746

Abstract

Purpose

The purpose of this paper is to provide a review of the issues related to cluster analysis, one of the most important and primitive activities of human beings, and of the advances made in recent years.

Design/methodology/approach

The paper investigates the clustering algorithms rooted in machine learning, computer science, statistics, and computational intelligence.

Findings

The paper reviews the basic issues of cluster analysis and discusses the recent advances of clustering algorithms in scalability, robustness, visualization, irregular cluster shape detection, and so on.

Originality/value

The paper presents a comprehensive and systematic survey of cluster analysis and emphasizes its recent efforts in order to meet the challenges caused by the glut of complicated data from a wide variety of communities.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 1 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 19 November 2018

Gian Luca Casali, Mirko Perano, Angelo Presenza and Tindara Abbate

The aim of this paper is to analyze the relationships between distribution strategies and the level of innovation propensity in the winemaking industry. It intends to identify the…

Abstract

Purpose

The aim of this paper is to analyze the relationships between distribution strategies and the level of innovation propensity in the winemaking industry. It intends to identify the existence of patterns around the way wineries innovate and the way distribution channels are used. These determinants can support or constrain wineries’ behaviors in their strategic choices related to distribution channels.

Design/methodology/approach

The sample comprised 191 Italian small- to medium-sized enterprises in the wine industry. First, a two-step cluster analysis was used to identify patterns in the level of innovation propensity and differences in distribution channel strategies. Second, the research question was tested using multinomial logit regression.

Findings

Five clusters of innovation propensity were identified, varying from “no propensity to innovate” to “propensity for radical innovation”, and three clusters of distribution channel strategies were found. A significant negative relationship between innovation propensity and distribution channel strategies was revealed. This means that the greater the propensity to innovate, the smaller the need for a wholesale distribution option.

Research limitations/implications

As with most research, there are limitations to this study. First, the sample is from only one country. A second limitation is the sample size (191 Italian firms). A sample including large firms can be used to further validate the findings. Linked to the sample, another possible limitation is that all respondents were small- and medium-sized enterprises from a single industry.

Practical implications

This study contributes to the current innovation research by showing the existence of a negative relationship between innovation propensity and the choice of distribution channel in the wine industry. This knowledge is precious to entrepreneurs and managers in the wine sector, allowing them to better consider not only the type of strategies related to distribution channels but also the importance of building the firm’s propensity to innovate into the strategic decision-making process. Furthermore, the paper provides an opportunity for practitioners to reflect upon the fact that changing the distribution channel is more than just changing the outlet for their product; it might also require a revision in their innovation propensity to better facilitate the process.

Social implications

There are also social implications, in particular providing an advantage for consumers. The major advantage is based on the fact that consumers are now aware that the level of innovation propensity in a wine industry is directly linked to the type of distribution channel adopted. Therefore, wines with low-innovation propensity are most likely found to adopt wholesale distribution strategy, while the more innovative wineries adopt the wine expert and direct distribution channels.

Originality/value

For the first time, a cluster analysis approach was used to review different typologies of Italian wineries based on their propensity toward to innovation and subsequent distribution strategies. This study further explains the direct relationship between innovation propensity and the strategic choice toward between long or short distribution channels.

Details

International Journal of Wine Business Research, vol. 30 no. 4
Type: Research Article
ISSN: 1751-1062

Keywords

Article
Publication date: 3 April 2009

Maria Soledad Pera and Yiu‐Kai Ng

Tens of thousands of news articles are posted online each day, covering topics from politics to science to current events. To better cope with this overwhelming volume of…

Abstract

Purpose

Tens of thousands of news articles are posted online each day, covering topics from politics to science to current events. To better cope with this overwhelming volume of information, RSS (news) feeds are used to categorize newly posted articles. Nonetheless, most RSS users must filter through many articles within the same or different RSS feeds to locate articles pertaining to their particular interests. Due to the large number of news articles in individual RSS feeds, there is a need for further organizing articles to aid users in locating non‐redundant, informative, and related articles of interest quickly. This paper aims to address these issues.

Design/methodology/approach

The paper presents a novel approach which uses the word‐correlation factors in a fuzzy set information retrieval model to: filter out redundant news articles from RSS feeds; shed less‐informative articles from the non‐redundant ones; and cluster the remaining informative articles according to the fuzzy equivalence classes on the news articles.

Findings

The clustering approach requires little overhead or computational costs, and experimental results have shown that it outperforms other existing, well‐known clustering approaches.

Research limitations/implications

The clustering approach as proposed in this paper applies only to RSS news articles; however, it can be extended to other application domains.

Originality/value

The developed clustering tool is highly efficient and effective in filtering and classifying RSS news articles and does not employ any labor‐intensive user‐feedback strategy. Therefore, it can be implemented in real‐world RSS feeds to aid users in locating RSS news articles of interest.

Details

International Journal of Web Information Systems, vol. 5 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 26 June 2019

Mamta Kayest and Sanjay Kumar Jain

Document retrieval has become a hot research topic over the past few years, and has been paid more attention in browsing and synthesizing information from different documents. The…

Abstract

Purpose

Document retrieval has become a hot research topic over the past few years, and has been paid more attention in browsing and synthesizing information from different documents. The purpose of this paper is to develop an effective document retrieval method, which focuses on reducing the time needed for the navigator to evoke the whole document based on contents, themes and concepts of documents.

Design/methodology/approach

This paper introduces an incremental learning approach for text categorization using Monarch Butterfly optimization–FireFly optimization based Neural Network (MB–FF based NN). Initially, the feature extraction is carried out on the pre-processed data using Term Frequency–Inverse Document Frequency (TF–IDF) and holoentropy to find the keywords of the document. Then, cluster-based indexing is performed using MB–FF algorithm, and finally, by matching process with the modified Bhattacharya distance measure, the document retrieval is done. In MB–FF based NN, the weights in the NN are chosen using MB–FF algorithm.

Findings

The effectiveness of the proposed MB–FF based NN is proven with an improved precision value of 0.8769, recall value of 0.7957, F-measure of 0.8143 and accuracy of 0.7815, respectively.

Originality/value

The experimental results show that the proposed MB–FF based NN is useful to companies, which have a large workforce across the country.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 12 no. 3
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 15 June 2020

Nils Grashof, Alexander Kopka, Colin Wessendorf and Dirk Fornahl

This paper aims to show the interaction effects between clusters and cluster-specific attributes and the industrial internet of things (IoT) knowledge of a firm on the…

Abstract

Purpose

This paper aims to show the interaction effects between clusters and cluster-specific attributes and the industrial internet of things (IoT) knowledge of a firm on the innovativeness of firms. Cluster theory and the concept of key enabling technologies are linked to test their effect on a firm’s incremental and radical knowledge generation.

Design/methodology/approach

Quantitative approach at the firm-level. By combining several data sources (e.g. ORBIS, PATSTAT and German subsidy catalogue) the paper relies on a unique database encompassing 8,347 firms in Germany. Ordinary least squares (OLS)-regression techniques are used for data analysis.

Findings

Industrial IoT is an important driver of radical patents, mediated positively by firm size. For incremental knowledge, a substitution effect occurs between a cluster and IoT effects, which is bigger for larger firms and dependent on cluster attributes and firms’ outside connections.

Research limitations/implications

The paper opens up new research paths considering long-term disruptive effects of the industrial IoT compared to short-term effects on the innovativeness of firms within clusters. Additionally, it enables further research enriching the discussion about cluster attributes and how these affect ongoing processes.

Practical implications

Linking cluster theory and policy with Industry 4.0 raises awareness for being considerate in terms of funding and scrutinising one-size-fits-all approaches.

Originality/value

Connecting the concepts of a cluster and advanced manufacturing technologies as a proxy for industrial IoT, specifically focussing on both radical and incremental innovations is a new approach. Especially, taking into account the interaction effects between cluster attributes and the influence of industrial IoT on the innovativeness of firms.

Details

Competitiveness Review: An International Business Journal , vol. 31 no. 1
Type: Research Article
ISSN: 1059-5422

Keywords

Article
Publication date: 19 June 2019

Prafulla Bafna, Dhanya Pramod, Shailaja Shrwaikar and Atiya Hassan

Document management is growing in importance proportionate to the growth of unstructured data, and its applications are increasing from process benchmarking to customer…

Abstract

Purpose

Document management is growing in importance proportionate to the growth of unstructured data, and its applications are increasing from process benchmarking to customer relationship management and so on. The purpose of this paper is to improve important components of document management that is keyword extraction and document clustering. It is achieved through knowledge extraction by updating the phrase document matrix. The objective is to manage documents by extending the phrase document matrix and achieve refined clusters. The study achieves consistency in cluster quality in spite of the increasing size of data set. Domain independence of the proposed method is tested and compared with other methods.

Design/methodology/approach

In this paper, a synset-based phrase document matrix construction method is proposed where semantically similar phrases are grouped to reduce the dimension curse. When a large collection of documents is to be processed, it includes some documents that are very much related to the topic of interest known as model documents and also the documents that deviate from the topic of interest. These non-relevant documents may affect the cluster quality. The first step in knowledge extraction from the unstructured textual data is converting it into structured form either as term frequency-inverse document frequency matrix or as phrase document matrix. Once in structured form, a range of mining algorithms from classification to clustering can be applied.

Findings

In the enhanced approach, the model documents are used to extract key phrases with synset groups, whereas the other documents participate in the construction of the feature matrix. It gives a better feature vector representation and improved cluster quality.

Research limitations/implications

Various applications that require managing of unstructured documents can use this approach by specifically incorporating the domain knowledge with a thesaurus.

Practical implications

Experiment pertaining to the academic domain is presented that categorizes research papers according to the context and topic, and this will help academicians to organize and build knowledge in a better way. The grouping and feature extraction for resume data can facilitate the candidate selection process.

Social implications

Applications like knowledge management, clustering of search engine results, different recommender systems like hotel recommender, task recommender, and so on, will benefit from this study. Hence, the study contributes to improving document management in business domains or areas of interest of its users from various strata’s of society.

Originality/value

The study proposed an improvement to document management approach that can be applied in various domains. The efficacy of the proposed approach and its enhancement is validated on three different data sets of well-articulated documents from data sets such as biography, resume and research papers. These results can be used for benchmarking further work carried out in these areas.

Details

Benchmarking: An International Journal, vol. 26 no. 6
Type: Research Article
ISSN: 1463-5771

Keywords

Article
Publication date: 26 October 2020

Jie Zhu, Jing Yang, Shaoning Di, Jiazhu Zheng and Leying Zhang

The spatial and non-spatial attributes are the two important characteristics of a spatial point, which belong to the two different attribute domains in many Geographic Information…

Abstract

Purpose

The spatial and non-spatial attributes are the two important characteristics of a spatial point, which belong to the two different attribute domains in many Geographic Information Systems applications. The dual clustering algorithms take into account both spatial and non-spatial attributes, where a cluster has not only high proximity in spatial domain but also high similarity in non-spatial domain. In a geographical dataset, traditional dual spatial clustering algorithms discover homogeneous spatially adjacent clusters suffering from the between-cluster inhomogeneity where those spatial points are described in non-spatial domain. To overcome this limitation, a novel dual-domain clustering algorithm (DDCA) is proposed by considering both spatial proximity and attribute similarity with the presence of inhomogeneity.

Design/methodology/approach

In this algorithm, Delaunay triangulation with edge length constraints is first employed to construct spatial proximity relationships amongst objects. Then, a clustering strategy based on statistical change detection is designed to obtain clusters with similar attributes.

Findings

The effectiveness and practicability of the proposed algorithm are illustrated by experiments on both simulated datasets and real spatial events. It is found that the proposed algorithm can adaptively and accurately detect clusters with spatial proximity and similar non-spatial attributes under the consideration of inhomogeneity.

Originality/value

Traditional dual spatial clustering algorithms discover homogeneous spatially adjacent clusters suffering from the between-cluster inhomogeneity where those spatial points are described in non-spatial domain. The research here is a contribution to developing a dual spatial clustering method considering both spatial proximity and attribute similarity with the presence of inhomogeneity. The detection of these clusters is useful to understand the local patterns of geographical phenomena, such as land use classification, spatial patterns research and big geo-data analysis.

Details

Data Technologies and Applications, vol. 54 no. 5
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 28 August 2009

Vassiliki A. Koutsonikola, Sophia G. Petridou, Athena I. Vakali and Georgios I. Papadimitriou

Web users' clustering is an important mining task since it contributes in identifying usage patterns, a beneficial task for a wide range of applications that rely on the web. The…

Abstract

Purpose

Web users' clustering is an important mining task since it contributes in identifying usage patterns, a beneficial task for a wide range of applications that rely on the web. The purpose of this paper is to examine the usage of Kullback‐Leibler (KL) divergence, an information theoretic distance, as an alternative option for measuring distances in web users clustering.

Design/methodology/approach

KL‐divergence is compared with other well‐known distance measures and clustering results are evaluated using a criterion function, validity indices, and graphical representations. Furthermore, the impact of noise (i.e. occasional or mistaken page visits) is evaluated, since it is imperative to assess whether a clustering process exhibits tolerance in noisy environments such as the web.

Findings

The proposed KL clustering approach is of similar performance when compared with other distance measures under both synthetic and real data workloads. Moreover, imposing extra noise on real data, the approach shows minimum deterioration among most of the other conventional distance measures.

Practical implications

The experimental results show that a probabilistic measure such as KL‐divergence has proven to be quite efficient in noisy environments and thus constitute a good alternative, the web users clustering problem.

Originality/value

This work is inspired by the usage of divergence in clustering of biological data and it is introduced by the authors in the area of web clustering. According to the experimental results presented in this paper, KL‐divergence can be considered as a good alternative for measuring distances in noisy environments such as the web.

Details

International Journal of Web Information Systems, vol. 5 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

1 – 10 of over 6000