Search results

1 – 10 of over 47000
Article
Publication date: 28 January 2014

Swarnalatha Purushotham and Balakrishna Tripathy

The purpose of this paper is to provide a way to analyze satellite images using various clustering algorithms and refined bitplane methods with other supporting techniques to…

Abstract

Purpose

The purpose of this paper is to provide a way to analyze satellite images using various clustering algorithms and refined bitplane methods with other supporting techniques to prove the superiority of RIFCM.

Design/methodology/approach

A comparative study has been carried out using RIFCM with other related algorithms from their suitability in analysis of satellite images with other supporting techniques which segments the images for further process for the benefit of societal problems. Four images were selected dealing with hills, freshwater, freshwatervally and drought satellite images.

Findings

The superiority of the proposed algorithm, RIFCM with refined bitplane towards other clustering techniques with other supporting methods clustering, has been found and as such the comparison, has been made by applying four metrics (Otsu (Max-Min), PSNR and RMSE (40%-60%-Min-Max), histogram analysis (Max-Max), DB index and D index (Max-Min)) and proved that the RIFCM algorithm with refined bitplane yielded robust results with efficient performance, reduction in the metrics and time complexity of depth computation of satellite images for further process of an image.

Practical implications

For better clustering of satellite images like lands, hills, freshwater, freshwatervalley, drought, etc. of satellite images is an achievement.

Originality/value

The existing system extends the novel framework to provide a more explicit way to analyze an image by removing distortions with refined bitplane slicing using the proposed algorithm of rough intuitionistic fuzzy c-means to show the superiority of RIFCM.

Article
Publication date: 9 May 2016

Chao-Lung Yang and Thi Phuong Quyen Nguyen

Class-based storage has been studied extensively and proved to be an efficient storage policy. However, few literature addressed how to cluster stuck items for class-based…

2536

Abstract

Purpose

Class-based storage has been studied extensively and proved to be an efficient storage policy. However, few literature addressed how to cluster stuck items for class-based storage. The purpose of this paper is to develop a constrained clustering method integrated with principal component analysis (PCA) to meet the need of clustering stored items with the consideration of practical storage constraints.

Design/methodology/approach

In order to consider item characteristic and the associated storage restrictions, the must-link and cannot-link constraints were constructed to meet the storage requirement. The cube-per-order index (COI) which has been used for location assignment in class-based warehouse was analyzed by PCA. The proposed constrained clustering method utilizes the principal component loadings as item sub-group features to identify COI distribution of item sub-groups. The clustering results are then used for allocating storage by using the heuristic assignment model based on COI.

Findings

The clustering result showed that the proposed method was able to provide better compactness among item clusters. The simulated result also shows the new location assignment by the proposed method was able to improve the retrieval efficiency by 33 percent.

Practical implications

While number of items in warehouse is tremendously large, the human intervention on revealing storage constraints is going to be impossible. The developed method can be easily fit in to solve the problem no matter what the size of the data is.

Originality/value

The case study demonstrated an example of practical location assignment problem with constraints. This paper also sheds a light on developing a data clustering method which can be directly applied on solving the practical data analysis issues.

Details

Industrial Management & Data Systems, vol. 116 no. 4
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 23 November 2010

Yongzheng Zhang, Evangelos Milios and Nur Zincir‐Heywood

Summarization of an entire web site with diverse content may lead to a summary heavily biased towards the site's dominant topics. The purpose of this paper is to present a novel…

Abstract

Purpose

Summarization of an entire web site with diverse content may lead to a summary heavily biased towards the site's dominant topics. The purpose of this paper is to present a novel topic‐based framework to address this problem.

Design/methodology/approach

A two‐stage framework is proposed. The first stage identifies the main topics covered in a web site via clustering and the second stage summarizes each topic separately. The proposed system is evaluated by a user study and compared with the single‐topic summarization approach.

Findings

The user study demonstrates that the clustering‐summarization approach statistically significantly outperforms the plain summarization approach in the multi‐topic web site summarization task. Text‐based clustering based on selecting features with high variance over web pages is reliable; outgoing links are useful if a rich set of cross links is available.

Research limitations/implications

More sophisticated clustering methods than those used in this study are worth investigating. The proposed method should be tested on web content that is less structured than organizational web sites, for example blogs.

Practical implications

The proposed summarization framework can be applied to the effective organization of search engine results and faceted or topical browsing of large web sites.

Originality/value

Several key components are integrated for web site summarization for the first time, including feature selection and link analysis, key phrase and key sentence extraction. Insight into the contributions of links and content to topic‐based summarization was gained. A classification approach is used to minimize the number of parameters.

Details

International Journal of Web Information Systems, vol. 6 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 28 February 2023

Meltem Aksoy, Seda Yanık and Mehmet Fatih Amasyali

When a large number of project proposals are evaluated to allocate available funds, grouping them based on their similarities is beneficial. Current approaches to group proposals…

Abstract

Purpose

When a large number of project proposals are evaluated to allocate available funds, grouping them based on their similarities is beneficial. Current approaches to group proposals are primarily based on manual matching of similar topics, discipline areas and keywords declared by project applicants. When the number of proposals increases, this task becomes complex and requires excessive time. This paper aims to demonstrate how to effectively use the rich information in the titles and abstracts of Turkish project proposals to group them automatically.

Design/methodology/approach

This study proposes a model that effectively groups Turkish project proposals by combining word embedding, clustering and classification techniques. The proposed model uses FastText, BERT and term frequency/inverse document frequency (TF/IDF) word-embedding techniques to extract terms from the titles and abstracts of project proposals in Turkish. The extracted terms were grouped using both the clustering and classification techniques. Natural groups contained within the corpus were discovered using k-means, k-means++, k-medoids and agglomerative clustering algorithms. Additionally, this study employs classification approaches to predict the target class for each document in the corpus. To classify project proposals, various classifiers, including k-nearest neighbors (KNN), support vector machines (SVM), artificial neural networks (ANN), classification and regression trees (CART) and random forest (RF), are used. Empirical experiments were conducted to validate the effectiveness of the proposed method by using real data from the Istanbul Development Agency.

Findings

The results show that the generated word embeddings can effectively represent proposal texts as vectors, and can be used as inputs for clustering or classification algorithms. Using clustering algorithms, the document corpus is divided into five groups. In addition, the results demonstrate that the proposals can easily be categorized into predefined categories using classification algorithms. SVM-Linear achieved the highest prediction accuracy (89.2%) with the FastText word embedding method. A comparison of manual grouping with automatic classification and clustering results revealed that both classification and clustering techniques have a high success rate.

Research limitations/implications

The proposed model automatically benefits from the rich information in project proposals and significantly reduces numerous time-consuming tasks that managers must perform manually. Thus, it eliminates the drawbacks of the current manual methods and yields significantly more accurate results. In the future, additional experiments should be conducted to validate the proposed method using data from other funding organizations.

Originality/value

This study presents the application of word embedding methods to effectively use the rich information in the titles and abstracts of Turkish project proposals. Existing research studies focus on the automatic grouping of proposals; traditional frequency-based word embedding methods are used for feature extraction methods to represent project proposals. Unlike previous research, this study employs two outperforming neural network-based textual feature extraction techniques to obtain terms representing the proposals: BERT as a contextual word embedding method and FastText as a static word embedding method. Moreover, to the best of our knowledge, there has been no research conducted on the grouping of project proposals in Turkish.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 16 no. 3
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 9 April 2018

Guijun Wang and Guoying Zhang

This paper aims to overcome the defect that the traditional clustering method is excessively dependent on initial clustering radius and also provide new technical measures for…

Abstract

Purpose

This paper aims to overcome the defect that the traditional clustering method is excessively dependent on initial clustering radius and also provide new technical measures for detecting the component content of lubricating oil based on the fuzzy neural system model.

Design/methodology/approach

According to the layers model of the fuzzy neural system model for the given sample data pair, the new clustering method can be implemented, and through the fuzzy system model, the detection method for the selected oil samples is given. By applying this method, the composition contents of 30 kinds of oil samples in lubricating oil are checked, and the actual composition contents of oil samples are compared.

Findings

Through the detection of 21 mineral elements in 30 oil samples, it can be known that the four mineral elements such as Zn, P, Ca and Mg have largest contribution rate to the lubricating oil, and they can be regarded as the main factors for classification of lubricating oil. The results show that the fuzzy system to be established based on sample data clustering has better performance in detection lubricant component content.

Originality/value

In spite of lots of methods for detecting the component of lubricating oil at the present, there is still no detection of the component of lubricating oil through clustering method based on sample data pair. The new nearest clustering method is proposed in this paper, and it can be more effectively used to detect the content of lubricating oil.

Details

Industrial Lubrication and Tribology, vol. 70 no. 3
Type: Research Article
ISSN: 0036-8792

Keywords

Article
Publication date: 15 May 2019

Ahmad Ali Abin

Constrained clustering is an important recent development in clustering literature. The goal of an algorithm in constrained clustering research is to improve the quality of…

Abstract

Purpose

Constrained clustering is an important recent development in clustering literature. The goal of an algorithm in constrained clustering research is to improve the quality of clustering by making use of background knowledge. The purpose of this paper is to suggest a new perspective for constrained clustering, by finding an effective transformation of data into target space on the reference of background knowledge given in the form of pairwise must- and cannot-link constraints.

Design/methodology/approach

Most of existing methods in constrained clustering are limited to learn a distance metric or kernel matrix from the background knowledge while looking for transformation of data in target space. Unlike previous efforts, the author presents a non-linear method for constraint clustering, whose basic idea is to use different non-linear functions for each dimension in target space.

Findings

The outcome of the paper is a novel non-linear method for constrained clustering which uses different non-linear functions for each dimension in target space. The proposed method for a particular case is formulated and explained for quadratic functions. To reduce the number of optimization parameters, the proposed method is modified to relax the quadratic function and approximate it by a factorized version that is easier to solve. Experimental results on synthetic and real-world data demonstrate the efficacy of the proposed method.

Originality/value

This study proposes a new direction to the problem of constrained clustering by learning a non-linear transformation of data into target space without using kernel functions. This work will assist researchers to start development of new methods based on the proposed framework which will potentially provide them with new research topics.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 12 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 1 March 1984

ALAN GRIFFITHS, LESLEY A. ROBINSON and PETER WILLETT

This paper considers the classifications produced by application of the single linkage, complete linkage, group average and Ward clustering methods to the Keen and Cranfield…

Abstract

This paper considers the classifications produced by application of the single linkage, complete linkage, group average and Ward clustering methods to the Keen and Cranfield document test collections. Experiments were carried out to study the structure of the hierarchies produced by the different methods, the extent to which the methods distort the input similarity matrices during the generation of a classification, and the retrieval effectiveness obtainable in cluster based retrieval. The results would suggest that the single linkage method, which has been used extensively in previous work on document clustering, is not the most effective procedure of those tested, although it should be emphasized that the experiments have used only small document test collections.

Details

Journal of Documentation, vol. 40 no. 3
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 8 January 2018

Peng Li and Cuiping Wei

In multi-criteria decision-making with interval grey number information, decision makers usually take a risk to rank or choose some very similar alternatives. Additionally, when…

Abstract

Purpose

In multi-criteria decision-making with interval grey number information, decision makers usually take a risk to rank or choose some very similar alternatives. Additionally, when evaluating only one alternative, decision makers can only obtain a specific value using traditional decision-making methods and may find it hard to cluster the alternatives to the “correct class” because these methods lack predetermined reference points. To overcome this problem, this paper aims to propose a two-stage grey decision-making method.

Design/methodology/approach

First, a new type of clustering method for interval grey numbers is designed by proposing a new possibility function for grey numbers. Based on this clustering method, a new grey clustering evaluation model for interval grey numbers is proposed by which decision makers can obtain the grade rating information of each alternative. Then, according to the grey clustering evaluation model, a new two-stage decision-making method is introduced to solve the problem that some alternatives are very similar by designing a grey comprehensive decision coefficient of alternatives.

Findings

The authors propose a new grey clustering evaluation model to deal with interval grey numbers. They design a new model to obtain the membership degree for the interval grey numbers and then propose a new grey clustering evaluation model, which can evaluate only one alternative by predefined grey classes. Then, by the grey comprehensive decision coefficient, a two-stage grey evaluation decision-making method is put forward to solve the problem that some alternatives are very close and hard to be distinguished.

Originality/value

A new grey clustering evaluation model is proposed, which can evaluate only one alternative by predefined grey classes. A two-stage grey evaluation decision-making method is given to solve the problem that some alternatives are very close and hard to be distinguished.

Details

Kybernetes, vol. 47 no. 4
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 30 April 2021

Faruk Bulut, Melike Bektaş and Abdullah Yavuz

In this study, supervision and control of the possible problems among people over a large area with a limited number of drone cameras and security staff is established.

Abstract

Purpose

In this study, supervision and control of the possible problems among people over a large area with a limited number of drone cameras and security staff is established.

Design/methodology/approach

These drones, namely unmanned aerial vehicles (UAVs) will be adaptively and automatically distributed over the crowds to control and track the communities by the proposed system. Since crowds are mobile, the design of the drone clusters will be simultaneously re-organized according to densities and distributions of people. An adaptive and dynamic distribution and routing mechanism of UAV fleets for crowds is implemented to control a specific given region. The nine popular clustering algorithms have been used and tested in the presented mechanism to gain better performance.

Findings

The nine popular clustering algorithms have been used and tested in the presented mechanism to gain better performance. An outperformed clustering performance from the aggregated model has been received when compared with a singular clustering method over five different test cases about crowds of human distributions. This study has three basic components. The first one is to divide the human crowds into clusters. The second one is to determine an optimum route of UAVs over clusters. The last one is to direct the most appropriate security personnel to the events that occurred.

Originality/value

This study has three basic components. The first one is to divide the human crowds into clusters. The second one is to determine an optimum route of UAVs over clusters. The last one is to direct the most appropriate security personnel to the events that occurred.

Details

International Journal of Intelligent Unmanned Systems, vol. 12 no. 1
Type: Research Article
ISSN: 2049-6427

Keywords

Article
Publication date: 19 June 2017

Khai Tan Huynh, Tho Thanh Quan and Thang Hoai Bui

Service-oriented architecture is an emerging software architecture, in which web service (WS) plays a crucial role. In this architecture, the task of WS composition and…

Abstract

Purpose

Service-oriented architecture is an emerging software architecture, in which web service (WS) plays a crucial role. In this architecture, the task of WS composition and verification is required when handling complex requirement of services from users. When the number of WS becomes very huge in practice, the complexity of the composition and verification is also correspondingly high. In this paper, the authors aim to propose a logic-based clustering approach to solve this problem by separating the original repository of WS into clusters. Moreover, they also propose a so-called quality-controlled clustering approach to ensure the quality of generated clusters in a reasonable execution time.

Design/methodology/approach

The approach represents WSs as logical formulas on which the authors conduct the clustering task. They also combine two most popular clustering approaches of hierarchical agglomerative clustering (HAC) and k-means to ensure the quality of generated clusters.

Findings

This logic-based clustering approach really helps to increase the performance of the WS composition and verification significantly. Furthermore, the logic-based approach helps us to maintain the soundness and completeness of the composition solution. Eventually, the quality-controlled strategy can ensure the quality of generated clusters in low complexity time.

Research limitations/implications

The work discussed in this paper is just implemented as a research tool known as WSCOVER. More work is needed to make it a practical and usable system for real life applications.

Originality/value

In this paper, the authors propose a logic-based paradigm to represent and cluster WSs. Moreover, they also propose an approach of quality-controlled clustering which combines and takes advantages of two most popular clustering approaches of HAC and k-means.

1 – 10 of over 47000