Search results

1 – 10 of over 4000
Article
Publication date: 6 February 2017

Aytug Onan

The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in…

Abstract

Purpose

The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in information retrieval, such as document organization, text filtering and sentiment analysis. Ensemble learning has been extensively studied to construct efficient text classification schemes with higher predictive performance and generalization ability. The purpose of this paper is to provide diversity among the classification algorithms of ensemble, which is a key issue in the ensemble design.

Design/methodology/approach

An ensemble scheme based on hybrid supervised clustering is presented for text classification. In the presented scheme, supervised hybrid clustering, which is based on cuckoo search algorithm and k-means, is introduced to partition the data samples of each class into clusters so that training subsets with higher diversities can be provided. Each classifier is trained on the diversified training subsets and the predictions of individual classifiers are combined by the majority voting rule. The predictive performance of the proposed classifier ensemble is compared to conventional classification algorithms (such as Naïve Bayes, logistic regression, support vector machines and C4.5 algorithm) and ensemble learning methods (such as AdaBoost, bagging and random subspace) using 11 text benchmarks.

Findings

The experimental results indicate that the presented classifier ensemble outperforms the conventional classification algorithms and ensemble learning methods for text classification.

Originality/value

The presented ensemble scheme is the first to use supervised clustering to obtain diverse ensemble for text classification

Details

Kybernetes, vol. 46 no. 2
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 9 May 2016

Chao-Lung Yang and Thi Phuong Quyen Nguyen

Class-based storage has been studied extensively and proved to be an efficient storage policy. However, few literature addressed how to cluster stuck items for class-based…

2536

Abstract

Purpose

Class-based storage has been studied extensively and proved to be an efficient storage policy. However, few literature addressed how to cluster stuck items for class-based storage. The purpose of this paper is to develop a constrained clustering method integrated with principal component analysis (PCA) to meet the need of clustering stored items with the consideration of practical storage constraints.

Design/methodology/approach

In order to consider item characteristic and the associated storage restrictions, the must-link and cannot-link constraints were constructed to meet the storage requirement. The cube-per-order index (COI) which has been used for location assignment in class-based warehouse was analyzed by PCA. The proposed constrained clustering method utilizes the principal component loadings as item sub-group features to identify COI distribution of item sub-groups. The clustering results are then used for allocating storage by using the heuristic assignment model based on COI.

Findings

The clustering result showed that the proposed method was able to provide better compactness among item clusters. The simulated result also shows the new location assignment by the proposed method was able to improve the retrieval efficiency by 33 percent.

Practical implications

While number of items in warehouse is tremendously large, the human intervention on revealing storage constraints is going to be impossible. The developed method can be easily fit in to solve the problem no matter what the size of the data is.

Originality/value

The case study demonstrated an example of practical location assignment problem with constraints. This paper also sheds a light on developing a data clustering method which can be directly applied on solving the practical data analysis issues.

Details

Industrial Management & Data Systems, vol. 116 no. 4
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 2 September 2014

Manish Gupta, B. Chandra and M.P. Gupta

– The purpose of this paper is to introduce architecture of an Intelligent Decision Support System to fulfill the emerging responsibilities of law enforcement agencies.

Abstract

Purpose

The purpose of this paper is to introduce architecture of an Intelligent Decision Support System to fulfill the emerging responsibilities of law enforcement agencies.

Design/methodology/approach

The proposed Intelligent Police System (IPS) is designed to meet the emerging requirements and provide information at all levels of decision making by introducing a multi-level structure of user interface and crime analysis model. The proposed framework of IPS is based on data mining and performance measurement techniques to extract useful information like crime hot spots, predict crime trends and rank police administration units on the basis of crime prevention measures.

Findings

IPS has been implemented on actual Indian crime data provided by National Crime Records Bureau (NCRB), which illustrates effectiveness and usefulness of the proposed system. IPS can play a vital role in improving outcome in the crime investigation, criminal detection and other major areas of functioning of police organization by analyzing the crime data and sharing of the information.

Research limitations/implications

The research in intelligent police information system can be enhanced with some important additional features which include web-base management system, geographical information system, mobile adhoc network technology, etc.

Practical implications

IPS can easily be applied to any police system in the world and can equally be useful for any law enforcement agencies for carrying out homeland security effectively.

Originality/value

The research reported in this manuscript is outcome of the research project funded by NCRB. This paper is the first attempt to build framework of IPS for Indian police who deal with large volume and high rate of crimes that are unmatched to any police force of the world.

Details

Journal of Enterprise Information Management, vol. 27 no. 5
Type: Research Article
ISSN: 1741-0398

Keywords

Article
Publication date: 10 April 2017

Evgeniy Kutsenko, Ekaterina Islankina and Vasily Abashkin

This paper aims at assessing the impacts of the national cluster policy, cluster age, cluster development benchmarks of neighbouring regions and the cumulative level of regional…

Abstract

Purpose

This paper aims at assessing the impacts of the national cluster policy, cluster age, cluster development benchmarks of neighbouring regions and the cumulative level of regional innovative capacity on the quantity and quality of cluster initiatives in Russia.

Design/methodology/approach

Hypotheses’ testing was carried out by a series of calculations comparing the qualitative and quantitative characteristics of cluster initiatives; the number of new cluster initiatives to the number of neighbouring regions, where cluster initiatives had begun to develop earlier; and ranks of regions within the Russian regional innovation scoreboard to the quantity and quality characteristics of cluster initiatives therein.

Findings

The results of the study empirically confirm that the national cluster policy significantly influenced the emergence and advancement of cluster initiatives in Russia. The proximity to the regions, having previously launched cluster support programmes, also had an impact on the emergence of new cluster initiatives. The cluster initiatives’ age had an ambiguous effect on their performance. Finally, the level of regional innovative capacity was correlated only with the number of cluster initiatives localised therein.

Practical implications

The findings show that along with the direct effects of the national cluster policy for the government-supported clusters, there are positive externalities, e.g. the emergence of new cluster initiatives throughout the country.

Originality/value

The research database of 277 cluster initiatives has been drawn up as a part of the first national cluster mapping and covers almost a decade of clustering activity in Russia. The study analyses not only the cluster initiatives supported by the federal government but also those developed independently.

Article
Publication date: 21 April 2020

Mohammed Anouar Naoui, Brahim Lejdel, Mouloud Ayad, Abdelfattah Amamra and Okba kazar

The purpose of this paper is to propose a distributed deep learning architecture for smart cities in big data systems.

Abstract

Purpose

The purpose of this paper is to propose a distributed deep learning architecture for smart cities in big data systems.

Design/methodology/approach

We have proposed an architectural multilayer to describe the distributed deep learning for smart cities in big data systems. The components of our system are Smart city layer, big data layer, and deep learning layer. The Smart city layer responsible for the question of Smart city components, its Internet of things, sensors and effectors, and its integration in the system, big data layer concerns data characteristics 10, and its distribution over the system. The deep learning layer is the model of our system. It is responsible for data analysis.

Findings

We apply our proposed architecture in a Smart environment and Smart energy. 10; In a Smart environment, we study the Toluene forecasting in Madrid Smart city. For Smart energy, we study wind energy foresting in Australia. Our proposed architecture can reduce the time of execution and improve the deep learning model, such as Long Term Short Memory10;.

Research limitations/implications

This research needs the application of other deep learning models, such as convolution neuronal network and autoencoder.

Practical implications

Findings of the research will be helpful in Smart city architecture. It can provide a clear view into a Smart city, data storage, and data analysis. The 10; Toluene forecasting in a Smart environment can help the decision-maker to ensure environmental safety. The Smart energy of our proposed model can give a clear prediction of power generation.

Originality/value

The findings of this study are expected to contribute valuable information to decision-makers for a better understanding of the key to Smart city architecture. Its relation with data storage, processing, and data analysis.

Details

Smart and Sustainable Built Environment, vol. 10 no. 1
Type: Research Article
ISSN: 2046-6099

Keywords

Article
Publication date: 4 October 2018

Maha Al-Yahya

In the context of information retrieval, text genre is as important as its content, and knowledge of the text genre enhances the search engine features by providing customized…

Abstract

Purpose

In the context of information retrieval, text genre is as important as its content, and knowledge of the text genre enhances the search engine features by providing customized retrieval. The purpose of this study is to explore and evaluate the use of stylometric analysis, a quantitative analysis for the linguistics features of text, to support the task of automated text genre detection for Classical Arabic text.

Design/methodology/approach

Unsupervised clustering and supervised classification were applied on the King Saud University Corpus of Classical Arabic texts (KSUCCA) using the most frequent words in the corpus (MFWs) as stylometric features. Four popular distance measures established in stylometric research are evaluated for the genre detection task.

Findings

The results of the experiments show that stylometry-based genre clustering and classification align well with human-defined genre. The evidence suggests that genre style signals exist for Classical Arabic and can be used to support the task of automated genre detection.

Originality/value

This work targets the task of genre detection in Classical Arabic text using stylometric features, an approach that has only been previously applied to Arabic authorship attribution. The study also provides a comparison of four distance measures used in stylomtreic analysis on the KSUCCA, a corpus with over 50 million words of Classical Arabic using clustering and classification.

Details

The Electronic Library, vol. 36 no. 5
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 28 September 2017

Holly Ellingwood, Karla Emeno, Craig Bennell, Adelle Forth, David Kosson and Robert D. Hare

The purpose of this paper is to examine the structure of juvenile psychopathy, as measured by the Psychopathy Checklist: Youth Version (PCL: YV).

Abstract

Purpose

The purpose of this paper is to examine the structure of juvenile psychopathy, as measured by the Psychopathy Checklist: Youth Version (PCL: YV).

Design/methodology/approach

Using a sample of 2,042 male youths from the USA, Canada, and the UK, the study was a conceptual replication of Bishopp and Hare’s (2008) multidimensional scaling (MDS) analysis of adult male offenders assessed with the Psychopathy Checklist-Revised.

Findings

The scaling analyses generally replicated those obtained by Bishopp and Hare, providing support for a multidimensional, four-factor model of juvenile psychopathy similar to that obtained with adults. However, a small number of items fell outside their predicted regions. Slight differences in the structure of juvenile psychopathy were found for incarcerated and supervised samples of youth, with the four-factor model breaking down slightly for the supervised sample. Item misplacements may indicate that certain items on the PCL: YV are being misinterpreted, reflect different dimensions for different samples, or cannot be reliably measured. Future research should examine these possibilities, with special attention being paid to supervised samples.

Originality/value

To the authors’ knowledge, this is one of the first known attempts to use MDS analysis to examine the psychopathy structures that emerge for male juvenile offenders. The greater nuances afforded by using MDS offer a more comprehensive understanding of psychopathy between incarcerated and supervised youth using the PCL: YV.

Details

Journal of Criminal Psychology, vol. 7 no. 4
Type: Research Article
ISSN: 2009-3829

Keywords

Article
Publication date: 1 March 2000

Christian Bauer and Arno Scharl

Describes an approach automatically to classify and evaluate publicly accessible World Wide Web sites. The suggested methodology is equally valuable for analyzing content and…

6331

Abstract

Describes an approach automatically to classify and evaluate publicly accessible World Wide Web sites. The suggested methodology is equally valuable for analyzing content and hypertext structures of commercial, educational and non‐profit organizations. Outlines a research methodology for model building and validation and defines the most relevant attributes of such a process. A set of operational criteria for classifying Web sites is developed. The introduced software tool supports the automated gathering of these parameters, and thereby assures the necessary “critical mass” of empirical data. Based on the preprocessed information, a multi‐methodological approach is chosen that comprises statistical clustering, textual analysis, supervised and non‐supervised neural networks and manual classification for validation purposes.

Details

Internet Research, vol. 10 no. 1
Type: Research Article
ISSN: 1066-2243

Keywords

Article
Publication date: 7 February 2023

Riju Bhattacharya, Naresh Kumar Nagwani and Sarsij Tripathi

A community demonstrates the unique qualities and relationships between its members that distinguish it from other communities within a network. Network analysis relies heavily on…

Abstract

Purpose

A community demonstrates the unique qualities and relationships between its members that distinguish it from other communities within a network. Network analysis relies heavily on community detection. Despite the traditional spectral clustering and statistical inference methods, deep learning techniques for community detection have grown in popularity due to their ease of processing high-dimensional network data. Graph convolutional neural networks (GCNNs) have received much attention recently and have developed into a potential and ubiquitous method for directly detecting communities on graphs. Inspired by the promising results of graph convolutional networks (GCNs) in analyzing graph structure data, a novel community graph convolutional network (CommunityGCN) as a semi-supervised node classification model has been proposed and compared with recent baseline methods graph attention network (GAT), GCN-based technique for unsupervised community detection and Markov random fields combined with graph convolutional network (MRFasGCN).

Design/methodology/approach

This work presents the method for identifying communities that combines the notion of node classification via message passing with the architecture of a semi-supervised graph neural network. Six benchmark datasets, namely, Cora, CiteSeer, ACM, Karate, IMDB and Facebook, have been used in the experimentation.

Findings

In the first set of experiments, the scaled normalized average matrix of all neighbor's features including the node itself was obtained, followed by obtaining the weighted average matrix of low-dimensional nodes. In the second set of experiments, the average weighted matrix was forwarded to the GCN with two layers and the activation function for predicting the node class was applied. The results demonstrate that node classification with GCN can improve the performance of identifying communities on graph datasets.

Originality/value

The experiment reveals that the CommunityGCN approach has given better results with accuracy, normalized mutual information, F1 and modularity scores of 91.26, 79.9, 92.58 and 70.5 per cent, respectively, for detecting communities in the graph network, which is much greater than the range of 55.7–87.07 per cent reported in previous literature. Thus, it has been concluded that the GCN with node classification models has improved the accuracy.

Details

Data Technologies and Applications, vol. 57 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

1 – 10 of over 4000