Search results

1 – 10 of over 1000

View access options

Article

Publication date: 19 October 2010

Classifying the user intent of web queries using k‐means clustering

Ashish Kathuria, Bernard J. Jansen, Carolyn Hafernik and Amanda Spink

Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people…

HTML

PDF (212 KB)

Downloads

1357

Abstract

Purpose

Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people may be looking for specific web sites or may wish to conduct transactions with web services. This paper aims to focus on automatically classifying the different user intents behind web queries.

Design/methodology/approach

For the research reported in this paper, 130,000 web search engine queries are categorized as informational, navigational, or transactional using a k‐means clustering approach based on a variety of query traits.

Findings

The research findings show that more than 75 percent of web queries (clustered into eight classifications) are informational in nature, with about 12 percent each for navigational and transactional. Results also show that web queries fall into eight clusters, six primarily informational, and one each of primarily transactional and navigational.

Research limitations/implications

This study provides an important contribution to web search literature because it provides information about the goals of searchers and a method for automatically classifying the intents of the user queries. Automatic classification of user intent can lead to improved web search engines by tailoring results to specific user needs.

Practical implications

The paper discusses how web search engines can use automatically classified user queries to provide more targeted and relevant results in web searching by implementing a real time classification method as presented in this research.

Originality/value

This research investigates a new application of a method for automatically classifying the intent of user queries. There has been limited research to date on automatically classifying the user intent of web queries, even though the pay‐off for web search engines can be quite beneficial.

Details

Internet Research, vol. 20 no. 5

Type: Research Article

DOI:

ISSN: 1066-2243

Keywords

Open Access

Article

Publication date: 22 November 2022

Research on optimization of index system design and its inspection method: data quality diagnosis, index classification and stratification

Kedong Yin, Yun Cao, Shiwei Zhou and Xinman Lv

The purposes of this research are to study the theory and method of multi-attribute index system design and establish a set of systematic, standardized, scientific index systems…

HTML

PDF (779 KB)

Downloads

670

Abstract

Purpose

The purposes of this research are to study the theory and method of multi-attribute index system design and establish a set of systematic, standardized, scientific index systems for the design optimization and inspection process. The research may form the basis for a rational, comprehensive evaluation and provide the most effective way of improving the quality of management decision-making. It is of practical significance to improve the rationality and reliability of the index system and provide standardized, scientific reference standards and theoretical guidance for the design and construction of the index system.

Design/methodology/approach

Using modern methods such as complex networks and machine learning, a system for the quality diagnosis of index data and the classification and stratification of index systems is designed. This guarantees the quality of the index data, realizes the scientific classification and stratification of the index system, reduces the subjectivity and randomness of the design of the index system, enhances its objectivity and rationality and lays a solid foundation for the optimal design of the index system.

Findings

Based on the ideas of statistics, system theory, machine learning and data mining, the focus in the present research is on “data quality diagnosis” and “index classification and stratification” and clarifying the classification standards and data quality characteristics of index data; a data-quality diagnosis system of “data review – data cleaning – data conversion – data inspection” is established. Using a decision tree, explanatory structural model, cluster analysis, K-means clustering and other methods, classification and hierarchical method system of indicators is designed to reduce the redundancy of indicator data and improve the quality of the data used. Finally, the scientific and standardized classification and hierarchical design of the index system can be realized.

Originality/value

The innovative contributions and research value of the paper are reflected in three aspects. First, a method system for index data quality diagnosis is designed, and multi-source data fusion technology is adopted to ensure the quality of multi-source, heterogeneous and mixed-frequency data of the index system. The second is to design a systematic quality-inspection process for missing data based on the systematic thinking of the whole and the individual. Aiming at the accuracy, reliability, and feasibility of the patched data, a quality-inspection method of patched data based on inversion thought and a unified representation method of data fusion based on a tensor model are proposed. The third is to use the modern method of unsupervised learning to classify and stratify the index system, which reduces the subjectivity and randomness of the design of the index system and enhances its objectivity and rationality.

Details

Marine Economics and Management, vol. 5 no. 2

Type: Research Article

DOI:

ISSN: 2516-158X

Keywords

View access options

Article

Publication date: 10 June 2014

Content adaptation for students with learning difficulties: design and case study

Diana Janeth Lancheros-Cuesta, Angela Carrillo-Ramos and Jaime A. Pavlich-Mariscal

This article aims to propose an adaptation algorithm that combines the analytical hierarchy process (AHP), a rule-based system, and a k-means clustering algorithm. Informatic…

HTML

PDF (572 KB)

Downloads

1130

Abstract

Purpose

This article aims to propose an adaptation algorithm that combines the analytical hierarchy process (AHP), a rule-based system, and a k-means clustering algorithm. Informatic tools are very useful to enhance the learning process in the classroom. The large variety of these tools require advanced decision-making techniques to select parameters, such as student profiles and preferences, to adjust content and information display, according to specific characteristics and necessities of students. They are part of the Kamachiy–Idukay (KI), a platform to offer adaptative educational services to students with learning difficulties or disabilities.

Design and Methodology

The design and implementation of the adaptation algorithm comprises the following phases: utilization of the AHP to determine the most important student parameters, parameter to take into account in the adaptation process, such as preferences, learning styles, performance in language, attention and memory aspects and disabilities; designing the first part of the adaptation algorithm, based on a rule-based system; designing the second part of the adaptation algorithm, based on k-means clustering; integration of the adaptation algorithm to KI; and validation of the approach in a primary school in Bogotá (Colombia).

Approach

The main approach is the application of computational techniques, namely, rule-based systems and k-means clustering, plus an AHP prioritization at design time to yield a system to support the teaching–learning process for students with disabilities or learning difficulties.

Findings

The algorithm found several groups of students with specific learning difficulties that required adapted activities. The algorithm also prioritized activities according to learning style and preferences. The results of the application of this system in a real classroom yielded positive results.

Limitations of the research

The algorithm performs adaptation for students with mild disabilities or learning difficulties (language, attention and memory). The algorithm does not address severe disabilities that could greatly affect cognitive abilities.

Contributions

The main contribution of this paper is an adaptation algorithm with the following distinctive characteristics, namely, designed utilizing the AHP, which ensures a proper prioritization of the student characteristics in the adaptation process, and utilizes a rule-based system to identify different adaptation scenarios and k-means clustering to group students with similar adaptation requirements.

Details

International Journal of Web Information Systems, vol. 10 no. 2

Type: Research Article

DOI:

ISSN: 1744-0084

Keywords

View access options

Article

Publication date: 13 August 2018

A wavelet technique for the study of economic socio-political situations in a textual analysis framework

Habiba Abdessalem and Saloua Benammou

The purpose of this paper is to apply the wavelet thresholding technique in order to analyze economic socio-political situations in Tunisia using textual data sets. This technique…

HTML

PDF (244 KB)

Downloads

143

Abstract

Purpose

The purpose of this paper is to apply the wavelet thresholding technique in order to analyze economic socio-political situations in Tunisia using textual data sets. This technique is used to remove noise from contingency table. A comparative study is done on correspondence analysis and classification results (using k-means algorithm) before and after denoising.

Design/methodology/approach

Textual data set is collected from an electronic newspaper that offers actual economic news about Tunisia. Both the hard and the soft-thresholding techniques are applied based on various Daubechies wavelets with different vanishing moments.

Findings

The results obtained have proved the effectiveness of wavelet denoising method in textual data analysis. On one hand, this technique allowed reducing the loss of information generated by correspondence analysis, ensured a better quality of representation of the factorial plan, neglected the interest of lemmatization in textual analysis and improved the results of classification by k-means algorithm. On the other hand, the proximities provided by the factorial visualization validate the economic situation of Tunisia during the studied period showing mainly a stable situation before the revolution and a deteriorated one after the revolution.

Originality/value

The results are the first to analyze economic socio-political relations using textual data. The originality of this paper comes also from the joint use of correspondence analysis and wavelet thresholding in textual data analysis.

Details

Journal of Economic Studies, vol. 45 no. 3

Type: Research Article

DOI:

ISSN: 0144-3585

Keywords

View access options

Article

Publication date: 31 May 2011

Methodology for non‐destructive assessment of integrity of steam generator shell welds

S. Thirunavukkarasu, B.P.C. Rao, G.K. Sharma, Viswa Chaithanya, C. Babu Rao, T. Jayakumar, Baldev Raj, Aravinda Pai, T.K. Mitra and Pandurang Jadhav

Development of non‐destructive methodology for detection of arc strike, spatter and fusion type of welding defects which may form on steam generator (SG) tubes that are in close…

HTML

PDF (352 KB)

Downloads

357

Abstract

Purpose

Development of non‐destructive methodology for detection of arc strike, spatter and fusion type of welding defects which may form on steam generator (SG) tubes that are in close proximity to the circumferential shell welds. Such defects, especially fusion‐type defects, are detrimental to the structural integrity of the SG. This paper aims to focus on this problem.

Design/methodology/approach

This paper presents a new methodology for non‐destructive detection of arc strike, spatter and fusion type of welding defects. This methodology uses remote field eddy current (RFEC) ultrasonic non‐destructive techniques and K‐means clustering.

Findings

Distinctly different RFEC signals have been observed for the three types of defects and this information has been effectively utilized for automated identification of weld fusion which produces two back‐wall echoes in ultrasonic A‐scan signals. The methodology can readily distinguish fusion‐type defect from arc strike and spatter type of defects.

Originality/value

The methodology is unique as there is no standard guideline for non‐destructive evaluation of peripheral tubes after shell welding to detect arc strike, spatter and fusion type of welding defects.

Details

International Journal of Structural Integrity, vol. 2 no. 2

Type: Research Article

DOI:

ISSN: 1757-9864

Keywords

View access options

Article

Publication date: 6 February 2017

Hybrid supervised clustering based ensemble scheme for text classification

Aytug Onan

The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in…

HTML

PDF (234 KB)

Downloads

541

Abstract

Purpose

The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in information retrieval, such as document organization, text filtering and sentiment analysis. Ensemble learning has been extensively studied to construct efficient text classification schemes with higher predictive performance and generalization ability. The purpose of this paper is to provide diversity among the classification algorithms of ensemble, which is a key issue in the ensemble design.

Design/methodology/approach

An ensemble scheme based on hybrid supervised clustering is presented for text classification. In the presented scheme, supervised hybrid clustering, which is based on cuckoo search algorithm and k-means, is introduced to partition the data samples of each class into clusters so that training subsets with higher diversities can be provided. Each classifier is trained on the diversified training subsets and the predictions of individual classifiers are combined by the majority voting rule. The predictive performance of the proposed classifier ensemble is compared to conventional classification algorithms (such as Naïve Bayes, logistic regression, support vector machines and C4.5 algorithm) and ensemble learning methods (such as AdaBoost, bagging and random subspace) using 11 text benchmarks.

Findings

The experimental results indicate that the presented classifier ensemble outperforms the conventional classification algorithms and ensemble learning methods for text classification.

Originality/value

The presented ensemble scheme is the first to use supervised clustering to obtain diverse ensemble for text classification

Details

Kybernetes, vol. 46 no. 2

Type: Research Article

DOI:

ISSN: 0368-492X

Keywords

View access options

Article

Publication date: 5 December 2022

A new taxonomy of fourth-party logistics: a lexicometric-based classification

Nejib Fattam, Tarik Saikouk, Ahmed Hamdi, Alan Win and Ismail Badraoui

This paper aims to elaborate on current research on fourth party logistics “4PL” by offering a taxonomy that provides a deeper understanding of 4PL service offerings, thus drawing…

HTML

PDF (3 MB)

Downloads

224

Abstract

Purpose

This paper aims to elaborate on current research on fourth party logistics “4PL” by offering a taxonomy that provides a deeper understanding of 4PL service offerings, thus drawing clear frontiers between existing 4PL business models.

Design/methodology/approach

The authors collected data using semi-structured interviews conducted with 60 logistics executives working in 44 “4PL” providers located in France. Using automatic analysis of textual data, the authors combined spatial visualisation, clustering analysis and hierarchical descending classification to generate the taxonomy.

Findings

Two key dimensions emerged, allowing the authors to clearly identify and distinguish four 4PL business models: the level of reliance on interpersonal relationships and the level of involvement in 4PL service offering. As a result, 4PL providers fall under one of the following business models in the taxonomy: (1) The Metronome, (2) The Architect, (3) The Nostalgic and (4) The Minimalist.

Research limitations/implications

The study focuses on investigating 4PL providers located in France; thus, future studies should explore the classification of 4PL business models across different cultural contexts and social structures.

Practical implications

The findings offer valuable managerial insights for logistics executives and clients of 4PL to better orient their needs, the negotiations and the contracting process with 4PLs.

Originality/value

Using a Lexicometric analysis, the authors develop taxonomy of 4PL service providers based on empirical evidence from logistics executives; the work addresses the existing confusion regarding the conceptualisation of 4PL firms with other types of logistical providers and the role of in/formal interpersonal relationships in the logistical intermediation.

Details

The International Journal of Logistics Management, vol. 34 no. 6

Type: Research Article

DOI:

ISSN: 0957-4093

Keywords

View access options

Article

Publication date: 7 November 2019

Corrosion loop development of oil and gas piping system based on machine learning and group technology method

Andika Rachman and R.M. Chandima Ratnayake

Corrosion loop development is an integral part of the risk-based inspection (RBI) methodology. The corrosion loop approach allows a group of piping to be analyzed simultaneously…

HTML

PDF (444 KB)

Downloads

300

Abstract

Purpose

Corrosion loop development is an integral part of the risk-based inspection (RBI) methodology. The corrosion loop approach allows a group of piping to be analyzed simultaneously, thus reducing non-value adding activities by eliminating repetitive degradation mechanism assessment for piping with similar operational and design characteristics. However, the development of the corrosion loop requires rigorous process that involves a considerable amount of engineering man-hours. Moreover, corrosion loop development process is a type of knowledge-intensive work that involves engineering judgement and intuition, causing the output to have high variability. The purpose of this paper is to reduce the amount of time and output variability of corrosion loop development process by utilizing machine learning and group technology method.

Design/methodology/approach

To achieve the research objectives, k-means clustering and non-hierarchical classification model are utilized to construct an algorithm that allows automation and a more effective and efficient corrosion loop development process. A case study is provided to demonstrate the functionality and performance of the corrosion loop development algorithm on an actual piping data set.

Findings

The results show that corrosion loops generated by the algorithm have lower variability and higher coherence than corrosion loops produced by manual work. Additionally, the utilization of the algorithm simplifies the corrosion loop development workflow, which potentially reduces the amount of time required to complete the development. The application of corrosion loop development algorithm is expected to generate a “leaner” overall RBI assessment process.

Research limitations/implications

Although the algorithm allows a part of corrosion loop development workflow to be automated, it is still deemed as necessary to allow the incorporation of the engineer’s expertise, experience and intuition into the algorithm outputs in order to capture tacit knowledge and refine insights generated by the algorithm intelligence.

Practical implications

This study shows that the advancement of Big Data analytics and artificial intelligence can promote the substitution of machines for human labors to conduct highly complex tasks requiring high qualifications and cognitive skills, including inspection and maintenance management area.

Originality/value

This paper discusses the novel way of developing a corrosion loop. The development of corrosion loop is an integral part of the RBI methodology, but it has less attention among scholars in inspection and maintenance-related subjects.

Details

Journal of Quality in Maintenance Engineering, vol. 26 no. 3

Type: Research Article

DOI:

ISSN: 1355-2511

Keywords

View access options

Article

Publication date: 22 March 2013

A comparative study of hybrid machine learning techniques for customer lifetime value prediction

Chih‐Fong Tsai, Ya‐Han Hu, Chia‐Sheng Hung and Yu‐Feng Hsu

Customer lifetime value (CLV) has received increasing attention in database marketing. Enterprises can retain valuable customers by the correct prediction of valuable customers…

HTML

PDF (87 KB)

Downloads

2444

Abstract

Purpose

Customer lifetime value (CLV) has received increasing attention in database marketing. Enterprises can retain valuable customers by the correct prediction of valuable customers. In the literature, many data mining and machine learning techniques have been applied to develop CLV models. Specifically, hybrid techniques have shown their superiorities over single techniques. However, it is unknown which hybrid model can perform the best in customer value prediction. Therefore, the purpose of this paper is to compares two types of commonly‐used hybrid models by classification+classification and clustering+classification hybrid approaches, respectively, in terms of customer value prediction.

Design/methodology/approach

To construct a hybrid model, multiple techniques are usually combined in a two‐stage manner, in which the first stage is based on either clustering or classification techniques, which can be used to pre‐process the data. Then, the output of the first stage (i.e. the processed data) is used to construct the second stage classifier as the prediction model. Specifically, decision trees, logistic regression, and neural networks are used as the classification techniques and k‐means and self‐organizing maps for the clustering techniques to construct six different hybrid models.

Findings

The experimental results over a real case dataset show that the classification+classification hybrid approach performs the best. In particular, combining two‐stage of decision trees provides the highest rate of accuracy (99.73 percent) and lowest rate of Type I/II errors (0.22 percent/0.43 percent).

Originality/value

The contribution of this paper is to demonstrate that hybrid machine learning techniques perform better than single ones. In addition, this paper allows us to find out which hybrid technique performs best in terms of CLV prediction.

Details

Kybernetes, vol. 42 no. 3

Type: Research Article

DOI:

ISSN: 0368-492X

Keywords

View access options

Article

Publication date: 28 August 2019

Fast clustering of male lower body based on GA-BP neural network

Pengpeng Cheng, Daoling Chen and Jianping Wang

The purpose of this paper is to improve the prediction accuracy of the body shape prediction model and provide some reference value for the design of underwear.

HTML

PDF (447 KB)

Downloads

156

Abstract

Purpose

The purpose of this paper is to improve the prediction accuracy of the body shape prediction model and provide some reference value for the design of underwear.

Design/methodology/approach

The body size data of 250 male youths is measured to analyze the body shape of the lower body. And there is a total of 56 measurement items, which are clustered by GA-BP-K-means, K-means, optimal segmentation method for ordered samples, wavelet coefficient analysis, regression analysis and Naive Bayes Algorithm. Finally, a test male sample of an unknown body shape was clustered to verify the superiority of the GA-BP-K-means.

Findings

This paper presented the key factors for body shape clustering, and experimental results have shown that the GA-BP neural network model is higher in speed and precision than other algorithm prediction models.

Originality/value

It was clarified which is the key to body shape clustering. At the same time, the GA-BP-K-means algorithm can promote the popularization and application of the prediction model in body shape clustering.

Details

International Journal of Clothing Science and Technology, vol. 32 no. 2

Type: Research Article

DOI:

ISSN: 0955-6222

Keywords

Access

Year

Content type

1 – 10 of over 1000