Search results

1 – 10 of over 2000
Article
Publication date: 12 October 2012

Tao Wang

Mining sequential patterns in large databases has become an important data mining task with broad applications, such as business analysis, web mining, security, and bio‐sequences…

290

Abstract

Purpose

Mining sequential patterns in large databases has become an important data mining task with broad applications, such as business analysis, web mining, security, and bio‐sequences analysis. The purpose of this paper is to propose the notion of condensed frequent sequential pattern base (SP base) with guaranteed maximal error bound.

Design/methodology/approach

A subset of frequent sequential patterns is computed, and then used to approximate the supports of arbitrary frequent sequential patterns with guaranteed maximal error bound, because in many applications it is sufficient to generate only frequent sequential patterns with support frequency in close‐enough approximation instead of in full precision.

Findings

The concept of condensed frequent SP base is introduced, and an efficient algorithm for mining condensed SP bases is developed.

Research limitations/implications

A condensed frequent SP base can significantly reduce the set of sequential patterns that need to be mined, stored, and analyzed, while providing guaranteed error bound for frequencies of sequential patterns not in the base.

Practical implications

A much smaller base of patterns can help users to comprehend the mining results. Computing a much smaller pattern base also leads to better efficiency.

Originality/value

The paper shows that by adopting a novel pruning technology, the algorithm out‐performs the previous work by one order of magnitude.

Details

Kybernetes, vol. 41 no. 9
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 6 August 2019

U.C. Moharana, S.P. Sarmah and Pradeep Kumar Rathore

The purpose of this paper is to suggest a framework for extracting the sequential patterns of maintenance activities and related spare parts information from historical records of…

Abstract

Purpose

The purpose of this paper is to suggest a framework for extracting the sequential patterns of maintenance activities and related spare parts information from historical records of maintenance data with pre-defined support or threshold values.

Design/methodology/approach

A data mining approach has been adopted for predicting the maintenance activity along with spare parts. It starts with a collection of spare parts and maintenance data, and then the development of sequential patterns followed by formation of frequent spare part groups, and finally, integration of sequential maintenance activities with the associated spare parts.

Findings

This study suggests a framework for extracting the sequential patterns of maintenance activities from historical records of maintenance data with pre-defined support or threshold values. A rule-based approach is proposed in this paper to predict the occurrence of next maintenance activity along with the information of spare parts consumption for that maintenance activity.

Research limitations/implications

Presented model can be extended for analyzing the failure maintenance activities and performing root cause analysis that can give more valuable suggestion to maintenance managers to take corrective actions prior to next occurrence of failures. In addition, the timestamp information can be utilized to prioritize the maintenance activity that is ignored in this study.

Practical implications

The proposed model has a high potential for industrial applications and is validated through a case study. The study suggests that the model gives a better approach for selecting spare parts based on their similarity or correlation, considering their actual occurrence during maintenance activities. Apart from this, the clustering of spare parts also trains maintenance manager to learn about the dependency among the spares for group stocking and maintaining the parts availability during maintenance activities.

Originality/value

This study has used the technique of data mining to find dependent spare parts itemset from the database of the company and developed the model for associated spare parts requirement for subsequent maintenance activity.

Details

Journal of Manufacturing Technology Management, vol. 30 no. 7
Type: Research Article
ISSN: 1741-038X

Keywords

Article
Publication date: 19 October 2015

Elham Akhondzadeh-Noughabi and Amir Albadvi

– The purpose of this paper is to detect different behavioral groups and the dominant patterns of customer shifts between segments of different values over time.

Abstract

Purpose

The purpose of this paper is to detect different behavioral groups and the dominant patterns of customer shifts between segments of different values over time.

Design/methodology/approach

A new hybrid methodology is presented based on clustering techniques and mining top-k and distinguishing sequential rules. This methodology is implemented on the data of 14,772 subscribers of a mobile phone operator in Tehran, the capital of Iran. The main data include the call detail records and event detail records data that was acquired from the IT department of this operator.

Findings

Seven different behavioral groups of customer shifts were identified. These groups and the corresponding top-k rules represent the dominant patterns of customer behavior. The results also explain the relation of customer switching behavior and segment instability, which is an open problem.

Practical implications

The findings can be helpful to improve marketing strategies and decision making and for prediction purposes. The obtained rules are relatively easy to interpret and use; this can strengthen the practicality of results.

Originality/value

A new hybrid methodology is proposed that systematically extracts the dominant patterns of customer shifts. This paper also offers a new definition and framework for discovering distinguishing sequential rules. Comparing with Markov chain models, this study captures the customer switching behavior in different levels of value through interpretable sequential rules. This is the first study that uses sequential and distinguishing rules in this domain.

Details

Management Decision, vol. 53 no. 9
Type: Research Article
ISSN: 0025-1747

Keywords

Article
Publication date: 1 November 2005

Yue‐Shi Lee, Show‐Jane Yen and Min‐Chi Hsieh

Web mining is one of the mining technologies, which applies data mining techniques in large amount of web data to improve the web services. Web traversal pattern mining discovers…

Abstract

Web mining is one of the mining technologies, which applies data mining techniques in large amount of web data to improve the web services. Web traversal pattern mining discovers most of the users’ access patterns from web logs. This information can provide the navigation suggestions for web users such that appropriate actions can be adopted. However, the web data will grow rapidly in the short time, and some of the web data may be antiquated. The user behaviors may be changed when the new web data is inserted into and the old web data is deleted from web logs. Besides, it is considerably difficult to select a perfect minimum support threshold during the mining process to find the interesting rules. Even though the experienced experts, they also cannot determine the appropriate minimum support. Thus, we must constantly adjust the minimum support until the satisfactory mining results can be found. The essences of incremental or interactive data mining are that we can use the previous mining results to reduce the unnecessary processes when the minimum support is changed or web logs are updated. In this paper, we propose efficient incremental and interactive data mining algorithms to discover web traversal patterns and make the mining results to satisfy the users’ requirements. The experimental results show that our algorithms are more efficient than the other approaches.

Details

International Journal of Web Information Systems, vol. 1 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 25 February 2020

Wolfram Höpken, Marcel Müller, Matthias Fuchs and Maria Lexhagen

The purpose of this study is to analyse the suitability of photo-sharing platforms, such as Flickr, to extract relevant knowledge on tourists’ spatial movement and point of…

Abstract

Purpose

The purpose of this study is to analyse the suitability of photo-sharing platforms, such as Flickr, to extract relevant knowledge on tourists’ spatial movement and point of interest (POI) visitation behaviour and compare the most prominent clustering approaches to identify POIs in various application scenarios.

Design/methodology/approach

The study, first, extracts photo metadata from Flickr, such as upload time, location and user. Then, photo uploads are assigned to latent POIs by density-based spatial clustering of applications with noise (DBSCAN) and k-means clustering algorithms. Finally, association rule analysis (FP-growth algorithm) and sequential pattern mining (generalised sequential pattern algorithm) are used to identify tourists’ behavioural patterns.

Findings

The approach has been demonstrated for the city of Munich, extracting 13,545 photos for the year 2015. POIs, identified by DBSCAN and k-means clustering, could be meaningfully assigned to well-known POIs. By doing so, both techniques show specific advantages for different usage scenarios. Association rule analysis revealed strong rules (support: 1.0-4.6 per cent; lift: 1.4-32.1 per cent), and sequential pattern mining identified relevant frequent visitation sequences (support: 0.6-1.7 per cent).

Research limitations/implications

As a theoretic contribution, this study comparatively analyses the suitability of different clustering techniques to appropriately identify POIs based on photo upload data as an input to association rule analysis and sequential pattern mining as an alternative but also complementary techniques to analyse tourists’ spatial behaviour.

Practical implications

From a practical perspective, the study highlights that big data sources, such as Flickr, show the potential to effectively substitute traditional data sources for analysing tourists’ spatial behaviour and movement patterns within a destination. Especially, the approach offers the advantage of being fully automatic and executable in a real-time environment.

Originality/value

The study presents an approach to identify POIs by clustering photo uploads on social media platforms and to analyse tourists’ spatial behaviour by association rule analysis and sequential pattern mining. The study gains novel insights into the suitability of different clustering techniques to identify POIs in different application scenarios.

摘要 研究目的

本论文旨在分析图片分享平台Flickr对截取游客空间动线信息和景点(POI)游览行为的适用性, 并且对比最知名的几种聚类分析手段, 以确定不同情况下的POI。

研究设计/方法/途径

本论文首先从Flickr上摘录下图片大数据, 比如上传时间、地点、用户等。其次, 本论文使用DBSCAN和k-means聚类分析参数来将上传图片分配给POI隐性变量。最后, 本论文采用关联规则挖掘分析(FP-growth参数)和序列样式勘探分析(GSP参数)以确认游客行为模式。

研究结果

本论文以慕尼黑城市为样本, 截取2015年13,545张图片。POIs由DBSCAN和k-means聚类分析将其分配到有名的POIs。由此, 本论文证明了两种技术对不同用法的各自优势。关联规则挖掘分析显示了显著联系(support:1%−4.6%;lift:1.4%−32.1%), 序列样式勘探分析确立了相关频率游览次序(support:0.6%−1.7%。

研究理论限制/意义

本论文的理论贡献在于, 根据图片数据, 通过对比分析不同聚类分析技术对确立POIs, 并且证明关联规则挖掘分析和序列样式勘探分析各有千秋又互相补充的分析技术以确立游客空间行为。

研究现实意义

本论文的现实意义在于, 强调了大数据的来源, 比如Flickr,证明了其对于有效代替传统数据的潜力, 以分析在游客在一个旅游目的地的空间行为和动线模式。特别是这种方法实现了实时自动可操作性等优势。

研究原创性/价值

本论文展示了一种方法, 这种方法通过聚类分析社交媒体上的上传图片以确立POIs, 以及通过关联规则挖掘分析和序列样式勘探分析来分析游客空间行为。本论文对于不同聚类分析以确立不同适用情况下的POIs的确立提出了独到见解。

Article
Publication date: 13 July 2015

Gebeyehu Belay Gebremeskel, Chai Yi, Chengliang Wang and Zhongshi He

Behavioral pattern mining for intelligent system such as SmEs sensor data are vitally important in many applications and performance optimizations. Sensor pattern mining (SPM) is…

Abstract

Purpose

Behavioral pattern mining for intelligent system such as SmEs sensor data are vitally important in many applications and performance optimizations. Sensor pattern mining (SPM) is also dynamic and a hot research issue to pervasive and ubiquitous of smart technologies toward improving human life. However, in large-scale sensor data, exploring and mining pattern, which leads to detect the abnormal behavior is challenging. The paper aims to discuss these issues.

Design/methodology/approach

Sensor data are complex and multivariate, for example, which data captured by the sensors, how it is precise, what properties are recorded or measured, are important research issues. Therefore, the method, the authors proposed Sequential Data Mining (SDM) approach to explore pattern behaviors toward detecting abnormal patterns for smart space fault diagnosis and performance optimization in the intelligent world. Sensor data types, modeling, descriptions and SPM techniques are discussed in depth using real sensor data sets.

Findings

The outcome of the paper is measured as introducing a novel idea how SDM technique’s scale-up to sensor data pattern mining. In the paper, the approach and technicality of the sensor data pattern analyzed, and finally the pattern behaviors detected or segmented as normal and abnormal patterns.

Originality/value

The paper is focussed on sensor data behavioral patterns for fault diagnosis and performance optimizations. It is other ways of knowledge extraction from the anomaly of sensor data (observation records), which is pertinent to adopt in many intelligent systems applications, including safety and security, efficiency, and other advantages as the consideration of the real-world problems.

Details

Industrial Management & Data Systems, vol. 115 no. 6
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 31 October 2018

Güzin Özdağoğlu, Gülin Zeynep Öztaş and Mehmet Çağliyangil

Learning management systems (LMS) provide detailed information about the processes through event-logs. Process and related data-mining approaches can reveal valuable information…

Abstract

Purpose

Learning management systems (LMS) provide detailed information about the processes through event-logs. Process and related data-mining approaches can reveal valuable information from these files to help teachers and executives to monitor and manage their online learning processes. In this regard, the purpose of this paper is to present an overview of the current direction of the literature on educational data mining, and an application framework to analyze the educational data provided by the Moodle LMS.

Design/methodology/approach

The paper presents a framework to provide a decision support through the approaches existing in process and data-mining fields for analyzing the event-log data gathered from LMS platforms. In this framework, latent class analysis (LCA) and sequential pattern mining approaches were used to understand the general patterns; heuristic and fuzzy approaches were performed for process mining to obtain the workflows and statistics; finally, social-network analysis was conducted to discover the collaborations.

Findings

The analyses conducted in the study give clues for the process performance of the course during a semester by indicating exceptional situations, clarifying the activity flows, understanding the main process flow and revealing the students’ interactions. Findings also show that using the preliminary data analyses before process mining steps is also beneficial to understand the general pattern and expose the irregular ones.

Originality/value

The study highlights the benefits of analyzing event-log files of LMSs to improve the quality of online educational processes through a case study based on Moodle event-logs. The application framework covers preliminary analyses such as LCA before the use of process mining algorithms to reveal the exceptional situations.

Details

Business Process Management Journal, vol. 25 no. 5
Type: Research Article
ISSN: 1463-7154

Keywords

Article
Publication date: 13 March 2017

Li Yu, Zaifang Zhang and Jin Shen

In the initial stage of product design, product portfolio identification (PPI) aims to translate customer needs (CNs) into product specifications (PSs). This is an essential task…

Abstract

Purpose

In the initial stage of product design, product portfolio identification (PPI) aims to translate customer needs (CNs) into product specifications (PSs). This is an essential task, since understanding what customers really want is at the center of product design. However, design information is incomplete and design knowledge is minimal during this stage. Furthermore, PPI is often a confusing and frustrating task, especially when customer preferences are changing rapidly. To facilitate the task, the purpose of this paper is to capture the time-sensitive mapping relationship between CNs and PSs.

Design/methodology/approach

This paper proposes a design sequential pattern mining model to uncover implicit but valuable knowledge from chronological transaction records. First, CNs and PSs from these records are transformed and connected according to the transaction time. Second, procedures such as litemset generation, data transformation and pattern mining are conducted based on the AprioriAll algorithm. Third, the uncovered patterns are modified and applied by engineers.

Findings

Using the retrieved patterns, engineers can keep up with the dynamics of customer preferences with regard to different PSs.

Research limitations/implications

Computational experiments on a case study of customization of desktop computers show that the proposed method is capable of extracting useful sequential patterns from a design database.

Originality/value

Considering the times tamps of the transactions, a sequential pattern mining-based method is proposed to extract valuable patterns. These patterns can help engineers identify market trends and the correlation among PSs.

Details

Industrial Management & Data Systems, vol. 117 no. 2
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 11 April 2008

Yang Ouyang and Miaoliang Zhu

This paper aims to explore the feasibility of using web‐mining technology on learning object (LO) usage information to discover the LO relation pattern and provide valuable…

Abstract

Purpose

This paper aims to explore the feasibility of using web‐mining technology on learning object (LO) usage information to discover the LO relation pattern and provide valuable recommendations on related learning resources. Design/methodology/approach – This paper proposes three kinds of learning object relation patterns and gives a specific definition of each pattern based on analysing the learners' usage data stored in the learning object repository. These relation patterns can be used to make effective recommendations to learners.

Findings

LO usage data indicate the potential relation patterns between LOs. By using web‐mining technology on the usage data, it is possible to discover valuable relation patterns.

Originality/value

The authors propose a set of LO relation patterns and indicate how they are closely related to users' learning behaviour.

Details

Online Information Review, vol. 32 no. 2
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 9 November 2020

Saleha Noor, Yi Guo, Syed Hamad Hassan Shah, Philippe Fournier-Viger and M. Saqib Nawaz

The novel Coronavirus (COVID-19) pandemic, which started in late December 2019, has spread to more than 200 countries. As no vaccine is yet available for this pandemic, government…

600

Abstract

Purpose

The novel Coronavirus (COVID-19) pandemic, which started in late December 2019, has spread to more than 200 countries. As no vaccine is yet available for this pandemic, government and health agencies are taking draconian steps to contain it. This pandemic is also trending on social media, particularly on Twitter. The purpose of this study is to explore and analyze the general public reactions to the COVID-19 outbreak on Twitter.

Design/methodology/approach

This study conducts a thematic analysis of COVID-19 tweets through VOSviewer to examine people’s reactions related to the COVID-19 outbreak in the world. Moreover, sequential pattern mining (SPM) techniques are used to find frequent words/patterns and their relationship in tweets.

Findings

Seven clusters (themes) were found through VOSviewer: Cluster 1 (green): public sentiments about COVID-19 in the USA. Cluster 2 (red): public sentiments about COVID-19 in Italy and Iran and a vaccine, Cluster 3 (purple): public sentiments about doomsday and science credibility. Cluster 4 (blue): public sentiments about COVID-19 in India. Cluster 5 (yellow): public sentiments about COVID-19’s emergence. Cluster 6 (light blue): public sentiments about COVID-19 in the Philippines. Cluster 7 (orange): Public sentiments about COVID-19 US Intelligence Report. The most frequent words/patterns discovered with SPM were “COVID-19,” “Coronavirus,” “Chinese virus” and the most frequent and high confidence sequential rules were related to “Coronavirus, testing, lockdown, China and Wuhan.”

Research limitations/implications

The methodology can be used to analyze the opinions/thoughts of the general public on Twitter and to categorize them accordingly. Moreover, the categories (generated by VOSviewer) can be correlated with the results obtained with pattern mining techniques.

Social implications

This study has a significant socio-economic impact as Twitter offers content posting and sharing to billions of users worldwide.

Originality/value

According to the authors’ best knowledge, this may be the first study to carry out a thematic analysis of COVID-19 tweets at a glance and mining the tweets with SPM to investigate how people reacted to the COVID-19 outbreak on Twitter.

Details

Kybernetes, vol. 50 no. 5
Type: Research Article
ISSN: 0368-492X

Keywords

1 – 10 of over 2000