Search results

1 – 10 of 41
Article
Publication date: 21 November 2023

Hua Pan and Rong Liu

On the one hand, this paper is to further understand the residents' differentiated power consumption behaviors and tap the residential family characteristics labels from the…

Abstract

Purpose

On the one hand, this paper is to further understand the residents' differentiated power consumption behaviors and tap the residential family characteristics labels from the perspective of electricity stability. On the other hand, this paper is to address the problem of lack of causal relationship in the existing research on the association analysis of residential electricity consumption behavior and basic information data.

Design/methodology/approach

First, the density-based spatial clustering of applications with noise method is used to extract the typical daily load curve of residents. Second, the degree of electricity consumption stability is described from three perspectives: daily minimum load rate, daily load rate and daily load fluctuation rate, and is evaluated comprehensively using the entropy weight method. Finally, residential customer labels are constructed from sociological characteristics, residential characteristics and energy use attitudes, and the enhanced FP-growth algorithm is employed to investigate any potential links between each factor and the stability of electricity consumption.

Findings

Compared with the original FP-growth algorithm, the improved algorithm can realize the excavation of rules containing specific attribute labels, which improves the excavation efficiency. In terms of factors influencing electricity stability, characteristics such as a large number of family members, being well employed, having children in the household and newer dwelling labels may all lead to poorer electricity stability, but residents' attitudes toward energy use and dwelling type are not significantly associated with electricity stability.

Originality/value

This paper aims to uncover household socioeconomic traits that influence the stability of home electricity use and to shed light on the intricate connections between them. Firstly, in this article, from the perspective of electricity stability, the characteristics of the power consumption of residents' users are refined. And the authors use the entropy weight method to comprehensively evaluate the stability of electricity usage. Secondly, the labels of residential users' household characteristics are screened and organized. Finally, the improved FP-growth algorithm is used to mine the residential household characteristic labels that are strongly associated with electricity consumption stability.

Highlights

  1. The stability of electricity consumption is important to the stable operation of the grid.

  2. An improved FP-growth algorithm is employed to explore the influencing factors.

  3. The improved algorithm enables the mining of rules containing specific attribute labels.

  4. Residents' attitudes toward energy use are largely unrelated to the stability of electricity use.

The stability of electricity consumption is important to the stable operation of the grid.

An improved FP-growth algorithm is employed to explore the influencing factors.

The improved algorithm enables the mining of rules containing specific attribute labels.

Residents' attitudes toward energy use are largely unrelated to the stability of electricity use.

Details

Management of Environmental Quality: An International Journal, vol. 35 no. 3
Type: Research Article
ISSN: 1477-7835

Keywords

Article
Publication date: 5 May 2021

Shanshan Wang, Jiahui Xu, Youli Feng, Meiling Peng and Kaijie Ma

This study aims to overcome the problem of traditional association rules relying almost entirely on expert experience to set relevant interest indexes in mining. Second, this…

Abstract

Purpose

This study aims to overcome the problem of traditional association rules relying almost entirely on expert experience to set relevant interest indexes in mining. Second, this project can effectively solve the problem of four types of rules being present in the database at the same time. The traditional association algorithm can only mine one or two types of rules and cannot fully explore the database knowledge in the decision-making process for library recommendation.

Design/methodology/approach

The authors proposed a Markov logic network method to reconstruct association rule-mining tasks for library recommendation and compared the method proposed in this paper to traditional Apriori, FP-Growth, Inverse, Sporadic and UserBasedCF algorithms on two history library data sets and the Chess and Accident data sets.

Findings

The method used in this project had two major advantages. First, the authors were able to mine four types of rules in an integrated manner without having to set interest measures. In addition, because it represents the relevance of mining in the network, decision-makers can use network visualization tools to fully understand the results of mining in library recommendation and data sets from other fields.

Research limitations/implications

The time cost of the project is still high for large data sets. The authors will solve this problem by mapping books, items, or attributes to higher granularity to reduce the computational complexity in the future.

Originality/value

The authors believed that knowledge of complex real-world problems can be well captured from a network perspective. This study can help researchers to avoid setting interest metrics and to comprehensively extract frequent, rare, positive, and negative rules in an integrated manner.

Details

Information Discovery and Delivery, vol. 50 no. 1
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 4 May 2020

Fatma Altuntas and Mehmet Şahin Gök

The purpose of this paper is to analyze the wind energy technologies using the social network analysis based on patent information. Analysis of patent documents with social…

Abstract

Purpose

The purpose of this paper is to analyze the wind energy technologies using the social network analysis based on patent information. Analysis of patent documents with social network analysis is used to identify the most influential and connected technologies in the field of wind energy.

Design/methodology/approach

In the literature, patent data are often used to evaluate technologies. Patents related to wind energy technologies are obtained from the United States Patent and Trademark Office database and the relationships among sub-technologies based on Corporate Patent Classification (CPC) codes are analyzed in this study. The results of two-phase algorithm for mining high average-utility itemsets algorithm, which is one of the utility mining algorithm in data mining, is used to find associations among wind energy technologies for social network analysis.

Findings

The results of this study show that it is very important to focus on wind motors and technologies related to energy conversion or management systems reducing greenhouse gas emissions. The results of this study imply that Y02E, F03D and F05B CPC codes are the most influential CPC codes based on social network analysis.

Originality/value

Analysis of patent documents with social network analysis for technology evaluation is extremely limited in the literature. There is no research related to the analysis of patent documents with social network analysis, in particular CPC codes, for wind energy technology. This paper fills this gap in the literature. This study explores technologies related to wind energy technologies and identifies the most influential wind energy technologies in practice. This study also extracts useful information and knowledge to identify core corporate patent class (es) in the field of wind energy technology.

Details

Kybernetes, vol. 50 no. 5
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 30 April 2021

Shengyu Guo, Yujia Zhao, Yuqiu Luoren, Kongzheng Liang and Bing Tang

Knowledge discovery related to unsafe behaviors promotes the performance of accident prevention in construction. Although numerous studies on accident causation models have…

Abstract

Purpose

Knowledge discovery related to unsafe behaviors promotes the performance of accident prevention in construction. Although numerous studies on accident causation models have discussed the correlations of unsafe behaviors with various factors (e.g., unsafe conditions), limited research explores correlations between unsafe behaviors within accidents. The purpose of this paper is mining strong association rules of unsafe behaviors from historical accidents to clarify this kind of tacit knowledge.

Design/methodology/approach

A case study was adopted as the research approach, in which accident records from building and urban railway construction in China were selected as data resources. The groups of unsafe behaviors extracted from accident records were expressed by the definitions of unsafe behaviors from safety regulations and operating procedures. Frequent Pattern (FP)-Growth algorithm was used for association rule mining, and the critical correlations between unsafe behaviors were represented by the effective strong rules.

Findings

The findings identify and distinguish correlations between unsafe behaviors within construction accidents. In building construction, workers and managers should pay attention to preventing unsafe behaviors related to personal protective equipment and machines and equipment. In urban railway construction, workers should especially avoid unsafe behaviors of inadequately dealing with environmental factors.

Practical implications

Tacit knowledge is transferred to explicit knowledge as the critical correlations between unsafe behaviors within accidents are determined by the effective strong rules. Additionally, the findings provide practice guidance for safety management, to collaboratively control unsafe behaviors with strong correlations.

Originality/value

This study contributes to the body of safety knowledge in construction and provides a further understanding of how construction accidents are caused by multiple unsafe behaviors.

Details

Engineering, Construction and Architectural Management, vol. 29 no. 4
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 4 June 2020

Alhanouf Abdulrahman Saleh Alsuwailem and Abdul Khader Jilani Saudagar

This paper aims to understand and document the state of the art in the anti-money laundering (AML) systems literature.

1766

Abstract

Purpose

This paper aims to understand and document the state of the art in the anti-money laundering (AML) systems literature.

Design/methodology/approach

A systematic literature review (SLR) is performed using the Saudi Digital Library. The outputs published as conference proceedings, workshop proceedings, journal articles and books were all considered. The final sample size after omitting out-of-scope selections was 27 documents, which mainly span from 2015 to 2020.

Findings

The sample is discussed based on a categorization, which demarcates solutions, machine learning, data sources, evaluation methods, implementation tools, sampling techniques and regions of study.

Originality/value

This SLR could serve as a useful basis for researchers and salient decision-makers, who are seeking to understand the nature and extent of the currently available research into AML systems.

Details

Journal of Money Laundering Control, vol. 23 no. 4
Type: Research Article
ISSN: 1368-5201

Keywords

Open Access
Article
Publication date: 9 December 2019

Zhiwen Pan, Jiangtian Li, Yiqiang Chen, Jesus Pacheco, Lianjun Dai and Jun Zhang

The General Society Survey(GSS) is a kind of government-funded survey which aims at examining the Socio-economic status, quality of life, and structure of contemporary society…

Abstract

Purpose

The General Society Survey(GSS) is a kind of government-funded survey which aims at examining the Socio-economic status, quality of life, and structure of contemporary society. GSS data set is regarded as one of the authoritative source for the government and organization practitioners to make data-driven policies. The previous analytic approaches for GSS data set are designed by combining expert knowledges and simple statistics. By utilizing the emerging data mining algorithms, we proposed a comprehensive data management and data mining approach for GSS data sets.

Design/methodology/approach

The approach are designed to be operated in a two-phase manner: a data management phase which can improve the quality of GSS data by performing attribute pre-processing and filter-based attribute selection; a data mining phase which can extract hidden knowledge from the data set by performing data mining analysis including prediction analysis, classification analysis, association analysis and clustering analysis.

Findings

According to experimental evaluation results, the paper have the following findings: Performing attribute selection on GSS data set can increase the performance of both classification analysis and clustering analysis; all the data mining analysis can effectively extract hidden knowledge from the GSS data set; the knowledge generated by different data mining analysis can somehow cross-validate each other.

Originality/value

By leveraging the power of data mining techniques, the proposed approach can explore knowledge in a fine-grained manner with minimum human interference. Experiments on Chinese General Social Survey data set are conducted at the end to evaluate the performance of our approach.

Details

International Journal of Crowd Science, vol. 3 no. 3
Type: Research Article
ISSN: 2398-7294

Keywords

Open Access
Article
Publication date: 13 November 2018

Zhiwen Pan, Wen Ji, Yiqiang Chen, Lianjun Dai and Jun Zhang

The disability datasets are the datasets that contain the information of disabled populations. By analyzing these datasets, professionals who work with disabled populations can…

1238

Abstract

Purpose

The disability datasets are the datasets that contain the information of disabled populations. By analyzing these datasets, professionals who work with disabled populations can have a better understanding of the inherent characteristics of the disabled populations, so that working plans and policies, which can effectively help the disabled populations, can be made accordingly.

Design/methodology/approach

In this paper, the authors proposed a big data management and analytic approach for disability datasets.

Findings

By using a set of data mining algorithms, the proposed approach can provide the following services. The data management scheme in the approach can improve the quality of disability data by estimating miss attribute values and detecting anomaly and low-quality data instances. The data mining scheme in the approach can explore useful patterns which reflect the correlation, association and interactional between the disability data attributes. Experiments based on real-world dataset are conducted at the end to prove the effectiveness of the approach.

Originality/value

The proposed approach can enable data-driven decision-making for professionals who work with disabled populations.

Details

International Journal of Crowd Science, vol. 2 no. 2
Type: Research Article
ISSN: 2398-7294

Keywords

Article
Publication date: 26 June 2023

Argaw Gurmu, M. Reza Hosseini, Mehrdad Arashpour and Wellia Lioeng

Building defects are becoming recurrent phenomena in most high-rise buildings. However, little research exists on the analysis of defects in high-rise buildings based on data from…

Abstract

Purpose

Building defects are becoming recurrent phenomena in most high-rise buildings. However, little research exists on the analysis of defects in high-rise buildings based on data from real-life projects. This study aims to develop dashboards and models for revealing the most common locations of defects, understanding associations among defects and predicting the rectification periods.

Design/methodology/approach

In total, 15,484 defect reports comprising qualitative and quantitative data were obtained from a company that provides consulting services for the construction industry in Victoria, Australia. Data mining methods were applied using a wide range of Python libraries including NumPy, Pandas, Natural Language Toolkit, SpaCy and Regular Expression, alongside association rule mining (ARM) and simulations.

Findings

Findings reveal that defects in multi-storey buildings often occur on lower levels, rather than on higher levels. Joinery defects were found to be the most recurrent problem on ground floors. The ARM outcomes show that the occurrence of one type of defect can be taken as an indication for the existence of other types of defects. For instance, in laundry, the chance of occurrence of plumbing and joinery defects, where paint defects are observed, is 88%. The stochastic model built for door defects showed that there is a 60% chance that defects on doors can be rectified within 60 days.

Originality/value

The dashboards provide original insight and novel ideas regarding the frequency of defects in various positions in multi-storey buildings. The stochastic models can provide a reliable point of reference for property managers, occupants and sub-contractors for taking measures to avoid reoccurring defects; so too, findings provide estimations of possible rectification periods for various types of defects.

Details

Construction Innovation , vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1471-4175

Keywords

Article
Publication date: 25 February 2020

Wolfram Höpken, Marcel Müller, Matthias Fuchs and Maria Lexhagen

The purpose of this study is to analyse the suitability of photo-sharing platforms, such as Flickr, to extract relevant knowledge on tourists’ spatial movement and point of…

Abstract

Purpose

The purpose of this study is to analyse the suitability of photo-sharing platforms, such as Flickr, to extract relevant knowledge on tourists’ spatial movement and point of interest (POI) visitation behaviour and compare the most prominent clustering approaches to identify POIs in various application scenarios.

Design/methodology/approach

The study, first, extracts photo metadata from Flickr, such as upload time, location and user. Then, photo uploads are assigned to latent POIs by density-based spatial clustering of applications with noise (DBSCAN) and k-means clustering algorithms. Finally, association rule analysis (FP-growth algorithm) and sequential pattern mining (generalised sequential pattern algorithm) are used to identify tourists’ behavioural patterns.

Findings

The approach has been demonstrated for the city of Munich, extracting 13,545 photos for the year 2015. POIs, identified by DBSCAN and k-means clustering, could be meaningfully assigned to well-known POIs. By doing so, both techniques show specific advantages for different usage scenarios. Association rule analysis revealed strong rules (support: 1.0-4.6 per cent; lift: 1.4-32.1 per cent), and sequential pattern mining identified relevant frequent visitation sequences (support: 0.6-1.7 per cent).

Research limitations/implications

As a theoretic contribution, this study comparatively analyses the suitability of different clustering techniques to appropriately identify POIs based on photo upload data as an input to association rule analysis and sequential pattern mining as an alternative but also complementary techniques to analyse tourists’ spatial behaviour.

Practical implications

From a practical perspective, the study highlights that big data sources, such as Flickr, show the potential to effectively substitute traditional data sources for analysing tourists’ spatial behaviour and movement patterns within a destination. Especially, the approach offers the advantage of being fully automatic and executable in a real-time environment.

Originality/value

The study presents an approach to identify POIs by clustering photo uploads on social media platforms and to analyse tourists’ spatial behaviour by association rule analysis and sequential pattern mining. The study gains novel insights into the suitability of different clustering techniques to identify POIs in different application scenarios.

摘要 研究目的

本论文旨在分析图片分享平台Flickr对截取游客空间动线信息和景点(POI)游览行为的适用性, 并且对比最知名的几种聚类分析手段, 以确定不同情况下的POI。

研究设计/方法/途径

本论文首先从Flickr上摘录下图片大数据, 比如上传时间、地点、用户等。其次, 本论文使用DBSCAN和k-means聚类分析参数来将上传图片分配给POI隐性变量。最后, 本论文采用关联规则挖掘分析(FP-growth参数)和序列样式勘探分析(GSP参数)以确认游客行为模式。

研究结果

本论文以慕尼黑城市为样本, 截取2015年13,545张图片。POIs由DBSCAN和k-means聚类分析将其分配到有名的POIs。由此, 本论文证明了两种技术对不同用法的各自优势。关联规则挖掘分析显示了显著联系(support:1%−4.6%;lift:1.4%−32.1%), 序列样式勘探分析确立了相关频率游览次序(support:0.6%−1.7%。

研究理论限制/意义

本论文的理论贡献在于, 根据图片数据, 通过对比分析不同聚类分析技术对确立POIs, 并且证明关联规则挖掘分析和序列样式勘探分析各有千秋又互相补充的分析技术以确立游客空间行为。

研究现实意义

本论文的现实意义在于, 强调了大数据的来源, 比如Flickr,证明了其对于有效代替传统数据的潜力, 以分析在游客在一个旅游目的地的空间行为和动线模式。特别是这种方法实现了实时自动可操作性等优势。

研究原创性/价值

本论文展示了一种方法, 这种方法通过聚类分析社交媒体上的上传图片以确立POIs, 以及通过关联规则挖掘分析和序列样式勘探分析来分析游客空间行为。本论文对于不同聚类分析以确立不同适用情况下的POIs的确立提出了独到见解。

Article
Publication date: 25 April 2022

A. Deiva Ganesh and P. Kalpana

The global pandemic COVID-19 unveils transforming the supply chain (SC) to be more resilient against unprecedented events. Identifying and assessing these risk factors is the most…

1618

Abstract

Purpose

The global pandemic COVID-19 unveils transforming the supply chain (SC) to be more resilient against unprecedented events. Identifying and assessing these risk factors is the most significant phase in supply chain risk management (SCRM). The earlier risk quantification methods make timely decision-making more complex due to their inability to provide early warning. The paper aims to propose a model for analyzing the social media data to understand the potential SC risk factors in real-time.

Design/methodology/approach

In this paper, the potential of text-mining, one of the most popular Artificial Intelligence (AI)-based data analytics approaches for extracting information from social media is exploited. The model retrieves the information using Twitter streaming API from online SC forums.

Findings

The potential risk factors that disrupt SC performance are obtained from the recent data by text-mining analyses. The outcomes carry valuable insights about some contemporary SC issues due to the pandemic during the year 2021. The most frequent risk factors using rule mining techniques are also analyzed.

Originality/value

This study presents the significant role of Twitter in real-time risk identification from online SC platforms like “Supply Chain Dive”, “Supply Chain Brain” and “Supply Chain Digest”. The results indicate the significant role of data analytics in achieving accurate decision-making. Future research will extend to represent a digital twin for identifying potential risks through social media analytics, assessing risk propagation and obtaining mitigation strategies.

Details

Industrial Management & Data Systems, vol. 122 no. 5
Type: Research Article
ISSN: 0263-5577

Keywords

1 – 10 of 41