Search results

1 – 10 of over 42000
Article
Publication date: 22 May 2020

Yuanxin Ouyang, Hongbo Zhang, Wenge Rong, Xiang Li and Zhang Xiong

The purpose of this paper is to propose an attention alignment method for opinion mining of massive open online course (MOOC) comments. Opinion mining is essential for MOOC…

Abstract

Purpose

The purpose of this paper is to propose an attention alignment method for opinion mining of massive open online course (MOOC) comments. Opinion mining is essential for MOOC applications. In this study, the authors analyze some of bidirectional encoder representations from transformers (BERT’s) attention heads and explore how to use these attention heads to extract opinions from MOOC comments.

Design/methodology/approach

The approach proposed is based on an attention alignment mechanism with the following three stages: first, extracting original opinions from MOOC comments with dependency parsing. Second, constructing frequent sets and using the frequent sets to prune the opinions. Third, pruning the opinions and discovering new opinions with the attention alignment mechanism.

Findings

The experiments on the MOOC comments data sets suggest that the opinion mining approach based on an attention alignment mechanism can obtain a better F1 score. Moreover, the attention alignment mechanism can discover some of the opinions filtered incorrectly by the frequent sets, which means the attention alignment mechanism can overcome the shortcomings of dependency analysis and frequent sets.

Originality/value

To take full advantage of pretrained language models, the authors propose an attention alignment method for opinion mining and combine this method with dependency analysis and frequent sets to improve the effectiveness. Furthermore, the authors conduct extensive experiments on different combinations of methods. The results show that the attention alignment method can effectively overcome the shortcomings of dependency analysis and frequent sets.

Details

Information Discovery and Delivery, vol. 50 no. 1
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 13 March 2017

Yan Guo, Minxi Wang and Xin Li

The purpose of this paper is to make the mobile e-commerce shopping more convenient and avoid information overload by a mobile e-commerce recommendation system using an improved…

3379

Abstract

Purpose

The purpose of this paper is to make the mobile e-commerce shopping more convenient and avoid information overload by a mobile e-commerce recommendation system using an improved Apriori algorithm.

Design/methodology/approach

Combined with the characteristics of the mobile e-commerce, an improved Apriori algorithm was proposed and applied to the recommendation system. This paper makes products that are recommended to consumers valuable by improving the data mining efficiency. Finally, a Taobao online dress shop is used as an example to prove the effectiveness of an improved Apriori algorithm in the mobile e-commerce recommendation system.

Findings

The results of the experimental study clearly show that the mobile e-commerce recommendation system based on an improved Apriori algorithm increases the efficiency of data mining to achieve the unity of real time and recommendation accuracy.

Originality/value

The improved Apriori algorithm is applied in the mobile e-commerce recommendation system solving the limitation of the visual interface in a mobile terminal and the mass data that are continuously generated. The proposed recommendation system provides greater prediction accuracy than conventional systems in data mining.

Details

Industrial Management & Data Systems, vol. 117 no. 2
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 20 June 2018

Ramzi A. Haraty and Rouba Nasrallah

The purpose of this paper is to propose a new model to enhance auto-indexing Arabic texts. The model denotes extracting new relevant words by relating those chosen by previous…

2172

Abstract

Purpose

The purpose of this paper is to propose a new model to enhance auto-indexing Arabic texts. The model denotes extracting new relevant words by relating those chosen by previous classical methods to new words using data mining rules.

Design/methodology/approach

The proposed model uses an association rule algorithm for extracting frequent sets containing related items – to extract relationships between words in the texts to be indexed with words from texts that belong to the same category. The associations of words extracted are illustrated as sets of words that appear frequently together.

Findings

The proposed methodology shows significant enhancement in terms of accuracy, efficiency and reliability when compared to previous works.

Research limitations/implications

The stemming algorithm can be further enhanced. In the Arabic language, we have many grammatical rules. The more we integrate rules to the stemming algorithm, the better the stemming will be. Other enhancements can be done to the stop-list. This is by adding more words to it that should not be taken into consideration in the indexing mechanism. Also, numbers should be added to the list as well as using the thesaurus system because it links different phrases or words with the same meaning to each other, which improves the indexing mechanism. The authors also invite researchers to add more pre-requisite texts to have better results.

Originality/value

In this paper, the authors present a full text-based auto-indexing method for Arabic text documents. The auto-indexing method extracts new relevant words by using data mining rules, which has not been investigated before. The method uses an association rule mining algorithm for extracting frequent sets containing related items to extract relationships between words in the texts to be indexed with words from texts that belong to the same category. The benefits of the method are demonstrated using empirical work involving several Arabic texts.

Details

Library Hi Tech, vol. 37 no. 1
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 14 August 2017

Neha Verma and Jatinder Singh

The purpose of this paper is to explore various limitations of conventional mining systems in extracting useful buying patterns from retail transactional databases flooded with…

1868

Abstract

Purpose

The purpose of this paper is to explore various limitations of conventional mining systems in extracting useful buying patterns from retail transactional databases flooded with Big Data. The key objective is to assist retail business owners to better understand the purchase needs of their customers and hence to attract customers to physical retail stores away from competitor e-commerce websites.

Design/methodology/approach

This paper employs a systematic and category-based review of relevant literature to explore the challenges possessed by Big Data for retail industry followed by discussion and implementation of association between MapReduce based Apriori association mining and Hadoop-based intelligent cloud architecture.

Findings

The findings reveal that conventional mining algorithms have not evolved to support Big Data analysis as required by modern retail businesses. They require a lot of resources such as memory and computational engines. This study aims to develop MR-Apriori algorithm in the form of IRM tool to address all these issues in an efficient manner.

Research limitations/implications

The paper suggests that a lot of research is yet to be done in market basket analysis, if full potential of cloud-based Big Data framework is required to be utilized.

Originality/value

This research arms the retail business owners with innovative IRM tool to easily extract comprehensive knowledge of useful buying patterns of customers to increase profits. This study experimentally verifies the effectiveness of proposed algorithm.

Details

Industrial Management & Data Systems, vol. 117 no. 7
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 4 April 2008

C.I. Ezeife, Jingyu Dong and A.K. Aggarwal

The purpose of this paper is to propose a web intrusion detection system (IDS), SensorWebIDS, which applies data mining, anomaly and misuse intrusion detection on web environment.

Abstract

Purpose

The purpose of this paper is to propose a web intrusion detection system (IDS), SensorWebIDS, which applies data mining, anomaly and misuse intrusion detection on web environment.

Design/methodology/approach

SensorWebIDS has three main components: the network sensor for extracting parameters from real‐time network traffic, the log digger for extracting parameters from web log files and the audit engine for analyzing all web request parameters for intrusion detection. To combat web intrusions like buffer‐over‐flow attack, SensorWebIDS utilizes an algorithm based on standard deviation (δ) theory's empirical rule of 99.7 percent of data lying within 3δ of the mean, to calculate the possible maximum value length of input parameters. Association rule mining technique is employed for mining frequent parameter list and their sequential order to identify intrusions.

Findings

Experiments show that proposed system has higher detection rate for web intrusions than SNORT and mod security for such classes of web intrusions like cross‐site scripting, SQL‐Injection, session hijacking, cookie poison, denial of service, buffer overflow, and probes attacks.

Research limitations/implications

Future work may extend the system to detect intrusions implanted with hacking tools and not through straight HTTP requests or intrusions embedded in non‐basic resources like multimedia files and others, track illegal web users with their prior web‐access sequences, implement minimum and maximum values for integer data, and automate the process of pre‐processing training data so that it is clean and free of intrusion for accurate detection results.

Practical implications

Web service security, as a branch of network security, is becoming more important as more business and social activities are moved online to the web.

Originality/value

Existing network IDSs are not directly applicable to web intrusion detection, because these IDSs are mostly sitting on the lower (network/transport) level of network model while web services are running on the higher (application) level. Proposed SensorWebIDS detects XSS and SQL‐Injection attacks through signatures, while other types of attacks are detected using association rule mining and statistics to compute frequent parameter list order and their maximum value lengths.

Details

International Journal of Web Information Systems, vol. 4 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 5 July 2023

Yuxiang Shan, Qin Ren, Gang Yu, Tiantian Li and Bin Cao

Internet marketing underground industry users refer to people who use technology means to simulate a large number of real consumer behaviors to obtain marketing activities rewards…

Abstract

Purpose

Internet marketing underground industry users refer to people who use technology means to simulate a large number of real consumer behaviors to obtain marketing activities rewards illegally, which leads to increased cost of enterprises and reduced effect of marketing. Therefore, this paper aims to construct a user risk assessment model to identify potential underground industry users to protect the interests of real consumers and reduce the marketing costs of enterprises.

Design/methodology/approach

Method feature extraction is based on two aspects. The first aspect is based on traditional statistical characteristics, using density-based spatial clustering of applications with noise clustering method to obtain user-dense regions. According to the total number of users in the region, the corresponding risk level of the receiving address is assigned. So that high-quality address information can be extracted. The second aspect is based on the time period during which users participate in activities, using frequent item set mining to find multiple users with similar operations within the same time period. Extract the behavior flow chart according to the user participation, so that the model can mine the deep relationship between the participating behavior and the underground industry users.

Findings

Based on the real underground industry user data set, the features of the data set are extracted by the proposed method. The features are experimentally verified by different models such as random forest, fully-connected layer network, SVM and XGBOST, and the proposed method is comprehensively evaluated. Experimental results show that in the best case, our method can improve the F1-score of traditional models by 55.37%.

Originality/value

This paper investigates the relative importance of static information and dynamic behavior characteristics of users in predicting underground industry users, and whether the absence of features of these categories affects the prediction results. This investigation can go a long way in aiding further research on this subject and found the features which improved the accuracy of predicting underground industry users.

Details

International Journal of Web Information Systems, vol. 19 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 19 May 2020

Praveen Kumar Gopagoni and Mohan Rao S K

Association rule mining generates the patterns and correlations from the database, which requires large scanning time, and the cost of computation associated with the generation…

Abstract

Purpose

Association rule mining generates the patterns and correlations from the database, which requires large scanning time, and the cost of computation associated with the generation of the rules is quite high. On the other hand, the candidate rules generated using the traditional association rules mining face a huge challenge in terms of time and space, and the process is lengthy. In order to tackle the issues of the existing methods and to render the privacy rules, the paper proposes the grid-based privacy association rule mining.

Design/methodology/approach

The primary intention of the research is to design and develop a distributed elephant herding optimization (EHO) for grid-based privacy association rule mining from the database. The proposed method of rule generation is processed as two steps: in the first step, the rules are generated using apriori algorithm, which is the effective association rule mining algorithm. In general, the extraction of the association rules from the input database is based on confidence and support that is replaced with new terms, such as probability-based confidence and holo-entropy. Thus, in the proposed model, the extraction of the association rules is based on probability-based confidence and holo-entropy. In the second step, the generated rules are given to the grid-based privacy rule mining, which produces privacy-dependent rules based on a novel optimization algorithm and grid-based fitness. The novel optimization algorithm is developed by integrating the distributed concept in EHO algorithm.

Findings

The experimentation of the method using the databases taken from the Frequent Itemset Mining Dataset Repository to prove the effectiveness of the distributed grid-based privacy association rule mining includes the retail, chess, T10I4D100K and T40I10D100K databases. The proposed method outperformed the existing methods through offering a higher degree of privacy and utility, and moreover, it is noted that the distributed nature of the association rule mining facilitates the parallel processing and generates the privacy rules without much computational burden. The rate of hiding capacity, the rate of information preservation and rate of the false rules generated for the proposed method are found to be 0.4468, 0.4488 and 0.0654, respectively, which is better compared with the existing rule mining methods.

Originality/value

Data mining is performed in a distributed manner through the grids that subdivide the input data, and the rules are framed using the apriori-based association mining, which is the modification of the standard apriori with the holo-entropy and probability-based confidence replacing the support and confidence in the standard apriori algorithm. The mined rules do not assure the privacy, and hence, the grid-based privacy rules are employed that utilize the adaptive elephant herding optimization (AEHO) for generating the privacy rules. The AEHO inherits the adaptive nature in the standard EHO, which renders the global optimal solution.

Details

Data Technologies and Applications, vol. 54 no. 3
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 12 June 2019

Hu Qiao, Qingyun Wu, Songlin Yu, Jiang Du and Ying Xiang

The purpose of this paper is to propose a three-dimensional (3D) assembly model retrieval method based on assembling semantic information to address semantic mismatches, poor…

Abstract

Purpose

The purpose of this paper is to propose a three-dimensional (3D) assembly model retrieval method based on assembling semantic information to address semantic mismatches, poor accuracy and low efficiency in existing 3D assembly model retrieval methods.

Design/methodology/approach

The paper proposes an assembly model retrieval method. First, assembly information retrieval is performed, and 3D models that conform to the design intention of the assembly are found by retrieving the code. On this basis, because there are conjugate subgraphs between attributed adjacency graphs (AAG) that have an assembly relationship, the assembly model geometric retrieval is translated into a problem of finding AAGs with a conjugate subgraph. Finally, the frequent subgraph mining method is used to retrieve AAGs with conjugate subgraphs.

Findings

The method improved the efficiency and accuracy of assembly model retrieval.

Practical implications

The examples illustrate the specific retrieval process and verify the feasibility and reasonability of the assembly model retrieval method in practical applications.

Originality/value

The assembly model retrieval method in the paper is an original method. Compared with other methods, good results were obtained.

Details

Assembly Automation, vol. 39 no. 4
Type: Research Article
ISSN: 0144-5154

Keywords

Article
Publication date: 6 March 2017

Ruoyu Liang, Wei Guo and Deqing Yang

Analyzing the sentiment orientation of each product aspect/feature might be sufficient to assist the customer to make purchase/usage decisions, but such level of information…

Abstract

Purpose

Analyzing the sentiment orientation of each product aspect/feature might be sufficient to assist the customer to make purchase/usage decisions, but such level of information obtained by sentiment analysis is not detailed enough to assist the company in making product improvement or design decisions. Therefore, this paper aims to propose a novel method to extract more detailed information of the product.

Design/methodology/approach

This paper proposed to use a set of trivial lexical-Part-of-Speech patterns to prepare candidate corpus and then adopted a topic model to find the optimal number of topics and get the words distributions in each topic. Finally, combined a priori analysis and compactness rules, the authors found out the expected strong rules in each topic, which make up the final problems.

Findings

Experimental results on a real-life data set from Xiaomi forum showed the proposed method can extract the product problems effectively. The authors also explained the errors of experiment, which suggested the direction for future research.

Originality/value

This paper proposed a novel method to obtain information of product problems in detail, which will be useful to assist companies to improve their product performance.

Details

Kybernetes, vol. 46 no. 3
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 11 March 2014

Sandra Maria Correia Loureiro, Francisco J. Miranda and Michael Breazeale

The aim of this study is to determine whether the cumulative effects of satisfaction, trust, and perceived value may, under certain conditions, provide more explanatory power for…

3066

Abstract

Purpose

The aim of this study is to determine whether the cumulative effects of satisfaction, trust, and perceived value may, under certain conditions, provide more explanatory power for customer loyalty intentions than the often studied and more elusive customer delight. Herzberg's two-factor theory is used to explain why the frequent nature of grocery shopping, a primarily utilitarian experience, might introduce considerations that have not yet been addressed in the study of delight.

Design/methodology/approach

A survey is administered to a quota sample of Portuguese supermarket shoppers via phone, using a CATI system.

Findings

Research findings suggest that perceived value, trust, and satisfaction have a greater impact on behavioural outcomes than customer delight in the grocery shopping setting. In such a setting, cognitive drivers may be even more important for customers who are primarily concerned with hygiene factors (rather than motivators).

Research limitations/implications

Retailers are encouraged to focus on the more mundane factors that influence consumers' perceptions of value and trust rather than trying to invest in the substantial resources required to continually delight consumers. Future research may explore other determinants of loyalty intentions and test the extended model in different service sectors, cultural contexts and countries.

Originality/value

This study applies Oliver et al.'s consumer delight model in a utilitarian, frequent-use setting, finding previously undiscovered limitations to its validity.

Details

Journal of Service Management, vol. 25 no. 1
Type: Research Article
ISSN: 1757-5818

Keywords

1 – 10 of over 42000