Search results
1 – 10 of over 14000Stefan Strohmeier and Franca Piazza
Numerous research questions in e-HRM research are directly related to the usage of diverse information systems by HR professionals, line managers, employees, and/or applicants…
Abstract
Numerous research questions in e-HRM research are directly related to the usage of diverse information systems by HR professionals, line managers, employees, and/or applicants. Since they are regularly based on Internet technologies, information systems in e-HRM automatically store detailed usage data in log files of web servers. Subsumed as “web mining,” such data are frequently used as inputs for innovative data analysis in e-commerce practice. Though also promising in empirical e-HRM research, web mining is neither discussed nor applied in this area at present. Our chapter therefore aims at a methodological evaluation of web mining as an e-HRM research approach. After introducing web mining as a possible approach in e-HRM research, we examine its applicability by discussing available data, feasible methods, coverable topics, and confirmable privacy. Subsequently, we classify the approach methodologically by examining major issues. Our evaluation reveals that “web mining” constitutes a promising additional research approach that enables research to answer numerous relevant questions related to the actual usage of information systems in e-HRM.
Yue‐Shi Lee, Show‐Jane Yen and Min‐Chi Hsieh
Web mining is one of the mining technologies, which applies data mining techniques in large amount of web data to improve the web services. Web traversal pattern mining discovers…
Abstract
Web mining is one of the mining technologies, which applies data mining techniques in large amount of web data to improve the web services. Web traversal pattern mining discovers most of the users’ access patterns from web logs. This information can provide the navigation suggestions for web users such that appropriate actions can be adopted. However, the web data will grow rapidly in the short time, and some of the web data may be antiquated. The user behaviors may be changed when the new web data is inserted into and the old web data is deleted from web logs. Besides, it is considerably difficult to select a perfect minimum support threshold during the mining process to find the interesting rules. Even though the experienced experts, they also cannot determine the appropriate minimum support. Thus, we must constantly adjust the minimum support until the satisfactory mining results can be found. The essences of incremental or interactive data mining are that we can use the previous mining results to reduce the unnecessary processes when the minimum support is changed or web logs are updated. In this paper, we propose efficient incremental and interactive data mining algorithms to discover web traversal patterns and make the mining results to satisfy the users’ requirements. The experimental results show that our algorithms are more efficient than the other approaches.
Details
Keywords
Qingyu Zhang and Richard S. Segall
The purpose of this paper is to review and compare selected software for data mining, text mining (TM), and web mining that are not available as free open‐source software.
Abstract
Purpose
The purpose of this paper is to review and compare selected software for data mining, text mining (TM), and web mining that are not available as free open‐source software.
Design/methodology/approach
Selected softwares are compared with their common and unique features. The software for data mining are SAS® Enterprise Miner™, Megaputer PolyAnalyst® 5.0, NeuralWare Predict®, and BioDiscovery GeneSight®. The software for TM are CompareSuite, SAS® Text Miner, TextAnalyst, VisualText, Megaputer PolyAnalyst® 5.0, and WordStat. The software for web mining are Megaputer PolyAnalyst®, SPSS Clementine®, ClickTracks, and QL2.
Findings
This paper discusses and compares the existing features, characteristics, and algorithms of selected software for data mining, TM, and web mining, respectively. These softwares are also applied to available data sets.
Research limitations/implications
The limitations are the inclusion of selected software and datasets rather than considering the entire realm of these. This review could be used as a framework for comparing other data, text, and web mining software.
Practical implications
This paper can be helpful for an organization or individual when choosing proper software to meet their mining needs.
Originality/value
Each of the software selected for this research has its own unique characteristics, properties, and algorithms. No other paper compares these selected softwares both visually and descriptively for all the three types of data, text, and web mining.
Details
Keywords
Richard S. Segall and Qingyu Zhang
The purpose of this paper is to illustrate the usefulness and results of applying web mining as extensions of data mining.
Abstract
Purpose
The purpose of this paper is to illustrate the usefulness and results of applying web mining as extensions of data mining.
Design/methodology/approach
Web mining is performed using three selected software to databases related to customer survey, marketing campaign data, and web site usage. The three selected software are PolyAnalyst® of Megaputer Intelligence, Inc., SPSS Clementine®, and ClickTracks by Web Analytics.
Findings
This paper discusses and compares the web mining technologies used by the selected software as applied to text, web, and click stream data.
Research limitations/implications
The limitations include the availability of databases and software to perform the web mining. The implications include that this methodology can be extended to other databases.
Practical implications
The methodology used in this paper could be representative of that used for managers to manage their relationships with customers, their marketing campaigns, and their web site activities.
Originality/value
PolyAnalyst is applied to analyze text data of actual written hotel comments. SPSS Clementine is applied to customer web data collected in response to several different marketing campaigns, including age, gender, and income. ClickTracks is applied to click‐stream data for Bob's Fruit web site to generate click fraud report, search report with revenues, pay‐per‐click, and search keywords for all visitors.
Details
Keywords
Tim France, Dave Yen, Jyun‐Cheng Wang and Chia‐Ming Chang
In recent years, the World Wide Web (WWW) has become incredibly popular in homes and offices alike. Consumers need to search for relevant information to help solve purchasing…
Abstract
In recent years, the World Wide Web (WWW) has become incredibly popular in homes and offices alike. Consumers need to search for relevant information to help solve purchasing problems on various Web sites. Although there is no question that great numbers of WWW users will continue using search engines for information retrieval, consumers still hesitate before making a final decision, often because only rough and limited information about the products is made available. Consequently, consumers need the help of data mining in order to help them make informed decisions. Herein we propose a new approach to integrating a search engine with data mining in an effort to help support customer‐oriented information search action. This approach also illustrates how to reduce the consumer’s information search perplexity.
Details
Keywords
Purpose – This study aims to introduce an application of web‐based data mining that integrates online data collection and data mining in selling strategies for online auctions…
Abstract
Purpose – This study aims to introduce an application of web‐based data mining that integrates online data collection and data mining in selling strategies for online auctions. This study seeks to illustrate the process of spider online data collection from eBay and the application of the classification and regression tree (CART) in constructing effective selling strategies. Design/methodology/approach – After developing a prototype of web‐based data mining, the four steps of spider online data collection and CART data mining are shown. A business dataset from eBay is collected, and the application to derive effective selling strategies for online auctions is used. Findings – In the web‐based data‐mining application the spiders can effectively and efficiently collect online auction data from the internet, and the CART model provides sellers with effective selling strategies. By using expected auction prices with the classification and regression trees, sellers can integrate their two primary goals, i.e. auction success and anticipated prices, in their selling strategies for online auctions. Practical implications – This study provides sellers with a useful tool to construct effective selling strategies by taking advantage of web‐based data mining. These effective selling strategies will help improve their online auction performance. Originality/value – This study contributes to the literature by providing an innovative tool for collecting online data and for constructing effective selling strategies, which are important for the growth of electronic marketplaces.
Details
Keywords
Rong Gu, Miaoliang Zhu, Liying Zhao and Ningning Zhang
Behaviour in virtual learning environments (VLE), including travel, gaze, manipulate, gesture and conversation, offer considerable information about the user's implicit interest…
Abstract
Purpose
Behaviour in virtual learning environments (VLE), including travel, gaze, manipulate, gesture and conversation, offer considerable information about the user's implicit interest. The purpose of this study is to find an approach for user interest mining via behaviour analysis in a VLE.
Design/methodology/approach
According to research in psychology, any interaction in a VLE has implications for the user's implicit interest. In order to mine a user's implicit interest, an explicit interaction‐interest model needs to be established. This paper presents findings from the concept classification of behaviour in a VLE. Based on this classification, the paper proposes a hierarchical interaction model. In this model the relation between interaction and user interest can be described and used to improve system performance.
Findings
In the experimental prototype the authors found that user‐implicit interest could be mined via stages of web mining, i.e. capture the user's original gesture signal, data pre‐process, pattern discovery, interaction goal and interest mining. The mined user's interest information can be used to update the state of local interest, leading to a reduction in network traffic and promotion of better system performance.
Originality/value
This is an original study using behaviour analysis for interest mining in e‐learning. Research on interest mining in e‐learning focused on content mining or search engine and usage mining in web courses. The paper provides valuable clues regarding user interest mining in a VLE, in which the context is different from usual web courses. The research output can be implemented widely, including online learning, and especially in the VLE.
Details
Keywords
Guillermo Navarro‐Arribas and Vicenç Torra
The purpose of this paper is to anonymize web server log files used in e‐commerce web mining processes.
Abstract
Purpose
The purpose of this paper is to anonymize web server log files used in e‐commerce web mining processes.
Design/methodology/approach
The paper has applied statistical disclosure control (SDC) techniques to achieve its goal. More precisely, it has introduced the micro‐aggregation of web access logs.
Findings
The experiments show that the proposed technique provides good results in general, but it is especially outstanding when dealing with relatively small websites.
Research limitations/implications
As in all SDC techniques there is always a trade‐off between privacy and utility or, in other words, between disclosure risk and information loss. In this proposal, it has borne this issue in mind, providing k‐anonymity, while preserving acceptable information accuracy.
Practical implications
Web server logs are valuable information used nowadays for user profiling and general data‐mining analysis of a website in e‐commerce and e‐services. This proposal allows anonymizing such logs, so they can be safely outsourced to other companies for marketing purposes, stored for further analysis, or made publicly available, without risking customer privacy.
Originality/value
Current solutions to the problem presented here are very poor and scarce. They are normally reduced to the elimination of sensitive information from query strings of URLs in general. Moreover, to its knowledge, the use of SDC techniques has never been applied to the anonymization of web logs.
Details
Keywords
C.I. Ezeife, Jingyu Dong and A.K. Aggarwal
The purpose of this paper is to propose a web intrusion detection system (IDS), SensorWebIDS, which applies data mining, anomaly and misuse intrusion detection on web environment.
Abstract
Purpose
The purpose of this paper is to propose a web intrusion detection system (IDS), SensorWebIDS, which applies data mining, anomaly and misuse intrusion detection on web environment.
Design/methodology/approach
SensorWebIDS has three main components: the network sensor for extracting parameters from real‐time network traffic, the log digger for extracting parameters from web log files and the audit engine for analyzing all web request parameters for intrusion detection. To combat web intrusions like buffer‐over‐flow attack, SensorWebIDS utilizes an algorithm based on standard deviation (δ) theory's empirical rule of 99.7 percent of data lying within 3δ of the mean, to calculate the possible maximum value length of input parameters. Association rule mining technique is employed for mining frequent parameter list and their sequential order to identify intrusions.
Findings
Experiments show that proposed system has higher detection rate for web intrusions than SNORT and mod security for such classes of web intrusions like cross‐site scripting, SQL‐Injection, session hijacking, cookie poison, denial of service, buffer overflow, and probes attacks.
Research limitations/implications
Future work may extend the system to detect intrusions implanted with hacking tools and not through straight HTTP requests or intrusions embedded in non‐basic resources like multimedia files and others, track illegal web users with their prior web‐access sequences, implement minimum and maximum values for integer data, and automate the process of pre‐processing training data so that it is clean and free of intrusion for accurate detection results.
Practical implications
Web service security, as a branch of network security, is becoming more important as more business and social activities are moved online to the web.
Originality/value
Existing network IDSs are not directly applicable to web intrusion detection, because these IDSs are mostly sitting on the lower (network/transport) level of network model while web services are running on the higher (application) level. Proposed SensorWebIDS detects XSS and SQL‐Injection attacks through signatures, while other types of attacks are detected using association rule mining and statistics to compute frequent parameter list order and their maximum value lengths.
Details
Keywords
Baoyao Zhou, Siu Cheung Hui and Alvis C. M. Fong
With the explosive growth of information available on the World Wide Web, it has become much more difficult to access relevant information from the Web. One possible approach to…
Abstract
With the explosive growth of information available on the World Wide Web, it has become much more difficult to access relevant information from the Web. One possible approach to solve this problem is web personalization. In this paper, we propose a novel WUL (Web Usage Lattice) based mining approach for mining association access pattern rules for personalized web recommendations. The proposed approach aims to mine a reduced set of effective association pattern rules for enhancing the online performance of web recommendations. We have incorporated the proposed approach into a personalized web recommender system known as AWARS. The performance of the proposed approach is evaluated based on the efficiency and the quality. In the efficiency evaluation, we measure the number of generated rules and the runtime for online recommendations. In the quality evaluation, we measure the quality of the recommendation service based on precision, satisfactory and applicability. This paper will discuss the proposed WUL‐based mining approach, and give the performance of the proposed approach in comparison with the Apriori‐based algorithms.
Details