Search results
1 – 10 of over 5000Gi Woong Yun, Jay Ford, Robert P. Hawkins, Suzanne Pingree, Fiona McTavish, David Gustafson and Haile Berhe
This paper seeks to discuss measurement units by comparing the internet use and the traditional media use, and to understand internet use from the traditional media use…
Abstract
Purpose
This paper seeks to discuss measurement units by comparing the internet use and the traditional media use, and to understand internet use from the traditional media use perspective.
Design/methodology/approach
Benefits and shortcomings of two log file types will be carefully and exhaustively examined. Client‐side and server‐side log files will be analyzed and compared with proposed units of analysis.
Findings
Server‐side session time calculation was remarkably reliable and valid based on the high correlation with the client‐side time calculation. The analysis result revealed that the server‐side log file session time measurement seems more promising than the researchers previously speculated.
Practical implications
An ability to identify each individual user and low caching problems were strong advantages for the analysis. Those web design implementations and web log data analysis scheme are recommended for future web log analysis research.
Originality/value
This paper examined the validity of the client‐side and the server‐side web log data. As a result of the triangulation of two datasets, research designs and propose analysis schemes could be recommended.
Details
Keywords
Most electronic journals are now Web‐based. This paper introduces the method of WWW server log file analysis and its application to evaluating electronic journals services and in…
Abstract
Most electronic journals are now Web‐based. This paper introduces the method of WWW server log file analysis and its application to evaluating electronic journals services and in monitoring their usage. Following a short description on the method and its possible application, the main results of a study of WWW server log file analysis of the electronic journal “Review of Information Science” will be presented and discussed. Finally, several concluding remarks will be given.
As has been described else where, web log files are a useful source of information about visitor site use, navigation behaviour, and, to some extent, demographics. But log files…
Abstract
As has been described else where, web log files are a useful source of information about visitor site use, navigation behaviour, and, to some extent, demographics. But log files can also reveal the existence of both web pages and search engine queries that are sources of new visitors.This study extracts such information from a single web log files and uses it to illustrate its value, not only to th site owner but also to those interested in investigating the online behaviour of web users.
Details
Keywords
David Nicholas, Paul Huntington, Peter Williams, Nat Lievesley, Tom Dobrowolski and Richard Withey
There is a general dearth of trustworthy information on who is using the web and how they use it. Such information is of vital concern to web managers and their advertisers yet…
Abstract
There is a general dearth of trustworthy information on who is using the web and how they use it. Such information is of vital concern to web managers and their advertisers yet the systems for delivering such data, where in place, generally cannot supply accurate enough data. Nor have web managers the expertise or time to evaluate the enormous amounts of information that are generated by web sites. The article, based on the experience of evaluating The Times web server access logs, describes the methodological problems that lie at the heart of web log analysis, evaluates a range of use measures (visits, page impressions, hits) and provides some advice on what analyses are worth conducting.
Details
Keywords
C.I. Ezeife, Jingyu Dong and A.K. Aggarwal
The purpose of this paper is to propose a web intrusion detection system (IDS), SensorWebIDS, which applies data mining, anomaly and misuse intrusion detection on web environment.
Abstract
Purpose
The purpose of this paper is to propose a web intrusion detection system (IDS), SensorWebIDS, which applies data mining, anomaly and misuse intrusion detection on web environment.
Design/methodology/approach
SensorWebIDS has three main components: the network sensor for extracting parameters from real‐time network traffic, the log digger for extracting parameters from web log files and the audit engine for analyzing all web request parameters for intrusion detection. To combat web intrusions like buffer‐over‐flow attack, SensorWebIDS utilizes an algorithm based on standard deviation (δ) theory's empirical rule of 99.7 percent of data lying within 3δ of the mean, to calculate the possible maximum value length of input parameters. Association rule mining technique is employed for mining frequent parameter list and their sequential order to identify intrusions.
Findings
Experiments show that proposed system has higher detection rate for web intrusions than SNORT and mod security for such classes of web intrusions like cross‐site scripting, SQL‐Injection, session hijacking, cookie poison, denial of service, buffer overflow, and probes attacks.
Research limitations/implications
Future work may extend the system to detect intrusions implanted with hacking tools and not through straight HTTP requests or intrusions embedded in non‐basic resources like multimedia files and others, track illegal web users with their prior web‐access sequences, implement minimum and maximum values for integer data, and automate the process of pre‐processing training data so that it is clean and free of intrusion for accurate detection results.
Practical implications
Web service security, as a branch of network security, is becoming more important as more business and social activities are moved online to the web.
Originality/value
Existing network IDSs are not directly applicable to web intrusion detection, because these IDSs are mostly sitting on the lower (network/transport) level of network model while web services are running on the higher (application) level. Proposed SensorWebIDS detects XSS and SQL‐Injection attacks through signatures, while other types of attacks are detected using association rule mining and statistics to compute frequent parameter list order and their maximum value lengths.
Details
Keywords
Alesia Zuccala, Mike Thelwall, Charles Oppenheim and Rajveen Dhiensa
The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the…
Abstract
Purpose
The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the National electronic Library for Health (NeLH).
Design/methodology/approach
The Web intelligence techniques in this study are a combination of link analysis (web structure mining), web server log file analysis (web usage mining), and text analysis (web content mining), utilizing the power of commercial search engines and drawing upon the information science fields of bibliometrics and webometrics. LexiURL is a computer program designed to calculate summary statistics for lists of links or URLs. Its output is a series of standard reports, for example listing and counting all of the different domain names in the data.
Findings
Link data, when analysed together with user transaction log files (i.e. Web referring domains) can provide insights into who is using a digital library and when, and who could be using the digital library if they are “surfing” a particular part of the Web; in this case any site that is linked to or colinked with the NeLH. This study found that the NeLH was embedded in a multifaceted Web context, including many governmental, educational, commercial and organisational sites, with the most interesting being sites from the.edu domain, representing American Universities. Not many links directed to the NeLH were followed on September 25, 2005 (the date of the log file analysis and link extraction analysis), which means that users who access the digital library have been arriving at the site via only a few select links, bookmarks and search engine searches, or non‐electronic sources.
Originality/value
A number of studies concerning digital library users have been carried out using log file analysis as a research tool. Log files focus on real‐time user transactions; while LexiURL can be used to extract links and colinks associated with a digital library's growing Web network. This Web network is not recognized often enough, and can be a useful indication of where potential users are surfing, even if they have not yet specifically visited the NeLH site.
Details
Keywords
Khaled A. Mohamed and Ahmed Hassan
This paper aims to examine the behaviour of the Egyptian scholars while accessing electronic resources through two federated search tools. The main purpose of this article is to…
Abstract
Purpose
This paper aims to examine the behaviour of the Egyptian scholars while accessing electronic resources through two federated search tools. The main purpose of this article is to provide guidance for federated search tool technicians and support teams about user issues, including the need for training.
Design/methodology/approach
Log files were exploited to examine the behaviour of users of information retrieval systems. This study examined two log files extracted from federated search tools available to the Egyptian scholars' community for accessing electronic resources. A data mining approach was implemented to investigate user behaviour through deep analysis of these logs.
Findings
Results show that: none of the available tools provide error messages for dummy queries; most of the Egyptian scholars had short queries; Boolean operators are not used in about 50 per cent of the queries; federated search tools do not provide techniques for query reformation; the optimal days for system maintenance are the non‐weekend vacations; and early morning is the best time for maintenance.
Practical implications
To maximise the value of the federated search tools by understanding user trends when utilising federated search tools. The study shows that more attention should be given to the search capabilities through ongoing training and awareness in order to maximise the benefit from the available resources and tools.
Originality/value
The hypothetical value of the federated search tools has not been previously examined and analysed to understand user trends.
Details
Keywords
Majdi A. Maabreh, Mohammed N. Al‐Kabi and Izzat M. Alsmadi
This study is an attempt to develop an automatic identification method for Arabic web queries and divide them into several query types using data mining. In addition, it seeks to…
Abstract
Purpose
This study is an attempt to develop an automatic identification method for Arabic web queries and divide them into several query types using data mining. In addition, it seeks to evaluate the impact of the academic environment on using the internet.
Design/methodology/approach
The web log files were collected from one of the higher institute's servers over a one‐month period. A special program was designed and implemented to extract web search queries from these files and also to automatically classify Arabic queries into three query types (i.e. Navigational, Transactional, and Informational queries) based on predefined specifications for each type.
Findings
The results indicate that students are slowly and gradually using the internet for more relevant academic purposes. Tests showed that it is possible to automatically classify Arabic queries based on query terms, with 80.6 per cent to 80.2 per cent accuracy for the two phases of the test respectively. In their future strategies, Jordanian universities should apply methods to encourage university students to use the internet for academic purposes. Web search engines in general and Arabic search engines in particular may benefit from the proposed classification method in order to improve the effectiveness and relevancy of their results in accordance with users' needs.
Originality/value
Studying internet web logs has been the subject of many papers. However, the particular domain, and the specific focuses on this research are what can distinguish it from the others.
Details
Keywords
Stefan Strohmeier and Franca Piazza
Numerous research questions in e-HRM research are directly related to the usage of diverse information systems by HR professionals, line managers, employees, and/or applicants…
Abstract
Numerous research questions in e-HRM research are directly related to the usage of diverse information systems by HR professionals, line managers, employees, and/or applicants. Since they are regularly based on Internet technologies, information systems in e-HRM automatically store detailed usage data in log files of web servers. Subsumed as “web mining,” such data are frequently used as inputs for innovative data analysis in e-commerce practice. Though also promising in empirical e-HRM research, web mining is neither discussed nor applied in this area at present. Our chapter therefore aims at a methodological evaluation of web mining as an e-HRM research approach. After introducing web mining as a possible approach in e-HRM research, we examine its applicability by discussing available data, feasible methods, coverable topics, and confirmable privacy. Subsequently, we classify the approach methodologically by examining major issues. Our evaluation reveals that “web mining” constitutes a promising additional research approach that enables research to answer numerous relevant questions related to the actual usage of information systems in e-HRM.
The purpose of this article is to alert researchers to software for web tracking of information seeking behaviour, and to offer a list of criteria that will make it easier to…
Abstract
Purpose
The purpose of this article is to alert researchers to software for web tracking of information seeking behaviour, and to offer a list of criteria that will make it easier to select software. A selection of research projects based on web tracking as well as the benefits and disadvantages of web tracking are also explored.
Design/methodology/approach
An overview of the literature, including clarification of key concepts, a brief overview of studies of web information seeking behaviour based on web tracking, identification of software used, as well as the strengths and short‐comings noted for web tracking is used as a background to the identification of criteria for the selection of web tracking software.
Findings
Web tracking can offer very valuable information for the development of websites, portals, digital libraries, etc. It, however, needs to be supplemented by qualitative studies, and researchers need to ensure that the tracking software will collect the data required.
Research limitations/implications
The criteria is not applied to any software in particular.
Practical implications
The criteria can be used by researchers working on web usage and web information seeking behaviour to select suitable tracking software.
Originality/value
Although there are many reports on the use of web tracking (also reported in this article), nothing could be traced on criteria for the evaluation of web tracking software.
Details