Search results

1 – 10 of over 2000
Article
Publication date: 1 October 2006

Gi Woong Yun, Jay Ford, Robert P. Hawkins, Suzanne Pingree, Fiona McTavish, David Gustafson and Haile Berhe

This paper seeks to discuss measurement units by comparing the internet use and the traditional media use, and to understand internet use from the traditional media use…

Abstract

Purpose

This paper seeks to discuss measurement units by comparing the internet use and the traditional media use, and to understand internet use from the traditional media use perspective.

Design/methodology/approach

Benefits and shortcomings of two log file types will be carefully and exhaustively examined. Client‐side and server‐side log files will be analyzed and compared with proposed units of analysis.

Findings

Server‐side session time calculation was remarkably reliable and valid based on the high correlation with the client‐side time calculation. The analysis result revealed that the server‐side log file session time measurement seems more promising than the researchers previously speculated.

Practical implications

An ability to identify each individual user and low caching problems were strong advantages for the analysis. Those web design implementations and web log data analysis scheme are recommended for future web log analysis research.

Originality/value

This paper examined the validity of the client‐side and the server‐side web log data. As a result of the triangulation of two datasets, research designs and propose analysis schemes could be recommended.

Details

Internet Research, vol. 16 no. 5
Type: Research Article
ISSN: 1066-2243

Keywords

Article
Publication date: 4 April 2008

C.I. Ezeife, Jingyu Dong and A.K. Aggarwal

The purpose of this paper is to propose a web intrusion detection system (IDS), SensorWebIDS, which applies data mining, anomaly and misuse intrusion detection on web environment.

Abstract

Purpose

The purpose of this paper is to propose a web intrusion detection system (IDS), SensorWebIDS, which applies data mining, anomaly and misuse intrusion detection on web environment.

Design/methodology/approach

SensorWebIDS has three main components: the network sensor for extracting parameters from real‐time network traffic, the log digger for extracting parameters from web log files and the audit engine for analyzing all web request parameters for intrusion detection. To combat web intrusions like buffer‐over‐flow attack, SensorWebIDS utilizes an algorithm based on standard deviation (δ) theory's empirical rule of 99.7 percent of data lying within 3δ of the mean, to calculate the possible maximum value length of input parameters. Association rule mining technique is employed for mining frequent parameter list and their sequential order to identify intrusions.

Findings

Experiments show that proposed system has higher detection rate for web intrusions than SNORT and mod security for such classes of web intrusions like cross‐site scripting, SQL‐Injection, session hijacking, cookie poison, denial of service, buffer overflow, and probes attacks.

Research limitations/implications

Future work may extend the system to detect intrusions implanted with hacking tools and not through straight HTTP requests or intrusions embedded in non‐basic resources like multimedia files and others, track illegal web users with their prior web‐access sequences, implement minimum and maximum values for integer data, and automate the process of pre‐processing training data so that it is clean and free of intrusion for accurate detection results.

Practical implications

Web service security, as a branch of network security, is becoming more important as more business and social activities are moved online to the web.

Originality/value

Existing network IDSs are not directly applicable to web intrusion detection, because these IDSs are mostly sitting on the lower (network/transport) level of network model while web services are running on the higher (application) level. Proposed SensorWebIDS detects XSS and SQL‐Injection attacks through signatures, while other types of attacks are detected using association rule mining and statistics to compute frequent parameter list order and their maximum value lengths.

Details

International Journal of Web Information Systems, vol. 4 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 13 April 2012

Ka I. Pun, Yain Whar Si and Kin Chan Pau

Intensive traffic often occurs in web‐enabled business processes hosted by travel industry and government portals. An extreme case for intensive traffic is flash crowd situations…

1351

Abstract

Purpose

Intensive traffic often occurs in web‐enabled business processes hosted by travel industry and government portals. An extreme case for intensive traffic is flash crowd situations when the number of web users spike within a short time due to unexpected events caused by political unrest or extreme weather conditions. As a result, the servers hosting these business processes can no longer handle overwhelming service requests. To alleviate this problem, process engineers usually analyze audit trail data collected from the application server and reengineer their business processes to withstand unexpected surge in the visitors. However, such analysis can only reveal the performance of the application server from the internal perspective. This paper aims to investigate this issue.

Design/methodology/approach

This paper proposes an approach for analyzing key performance indicators of traffic intensive web‐enabled business processes from audit trail data, web server logs, and stress testing logs.

Findings

The key performance indicators identified in the study's approach can be used to understand the behavior of traffic intensive web‐enabled business processes and the underlying factors that affect the stability of the web server.

Originality/value

The proposed analysis also provides an internal as well as an external view of the performance. Moreover, the calculated key performance indicators can be used by the process engineers for locating potential bottlenecks, reengineering business processes, and implementing contingency measures for traffic intensive situations.

Details

Business Process Management Journal, vol. 18 no. 2
Type: Research Article
ISSN: 1463-7154

Keywords

Article
Publication date: 1 February 1998

Zhongdong Zhang

Most electronic journals are now Web‐based. This paper introduces the method of WWW server log file analysis and its application to evaluating electronic journals services and in…

Abstract

Most electronic journals are now Web‐based. This paper introduces the method of WWW server log file analysis and its application to evaluating electronic journals services and in monitoring their usage. Following a short description on the method and its possible application, the main results of a study of WWW server log file analysis of the electronic journal “Review of Information Science” will be presented and discussed. Finally, several concluding remarks will be given.

Details

VINE, vol. 28 no. 2
Type: Research Article
ISSN: 0305-5728

Article
Publication date: 14 March 2008

S.E. Kruck, Faye Teer and William A. Christian

The purpose of this paper is to describe a new software tool that graphically depicts analysis of visitor traffic. This new tool is the graph‐based server log analysis program…

Abstract

Purpose

The purpose of this paper is to describe a new software tool that graphically depicts analysis of visitor traffic. This new tool is the graph‐based server log analysis program (GSLAP).

Design/methodology/approach

Discovering hidden and meaningful information about web users' patterns of usage is critical to optimization of the web server. The authors designed and developed GSLAP. Presented in this paper is an example of GSLAP in the context of an analysis of the web site of a small fictitious company. Also included is an explanation of current literature that supports graphical display of data as a cognitive aid to understanding data.

Findings

GSLAP is shown to provide a visual server log analysis that is a great improvement on the textual server log.

Research limitations/implications

The benefits of the output from GSLAP are compared with the typical textual output.

Originality/value

The paper describes a software tool that helps the analysis of usage patterns of web traffic.

Details

Industrial Management & Data Systems, vol. 108 no. 2
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 1 September 2005

Jimmy Ghaphery

To study the use of “Quick Links”, a common navigational element, in the context of an academic library website.

999

Abstract

Purpose

To study the use of “Quick Links”, a common navigational element, in the context of an academic library website.

Design/methodology/approach

Transaction log files and web server logs are analyzed over a four‐year period to detect patterns in Quick Link usage.

Findings

Provides information about what Quick Links have been used over time, as well as the relationship of Quick Link usage to the rest of the library website. Finds generally that Quick Link usage is prevalent, tilted toward a few of the choices, and is drawn largely from the library homepage as referral source.

Research limitations/implications

Log analysis does not include IP referral data, which limits the ability to determine different patterns of use by specific locations including services desks, off‐campus, and in‐house library usage.

Practical implications

This paper is useful for website usability in terms of design decisions and log analysis.

Originality/value

This paper targets a specific website usability issue over time.

Details

OCLC Systems & Services: International digital library perspectives, vol. 21 no. 3
Type: Research Article
ISSN: 1065-075X

Keywords

Article
Publication date: 1 June 1999

David Nicholas, Paul Huntington, Peter Williams, Nat Lievesley, Tom Dobrowolski and Richard Withey

There is a general dearth of trustworthy information on who is using the web and how they use it. Such information is of vital concern to web managers and their advertisers yet…

Abstract

There is a general dearth of trustworthy information on who is using the web and how they use it. Such information is of vital concern to web managers and their advertisers yet the systems for delivering such data, where in place, generally cannot supply accurate enough data. Nor have web managers the expertise or time to evaluate the enormous amounts of information that are generated by web sites. The article, based on the experience of evaluating The Times web server access logs, describes the methodological problems that lie at the heart of web log analysis, evaluates a range of use measures (visits, page impressions, hits) and provides some advice on what analyses are worth conducting.

Details

Aslib Proceedings, vol. 51 no. 5
Type: Research Article
ISSN: 0001-253X

Keywords

Article
Publication date: 1 December 2005

Hamid R. Jamali, David Nicholas and Paul Huntington

To provide a review of the log analysis studies of use and users of scholarly electronic journals.

3178

Abstract

Purpose

To provide a review of the log analysis studies of use and users of scholarly electronic journals.

Design/methodology/approach

The advantages and limitations of log analysis are described and then past studies of e‐journals' use and users that applied this methodology are critiqued. The results of these studies will be very briefly compared with some survey studies. Those aspects of online journals' use and users studies that log analysis can investigate well and those aspects that log analysis can not disclose enough information about are highlighted.

Findings

The review indicates that although there is a debate about reliability of the results of log analysis, this methodology has great potential for studying online journals' use and their users' information seeking behaviour.

Originality/value

This paper highlights the strengths and weaknesses of log analysis for studying digital journals and raises a couple of questions to be investigated by further studies.

Details

Aslib Proceedings, vol. 57 no. 6
Type: Research Article
ISSN: 0001-253X

Keywords

Article
Publication date: 8 June 2010

Guillermo Navarro‐Arribas and Vicenç Torra

The purpose of this paper is to anonymize web server log files used in e‐commerce web mining processes.

1484

Abstract

Purpose

The purpose of this paper is to anonymize web server log files used in e‐commerce web mining processes.

Design/methodology/approach

The paper has applied statistical disclosure control (SDC) techniques to achieve its goal. More precisely, it has introduced the micro‐aggregation of web access logs.

Findings

The experiments show that the proposed technique provides good results in general, but it is especially outstanding when dealing with relatively small websites.

Research limitations/implications

As in all SDC techniques there is always a trade‐off between privacy and utility or, in other words, between disclosure risk and information loss. In this proposal, it has borne this issue in mind, providing k‐anonymity, while preserving acceptable information accuracy.

Practical implications

Web server logs are valuable information used nowadays for user profiling and general data‐mining analysis of a website in e‐commerce and e‐services. This proposal allows anonymizing such logs, so they can be safely outsourced to other companies for marketing purposes, stored for further analysis, or made publicly available, without risking customer privacy.

Originality/value

Current solutions to the problem presented here are very poor and scarce. They are normally reduced to the elimination of sensitive information from query strings of URLs in general. Moreover, to its knowledge, the use of SDC techniques has never been applied to the anonymization of web logs.

Details

Internet Research, vol. 20 no. 3
Type: Research Article
ISSN: 1066-2243

Keywords

Article
Publication date: 31 July 2007

Alesia Zuccala, Mike Thelwall, Charles Oppenheim and Rajveen Dhiensa

The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the…

2112

Abstract

Purpose

The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the National electronic Library for Health (NeLH).

Design/methodology/approach

The Web intelligence techniques in this study are a combination of link analysis (web structure mining), web server log file analysis (web usage mining), and text analysis (web content mining), utilizing the power of commercial search engines and drawing upon the information science fields of bibliometrics and webometrics. LexiURL is a computer program designed to calculate summary statistics for lists of links or URLs. Its output is a series of standard reports, for example listing and counting all of the different domain names in the data.

Findings

Link data, when analysed together with user transaction log files (i.e. Web referring domains) can provide insights into who is using a digital library and when, and who could be using the digital library if they are “surfing” a particular part of the Web; in this case any site that is linked to or colinked with the NeLH. This study found that the NeLH was embedded in a multifaceted Web context, including many governmental, educational, commercial and organisational sites, with the most interesting being sites from the.edu domain, representing American Universities. Not many links directed to the NeLH were followed on September 25, 2005 (the date of the log file analysis and link extraction analysis), which means that users who access the digital library have been arriving at the site via only a few select links, bookmarks and search engine searches, or non‐electronic sources.

Originality/value

A number of studies concerning digital library users have been carried out using log file analysis as a research tool. Log files focus on real‐time user transactions; while LexiURL can be used to extract links and colinks associated with a digital library's growing Web network. This Web network is not recognized often enough, and can be a useful indication of where potential users are surfing, even if they have not yet specifically visited the NeLH site.

1 – 10 of over 2000