Search results

1 – 10 of over 2000

View access options

Article

Publication date: 1 October 2006

On the validity of client‐side vs server‐side web log data analysis

Gi Woong Yun, Jay Ford, Robert P. Hawkins, Suzanne Pingree, Fiona McTavish, David Gustafson and Haile Berhe

This paper seeks to discuss measurement units by comparing the internet use and the traditional media use, and to understand internet use from the traditional media use…

HTML

PDF (107 KB)

Downloads

990

Abstract

Purpose

This paper seeks to discuss measurement units by comparing the internet use and the traditional media use, and to understand internet use from the traditional media use perspective.

Design/methodology/approach

Benefits and shortcomings of two log file types will be carefully and exhaustively examined. Client‐side and server‐side log files will be analyzed and compared with proposed units of analysis.

Findings

Server‐side session time calculation was remarkably reliable and valid based on the high correlation with the client‐side time calculation. The analysis result revealed that the server‐side log file session time measurement seems more promising than the researchers previously speculated.

Practical implications

An ability to identify each individual user and low caching problems were strong advantages for the analysis. Those web design implementations and web log data analysis scheme are recommended for future web log analysis research.

Originality/value

This paper examined the validity of the client‐side and the server‐side web log data. As a result of the triangulation of two datasets, research designs and propose analysis schemes could be recommended.

Details

Internet Research, vol. 16 no. 5

Type: Research Article

DOI:

ISSN: 1066-2243

Keywords

View access options

Article

Publication date: 4 April 2008

SensorWebIDS: a web mining intrusion detection system

C.I. Ezeife, Jingyu Dong and A.K. Aggarwal

The purpose of this paper is to propose a web intrusion detection system (IDS), SensorWebIDS, which applies data mining, anomaly and misuse intrusion detection on web environment.

HTML

PDF (137 KB)

Downloads

557

Abstract

Purpose

The purpose of this paper is to propose a web intrusion detection system (IDS), SensorWebIDS, which applies data mining, anomaly and misuse intrusion detection on web environment.

Design/methodology/approach

SensorWebIDS has three main components: the network sensor for extracting parameters from real‐time network traffic, the log digger for extracting parameters from web log files and the audit engine for analyzing all web request parameters for intrusion detection. To combat web intrusions like buffer‐over‐flow attack, SensorWebIDS utilizes an algorithm based on standard deviation (δ) theory's empirical rule of 99.7 percent of data lying within 3δ of the mean, to calculate the possible maximum value length of input parameters. Association rule mining technique is employed for mining frequent parameter list and their sequential order to identify intrusions.

Findings

Experiments show that proposed system has higher detection rate for web intrusions than SNORT and mod security for such classes of web intrusions like cross‐site scripting, SQL‐Injection, session hijacking, cookie poison, denial of service, buffer overflow, and probes attacks.

Research limitations/implications

Future work may extend the system to detect intrusions implanted with hacking tools and not through straight HTTP requests or intrusions embedded in non‐basic resources like multimedia files and others, track illegal web users with their prior web‐access sequences, implement minimum and maximum values for integer data, and automate the process of pre‐processing training data so that it is clean and free of intrusion for accurate detection results.

Practical implications

Web service security, as a branch of network security, is becoming more important as more business and social activities are moved online to the web.

Originality/value

Existing network IDSs are not directly applicable to web intrusion detection, because these IDSs are mostly sitting on the lower (network/transport) level of network model while web services are running on the higher (application) level. Proposed SensorWebIDS detects XSS and SQL‐Injection attacks through signatures, while other types of attacks are detected using association rule mining and statistics to compute frequent parameter list order and their maximum value lengths.

Details

International Journal of Web Information Systems, vol. 4 no. 1

Type: Research Article

DOI:

ISSN: 1744-0084

Keywords

View access options

Article

Publication date: 13 April 2012

Key performance indicators for traffic intensive web‐enabled business processes

Ka I. Pun, Yain Whar Si and Kin Chan Pau

Intensive traffic often occurs in web‐enabled business processes hosted by travel industry and government portals. An extreme case for intensive traffic is flash crowd situations…

HTML

PDF (662 KB)

Downloads

1351

Abstract

Purpose

Intensive traffic often occurs in web‐enabled business processes hosted by travel industry and government portals. An extreme case for intensive traffic is flash crowd situations when the number of web users spike within a short time due to unexpected events caused by political unrest or extreme weather conditions. As a result, the servers hosting these business processes can no longer handle overwhelming service requests. To alleviate this problem, process engineers usually analyze audit trail data collected from the application server and reengineer their business processes to withstand unexpected surge in the visitors. However, such analysis can only reveal the performance of the application server from the internal perspective. This paper aims to investigate this issue.

Design/methodology/approach

This paper proposes an approach for analyzing key performance indicators of traffic intensive web‐enabled business processes from audit trail data, web server logs, and stress testing logs.

Findings

The key performance indicators identified in the study's approach can be used to understand the behavior of traffic intensive web‐enabled business processes and the underlying factors that affect the stability of the web server.

Originality/value

The proposed analysis also provides an internal as well as an external view of the performance. Moreover, the calculated key performance indicators can be used by the process engineers for locating potential bottlenecks, reengineering business processes, and implementing contingency measures for traffic intensive situations.

Details

Business Process Management Journal, vol. 18 no. 2

Type: Research Article

DOI:

ISSN: 1463-7154

Keywords

View access options

Article

Publication date: 1 February 1998

Evaluating electronic journals services and monitoring their usage by means of WWW server log file analysis

Zhongdong Zhang

Most electronic journals are now Web‐based. This paper introduces the method of WWW server log file analysis and its application to evaluating electronic journals services and in…

HTML

PDF (445 KB)

Downloads

305

Abstract

Most electronic journals are now Web‐based. This paper introduces the method of WWW server log file analysis and its application to evaluating electronic journals services and in monitoring their usage. Following a short description on the method and its possible application, the main results of a study of WWW server log file analysis of the electronic journal “Review of Information Science” will be presented and discussed. Finally, several concluding remarks will be given.

Details

VINE, vol. 28 no. 2

Type: Research Article

DOI:

ISSN: 0305-5728

View access options

Article

Publication date: 14 March 2008

GSLAP: a graph‐based web analysis tool

S.E. Kruck, Faye Teer and William A. Christian

The purpose of this paper is to describe a new software tool that graphically depicts analysis of visitor traffic. This new tool is the graph‐based server log analysis program…

HTML

PDF (397 KB)

Downloads

406

Abstract

Purpose

The purpose of this paper is to describe a new software tool that graphically depicts analysis of visitor traffic. This new tool is the graph‐based server log analysis program (GSLAP).

Design/methodology/approach

Discovering hidden and meaningful information about web users' patterns of usage is critical to optimization of the web server. The authors designed and developed GSLAP. Presented in this paper is an example of GSLAP in the context of an analysis of the web site of a small fictitious company. Also included is an explanation of current literature that supports graphical display of data as a cognitive aid to understanding data.

Findings

GSLAP is shown to provide a visual server log analysis that is a great improvement on the textual server log.

Research limitations/implications

The benefits of the output from GSLAP are compared with the typical textual output.

Originality/value

The paper describes a software tool that helps the analysis of usage patterns of web traffic.

Details

Industrial Management & Data Systems, vol. 108 no. 2

Type: Research Article

DOI:

ISSN: 0263-5577

Keywords

View access options

Article

Publication date: 1 September 2005

Too quick? Log analysis of Quick Links from an academic library website

Jimmy Ghaphery

To study the use of “Quick Links”, a common navigational element, in the context of an academic library website.

HTML

PDF (84 KB)

Downloads

999

Abstract

Purpose

To study the use of “Quick Links”, a common navigational element, in the context of an academic library website.

Design/methodology/approach

Transaction log files and web server logs are analyzed over a four‐year period to detect patterns in Quick Link usage.

Findings

Provides information about what Quick Links have been used over time, as well as the relationship of Quick Link usage to the rest of the library website. Finds generally that Quick Link usage is prevalent, tilted toward a few of the choices, and is drawn largely from the library homepage as referral source.

Research limitations/implications

Log analysis does not include IP referral data, which limits the ability to determine different patterns of use by specific locations including services desks, off‐campus, and in‐house library usage.

Practical implications

This paper is useful for website usability in terms of design decisions and log analysis.

Originality/value

This paper targets a specific website usability issue over time.

Details

OCLC Systems & Services: International digital library perspectives, vol. 21 no. 3

Type: Research Article

DOI:

ISSN: 1065-075X

Keywords

View access options

Article

Publication date: 1 June 1999

Developing and testing methods to determine the use of web sites: case study newspapers

David Nicholas, Paul Huntington, Peter Williams, Nat Lievesley, Tom Dobrowolski and Richard Withey

There is a general dearth of trustworthy information on who is using the web and how they use it. Such information is of vital concern to web managers and their advertisers yet…

HTML

PDF (267 KB)

Downloads

378

Abstract

There is a general dearth of trustworthy information on who is using the web and how they use it. Such information is of vital concern to web managers and their advertisers yet the systems for delivering such data, where in place, generally cannot supply accurate enough data. Nor have web managers the expertise or time to evaluate the enormous amounts of information that are generated by web sites. The article, based on the experience of evaluating The Times web server access logs, describes the methodological problems that lie at the heart of web log analysis, evaluates a range of use measures (visits, page impressions, hits) and provides some advice on what analyses are worth conducting.

Details

Aslib Proceedings, vol. 51 no. 5

Type: Research Article

DOI:

ISSN: 0001-253X

Keywords

View access options

Article

Publication date: 1 December 2005

The use and users of scholarly e‐journals: a review of log analysis studies

Hamid R. Jamali, David Nicholas and Paul Huntington

To provide a review of the log analysis studies of use and users of scholarly electronic journals.

HTML

PDF (105 KB)

Downloads

3178

Abstract

Purpose

To provide a review of the log analysis studies of use and users of scholarly electronic journals.

Design/methodology/approach

The advantages and limitations of log analysis are described and then past studies of e‐journals' use and users that applied this methodology are critiqued. The results of these studies will be very briefly compared with some survey studies. Those aspects of online journals' use and users studies that log analysis can investigate well and those aspects that log analysis can not disclose enough information about are highlighted.

Findings

The review indicates that although there is a debate about reliability of the results of log analysis, this methodology has great potential for studying online journals' use and their users' information seeking behaviour.

Originality/value

This paper highlights the strengths and weaknesses of log analysis for studying digital journals and raises a couple of questions to be investigated by further studies.

Details

Aslib Proceedings, vol. 57 no. 6

Type: Research Article

DOI:

ISSN: 0001-253X

Keywords

View access options

Article

Publication date: 8 June 2010

Privacy‐preserving data‐mining through micro‐aggregation for web‐based e‐commerce

Guillermo Navarro‐Arribas and Vicenç Torra

The purpose of this paper is to anonymize web server log files used in e‐commerce web mining processes.

HTML

PDF (559 KB)

Downloads

1484

Abstract

Purpose

The purpose of this paper is to anonymize web server log files used in e‐commerce web mining processes.

Design/methodology/approach

The paper has applied statistical disclosure control (SDC) techniques to achieve its goal. More precisely, it has introduced the micro‐aggregation of web access logs.

Findings

The experiments show that the proposed technique provides good results in general, but it is especially outstanding when dealing with relatively small websites.

Research limitations/implications

As in all SDC techniques there is always a trade‐off between privacy and utility or, in other words, between disclosure risk and information loss. In this proposal, it has borne this issue in mind, providing k‐anonymity, while preserving acceptable information accuracy.

Practical implications

Web server logs are valuable information used nowadays for user profiling and general data‐mining analysis of a website in e‐commerce and e‐services. This proposal allows anonymizing such logs, so they can be safely outsourced to other companies for marketing purposes, stored for further analysis, or made publicly available, without risking customer privacy.

Originality/value

Current solutions to the problem presented here are very poor and scarce. They are normally reduced to the elimination of sensitive information from query strings of URLs in general. Moreover, to its knowledge, the use of SDC techniques has never been applied to the anonymization of web logs.

Details

Internet Research, vol. 20 no. 3

Type: Research Article

DOI:

ISSN: 1066-2243

Keywords

View access options

Article

Publication date: 31 July 2007

Web intelligence analyses of digital libraries: A case study of the National electronic Library for Health (NeLH)

Alesia Zuccala, Mike Thelwall, Charles Oppenheim and Rajveen Dhiensa

The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the…

HTML

PDF (675 KB)

Downloads

2112

Abstract

Purpose

The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the National electronic Library for Health (NeLH).

Design/methodology/approach

The Web intelligence techniques in this study are a combination of link analysis (web structure mining), web server log file analysis (web usage mining), and text analysis (web content mining), utilizing the power of commercial search engines and drawing upon the information science fields of bibliometrics and webometrics. LexiURL is a computer program designed to calculate summary statistics for lists of links or URLs. Its output is a series of standard reports, for example listing and counting all of the different domain names in the data.

Findings

Link data, when analysed together with user transaction log files (i.e. Web referring domains) can provide insights into who is using a digital library and when, and who could be using the digital library if they are “surfing” a particular part of the Web; in this case any site that is linked to or colinked with the NeLH. This study found that the NeLH was embedded in a multifaceted Web context, including many governmental, educational, commercial and organisational sites, with the most interesting being sites from the.edu domain, representing American Universities. Not many links directed to the NeLH were followed on September 25, 2005 (the date of the log file analysis and link extraction analysis), which means that users who access the digital library have been arriving at the site via only a few select links, bookmarks and search engine searches, or non‐electronic sources.

Originality/value

A number of studies concerning digital library users have been carried out using log file analysis as a research tool. Log files focus on real‐time user transactions; while LexiURL can be used to extract links and colinks associated with a digital library's growing Web network. This Web network is not recognized often enough, and can be a useful indication of where potential users are surfing, even if they have not yet specifically visited the NeLH site.

Details

Journal of Documentation, vol. 63 no. 4

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

Access

Year

Content type

1 – 10 of over 2000

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Details

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information