Search results

1 – 10 of 45
To view the access options for this content please click here
Article

Schubert Foo, Siu Cheung Hui, Hong Koon Lim and Li Hui

Asian languages such as Japanese, Korean and in particular Chinese, are beginning to gain popularity in the information retrieval (IR) domain. The quality of IR systems…

Abstract

Asian languages such as Japanese, Korean and in particular Chinese, are beginning to gain popularity in the information retrieval (IR) domain. The quality of IR systems has traditionally been judged by the system’s retrieval effectiveness which, in turn, is commonly measured by data recall and data precision. This paper proposes and describes a process for generating an automatic Chinese thesaurus that can be used to provide related terms to a user’s queries to enhance retrieval effectiveness. In the absence of existing automatic Chinese thesauri, techniques used in English thesaurus generation have been evaluated and adapted to generate a Chinese equivalent. The automatic thesaurus is generated by computing the co‐occurrence values between domain‐specific terms found in a document collection. These co‐occurrence values are in turn derived from the term and document frequencies of the terms. A set of experiments was subsequently carried out on a document test set to evaluate the applicability of the thesaurus. Results obtained from these experiments confirmed that such an automatic generated thesaurus is able to improve the retrieval effectiveness of a Chinese IR system.

Details

Library Review, vol. 49 no. 5
Type: Research Article
ISSN: 0024-2535

Keywords

To view the access options for this content please click here
Article

Le Vu Ho, Siu Cheung Hui and A.C.M. Fong

The World Wide Web has become an important medium for disseminating scientific publications. To make their research works accessible to other researchers, most research…

Abstract

The World Wide Web has become an important medium for disseminating scientific publications. To make their research works accessible to other researchers, most research institutions list their publications in an index page that sometimes includes links to online versions of the publications. As the index page is usually updated whenever new research papers are published, researchers need to check these index pages frequently in order to know of any new publications published in the targeted Web site or page. This manual publication monitoring process is tedious and time‐consuming. In this paper, a publication monitoring system, known as PubWatcher, is proposed to automatically track Web publications from user‐specified Web sites or pages. A publication extraction technique has been developed to extract publication information listed in the index pages of the monitored Web sites and pages.

Details

The Electronic Library, vol. 21 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

To view the access options for this content please click here
Article

Bing Tan, Schubert Foo and Siu Cheung Hui

The dynamic nature of information content on the Web has posed a serious problem to users who need constantly to keep track of the latest updates on specific information…

Abstract

The dynamic nature of information content on the Web has posed a serious problem to users who need constantly to keep track of the latest updates on specific information. Traditional search engines enable users to retrieve potentially relevant Web information, but they do not track and monitor Web pages based on users’ interests. On the other hand, Web information monitoring systems are designed specifically to help users track and monitor Web information. However, to make Web monitoring effective, it is necessary to identify and understand typical Web page update characteristics so that useful monitoring features and functions can be designed and built into these systems. In this study, a total of 105 Web pages from the Internet were collected and monitored over a one‐month period. These pages are selected from seven domains under Yahoo!’s directories. The analysis results are presented according to Web site domains, Web page types, Web page attributes and change frequency. Based on this study, different functions and features for a Web monitoring system are identified. These features have been incorporated into a Web monitoring system, WebMon, that has been developed at the School of Computer Engineering, Nanyang Technological University, Singapore.

Details

Online Information Review, vol. 25 no. 1
Type: Research Article
ISSN: 1468-4527

Keywords

To view the access options for this content please click here
Article

Yulan He and Siu Cheung Hui

Many scientific publications are now available on the World Wide Web for researchers to share research findings. However, they tend to be poorly organised, making the…

Abstract

Many scientific publications are now available on the World Wide Web for researchers to share research findings. However, they tend to be poorly organised, making the search of relevant publications difficult and time‐consuming. Most existing search engines are ineffective in searching these publications, as they do not index Web publications that normally appear in PDF (portable document format) or PostScript formats. Proposes a Web citation‐based retrieval system, known as PubSearch, for the retrieval of Web publications. PubSearch indexes Web publications based on citation indices and stores them into a Web Citation Database. The Web Citation Database is then mined to support publication retrieval. Apart from supporting the traditional cited reference search, PubSearch also provides document clustering search and author clustering search. Document clustering groups related publications into clusters, while author clustering categorizes authors into different research areas based on author co‐citation analysis.

Details

Library Hi Tech, vol. 19 no. 3
Type: Research Article
ISSN: 0737-8831

Keywords

To view the access options for this content please click here
Article

Baoyao Zhou, Siu Cheung Hui and Alvis C. M. Fong

With the explosive growth of information available on the World Wide Web, it has become much more difficult to access relevant information from the Web. One possible…

Abstract

With the explosive growth of information available on the World Wide Web, it has become much more difficult to access relevant information from the Web. One possible approach to solve this problem is web personalization. In this paper, we propose a novel WUL (Web Usage Lattice) based mining approach for mining association access pattern rules for personalized web recommendations. The proposed approach aims to mine a reduced set of effective association pattern rules for enhancing the online performance of web recommendations. We have incorporated the proposed approach into a personalized web recommender system known as AWARS. The performance of the proposed approach is evaluated based on the efficiency and the quality. In the efficiency evaluation, we measure the number of generated rules and the runtime for online recommendations. In the quality evaluation, we measure the quality of the recommendation service based on precision, satisfactory and applicability. This paper will discuss the proposed WUL‐based mining approach, and give the performance of the proposed approach in comparison with the Apriori‐based algorithms.

Details

International Journal of Web Information Systems, vol. 1 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

To view the access options for this content please click here
Article

Schubert Foo Siu Cheung Hui and See Wai Yip

The Internet environment, with its packet‐switched network and lack of resource reservation mechanisms, has made the delivery of low bit‐rate real‐time communication…

Abstract

The Internet environment, with its packet‐switched network and lack of resource reservation mechanisms, has made the delivery of low bit‐rate real‐time communication services particularly difficult and challenging. The high potential transmission delay and data packet loss under varying network conditions will lead to unpleasant and unintelligible audio and jerky video play‐out. The Internet TCP/IP protocol suite can be extended with new mechanisms in an attempt to tackle such problems. In this research, an integrated transmission mechanism that incorporates a number of existing techniques to enhance the quality and deliver “acceptable” real‐time services is proposed. These techniques include the use of data compression, data buffering, dynamic rate control, packet lost replacement, silence deletion and virtual video play‐out mechanism. The proposed transmission mechanism is designed as a generic communication system so that it can be used in different systems and conditions. This approach has been successfully implemented and demonstrated using three separate systems that include the Internet Phone, WebVideo and video‐conferencing tool.

Details

Internet Research, vol. 9 no. 3
Type: Research Article
ISSN: 1066-2243

Keywords

To view the access options for this content please click here
Article

Schubert Foo, Siu Cheung Hui, See Wai Yip and Yulan He

Knowledge of the Internet Protocol (IP) address is essential for connection establishment in certain classes of synchronous distributed applications, such as Internet…

Abstract

Knowledge of the Internet Protocol (IP) address is essential for connection establishment in certain classes of synchronous distributed applications, such as Internet telephony and video‐conferencing systems. A problem of dynamic IP addressing arises when the connection to the Internet is through an Internet service provider, since the IP address is dynamically allocated only at connection time. Proposes and draws a contrast between a number of generic methods that can be classified as online and offline methods for the resolution of dynamic IP addressing. Online methods, which include the World Wide Web, exchange server and the dynamic Domain Name System, are only effective when both the caller and recipient are logged on to the Internet. On the other hand, offline methods, which include electronic mailing and directory service look‐up, provide an additional means to allow the caller to leave messages when the recipient is not logged on to the Internet. Of these methods, the dynamic Domain Name System and directory service look‐up appear to be the best for resolving dynamic IP addressing.

Details

Internet Research, vol. 7 no. 3
Type: Research Article
ISSN: 1066-2243

Keywords

To view the access options for this content please click here
Article

Schubert Foo and Siu Cheung Hui

An Internet telephony system functions like a conventional telephone to support real‐time voice communication over the Internet. Numerous proprietary systems have surfaced…

Abstract

An Internet telephony system functions like a conventional telephone to support real‐time voice communication over the Internet. Numerous proprietary systems have surfaced since its introduction as a result of its main attraction of allowing transcontinental telephone calls to be made at the price of local telephone calls. Currently, these systems are at their infancy stage with no support for standards to allow interoperability among systems. A proper framework to allow evaluation or comparison between systems is also lacking. This paper proposes such a framework that utilises a feature and functionality appraisal together with both quantitative and qualitative assessment techniques to allow a systematic evaluation of Internet telephony systems to take place. These techniques include voice process evaluation through signal reproduction, Diagnostic Rhyme Test, Diagnostic Acceptability Measure, Degradation Category Rating, and Free Conversation Test. This framework has been successfully demonstrated and utilised for the evaluation of five Internet telephony systems.

Details

Internet Research, vol. 8 no. 1
Type: Research Article
ISSN: 1066-2243

Keywords

To view the access options for this content please click here
Article

Tho Thanh Quan, Xuan H. Luong , Thanh C. Nguyen and Hui Siu Cheung

Most digital libraries (DL) are now available online. They also provide the Z39.50 standard protocol which allows computer-based systems to effectively retrieve…

Abstract

Purpose

Most digital libraries (DL) are now available online. They also provide the Z39.50 standard protocol which allows computer-based systems to effectively retrieve information stored in the DLs. The major difficulty lies in inconsistency between database schemas of multiple DLs. The purpose of this paper is to present a system known as Argumentation-based Digital Library Search (ADLSearch), which facilitates information retrieval across multiple DLs.

Design/methodology/approach

The proposed approach is based on argumentation theory for schema matching reconciliation from multiple schema matching algorithms. In addition, a distributed architecture is proposed for the ADLSearch system for information retrieval from multiple DLs.

Findings

Initial performance results are promising. First, schema matching can improve the retrieval performance on DLs, as compared to the baseline technique. Subsequently, argumentation-based retrieval can yield better matching accuracy and retrieval efficiency than individual schema matching algorithms.

Research limitations/implications

The work discussed in this paper has been implemented as a prototype supporting scholarly retrieval from about 800 DLs over the world. However, due to complexity of argumentation algorithm, the process of adding new DLs to the system cannot be performed in a real-time manner.

Originality/value

In this paper, an argumentation-based approach is proposed for reconciling the conflicts from multiple schema matching algorithms in the context of information retrieval from multiple DL. Moreover, the proposed approach can also be applied for similar applications which require automatic mapping from multiple database schemas.

Details

Online Information Review, vol. 39 no. 1
Type: Research Article
ISSN: 1468-4527

Keywords

To view the access options for this content please click here
Article

Haichao Dong, Siu Cheung Hui and Yulan He

The purpose of this research is to study the characteristics of chat messages from analysing a collection of 33,121 sample messages gathered from 1,700 sessions of…

Abstract

Purpose

The purpose of this research is to study the characteristics of chat messages from analysing a collection of 33,121 sample messages gathered from 1,700 sessions of conversations of 72 pairs of MSN Messenger users over a four month duration from June to September of 2005. The primary objective of chat message characterization is to understand the properties of chat messages for effective message analysis, such as message topic detection.

Design/methodology/approach

From the study on chat message characteristics, an indicative term‐based categorization approach for chat topic detection is proposed. In the proposed approach, different techniques such as sessionalisation of chat messages and extraction of features from icon texts and URLs are incorporated for message pre‐processing. Naïve Bayes, Associative Classification, and Support Vector Machine are employed as classifiers for categorizing topics from chat sessions.

Findings

Indicative term‐based approach is superior to the traditional document frequency based approach, for feature selection in chat topic categorization.

Originality/value

This paper studies the characteristics of chat messages and proposes an indicative term‐based categorization approach for chat topic detection.

Details

Online Information Review, vol. 30 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

1 – 10 of 45