Books and journals Case studies Expert Briefings Open Access
Advanced search

Search results

1 – 10 of over 3000
To view the access options for this content please click here
Article
Publication date: 23 November 2010

Towards flexible and lightweight integration of web applications by end‐user programming

Hao Han and Takehiro Tokuda

The purpose of this paper is to present a method to realize the flexible and lightweight integration of general web applications.

HTML
PDF (325 KB)

Abstract

Purpose

The purpose of this paper is to present a method to realize the flexible and lightweight integration of general web applications.

Design/methodology/approach

The information extraction and functionality emulation method are proposed to realize the web information integration for the general web applications. All the processes of web information searching, submitting and extraction are run at client‐side by end‐user programming like a real web service.

Findings

The implementation shows that the required programming techniques are within the abilities of general web users, and without needing to write too many programs.

Originality/value

A Java‐based class package was developed for web information searching/submitting/extraction, which users can integrate easily with the general web applications.

Details

International Journal of Web Information Systems, vol. 6 no. 4
Type: Research Article
DOI: https://doi.org/10.1108/17440081011090257
ISSN: 1744-0084

Keywords

  • Internet
  • Computer applications
  • Programming
  • Information retrieval

To view the access options for this content please click here
Article
Publication date: 1 February 2002

Effective techniques for automatic extraction of Web publications

A.C.M. Fong, S.C. Hui and H.L. Vu

Research organisations and individual researchers increasingly choose to share their research findings by providing lists of their published works on the World Wide Web…

HTML
PDF (2.2 MB)

Abstract

Research organisations and individual researchers increasingly choose to share their research findings by providing lists of their published works on the World Wide Web. To facilitate the exchange of ideas, the lists often include links to published papers in portable document format (PDF) or Postscript (PS) format. Generally, these publication Web sites are updated regularly to include new works. While manual monitoring of relevant Web sites is tedious, commercial search engines and information monitoring systems are ineffective in finding and tracking scholarly publications. Analyses the characteristics of publication index pages and describes effective automatic extraction techniques that the authors have developed. The authors’ techniques combine lexical and syntactic analyses with heuristics. The proposed techniques have been implemented and tested for more than 14,000 Web pages and achieved consistently high success rates of around 90 percent.

Details

Online Information Review, vol. 26 no. 1
Type: Research Article
DOI: https://doi.org/10.1108/14684520210418347
ISSN: 1468-4527

Keywords

  • Internet
  • Research
  • Electronic publishing
  • Content analysis

To view the access options for this content please click here
Book part
Publication date: 10 February 2012

Chapter 3 Local Web Search Examined

Dirk Ahlers

Purpose — To provide a theoretical background to understand current local search engines as an aspect of specialized search, and understand the data sources and used…

HTML
PDF (29.4 MB)
EPUB (3.1 MB)

Abstract

Purpose — To provide a theoretical background to understand current local search engines as an aspect of specialized search, and understand the data sources and used technologies.

Design/methodology/approach — Selected local search engines are examined and compared toward their use of geographic information retrieval (GIR) technologies, data sources, available entity information, processing, and interfaces. An introduction to the field of GIR is given and its use in the selected systems is discussed.

Findings — All selected commercial local search engines utilize GIR technology in varying degrees for information preparation and presentation. It is also starting to be used in regular Web search. However, major differences can be found between the different search engines.

Research limitations/implications — This study is not exhaustive and only uses informal comparisons without definitive ranking. Due to the unavailability of hard data, informed guesses were made based on available public interfaces and literature.

Practical implications — A source of background information for understanding the results of local search engines, their provenance, and their potential.

Originality/value — An overview of GIR technology in the context of commercial search engines integrates research efforts and commercial systems and helps to understand both sides better.

Details

Web Search Engine Research
Type: Book
DOI: https://doi.org/10.1108/S1876-0562(2012)002012a005
ISBN: 978-1-78052-636-2

Keywords

  • Local search
  • Web search
  • geospatial search
  • geographic information retrieval
  • location-based services

To view the access options for this content please click here
Article
Publication date: 21 September 2012

Smart combination of web measures for solving semantic similarity problems

Jorge Martinez‐Gil and José F. Aldana‐Montes

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring…

HTML
PDF (195 KB)

Abstract

Purpose

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring or text clustering have used some semantic similarity measures in the past. Despite the usefulness of semantic similarity measures in these applications, the problem of measuring the similarity between two text expressions remains a key challenge. This paper aims to address this issue.

Design/methodology/approach

In this article, the authors propose an optimization environment to improve existing techniques that use the notion of co‐occurrence and the information available on the web to measure similarity between terms.

Findings

The experimental results using the Miller and Charles and Gracia and Mena benchmark datasets show that the proposed approach is able to outperform classic probabilistic web‐based algorithms by a wide margin.

Originality/value

This paper presents two main contributions. The authors propose a novel technique that beats classic probabilistic techniques for measuring semantic similarity between terms. This new technique consists of using not only a search engine for computing web page counts, but a smart combination of several popular web search engines. The approach is evaluated on the Miller and Charles and Gracia and Mena benchmark datasets and compared with existing probabilistic web extraction techniques.

Details

Online Information Review, vol. 36 no. 5
Type: Research Article
DOI: https://doi.org/10.1108/14684521211276000
ISSN: 1468-4527

Keywords

  • Similarity measures
  • Web intelligence
  • Web search engines
  • Information integration
  • Information searches
  • Internet

To view the access options for this content please click here
Article
Publication date: 31 July 2007

Web intelligence analyses of digital libraries: A case study of the National electronic Library for Health (NeLH)

Alesia Zuccala, Mike Thelwall, Charles Oppenheim and Rajveen Dhiensa

The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the…

HTML
PDF (675 KB)

Abstract

Purpose

The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the National electronic Library for Health (NeLH).

Design/methodology/approach

The Web intelligence techniques in this study are a combination of link analysis (web structure mining), web server log file analysis (web usage mining), and text analysis (web content mining), utilizing the power of commercial search engines and drawing upon the information science fields of bibliometrics and webometrics. LexiURL is a computer program designed to calculate summary statistics for lists of links or URLs. Its output is a series of standard reports, for example listing and counting all of the different domain names in the data.

Findings

Link data, when analysed together with user transaction log files (i.e. Web referring domains) can provide insights into who is using a digital library and when, and who could be using the digital library if they are “surfing” a particular part of the Web; in this case any site that is linked to or colinked with the NeLH. This study found that the NeLH was embedded in a multifaceted Web context, including many governmental, educational, commercial and organisational sites, with the most interesting being sites from the.edu domain, representing American Universities. Not many links directed to the NeLH were followed on September 25, 2005 (the date of the log file analysis and link extraction analysis), which means that users who access the digital library have been arriving at the site via only a few select links, bookmarks and search engine searches, or non‐electronic sources.

Originality/value

A number of studies concerning digital library users have been carried out using log file analysis as a research tool. Log files focus on real‐time user transactions; while LexiURL can be used to extract links and colinks associated with a digital library's growing Web network. This Web network is not recognized often enough, and can be a useful indication of where potential users are surfing, even if they have not yet specifically visited the NeLH site.

Details

Journal of Documentation, vol. 63 no. 4
Type: Research Article
DOI: https://doi.org/10.1108/00220410710759011
ISSN: 0022-0418

Keywords

  • Digital libraries
  • Worldwide web
  • Search engines
  • Generation and dissemination of information
  • Transmission control protocol/internet protocol
  • Communication technologies

To view the access options for this content please click here
Book part
Publication date: 13 December 2017

E-Business Platform Information Search Services

Qiongwei Ye and Baojun Ma

Internet + and Electronic Business in China is a comprehensive resource that provides insight and analysis into E-commerce in China and how it has revolutionized and…

HTML
PDF (2.1 MB)
EPUB (280 KB)

Abstract

Internet + and Electronic Business in China is a comprehensive resource that provides insight and analysis into E-commerce in China and how it has revolutionized and continues to revolutionize business and society. Split into four distinct sections, the book first lays out the theoretical foundations and fundamental concepts of E-Business before moving on to look at internet+ innovation models and their applications in different industries such as agriculture, finance and commerce. The book then provides a comprehensive analysis of E-business platforms and their applications in China before finishing with four comprehensive case studies of major E-business projects, providing readers with successful examples of implementing E-Business entrepreneurship projects.

Internet + and Electronic Business in China is a comprehensive resource that provides insights and analysis into how E-commerce has revolutionized and continues to revolutionize business and society in China.

Details

Internet+ and Electronic Business in China: Innovation and Applications
Type: Book
DOI: https://doi.org/10.1108/978-1-78743-115-720171012
ISBN: 978-1-78743-115-7

To view the access options for this content please click here
Article
Publication date: 7 August 2009

Structure‐preserving and query‐biased document summarisation for web searching

F. Canan Pembe and Tunga Güngör

The purpose of this paper is to develop a new summarisation approach, namely structure‐preserving and query‐biased summarisation, to improve the effectiveness of web…

HTML
PDF (785 KB)

Abstract

Purpose

The purpose of this paper is to develop a new summarisation approach, namely structure‐preserving and query‐biased summarisation, to improve the effectiveness of web searching. During web searching, one aid for users is the document summaries provided in the search results. However, the summaries provided by current search engines have limitations in directing users to relevant documents.

Design/methodology/approach

The proposed system consists of two stages: document structure analysis and summarisation. In the first stage, a rule‐based approach is used to identify the sectional hierarchies of web documents. In the second stage, query‐biased summaries are created, making use of document structure both in the summarisation process and in the output summaries.

Findings

In structural processing, about 70 per cent accuracy in identifying document sectional hierarchies is obtained. The summarisation method is tested on a task‐based evaluation method using English and Turkish document collections. The results show that the proposed method is a significant improvement over both unstructured query‐biased summaries and Google snippets in terms of f‐measure.

Practical implications

The proposed summarisation system can be incorporated into search engines. The structural processing technique also has applications in other information systems, such as browsing, outlining and indexing documents.

Originality/value

In the literature on summarisation, the effects of query‐biased techniques and document structure are considered in only a few works and are researched separately. The research reported here differs from traditional approaches by combining these two aspects in a coherent framework. The work is also the first automatic summarisation study for Turkish targeting web search.

Details

Online Information Review, vol. 33 no. 4
Type: Research Article
DOI: https://doi.org/10.1108/14684520910985684
ISSN: 1468-4527

Keywords

  • Data structures
  • Document delivery
  • Markup languages
  • Search engines
  • Worldwide web

To view the access options for this content please click here
Article
Publication date: 1 June 2005

Indexing the invisible web: a survey

Yanbo Ru and Ellis Horowitz

The existence and continued growth of the invisible web creates a major challenge for search engines that are attempting to organize all of the material on the web into a…

HTML
PDF (135 KB)

Abstract

Purpose

The existence and continued growth of the invisible web creates a major challenge for search engines that are attempting to organize all of the material on the web into a form that is easily retrieved by all users. The purpose of this paper is to identify the challenges and problems underlying existing work in this area.

Design/methodology/approach

A discussion based on a short survey of prior work, including automated discovery of invisible web site search interfaces, automated classification of invisible web sites, label assignment and form filling, information extraction from the resulting pages, learning the query language of the search interface, building content summary for an invisible web site, selecting proper databases, integrating invisible web‐search interfaces, and accessing the performance of an invisible web site.

Findings

Existing technologies and tools for indexing the invisible web follow one of two strategies: indexing the web site interface or examining a portion of the contents of an invisible web site and indexing the results.

Originality/value

The paper is of value to those involved with information management.

Details

Online Information Review, vol. 29 no. 3
Type: Research Article
DOI: https://doi.org/10.1108/14684520510607579
ISSN: 1468-4527

Keywords

  • Worldwide web
  • Search engines
  • Information retrieval
  • Indexing

To view the access options for this content please click here
Article
Publication date: 11 April 2008

Extracting inter‐firm networks from the World Wide Web using a general‐purpose search engine

Yingzi Jin, Mitsuru Ishizuka and Yutaka Matsuo

Purpose – Social relations play an important role in a real community. Interaction patterns reveal relations among actors (such as persons, groups, firms), which can be…

HTML
PDF (312 KB)

Abstract

Purpose – Social relations play an important role in a real community. Interaction patterns reveal relations among actors (such as persons, groups, firms), which can be merged to produce valuable information such as a network structure. This paper aims to present a new approach to extract inter‐firm networks from the web for further analysis. Design/methodology/approach – In this study extraction of relations between a pair of firms is obtained by using a search engine and text processing. Because names of firms co‐appear coincidentally on the web, an advanced algorithm is proposed, which is characterised by the addition of keywords (“relation keywords”) to a query. The relation keywords are obtained from the web using a Jaccard coefficient. Findings – As an application, a network of 60 firms in Japan is extracted including IT, communication, broadcasting, and electronics firms from the web and comprehensive evaluations of this approach are shown. The alliance and lawsuit relations are easily obtainable from the web using the algorithm. By adding relation keywords to named pairs of firms as a query, It is possible to collect target pages from the top of web pages more precisely than by only using the named pairs as a query. Practical implications – This study proposes a new approach for extracting inter‐firm networks from the web. The obtained network is useful in several ways. It is possible to find a cluster of firms and characterise a firm by its cluster. Business experts often make such inferences based on firm relations and firm groups. For that reason the firm network might enhance inferential abilities on the business domain. Also we might use obtained networks to recommend business partners based on structural advantages. The authors' intuition is that extracting a social network might provide information that is only recognisable from the network point of view. For example, the centrality of each firm is identified only after generating a social network. Originality/value – This study is a first attempt to extract inter‐firm networks from the web using a search engine. The approach is also applicable to other actors, such as famous persons, organisations or other multiple relational entities.

Details

Online Information Review, vol. 32 no. 2
Type: Research Article
DOI: https://doi.org/10.1108/14684520810879827
ISSN: 1468-4527

Keywords

  • Worldwide web
  • Social networks
  • Information retrieval

To view the access options for this content please click here
Article
Publication date: 10 December 2018

Managing mining project documentation using human language technology

Aleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović and Božo Kolonja

This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with…

HTML
PDF (2.5 MB)

Abstract

Purpose

This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing.

Design/methodology/approach

The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases.

Findings

The use of the system is illustrated by examples demonstrating keyword search supported by Web query expansion services, search based on regular expressions, corpus search based on local grammars, followed by extraction of information based on this search and finally, search with lexical masks using domain and semantic markers.

Originality/value

The presented system is the first software solution for implementation of human language technology in management of documentation from the mining engineering domain, but it is also applicable to other engineering and non-engineering domains. The system is independent of the type of alphabet (Cyrillic and Latin), which makes it applicable to other languages of the Balkan region related to Serbian, and its support for morphological dictionaries can be applied in most morphologically complex languages, such as Slavic languages. Significant search improvements and the efficiency of IE are based on semantic networks and terminology dictionaries, with the support of local grammars.

Details

The Electronic Library, vol. 36 no. 6
Type: Research Article
DOI: https://doi.org/10.1108/EL-11-2017-0239
ISSN: 0264-0473

Keywords

  • Digital libraries
  • Information retrieval
  • Data mining
  • Human language technologies
  • Project documentation

Access
Only content I have access to
Only Open Access
Year
  • Last week (5)
  • Last month (24)
  • Last 3 months (114)
  • Last 6 months (234)
  • Last 12 months (485)
  • All dates (3122)
Content type
  • Article (2659)
  • Book part (228)
  • Earlycite article (223)
  • Case study (12)
1 – 10 of over 3000
Emerald Publishing
  • Opens in new window
  • Opens in new window
  • Opens in new window
  • Opens in new window
© 2021 Emerald Publishing Limited

Services

  • Authors Opens in new window
  • Editors Opens in new window
  • Librarians Opens in new window
  • Researchers Opens in new window
  • Reviewers Opens in new window

About

  • About Emerald Opens in new window
  • Working for Emerald Opens in new window
  • Contact us Opens in new window
  • Publication sitemap

Policies and information

  • Privacy notice
  • Site policies
  • Modern Slavery Act Opens in new window
  • Chair of Trustees governance statement Opens in new window
  • COVID-19 policy Opens in new window
Manage cookies

We’re listening — tell us what you think

  • Something didn’t work…

    Report bugs here

  • All feedback is valuable

    Please share your general feedback

  • Member of Emerald Engage?

    You can join in the discussion by joining the community or logging in here.
    You can also find out more about Emerald Engage.

Join us on our journey

  • Platform update page

    Visit emeraldpublishing.com/platformupdate to discover the latest news and updates

  • Questions & More Information

    Answers to the most commonly asked questions here