Search results

1 – 10 of over 1000
Article
Publication date: 11 December 2020

Lei Lei, Yaochen Deng and Dilin Liu

Examining research topics in a specific area such as accounting is important to both novice and veteran researchers. The present study aims to identify the research topics in the…

Abstract

Purpose

Examining research topics in a specific area such as accounting is important to both novice and veteran researchers. The present study aims to identify the research topics in the area of accounting and to investigate the research trends by finding hot and cold topics from all those identified ones in the field.

Design/methodology/approach

A new dependency-based method focusing on noun phrases, which efficiently extracts research topics from a large set of library data, was proposed. An AR(1) autoregressive model was used to identify topics that have received significantly more or less attention from the researchers. The data used in the study included a total of 4,182 abstracts published in six leading (or premier) accounting journals from 2000 to May 2019.

Findings

The study identified 48 important research topics across the examined period as well as eight hot topics and one cold topic from the 48 topics.

Originality/value

The research topics identified based on the dependency-based method are similar to those found with the technique of latent Dirichlet allocation latent Dirichlet allocation (LDA) topic modelling. In addition, the method seems highly efficient, and the results are easier to interpret. Last, the research topics and trends found in the study provide reference to the researchers in the area of accounting.

Details

Library Hi Tech, vol. 41 no. 2
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 3 December 2018

Cong-Phuoc Phan, Hong-Quang Nguyen and Tan-Tai Nguyen

Large collections of patent documents disclosing novel, non-obvious technologies are publicly available and beneficial to academia and industries. To maximally exploit its…

Abstract

Purpose

Large collections of patent documents disclosing novel, non-obvious technologies are publicly available and beneficial to academia and industries. To maximally exploit its potential, searching these patent documents has increasingly become an important topic. Although much research has processed a large size of collections, a few studies have attempted to integrate both patent classifications and specifications for analyzing user queries. Consequently, the queries are often insufficiently analyzed for improving the accuracy of search results. This paper aims to address such limitation by exploiting semantic relationships between patent contents and their classification.

Design/methodology/approach

The contributions are fourfold. First, the authors enhance similarity measurement between two short sentences and make it 20 per cent more accurate. Second, the Graph-embedded Tree ontology is enriched by integrating both patent documents and classification scheme. Third, the ontology does not rely on rule-based method or text matching; instead, an heuristic meaning comparison to extract semantic relationships between concepts is applied. Finally, the patent search approach uses the ontology effectively with the results sorted based on their most common order.

Findings

The experiment on searching for 600 patent documents in the field of Logistics brings better 15 per cent in terms of F-Measure when compared with traditional approaches.

Research limitations/implications

The research, however, still requires improvement in which the terms and phrases extracted by Noun and Noun phrases making less sense in some aspect and thus might not result in high accuracy. The large collection of extracted relationships could be further optimized for its conciseness. In addition, parallel processing such as Map-Reduce could be further used to improve the search processing performance.

Practical implications

The experimental results could be used for scientists and technologists to search for novel, non-obvious technologies in the patents.

Social implications

High quality of patent search results will reduce the patent infringement.

Originality/value

The proposed ontology is semantically enriched by integrating both patent documents and their classification. This ontology facilitates the analysis of the user queries for enhancing the accuracy of the patent search results.

Details

International Journal of Web Information Systems, vol. 15 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 20 June 2008

Thanh Nguyen and Tuoi Phan

The purpose of this paper is to propose a hybrid ontology‐based solution to expand user's queries.

Abstract

Purpose

The purpose of this paper is to propose a hybrid ontology‐based solution to expand user's queries.

Design/methodology/approach

The solution aims for ontology development and query expansion with ontology‐based approach. The first task is to develop an ontology (named OMP), which relates to key‐properties and key‐members of objects described in words/terms of English vocabulary. Its training methodology is also a hybrid, rule‐based with proposed patterns and statistical‐based solution for selecting the best candidates from TREC English corpus. The second is proposals for mechanisms not only to look for relative result in the ontology OMP to complete and expand user's entered query/noun phrase, but also to expand the search progress by linking the OMP ontology to indexes of information retrieval system. Especially, the base of these two tasks is our proposal for four kinds of semantic relationship of words.

Findings

Several semantic relationships among words in vocabulary has been introduced and currently used in WordNet to represent the system of semantic networks. In another way, our analyzing for words in English vocabulary found that there are some kinds of semantic dependency in some cases for part(s) of a noun phrase, and it can be represented in grammar noun phrase syntax. That affects not only the proposed approach of ontology OMP development via identifying four kinds of semantic relationship and organizing its structure including core element types such as object and key‐member and key‐property, but also ontology training mechanism and solutions of query expansion by adding extended correspondent words (based on that relationship) to original query.

Research limitations/implications

In initial iteration, the approach is applied for English query only with limited size of ontology OMP and dependency on grammar rules‐based in creating patterns to extract data from corpus. For future research, applications for other languages (Vietnamese, Chinese …) with sharp focus on improvement of ontology training quality/quantity and query expansion precision are primary targets.

Practical implications

The developed ontology OMP can be shared as a support for other applications such as semantic data extraction or semantic information retrieval in other researches.

Originality/value

This paper fulfils an approach of ontology‐based query expansion and theoretical definitions of semantic relationship among words. Specially, these kinds of relationship can use to develop a useful semantic network system.

Details

International Journal of Web Information Systems, vol. 4 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 1 March 1992

BRIAN VICKERY and ALINA VICKERY

The paper describes techniques developed by Tome Associates to process natural language queries into search statements suitable for transmission to online text database systems…

Abstract

The paper describes techniques developed by Tome Associates to process natural language queries into search statements suitable for transmission to online text database systems. The problems discussed include word identification, the handling of unknown words, the contents and structure of system dictionaries, the use of semantic categories and classification, disambiguation of multi‐meaning words, stemming and truncation, noun compounds and indications of relationship between search terms.

Details

Journal of Documentation, vol. 48 no. 3
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 6 July 2021

Shwe Sin Phyo

With the wealth of information available on the World Wide Web, it is difficult for anyone from a general user to the researcher to easily fulfill their information need. The main…

179

Abstract

Purpose

With the wealth of information available on the World Wide Web, it is difficult for anyone from a general user to the researcher to easily fulfill their information need. The main challenge is to categorize the documents systematically and also take into account more valuable data such as semantic information. The purpose of this paper is to develop a concept-based search system that leverages the external knowledge resources as the background knowledge for getting the accurate and efficient meaningful search results.

Design/methodology/approach

The paper introduces the approach which is based on formal concept analysis (FCA) with the semantic information to support the document management in information retrieval (IR). To describe the semantic information of the documents, the system uses the popular knowledge resources WordNet and Wikipedia. By using FCA, the system creates the concept lattice as the concept hierarchy of the document and proposes the navigation algorithm for retrieving the hierarchy based on the user query.

Findings

The semantic information of the document is based on the two external popular knowledge resources; the authors find that it will be more efficient to deal with the semantic mismatch problems of user need.

Originality/value

The navigation algorithm proposed in this research is applied to the scientific articles of the National Science Foundation (NSF). The proposed system can enhance the integration and exploration of the scientific articles for the advancement of the Scientific and Engineering Research Community.

Details

Data Technologies and Applications, vol. 56 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 28 June 2022

Chedi Bechikh Ali, Hatem Haddad and Yahya Slimani

A number of approaches and algorithms have been proposed over the years as a basis for automatic indexing. Many of these approaches suffer from precision inefficiency at low…

Abstract

Purpose

A number of approaches and algorithms have been proposed over the years as a basis for automatic indexing. Many of these approaches suffer from precision inefficiency at low recall. The choice of indexing units has a great impact on search system effectiveness. The authors dive beyond simple terms indexing to propose a framework for multi-word terms (MWT) filtering and indexing.

Design/methodology/approach

In this paper, the authors rely on ranking MWT to filter them, keeping the most effective ones for the indexing process. The proposed model is based on filtering MWT according to their ability to capture the document topic and distinguish between different documents from the same collection. The authors rely on the hypothesis that the best MWT are those that achieve the greatest association degree. The experiments are carried out with English and French languages data sets.

Findings

The results indicate that this approach achieved precision enhancements at low recall, and it performed better than more advanced models based on terms dependencies.

Originality/value

Using and testing different association measures to select MWT that best describe the documents to enhance the precision in the first retrieved documents.

Article
Publication date: 21 June 2011

Yi‐ling Lin, Peter Brusilovsky and Daqing He

The goal of the research is to explore whether the use of higher‐level semantic features can help us to build better self‐organising map (SOM) representation as measured from a…

Abstract

Purpose

The goal of the research is to explore whether the use of higher‐level semantic features can help us to build better self‐organising map (SOM) representation as measured from a human‐centred perspective. The authors also explore an automatic evaluation method that utilises human expert knowledge encapsulated in the structure of traditional textbooks to determine map representation quality.

Design/methodology/approach

Two types of document representations involving semantic features have been explored – i.e. using only one individual semantic feature, and mixing a semantic feature with keywords. Experiments were conducted to investigate the impact of semantic representation quality on the map. The experiments were performed on data collections from a single book corpus and a multiple book corpus.

Findings

Combining keywords with certain semantic features achieves significant improvement of representation quality over the keywords‐only approach in a relatively homogeneous single book corpus. Changing the ratios in combining different features also affects the performance. While semantic mixtures can work well in a single book corpus, they lose their advantages over keywords in the multiple book corpus. This raises a concern about whether the semantic representations in the multiple book corpus are homogeneous and coherent enough for applying semantic features. The terminology issue among textbooks affects the ability of the SOM to generate a high quality map for heterogeneous collections.

Originality/value

The authors explored the use of higher‐level document representation features for the development of better quality SOM. In addition the authors have piloted a specific method for evaluating the SOM quality based on the organisation of information content in the map.

Details

Online Information Review, vol. 35 no. 3
Type: Research Article
ISSN: 1468-4527

Keywords

Abstract

Details

Machine Translation and Global Research: Towards Improved Machine Translation Literacy in the Scholarly Community
Type: Book
ISBN: 978-1-78756-721-4

Article
Publication date: 1 January 1993

Ankie Visschedijk and Forbes Gibb

This article reviews some of the more unconventional text retrieval systems, emphasising those which have been commercialised. These sophisticated systems improve on conventional…

Abstract

This article reviews some of the more unconventional text retrieval systems, emphasising those which have been commercialised. These sophisticated systems improve on conventional retrieval by using either innovative software or hardware to increase retrieval speed or functionality, precision or recall. The software systems reviewed are: AIDA, CLARIT, Metamorph, SIMPR, STATUS/IQ, TCS, TINA and TOPIC. The hardware systems reviewed are: CAFS‐ISP, the Connection Machine, GESCAN,HSTS,MPP, TEXTRACT, TRW‐FDF and URSA.

Details

Online and CD-Rom Review, vol. 17 no. 1
Type: Research Article
ISSN: 1353-2642

Keywords

Article
Publication date: 19 September 2020

Juite Wang and Chih-Chi Hsu

Smart manufacturing can lead to disruptive changes in production technologies and business models in the manufacturing industry. This paper aims to identify technological topics…

1104

Abstract

Purpose

Smart manufacturing can lead to disruptive changes in production technologies and business models in the manufacturing industry. This paper aims to identify technological topics in smart manufacturing by using patent data, investigating technological trends and exploring potential opportunities.

Design/methodology/approach

The latent Dirichlet allocation (LDA) topic modeling technique was used to extract latent technological topics, and the generalized linear mixed model (GLMM) was used to analyze the relative emergence levels of the topics. Topic value and topic competitive analyses were developed to evaluate each topic's potential value and identify technological positions of competing firms, respectively.

Findings

A total of 14 topics were extracted from the collected patent data and several fast growth and high-value topics were identified, such as smart connection, cyber-physical systems (CPSs), manufacturing data analytics and powder bed fusion additive manufacturing. Several leading firms apply broad R&D emphasis across a variety of technological topics, while others focus on a few technological topics.

Practical implications

The developed methodology can help firms identify important technological topics in smart manufacturing for making their R&D investment decisions. Firms can select appropriate technology strategies depending on the topic's emergence position in the topic strategy matrix.

Originality/value

Previous research studies have not analyzed the maturity levels of technological topics. The topic-based patent analytics approach can complement previous studies. In addition, this study provides a multi-valuation framework for exploring technological opportunities, thus providing valuable information that supports a more robust understanding of the technology landscape of smart manufacturing.

Details

Journal of Manufacturing Technology Management, vol. 32 no. 1
Type: Research Article
ISSN: 1741-038X

Keywords

1 – 10 of over 1000