Web-based methodology for extracting technology words in Chinese process patents
International Journal of Web Information Systems
ISSN: 1744-0084
Article publication date: 11 August 2020
Issue publication date: 8 October 2020
Abstract
Purpose
The purpose of constructing the technology/function matrix is to analyze the patents in the target domain. The extraction of technology words is an important part of the construction of technology/function matrix. This algorithm is used to solve the problem of low efficiency of traditional Chinese process patents technology words extraction.
Design/methodology/approach
The authors propose a Chinese process patents technology words extraction method based on the improved term frequency–inverse document frequency (TF-IDF) algorithm to help technicians obtain the technology words in the target domain. According to the characteristics of Chinese process patents technology words, the TF value of candidate technology words is divided into four parts, and the corpus of IDF value calculation of candidate technology words is selected.
Findings
Through the test of Chinese process patents in the domain of path planning, this study shows that the method is feasible and practical. It can help users quickly and accurately obtain the technology words of Chinese process patents in the target domain.
Practical implications
With the increasing number of patents on the network-based patent information platform, patent analysis of massive Chinese process patents has become a research focus. The method proposed in this paper can facilitate users to extract technology words from massive Chinese process patents for patent analysis.
Originality/value
This paper aims to improve the efficiency of Chinese process patents technology words extraction. The authors hope that the proposed method can reduce the labor and time cost of Chinese process patents technology words extraction.
Keywords
Citation
Yang, Y. and Ren, G. (2020), "Web-based methodology for extracting technology words in Chinese process patents", International Journal of Web Information Systems, Vol. 16 No. 3, pp. 315-329. https://doi.org/10.1108/IJWIS-06-2020-0033
Publisher
:Emerald Publishing Limited
Copyright © 2020, Emerald Publishing Limited