To read this content please select one of the options below:

A Chinese nested named entity recognition approach using sequence labeling

Maojian Chen (School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China; Shunde Innovation School, University of Science and Technology Beijing, Foshan, China and Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing, China)
Xiong Luo (School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China; Shunde Innovation School, University of Science and Technology Beijing, Foshan, China and Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing, China)
Hailun Shen (Ouyeel Co., Ltd, Shanghai, China)
Ziyang Huang (Ouyeel Co., Ltd, Shanghai, China)
Qiaojuan Peng (School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China)
Yuqi Yuan (School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 4 July 2023

Issue publication date: 12 July 2023

148

Abstract

Purpose

This study aims to introduce an innovative approach that uses a decoder with multiple layers to accurately identify Chinese nested entities across various nesting depths. To address potential human intervention, an advanced optimization algorithm is used to fine-tune the decoder based on the depth of nested entities present in the data set. With this approach, this study achieves remarkable performance in recognizing Chinese nested entities.

Design/methodology/approach

This study provides a framework for Chinese nested named entity recognition (NER) based on sequence labeling methods. Similar to existing approaches, the framework uses an advanced pre-training model as the backbone to extract semantic features from the text. Then a decoder comprising multiple conditional random field (CRF) algorithms is used to learn the associations between granularity labels. To minimize the need for manual intervention, the Jaya algorithm is used to optimize the number of CRF layers. Experimental results validate the effectiveness of the proposed approach, demonstrating its superior performance on both Chinese nested NER and flat NER tasks.

Findings

The experimental findings illustrate that the proposed methodology can achieve a remarkable 4.32% advancement in nested NER performance on the People’s Daily corpus compared to existing models.

Originality/value

This study explores a Chinese NER methodology based on the sequence labeling ideology for recognizing sophisticated Chinese nested entities with remarkable accuracy.

Keywords

Acknowledgements

This work was supported in part by the Beijing Natural Science Foundation under grants 19L2029 and M21032, in part by the National Natural Science Foundation of China under grants 62271045 and U1836106, and in part by the Scientific and Technological Innovation Foundation of Foshan under grants BK21BF001 and BK20BF010.

Declarations.

The People’s Daily data set, MSRA, OntoNotes 4.0 and Weibo analyzed during the current study are available in the following public domain resources.

www.ling.lancs.ac.uk/corplang/pdcorpus/pdcorpus.html.

http://sighan.cs.uchicago.edu/bakeoff2006/.

https://catalog.ldc.upenn.edu/LDC2011T03.

https://github.com/hltcoe/golden-horse.

The steel industry data that support the findings of this study are available from enterprise but restrictions apply to the availability of these data, which were used under license for the current study and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of the enterprise.

Conflict of interest: The authors declare no conflict of interest.

Citation

Chen, M., Luo, X., Shen, H., Huang, Z., Peng, Q. and Yuan, Y. (2023), "A Chinese nested named entity recognition approach using sequence labeling", International Journal of Web Information Systems, Vol. 19 No. 1, pp. 42-60. https://doi.org/10.1108/IJWIS-04-2023-0070

Publisher

:

Emerald Publishing Limited

Copyright © 2023, Emerald Publishing Limited

Related articles