To read this content please select one of the options below:

Low-resource multi-granularity academic function recognition based on multiple prompt knowledge

Jiawei Liu (School of Information Management, Wuhan University, Wuhan, China)
Zi Xiong (School of Information Management, Wuhan University, Wuhan, China)
Yi Jiang (School of Information Management, Wuhan University, Wuhan, China)
Yongqiang Ma (School of Information Management, Wuhan University, Wuhan, China)
Wei Lu (School of Information Management, Wuhan University, Wuhan, China)
Yong Huang (School of Information Management, Wuhan University, Wuhan, China)
Qikai Cheng (School of Information Management, Wuhan University, Wuhan, China)

The Electronic Library

ISSN: 0264-0473

Article publication date: 22 August 2024

Issue publication date: 31 October 2024

79

Abstract

Purpose

Fine-tuning pre-trained language models (PLMs), e.g. SciBERT, generally require large numbers of annotated data to achieve state-of-the-art performance on a range of NLP tasks in the scientific domain. However, obtaining fine-tuning data for scientific NLP tasks is still challenging and expensive. In this paper, the authors propose the mix prompt tuning (MPT), which is a semi-supervised method aiming to alleviate the dependence on annotated data and improve the performance of multi-granularity academic function recognition tasks.

Design/methodology/approach

Specifically, the proposed method provides multi-perspective representations by combining manually designed prompt templates with automatically learned continuous prompt templates to help the given academic function recognition task take full advantage of knowledge in PLMs. Based on these prompt templates and the fine-tuned PLM, a large number of pseudo labels are assigned to the unlabelled examples. Finally, the authors further fine-tune the PLM using the pseudo training set. The authors evaluate the method on three academic function recognition tasks of different granularity including the citation function, the abstract sentence function and the keyword function, with data sets from the computer science domain and the biomedical domain.

Findings

Extensive experiments demonstrate the effectiveness of the method and statistically significant improvements against strong baselines. In particular, it achieves an average increase of 5% in Macro-F1 score compared with fine-tuning, and 6% in Macro-F1 score compared with other semi-supervised methods under low-resource settings.

Originality/value

In addition, MPT is a general method that can be easily applied to other low-resource scientific classification tasks.

Keywords

Acknowledgements

This work is supported by the National Science and Technology Major Project (2023ZD0121502) and the National Natural Science Foundation of China (72174157).

Citation

Liu, J., Xiong, Z., Jiang, Y., Ma, Y., Lu, W., Huang, Y. and Cheng, Q. (2024), "Low-resource multi-granularity academic function recognition based on multiple prompt knowledge", The Electronic Library, Vol. 42 No. 6, pp. 879-904. https://doi.org/10.1108/EL-01-2024-0022

Publisher

:

Emerald Publishing Limited

Copyright © 2024, Emerald Publishing Limited

Related articles