To read this content please select one of the options below:

A novel multi-layer feature fusion-based BERT-CNN for sentence representation learning and classification

Khaled Hamed Alyoubi (Department of Information Systems, King Abdulaziz University, Jeddah, Saudi Arabia)
Fahd Saleh Alotaibi (Department of Information Systems, King Abdulaziz University, Jeddah, Saudi Arabia)
Akhil Kumar (SCOPE, Vellore Institute of Technology, Chennai Campus, Chennai, India)
Vishal Gupta (Department of CSE, UIET, Panjab University, Chandigarh, India)
Akashdeep Sharma (Department of CSE, UIET, Panjab University, Chandigarh, India)

Robotic Intelligence and Automation

ISSN: 2754-6969

Article publication date: 2 November 2023

Issue publication date: 17 November 2023

162

Abstract

Purpose

The purpose of this paper is to describe a new approach to sentence representation learning leading to text classification using Bidirectional Encoder Representations from Transformers (BERT) embeddings. This work proposes a novel BERT-convolutional neural network (CNN)-based model for sentence representation learning and text classification. The proposed model can be used by industries that work in the area of classification of similarity scores between the texts and sentiments and opinion analysis.

Design/methodology/approach

The approach developed is based on the use of the BERT model to provide distinct features from its transformer encoder layers to the CNNs to achieve multi-layer feature fusion. To achieve multi-layer feature fusion, the distinct feature vectors of the last three layers of the BERT are passed to three separate CNN layers to generate a rich feature representation that can be used for extracting the keywords in the sentences. For sentence representation learning and text classification, the proposed model is trained and tested on the Stanford Sentiment Treebank-2 (SST-2) data set for sentiment analysis and the Quora Question Pair (QQP) data set for sentence classification. To obtain benchmark results, a selective training approach has been applied with the proposed model.

Findings

On the SST-2 data set, the proposed model achieved an accuracy of 92.90%, whereas, on the QQP data set, it achieved an accuracy of 91.51%. For other evaluation metrics such as precision, recall and F1 Score, the results obtained are overwhelming. The results with the proposed model are 1.17%–1.2% better as compared to the original BERT model on the SST-2 and QQP data sets.

Originality/value

The novelty of the proposed model lies in the multi-layer feature fusion between the last three layers of the BERT model with CNN layers and the selective training approach based on gated pruning to achieve benchmark results.

Keywords

Acknowledgements

This work is funded by Ministry of Education, Kingdom of Saudi Arabia, King AbdulAziz University, Jeddah, Saudi Arabia, with grant number: GRANT_FSALOTAIBI_140622_P001. The authors are thankful to the funding agency for necessary support.

Competing interests: The authors declare that they have no known conflict of interest(s).

Availability of code and material: The data sets used to carry out this work are available on public repository. The code of the proposed model can be obtained from the corresponding author on a reasonable request. SST-2 data set: www.huggingface.co/datasets/sst2, QQP data set: www.huggingface.co/datasets/quora

Since acceptance of this article, the following author has updated their affiliation: Akhil Kumar is at the Bennett University, Greater Noida, India.

Citation

Alyoubi, K.H., Alotaibi, F.S., Kumar, A., Gupta, V. and Sharma, A. (2023), "A novel multi-layer feature fusion-based BERT-CNN for sentence representation learning and classification", Robotic Intelligence and Automation, Vol. 43 No. 6, pp. 704-715. https://doi.org/10.1108/RIA-04-2023-0047

Publisher

:

Emerald Publishing Limited

Copyright © 2023, Emerald Publishing Limited

Related articles