To read this content please select one of the options below:

ABEE: automated bio entity extraction from biomedical text documents

Ashutosh Kumar (Department of Computer Science and Engineering, National Institute of Technology Raipur, Raipur, India)
Aakanksha Sharaff (Department of Computer Science and Engineering, National Institute of Technology Raipur, Raipur, India)

Data Technologies and Applications

ISSN: 2514-9288

Article publication date: 25 January 2023

Issue publication date: 25 April 2023

103

Abstract

Purpose

The purpose of this study was to design a multitask learning model so that biomedical entities can be extracted without having any ambiguity from biomedical texts.

Design/methodology/approach

In the proposed automated bio entity extraction (ABEE) model, a multitask learning model has been introduced with the combination of single-task learning models. Our model used Bidirectional Encoder Representations from Transformers to train the single-task learning model. Then combined model's outputs so that we can find the verity of entities from biomedical text.

Findings

The proposed ABEE model targeted unique gene/protein, chemical and disease entities from the biomedical text. The finding is more important in terms of biomedical research like drug finding and clinical trials. This research aids not only to reduce the effort of the researcher but also to reduce the cost of new drug discoveries and new treatments.

Research limitations/implications

As such, there are no limitations with the model, but the research team plans to test the model with gigabyte of data and establish a knowledge graph so that researchers can easily estimate the entities of similar groups.

Practical implications

As far as the practical implication concerned, the ABEE model will be helpful in various natural language processing task as in information extraction (IE), it plays an important role in the biomedical named entity recognition and biomedical relation extraction and also in the information retrieval task like literature-based knowledge discovery.

Social implications

During the COVID-19 pandemic, the demands for this type of our work increased because of the increase in the clinical trials at that time. If this type of research has been introduced previously, then it would have reduced the time and effort for new drug discoveries in this area.

Originality/value

In this work we proposed a novel multitask learning model that is capable to extract biomedical entities from the biomedical text without any ambiguity. The proposed model achieved state-of-the-art performance in terms of precision, recall and F1 score.

Keywords

Acknowledgements

The authors gratefully acknowledge the Department of Computer Science and Engineering of the National Institute of Technology Raipur for providing infrastructure and facilities necessary for this work.

Funding: This research is not funded by any financial institution.

Authors' contributions: A.K. and A.S. hypothesized and designed the idea of ABEE model. A.K. developed ABEE. A.K. and A.S. experimented and analyzed the results. A.S., as the supervisor of A.K., guided this research work. All authors read the final manuscript carefully and approved it.

Availability of data: All the corpora are openly licensed and available at https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data and https://github.com/SKumarAshutosh/ABEE.

Declaration of competing interests: The authors declare that they have no competing interests.

Citation

Kumar, A. and Sharaff, A. (2023), "ABEE: automated bio entity extraction from biomedical text documents", Data Technologies and Applications, Vol. 57 No. 2, pp. 222-244. https://doi.org/10.1108/DTA-04-2022-0151

Publisher

:

Emerald Publishing Limited

Copyright © 2023, Emerald Publishing Limited

Related articles