The purpose of this study was to design a multitask learning model so that biomedical entities can be extracted without having any ambiguity from biomedical texts.
In the proposed automated bio entity extraction (ABEE) model, a multitask learning model has been introduced with the combination of single-task learning models. Our model used Bidirectional Encoder Representations from Transformers to train the single-task learning model. Then combined model's outputs so that we can find the verity of entities from biomedical text.
The proposed ABEE model targeted unique gene/protein, chemical and disease entities from the biomedical text. The finding is more important in terms of biomedical research like drug finding and clinical trials. This research aids not only to reduce the effort of the researcher but also to reduce the cost of new drug discoveries and new treatments.
As such, there are no limitations with the model, but the research team plans to test the model with gigabyte of data and establish a knowledge graph so that researchers can easily estimate the entities of similar groups.
As far as the practical implication concerned, the ABEE model will be helpful in various natural language processing task as in information extraction (IE), it plays an important role in the biomedical named entity recognition and biomedical relation extraction and also in the information retrieval task like literature-based knowledge discovery.
During the COVID-19 pandemic, the demands for this type of our work increased because of the increase in the clinical trials at that time. If this type of research has been introduced previously, then it would have reduced the time and effort for new drug discoveries in this area.
In this work we proposed a novel multitask learning model that is capable to extract biomedical entities from the biomedical text without any ambiguity. The proposed model achieved state-of-the-art performance in terms of precision, recall and F1 score.
The authors gratefully acknowledge the Department of Computer Science and Engineering of the National Institute of Technology Raipur for providing infrastructure and facilities necessary for this work.
Funding: This research is not funded by any financial institution.
Authors' contributions: A.K. and A.S. hypothesized and designed the idea of ABEE model. A.K. developed ABEE. A.K. and A.S. experimented and analyzed the results. A.S., as the supervisor of A.K., guided this research work. All authors read the final manuscript carefully and approved it.
Availability of data: All the corpora are openly licensed and available at https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data and https://github.com/SKumarAshutosh/ABEE.
Declaration of competing interests: The authors declare that they have no competing interests.
Kumar, A. and Sharaff, A. (2022), "ABEE: automated bio entity extraction from biomedical text documents", Data Technologies and Applications, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/DTA-04-2022-0151
Emerald Publishing Limited
Copyright © 2022, Emerald Publishing Limited