Search results
1 – 10 of 10Shreyesh Doppalapudi, Tingyan Wang and Robin Qiu
Clinical notes typically contain medical jargons and specialized words and phrases that are complicated and technical to most people, which is one of the most challenging…
Abstract
Purpose
Clinical notes typically contain medical jargons and specialized words and phrases that are complicated and technical to most people, which is one of the most challenging obstacles in health information dissemination to consumers by healthcare providers. The authors aim to investigate how to leverage machine learning techniques to transform clinical notes of interest into understandable expressions.
Design/methodology/approach
The authors propose a natural language processing pipeline that is capable of extracting relevant information from long unstructured clinical notes and simplifying lexicons by replacing medical jargons and technical terms. Particularly, the authors develop an unsupervised keywords matching method to extract relevant information from clinical notes. To automatically evaluate completeness of the extracted information, the authors perform a multi-label classification task on the relevant texts. To simplify lexicons in the relevant text, the authors identify complex words using a sequence labeler and leverage transformer models to generate candidate words for substitution. The authors validate the proposed pipeline using 58,167 discharge summaries from critical care services.
Findings
The results show that the proposed pipeline can identify relevant information with high completeness and simplify complex expressions in clinical notes so that the converted notes have a high level of readability but a low degree of meaning change.
Social implications
The proposed pipeline can help healthcare consumers well understand their medical information and therefore strengthen communications between healthcare providers and consumers for better care.
Originality/value
An innovative pipeline approach is developed to address the health literacy problem confronted by healthcare providers and consumers in the ongoing digital transformation process in the healthcare industry.
Details
Keywords
Ashutosh Kumar and Aakanksha Sharaff
The purpose of this study was to design a multitask learning model so that biomedical entities can be extracted without having any ambiguity from biomedical texts.
Abstract
Purpose
The purpose of this study was to design a multitask learning model so that biomedical entities can be extracted without having any ambiguity from biomedical texts.
Design/methodology/approach
In the proposed automated bio entity extraction (ABEE) model, a multitask learning model has been introduced with the combination of single-task learning models. Our model used Bidirectional Encoder Representations from Transformers to train the single-task learning model. Then combined model's outputs so that we can find the verity of entities from biomedical text.
Findings
The proposed ABEE model targeted unique gene/protein, chemical and disease entities from the biomedical text. The finding is more important in terms of biomedical research like drug finding and clinical trials. This research aids not only to reduce the effort of the researcher but also to reduce the cost of new drug discoveries and new treatments.
Research limitations/implications
As such, there are no limitations with the model, but the research team plans to test the model with gigabyte of data and establish a knowledge graph so that researchers can easily estimate the entities of similar groups.
Practical implications
As far as the practical implication concerned, the ABEE model will be helpful in various natural language processing task as in information extraction (IE), it plays an important role in the biomedical named entity recognition and biomedical relation extraction and also in the information retrieval task like literature-based knowledge discovery.
Social implications
During the COVID-19 pandemic, the demands for this type of our work increased because of the increase in the clinical trials at that time. If this type of research has been introduced previously, then it would have reduced the time and effort for new drug discoveries in this area.
Originality/value
In this work we proposed a novel multitask learning model that is capable to extract biomedical entities from the biomedical text without any ambiguity. The proposed model achieved state-of-the-art performance in terms of precision, recall and F1 score.
Details
Keywords
Yi-Hung Liu and Sheng-Fong Chen
Whether automatically generated summaries of health social media can assist users in appropriately managing their diseases and ensuring better communication with health…
Abstract
Purpose
Whether automatically generated summaries of health social media can assist users in appropriately managing their diseases and ensuring better communication with health professionals becomes an important issue. This paper aims to develop a novel deep learning-based summarization approach for obtaining the most informative summaries from online patient reviews accurately and effectively.
Design/methodology/approach
This paper proposes a framework to generate summaries that integrates a domain-specific pre-trained embedding model and a deep neural extractive summary approach by considering content features, text sentiment, review influence and readability features. Representative health-related summaries were identified, and user judgements were analysed.
Findings
Experimental results on the three real-world health forum data sets indicate that awarding sentences without incorporating all the adopted features leads to declining summarization performance. The proposed summarizer significantly outperformed the comparison baseline. User judgement through the questionnaire provides realistic and concrete evidence of crucial features that remarkably influence patient forum review summaries.
Originality/value
This study contributes to health analytics and management literature by exploring users’ expressions and opinions through the health deep learning summarization model. The research also developed an innovative mindset to design summarization weighting methods from user-created content on health topics.
Details
Keywords
Khaled Hamed Alyoubi, Fahd Saleh Alotaibi, Akhil Kumar, Vishal Gupta and Akashdeep Sharma
The purpose of this paper is to describe a new approach to sentence representation learning leading to text classification using Bidirectional Encoder Representations from…
Abstract
Purpose
The purpose of this paper is to describe a new approach to sentence representation learning leading to text classification using Bidirectional Encoder Representations from Transformers (BERT) embeddings. This work proposes a novel BERT-convolutional neural network (CNN)-based model for sentence representation learning and text classification. The proposed model can be used by industries that work in the area of classification of similarity scores between the texts and sentiments and opinion analysis.
Design/methodology/approach
The approach developed is based on the use of the BERT model to provide distinct features from its transformer encoder layers to the CNNs to achieve multi-layer feature fusion. To achieve multi-layer feature fusion, the distinct feature vectors of the last three layers of the BERT are passed to three separate CNN layers to generate a rich feature representation that can be used for extracting the keywords in the sentences. For sentence representation learning and text classification, the proposed model is trained and tested on the Stanford Sentiment Treebank-2 (SST-2) data set for sentiment analysis and the Quora Question Pair (QQP) data set for sentence classification. To obtain benchmark results, a selective training approach has been applied with the proposed model.
Findings
On the SST-2 data set, the proposed model achieved an accuracy of 92.90%, whereas, on the QQP data set, it achieved an accuracy of 91.51%. For other evaluation metrics such as precision, recall and F1 Score, the results obtained are overwhelming. The results with the proposed model are 1.17%–1.2% better as compared to the original BERT model on the SST-2 and QQP data sets.
Originality/value
The novelty of the proposed model lies in the multi-layer feature fusion between the last three layers of the BERT model with CNN layers and the selective training approach based on gated pruning to achieve benchmark results.
Details
Keywords
Jinxiang Zeng, Shujin Cao, Yijin Chen, Pei Pan and Yafang Cai
This study analyzed the interdisciplinary characteristics of Chinese research studies in library and information science (LIS) measured by knowledge elements extracted through the…
Abstract
Purpose
This study analyzed the interdisciplinary characteristics of Chinese research studies in library and information science (LIS) measured by knowledge elements extracted through the Lexicon-LSTM model.
Design/methodology/approach
Eight research themes were selected for experiment, with a large-scale (N = 11,625) dataset of research papers from the China National Knowledge Infrastructure (CNKI) database constructed. And it is complemented with multiple corpora. Knowledge elements were extracted through a Lexicon-LSTM model. A subject knowledge graph is constructed to support the searching and classification of knowledge elements. An interdisciplinary-weighted average citation index space was constructed for measuring the interdisciplinary characteristics and contributions based on knowledge elements.
Findings
The empirical research shows that the Lexicon-LSTM model has superiority in the accuracy of extracting knowledge elements. In the field of LIS, the interdisciplinary diversity indicator showed an upward trend from 2011 to 2021, while the disciplinary balance and difference indicators showed a downward trend. The knowledge elements of theory and methodology could be used to detect and measure the interdisciplinary characteristics and contributions.
Originality/value
The extraction of knowledge elements facilitates the discovery of semantic information embedded in academic papers. The knowledge elements were proved feasible for measuring the interdisciplinary characteristics and exploring the changes in the time sequence, which helps for overview the state of the arts and future development trend of the interdisciplinary of research theme in LIS.
Details
Keywords
Tongyang Zhang, Fang Tan, Chao Yu, Jiexun Wu and Jian Xu
Proper topic selection is an essential prerequisite for the success of research. To study this, this article proposes an important concerned factor of topic selection-topic…
Abstract
Purpose
Proper topic selection is an essential prerequisite for the success of research. To study this, this article proposes an important concerned factor of topic selection-topic popularity, to examine the relationship between topic selection and team performance.
Design/methodology/approach
The authors adopt extracted entities on the type of gene/protein, which are used as proxies as topics, to keep track of the development of topic popularity. The decision tree model is used to classify the ascending phase and descending phase of entity popularity based on the temporal trend of entity occurrence frequency. Through comparing various dimensions of team performance – academic performance, research funding, relationship between performance and funding and corresponding author's influence at different phases of topic popularity – the relationship between the selected phase of topic popularity and academic performance of research teams can be explored.
Findings
First, topic popularity can impact team performance in the academic productivity and their research work's academic influence. Second, topic popularity can affect the quantity and amount of research funding received by teams. Third, topic popularity can impact the promotion effect of funding on team performance. Fourth, topic popularity can impact the influence of the corresponding author on team performance.
Originality/value
This is a new attempt to conduct team-oriented analysis on the relationship between topic selection and academic performance. Through understanding relationships amongst topic popularity, team performance and research funding, the study would be valuable for researchers and policy makers to conduct reasonable decision making on topic selection.
Details
Keywords
Huyen Nguyen, Haihua Chen, Jiangping Chen, Kate Kargozari and Junhua Ding
This study aims to evaluate a method of building a biomedical knowledge graph (KG).
Abstract
Purpose
This study aims to evaluate a method of building a biomedical knowledge graph (KG).
Design/methodology/approach
This research first constructs a COVID-19 KG on the COVID-19 Open Research Data Set, covering information over six categories (i.e. disease, drug, gene, species, therapy and symptom). The construction used open-source tools to extract entities, relations and triples. Then, the COVID-19 KG is evaluated on three data-quality dimensions: correctness, relatedness and comprehensiveness, using a semiautomatic approach. Finally, this study assesses the application of the KG by building a question answering (Q&A) system. Five queries regarding COVID-19 genomes, symptoms, transmissions and therapeutics were submitted to the system and the results were analyzed.
Findings
With current extraction tools, the quality of the KG is moderate and difficult to improve, unless more efforts are made to improve the tools for entity extraction, relation extraction and others. This study finds that comprehensiveness and relatedness positively correlate with the data size. Furthermore, the results indicate the performances of the Q&A systems built on the larger-scale KGs are better than the smaller ones for most queries, proving the importance of relatedness and comprehensiveness to ensure the usefulness of the KG.
Originality/value
The KG construction process, data-quality-based and application-based evaluations discussed in this paper provide valuable references for KG researchers and practitioners to build high-quality domain-specific knowledge discovery systems.
Details
Keywords
Yi-Hung Liu, Sheng-Fong Chen and Dan-Wei (Marian) Wen
Online medical repositories provide a platform for users to share information and dynamically access abundant electronic health data. It is important to determine whether case…
Abstract
Purpose
Online medical repositories provide a platform for users to share information and dynamically access abundant electronic health data. It is important to determine whether case report information can assist the general public in appropriately managing their diseases. Therefore, this paper aims to introduce a novel deep learning-based method that allows non-professionals to make inquiries using ordinary vocabulary, retrieving the most relevant case reports for accurate and effective health information.
Design/methodology/approach
The dataset of case reports was collected from both the patient-generated research network and the digital medical journal repository. To enhance the accuracy of obtaining relevant case reports, the authors propose a retrieval approach that combines BERT and BiLSTM methods. The authors identified representative health-related case reports and analyzed the retrieval performance, as well as user judgments.
Findings
This study aims to provide the necessary functionalities to deliver relevant health case reports based on input from ordinary terms. The proposed framework includes features for health management, user feedback acquisition and ranking by weights to obtain the most pertinent case reports.
Originality/value
This study contributes to health information systems by analyzing patients' experiences and treatments with the case report retrieval model. The results of this study can provide immense benefit to the general public who intend to find treatment decisions and experiences from relevant case reports.
Details