Biomedical Named-Entity Recognition by Hierarchically Fusing BioBERT Representations and Deep Contextual-Level Word-Embedding

Naseem, Usman, Musial, Katarzyna, Eklund, Peter, and Prasad, Mukesh (2020) Biomedical Named-Entity Recognition by Hierarchically Fusing BioBERT Representations and Deep Contextual-Level Word-Embedding. In: Proceedings of the 2020 International Joint Conference on Neural Networks. From: IJCNN: 2020 International Joint Conference on Neural Networks, 19-24 July 2020, Glasgow, UK.

[img] PDF (Pubished Version) - Published Version
Restricted to Repository staff only

View at Publisher Website: http://doi.org/10.1109/IJCNN48605.2020.9...


Abstract

Text mining in the biomedical domain is increasingly important as the volume of biomedical documents increases. Thanks to advances in natural language processing (NLP), extracting valuable information from the biomedical literature is gaining popularity among researchers, and deep learning has enabled the development of effective biomedical text mining models. However, directly applying advancements in NLP to biomedical sources often yields unsatisfactory results, due to a word distribution drift from the general language domain corpora to specific biomedical corpora, and this drift introduces linguistic ambiguities. To overcome these challenges, this paper presents a novel method for biomedical named entity-recognition (BioNER) through hierarchically fusing representations from BioBERT, which is trained on biomedical corpora and Deep contextual-level word embeddings to handle the linguistic challenges within biomedical literature. Proposed text representation is then fed to attention-based Bi-directional Long Short Term Memory (BiLSTM) with Conditional random field (CRF) for the BioNER task. The experimental analysis shows that our proposed end-to-end methodology outperforms existing state-of-the-art methods for the BioNER task.

Item ID: 79242
Item Type: Conference Item (Research - E1)
ISBN: 978-1-7281-6926-2
Copyright Information: © 2020 IEEE
Date Deposited: 06 Jul 2023 02:41
FoR Codes: 46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460208 Natural language processing @ 100%
SEO Codes: 22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220403 Artificial intelligence @ 100%
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page