Chaturvedi, Jaya and Mascio, Aurelie and Velupillai, Sumithra U. and Roberts, Angus (2021) Development of a Lexicon for Pain. Frontiers in Digital Health, 3. ISSN 2673-253X
pubmed-zip/versions/1/package-entries/fdgth-03-778305/fdgth-03-778305.pdf - Published Version
Download (1MB)
Abstract
Pain has been an area of growing interest in the past decade and is known to be associated with mental health issues. Due to the ambiguous nature of how pain is described in text, it presents a unique natural language processing (NLP) challenge. Understanding how pain is described in text and utilizing this knowledge to improve NLP tasks would be of substantial clinical importance. Not much work has previously been done in this space. For this reason, and in order to develop an English lexicon for use in NLP applications, an exploration of pain concepts within free text was conducted. The exploratory text sources included two hospital databases, a social media platform (Twitter), and an online community (Reddit). This exploration helped select appropriate sources and inform the construction of a pain lexicon. The terms within the final lexicon were derived from three sources—literature, ontologies, and word embedding models. This lexicon was validated by two clinicians as well as compared to an existing 26-term pain sub-ontology and MeSH (Medical Subject Headings) terms. The final validated lexicon consists of 382 terms and will be used in downstream NLP tasks by helping select appropriate pain-related documents from electronic health record (EHR) databases, as well as pre-annotating these words to help in development of an NLP application for classification of mentions of pain within the documents. The lexicon and the code used to generate the embedding models have been made publicly available.
Item Type: | Article |
---|---|
Subjects: | STM Library > Multidisciplinary |
Depositing User: | Managing Editor |
Date Deposited: | 21 Nov 2022 04:35 |
Last Modified: | 04 May 2024 04:22 |
URI: | http://open.journal4submit.com/id/eprint/284 |