EDBase: Generating a Lexicon Base for Eating Disorders Via Social Media

Anwar, Tarique, Fuller-Tyszkiewicz, Matthew, Jarman, Hannah K, Abuhassan, Mohammad, Shatte, Adrian, Team, WIRED, and Sukunesan, Suku (2022) EDBase: Generating a Lexicon Base for Eating Disorders Via Social Media. IEEE Journal of Biomedical and Health Informatics, 26 (12). pp. 6116-6125.

[img] PDF (Published Version) - Published Version
Restricted to Repository staff only

View at Publisher Website: https://doi.org/10.1109/JBHI.2022.321115...
 
1


Abstract

Eating disorders (EDs) are characterised by abnormal eating habits and obsessive thought about food, weight, shape, and body image. EDs are experienced by a significant portion of our population. Social media is identified as a possible source of influence for EDs, and there is growing evidence of a large amount of ED-related discussions on the Web via social media platforms, such as Twitter. With this growing trend, automatic content analysis for EDs is becoming increasingly important. To date, there does not exist any comprehensive benchmark ED lexicon to identify ED-related conversations that would, in turn, facilitate these content analysis tasks. In this paper, we propose a novel method for generating a lexicon base for ED language, called EDBase . The method starts with collecting over 3.7 million ED-focused tweets. In order to semantically represent potential ED terminology in a vector space, an ED word embedding model ( EDModel ) is trained. Then we develop a novel multi-seeded hierarchical density-based algorithm with contrasting corpora for ED lexicon expansion. The EDModel is queried by the proposed lexicon expansion algorithm to expand the seed terms to a comprehensive lexicon base. Our EDBase consists of a (further expandable) list of 3794 high-quality ED terms, quantified by an ED score, and linked to their parent terms. The proposed method significantly outperforms all existing alternative baseline methods and models by over 25% in terms of precision and 1500 in terms of true positives. This research is expected to be impactful in the health data science and healthcare community.

Item ID: 81617
Item Type: Article (Research - C1)
ISSN: 2168-2208
Keywords: Social networking (online); Data mining; Artificial intelligence; Mental disorders; Dictionaries; Semantics; Terminology; Eating disorders; Lexicon base; Mental health; Social media mining; Artificial intelligence.
Copyright Information: © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
Date Deposited: 21 Apr 2024 23:52
FoR Codes: 46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460208 Natural language processing @ 80%
42 HEALTH SCIENCES > 4202 Epidemiology > 420299 Epidemiology not elsewhere classified @ 20%
SEO Codes: 20 HEALTH > 2005 Specific population health (excl. Indigenous health) > 200599 Specific population health (excl. Indigenous health) not elsewhere classified @ 30%
22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220403 Artificial intelligence @ 70%
Downloads: Total: 1
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page