Incorporating Embedding to Topic Modeling for More Effective Short Text Analysis

Rashid, Junaid, Kim, Jungeun, and Naseem, Usman (2023) Incorporating Embedding to Topic Modeling for More Effective Short Text Analysis. In: Proceedings of the ACM Web Conference 2023. pp. 73-76. From: WWW 2023: The ACM Web Conference 2023: Companion of The World Wide Web Conference, April 30 - May 4 2023, Austin, TX, USA.

[img] PDF (Published Version) - Published Version
Restricted to Repository staff only

View at Publisher Website: https://doi.org/10.1145/3543873.3587316
 
1


Abstract

With the growing abundance of short text content on websites, analyzing and comprehending these short texts has become a crucial task. Topic modeling is a widely used technique for analyzing short text documents and uncovering the underlying topics. However, traditional topic models face difculties in accurately extracting topics from short texts due to limited content and their sparse nature. To address these issues, we propose an Embedding-based topic modeling (EmTM) approach that incorporates word embedding and hierarchical clustering to identify signifcant topics. Experimental results demonstrate the efectiveness of EmTM on two datasets comprising web short texts, Snippet and News. The results indicate a superiority of EmTM over baseline topic models by its exceptional performance in both classifcation accuracy and topic coherence metrics.

Item ID: 79221
Item Type: Conference Item (Research - E1)
ISBN: 978-1-4503-9416-1
Copyright Information: © 2023 Copyright held by the owner/author(s).
Date Deposited: 08 Aug 2023 00:14
FoR Codes: 46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460208 Natural language processing @ 100%
SEO Codes: 22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220403 Artificial intelligence @ 100%
Downloads: Total: 1
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page