Domain Knowledge-Enhanced Contrastive Learning for Industry Classification of Enterprises

Wen, Kang, Guo, Zhaoxia, Huang, Tao, and Guo, Feng (2024) Domain Knowledge-Enhanced Contrastive Learning for Industry Classification of Enterprises. In: Proceedings of the 4th IEEE International Conference on Software Engineering and Artificial Intelligence. pp. 210-214. From: SEAI 2024: IEEE 4th International Conference on Software Engineering and Artificial Intelligence, 21-23 June 2024, Xiamen, China.

[img] PDF (Published Version) - Published Version
Restricted to Repository staff only

View at Publisher Website: https://doi.org/10.1109/SEAI62072.2024.1...


Abstract

There exist various online platforms that provide enterprise data and services for e-commerce and foreign trade companies, in which accurate industry classification of enter-prises data is the basis for effective data services. However, there is a pressing need to obtain accurate industry classification automatically. This paper develops a new contrastive learning framework for industry classification of enterprises, which integrates industry domain knowledge with a RoBERTa pretraining model. The effectiveness of our proposed model is demonstrated by extensive experiments on a real enterprise data set. The experimental results show that the model is superior to the baseline and the RoBERTa model with simple contrastive learning method in terms of accuracy, F1 score, recall rate and precision.

Item ID: 86882
Item Type: Conference Item (Research - E1)
ISBN: 9798350374346
Keywords: Contrastive Learning, Domain Knowledge, Industry classification, RoBERTa, Text Classification
Copyright Information: © 2024 IEEE
Date Deposited: 12 Nov 2025 01:59
FoR Codes: 46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460299 Artificial intelligence not elsewhere classified @ 50%
46 INFORMATION AND COMPUTING SCIENCES > 4612 Software engineering > 461299 Software engineering not elsewhere classified @ 50%
SEO Codes: 22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220403 Artificial intelligence @ 100%
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page