Domain Knowledge-Enhanced Contrastive Learning for Industry Classification of Enterprises
Wen, Kang, Guo, Zhaoxia, Huang, Tao, and Guo, Feng (2024) Domain Knowledge-Enhanced Contrastive Learning for Industry Classification of Enterprises. In: Proceedings of the 4th IEEE International Conference on Software Engineering and Artificial Intelligence. pp. 210-214. From: SEAI 2024: IEEE 4th International Conference on Software Engineering and Artificial Intelligence, 21-23 June 2024, Xiamen, China.
|
PDF (Published Version)
- Published Version
Restricted to Repository staff only |
Abstract
There exist various online platforms that provide enterprise data and services for e-commerce and foreign trade companies, in which accurate industry classification of enter-prises data is the basis for effective data services. However, there is a pressing need to obtain accurate industry classification automatically. This paper develops a new contrastive learning framework for industry classification of enterprises, which integrates industry domain knowledge with a RoBERTa pretraining model. The effectiveness of our proposed model is demonstrated by extensive experiments on a real enterprise data set. The experimental results show that the model is superior to the baseline and the RoBERTa model with simple contrastive learning method in terms of accuracy, F1 score, recall rate and precision.
| Item ID: | 86882 |
|---|---|
| Item Type: | Conference Item (Research - E1) |
| ISBN: | 9798350374346 |
| Keywords: | Contrastive Learning, Domain Knowledge, Industry classification, RoBERTa, Text Classification |
| Copyright Information: | © 2024 IEEE |
| Date Deposited: | 12 Nov 2025 01:59 |
| FoR Codes: | 46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460299 Artificial intelligence not elsewhere classified @ 50% 46 INFORMATION AND COMPUTING SCIENCES > 4612 Software engineering > 461299 Software engineering not elsewhere classified @ 50% |
| SEO Codes: | 22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220403 Artificial intelligence @ 100% |
| More Statistics |
