Arabic text detection: a survey of recent progress challenges and opportunities
Muaad, Abdullah Y., Raza, Shaina, Naseem, Usman, and Davanagere, Hanumanthappa J. Jayappa (2023) Arabic text detection: a survey of recent progress challenges and opportunities. Applied Intelligence, 53 (24). pp. 29845-29862.
PDF (Published Version)
- Published Version
Restricted to Repository staff only |
Abstract
The Arabic language plays a crucial role in the world after becoming the sixth official language of the United Nations (UN). In the last ten years, there has been a rising growth in the number of Arabic texts, which requires algorithmic to be more effective and efficient to represent Arabic Text (AT), detecting patterns, and classifying text into the right class. Many algorithms are available for English text, but it is not the same for Arabic because of the complexity of morphology and diversity of the Arabic dialects. This study provides a survey of research in the field of Arabic Text Detection (ATD) published from 2017 to 2023. In addition, it has been conducted in a two-fold manner. Firstly, we survey based on eleven topics related to ATD. Secondly, we survey based on three stages of ATD namely pre-processing, representation, and detection. We explore all available datasets and open sources related to AT. It is revealed through the reviewed research that there are many topics of still interest to address. Furthermore, based on our observation deep-based methods yield better results only because they comprehend both the context and semantics of the language. However, they are also slower than traditional representations. Thus, hybrid models seem to be a promising way forward. Finally, we highlight new directions and discuss the open challenges and opportunities which assist researchers in identifying future work.
Item ID: | 81083 |
---|---|
Item Type: | Article (Research - C1) |
ISSN: | 1573-7497 |
Keywords: | Arabic language, Natural language processing, Text detection, Text pre-processing, Text representation |
Copyright Information: | © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023 |
Date Deposited: | 29 Feb 2024 05:30 |
FoR Codes: | 46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460208 Natural language processing @ 100% |
SEO Codes: | 22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220408 Information systems @ 100% |
Downloads: |
Total: 2 |
More Statistics |