Bridging NLP Gaps: Methodology to Create A Novel Dataset for Named Entity Recognition in Algerian Dialectal Arabic

BELBEKRI, Adel; Bouarroudj, Wissem
 Fouzia, Benchikha

DSpace Home
→
Maison de l'Intelligence Artificielle
→
Conférence et Séminaire
→
Séminaires
→
National Conference on Artificial Intelligence and Its Applications (NCAIA 2024)
→
View Item

Bridging NLP Gaps: Methodology to Create A Novel Dataset for Named Entity Recognition in Algerian Dialectal Arabic

BELBEKRI, Adel; Bouarroudj, Wissem Fouzia, Benchikha

URI: http://depot.umc.edu.dz/handle/123456789/14532

Date: 25/10/2024

Abstract:

This paper introduces a methodology to create a dataset designed to address the challenges of Named Entity Recognition (NER) in Algerian Dialectal Arabic (ADA). While significant advancements have been made in NER for Modern Standard Arabic (MSA), dialectal varieties like ADA remain largely underrepresented in natural language processing (NLP) resources. To bridge this gap, we propose a systematic method for collecting and annotating ADA texts. The dataset will be curated from informal sources, including social media platforms, where ADA is predominantly used

Show full item record