Summary of the paper

Title Automatic Identification of Maghreb Dialects Using a Dictionary-Based Approach
Authors Houda SAADANE, Hosni Seffih, Christian Fluhr, Khalid Choukri and Nasredine SEMMAR
Abstract Automatic identification of Arabic dialects in a text is a difficult task, especially for Maghreb languages and when they are written in Arabic or Latin characters (Arabizi). These texts are characterized by the use of code-switching between the Modern Standard Arabic (MSA) and the Arabic Dialect (AD) in the texts written in Arabic, or between Arabizi and foreign languages for those written in Latin. This paper presents the specific resources and tools we have developed for this purpose, with a focus on the transliteration of Arabizi into Arabic (using the dedicated tools for Arabic dialects). A dictionary-based approach to detect the dialectal origin of a text is described, it exhibits satisfactory results.
Topics Language Identification, Corpus (Creation, Annotation, Etc.), Lexicon, Lexical Database
Full paper Automatic Identification of Maghreb Dialects Using a Dictionary-Based Approach
Bibtex @InProceedings{SAADANE18.580,
  author = {Houda SAADANE and Hosni Seffih and Christian Fluhr and Khalid Choukri and Nasredine SEMMAR},
  title = "{Automatic Identification of Maghreb Dialects Using a Dictionary-Based Approach}",
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {May 7-12, 2018},
  address = {Miyazaki, Japan},
  editor = {Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {979-10-95546-00-9},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA