In this paper, we present the annotation pipeline and the guidelines we wrote as part of an effort to create a manually annotated Arabic author profiling dataset from various social media sources. We summarize our general and dialect specific guidelines for each of the 12 Arabic dialects. We also present the annotation Framework, the quality control and the issues and challenges encountered during the annotation phase, especially those related to the peculiarities of the Arabic variety used in social media.
@InProceedings{ZAGHOUANI18.5, author = {Wajdi Zaghouani and Anis Charfi}, title = {Guidelines and Annotation Framework for Arabic Author Profiling}, booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)}, year = {2018}, month = {may}, date = {7-12}, location = {Miyazaki, Japan}, editor = {Hend Al-Khalifa and King Saud University and KSA
Walid Magdy and University of Edinburgh and UK
Kareem Darwish and Qatar Computing Research Institute and Qatar
Tamer Elsayed and Qatar University and Qatar}, publisher = {European Language Resources Association (ELRA)}, address = {Paris, France}, isbn = {979-10-95546-25-2}, language = {english} }