Title |
Using Stem-Templates to Improve Arabic POS and Gender/Number Tagging |
Authors |
Kareem Darwish, Ahmed Abdelali and Hamdy Mubarak |
Abstract |
This paper presents an end-to-end automatic processing system for Arabic. The system performs: correction of common spelling errors pertaining to different forms of alef, ta marbouta and ha, and alef maqsoura and ya; context sensitive word segmentation into underlying clitics, POS tagging, and gender and number tagging of nouns and adjectives. We introduce the use of stem templates as a feature to improve POS tagging by 0.5\% and to help ascertain the gender and number of nouns and adjectives. For gender and number tagging, we report accuracies that are significantly higher on previously unseen words compared to a state-of-the-art system. |
Topics |
Part-of-Speech Tagging |
Full paper |
Using Stem-Templates to Improve Arabic POS and Gender/Number Tagging |
Bibtex |
@InProceedings{DARWISH14.335,
author = {Kareem Darwish and Ahmed Abdelali and Hamdy Mubarak}, title = {Using Stem-Templates to Improve Arabic POS and Gender/Number Tagging}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} } |