Summary of the paper

Title Experimental Fast-Tracking of Morphological Analysers for Nguni Languages
Authors Sonja Bosch, Laurette Pretorius, Kholisa Podile and Axel Fleisch
Abstract The development of natural language processing (NLP) components is resource-intensive and therefore justifies exploring ways of reducing development time and effort when building NLP components. This paper addresses the experimental fast-tracking of the development of finite-state morphological analysers for Xhosa, Swati and (Southern) Ndebele by using an existing morphological analyser prototype for Zulu. The research question is whether fast-tracking is feasible across the language boundaries between these closely related varieties. The objective is a thorough assessment of recognition rates yielded by the Zulu morphological analyser for the three related languages. The strategy is to use techniques comprising several cycles of the following steps: applying the analyser to corpus data from all languages, identifying failures, and implementing the respective changes in the analyser. Tests show that the high degree of shared typological properties and formal similarities among the Nguni varieties warrants a modular fast-tracking approach. Word forms recognized by the Zulu analyser were mostly adequately interpreted. Therefore, the focus lies on providing adaptations based on failure output analysis for each language. As a result, the development of analysers for Xhosa, Swati and Ndebele is considerably faster than the creation of the Zulu prototype. The paper concludes with comments on the feasibility of the experiment, and the results of the evaluation.
Language Multiple languages
Topics Morphology, Multilinguality, Other
Full paper Experimental Fast-Tracking of Morphological Analysers for Nguni Languages
Slides -
Bibtex @InProceedings{BOSCH08.643,
  author = {Sonja Bosch, Laurette Pretorius, Kholisa Podile and Axel Fleisch},
  title = {Experimental Fast-Tracking of Morphological Analysers for Nguni Languages},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA