Title |
Turkish Treebank as a Gold Standard for Morphological Disambiguation and Its Influence on Parsing |
Authors |
Ozlem Cetinoglu |
Abstract |
So far predicted scenarios for Turkish dependency parsing have used a morphological disambiguator that is trained on the data distributed with the tool(Sak et al., 2008). Although models trained on this data have high accuracy scores on the test and development data of the same set, the accuracy drastically drops when the model is used in the preprocessing of Turkish Treebank parsing experiments. We propose to use the Turkish Treebank(Oflazer et al., 2003) as a morphological resource to overcome this problem and convert the treebank to the morphological disambiguators format. The experimental results show that we achieve improvements in disambiguating the Turkish Treebank and the results also carry over to parsing. With the help of better morphological analysis, we present the best labelled dependency parsing scores to date on Turkish. |
Topics |
Parsing, Corpus (Creation, Annotation, etc.) |
Full paper |
Turkish Treebank as a Gold Standard for Morphological Disambiguation and Its Influence on Parsing |
Bibtex |
@InProceedings{CETINOGLU14.1073,
author = {Ozlem Cetinoglu}, title = {Turkish Treebank as a Gold Standard for Morphological Disambiguation and Its Influence on Parsing}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} } |