Title |
Rapid Deployment of Phrase Structure Parsing for Related Languages: A Case Study of Insular Scandinavian |
Authors |
Anton Karl Ingason, Hrafn Loftsson, Eiríkur Rögnvaldsson, Einar Freyr Sigurðsson and Joel C. Wallenberg |
Abstract |
This paper presents ongoing work that aims to improve machine parsing of Faroese using a combination of Faroese and Icelandic training data. We show that even if we only have a relatively small parsed corpus of one language, namely 53,000 words of Faroese, we can obtain better results by adding information about phrase structure from a closely related language which has a similar syntax. Our experiment uses the Berkeley parser. We demonstrate that the addition of Icelandic data without any other modification to the experimental setup results in an f-measure improvement from 75.44% to 78.05% in Faroese and an improvement in part-of-speech tagging accuracy from 88.86% to 90.40%. |
Topics |
Corpus (Creation, Annotation, etc.), Grammar and Syntax |
Full paper |
Rapid Deployment of Phrase Structure Parsing for Related Languages: A Case Study of Insular Scandinavian |
Bibtex |
@InProceedings{INGASON14.855,
author = {Anton Karl Ingason and Hrafn Loftsson and Eiríkur Rögnvaldsson and Einar Freyr Sigurðsson and Joel C. Wallenberg}, title = {Rapid Deployment of Phrase Structure Parsing for Related Languages: A Case Study of Insular Scandinavian}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} } |