Title |
Enriching ODIN |
Authors |
Fei Xia, William Lewis, Michael Wayne Goodman, Joshua Crowgey and Emily M. Bender |
Abstract |
In this paper, we describe the expansion of the ODIN resource, a database containing many thousands of instances of Interlinear Glossed Text (IGT) for over a thousand languages harvested from scholarly linguistic papers posted to the Web. A database containing a large number of instances of IGT, which are effectively richly annotated and heuristically aligned bitexts, provides a unique resource for bootstrapping NLP tools for resource-poor languages. To make the data in ODIN more readily consumable by tool developers and NLP researchers, we propose a new XML format for IGT, called Xigt. We call the updated release ODIN-II. |
Topics |
Endangered Languages, Tools, Systems, Applications |
Full paper |
Enriching ODIN |
Bibtex |
@InProceedings{XIA14.1072,
author = {Fei Xia and William Lewis and Michael Wayne Goodman and Joshua Crowgey and Emily M. Bender}, title = {Enriching ODIN}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} } |