Summary of the paper

Title Annotating Sumerian: A LLOD-enhanced Workflow for Cuneiform Corpora
Authors Christian Chiarcos, Ilya Khait, Émilie Pagé-Perron, Niko Schenk and Lucas Reckling
Abstract Assyriology, the discipline that studies cuneiform sources and their context, has enormous potential for the application of computational linguistics theory and method on account of the significant quantity of transcribed texts that are available in digital form but that remain as yet largely unexploited. As part of the XXX project, we aim to bring together corpus data, lexical data, linguistic annotations and object metadata in order to contribute to resolving data processing and integration challenges in the field of Assyriology as a whole, as well as for related fields of research such as linguistics and history. Data sparsity presents a challenge to our goal of the automated translation of the Ur III administrative texts. To mitigate this situation we have undertaken to annotate the whole corpus. To this end we have developed an annotation pipeline to facilitate the annotation of our gold corpus. This toolset can be re-employed to annotate any Sumerian text and will be integrated into the Cuneiform Digital Library Initiative (<https://cdli.ucla.edu>) infrastructure. To share these new data, we have also mapped our data to existing LOD and LLOD ontologies and vocabularies. This article provides details on the processing of Sumerian linguistic data using our pipeline, from raw transliterations to rich and structured data in the form of (L)LOD. We describe the morphological and syntactic annotation, with a particular focus on the publication of our datasets as LOD. This application of LLOD in Assyriology is unique and involves the concept of a LLOD edition of a linguistically annotated corpus of Sumerian, as well as linking with lexical resources, repositories of annotation terminology, and finally the museum collections in which the artifacts bearing these inscribed texts are kept.
Full paper Annotating Sumerian: A LLOD-enhanced Workflow for Cuneiform Corpora
Bibtex @InProceedings{CHIARCOS18.12,
  author = {Christian Chiarcos ,Ilya Khait ,Émilie Pagé-Perron ,Niko Schenk and Lucas Reckling},
  title = {Annotating Sumerian: A LLOD-enhanced Workflow for Cuneiform Corpora},
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {may},
  date = {7-12},
  location = {Miyazaki, Japan},
  editor = {John P. McCrae and Christian Chiarcos and Thierry Declerck and Jorge Gracia and Bettina Klimek},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {979-10-95546-19-1},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA