Summary of the paper

Title Designing a Re-Usable and Embeddable Corpus Search Library
Authors Thomas Krause, Ulf Leser, Anke Lüdeling and Stephan Druskat
Abstract This paper describes a fundamental re-design and extension of the existing general multi-layer corpus search tool ANNIS, which simplifies its re-use in other tools. This embeddable corpus search library is called graphANNIS and uses annotation graphs as its internal data model. It has a modular design, where each graph component can be implemented by a so-called graph storage and allows efficient reachability queries on each graph component. We show that using different implementations for different types of graphs is much more efficient than relying on a single strategy. Our approach unites the interoperable data model of a directed graph with adaptable and efficient implementations. We argue that graphANNIS can be a valuable building block for applications that need to embed some kind of search functionality on linguistically annotated corpora. Examples are annotation editors that need a search component to support agile corpus creation. The adaptability of graphANNIS, and its ability to support new kinds of annotation structures efficiently, could make such a re-use easier to achieve.
Topics Corpus Tools
Full paper Designing a Re-Usable and Embeddable Corpus Search Library
Bibtex @InProceedings{KRAUSE18.12,
  author = {Thomas Krause ,Ulf Leser ,Anke Lüdeling and Stephan Druskat},
  title = {Designing a Re-Usable and Embeddable Corpus Search Library},
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {may},
  date = {7-12},
  location = {Miyazaki, Japan},
  editor = {Piotr Banski and Marc Kupietz and Adrien Barbaresi and Hanno Biber and Evelyn Breiteneder and Simon Clematide and Andreas Witt},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {979-10-95546-14-6},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA