Summary of the paper

Title Building Literary Corpora for Computational Literary Analysis - A Prototype to Bridge the Gap between CL and DH
Authors Andrew Frank and Christine IVANOVIC
Abstract The design of LitText follows the traditional research approach in digital humanities (DH): collecting texts for critical reading and underlining parts of interest. Texts, in multiple languages, are prepared with a minimal markup language, and processed by NLP services. The result is converted to RDF (a.k.a. semantic-web, linked-data) triples. Additional data available as linked data on the web (e.g. Wikipedia data) can be added. The DH researcher can then harvest the corpus with SPARQL queries. The approach is demonstrated with the construction of a 20 million word corpus from English, German, Spanish, French and Italian texts and an example query to identify texts where animals behave like humans as it is the case in fables.
Topics Linked Data, Corpus (Creation, Annotation, Etc.), Other
Full paper Building Literary Corpora for Computational Literary Analysis - A Prototype to Bridge the Gap between CL and DH
Bibtex @InProceedings{FRANK18.371,
  author = {Andrew Frank and Christine IVANOVIC},
  title = "{Building Literary Corpora for Computational Literary Analysis - A Prototype to Bridge the Gap between CL and DH}",
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {May 7-12, 2018},
  address = {Miyazaki, Japan},
  editor = {Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {979-10-95546-00-9},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA