Summary of the paper

Title The BNC Parsed with RASP4UIMA
Authors Řistein E. Andersen, Julien Nioche, Ted Briscoe and John Carroll
Abstract We have integrated the RASP system with the UIMA framework (RASP4UIMA) and used this to parse the XML-encoded version of the British National Corpus (BNC). All original annotation is preserved, and parsing information, mainly in the form of grammatical relations, is added in an XML format. A few specific adaptations of the system to give better results with the BNC are discussed briefly. The RASP4UIMA system is publicly available and can be used to parse other corpora or document collections, and the final parsed version of the BNC will be deposited with the Oxford Text Archive.
Language
Topics Corpus (creation, annotation, etc.), Parsing Systems, LR Infrastructures and Architectures
Full paper The BNC Parsed with RASP4UIMA
Slides The BNC Parsed with RASP4UIMA
Bibtex @InProceedings{ANDERSEN08.218,
  author = {Řistein E. Andersen, Julien Nioche, Ted Briscoe and John Carroll},
  title = {The BNC Parsed with RASP4UIMA},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA