Summary of the paper

Title The Denoised Web Treebank: Evaluating Dependency Parsing under Noisy Input Conditions
Authors Joachim Daiber and Rob van der Goot
Abstract We introduce the Denoised Web Treebank: a treebank including a normalization layer and a corresponding evaluation metric for dependency parsing of noisy text, such as Tweets. This benchmark enables the evaluation of parser robustness as well as text normalization methods, including normalization as machine translation and unsupervised lexical normalization, directly on syntactic trees. Experiments show that text normalization together with a combination of domain-specific and generic part-of-speech taggers can lead to a significant improvement in parsing accuracy on this test set.
Topics Parsing, Part-of-Speech Tagging, Social Media Processing
Full paper The Denoised Web Treebank: Evaluating Dependency Parsing under Noisy Input Conditions
Bibtex @InProceedings{DAIBER16.86,
  author = {Joachim Daiber and Rob van der Goot},
  title = {The Denoised Web Treebank: Evaluating Dependency Parsing under Noisy Input Conditions},
  booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
  year = {2016},
  month = {may},
  date = {23-28},
  location = {Portorož, Slovenia},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {978-2-9517408-9-1},
  language = {english}
 }
Powered by ELDA © 2016 ELDA/ELRA