LREC 2012 Proceedings

Summary of the paper

Title	French and German Corpora for Audience-based Text Type Classification
Authors	Amalia Todirascu, Sebastian Pado, Jennifer Krisch, Max Kisselew and Ulrich Heid
Abstract	This paper presents some of the results of the CLASSYN project which investigated the classification of text according to audience-related text types. We describe the design principles and the properties of the French and German linguistically annotated corpora that we have created. We report on tools used to collect the data and on the quality of the syntactic annotation. The CLASSYN corpora comprise two text collections to investigate general text types difference between scientific and popular science text on the two domains of medical and computer science.
Topics	Corpus (creation, annotation, etc.), Document Classification, Text categorisation, Multilinguality
Full paper	French and German Corpora for Audience-based Text Type Classification
Bibtex	@InProceedings{TODIRASCU12.518, author = {Amalia Todirascu and Sebastian Pado and Jennifer Krisch and Max Kisselew and Ulrich Heid}, title = {French and German Corpora for Audience-based Text Type Classification}, booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)}, year = {2012}, month = {may}, date = {23-25}, address = {Istanbul, Turkey}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-7-7}, language = {english} }