Summary of the paper

Title Towards the National Corpus of Polish
Authors Adam Przepiórkowski, Rafał L. Górski, Barbara Lewandowska-Tomaszyk and Marek Łaziński
Abstract This paper presents a new corpus project, aiming at building a national corpus of Polish. What makes it different from a typical YACP (Yet Another Corpus Project) is 1) the fact that all four partners in the project have in the past constructed corpora of Polish, sometimes in the spirit of collaboration, at other times - in the spirit of competition, 2) the partners bring into the project varying areas of expertise and experience, so the synergy effect is anticipated, 3) the corpus will be built with an eye on specific applications in various fields, including lexicography (the corpus will be the empirical basis of a new large general dictionary of Polish) and natural language processing (a number of NLP tools will be constructed within the project).
Language Single language
Topics LR national/international projects, organizational/policy issues, Corpus (creation, annotation, etc.), LR Infrastructures and Architectures
Full paper Towards the National Corpus of Polish
Slides -
Bibtex @InProceedings{PRZEPIRKOWSKI08.211,
  author = {Adam Przepiórkowski, Rafał L. Górski, Barbara Lewandowska-Tomaszyk and Marek Łaziński},
  title = {Towards the National Corpus of Polish},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA