Summary of the paper

Title The TextPro Tool Suite
Authors Emanuele Pianta, Christian Girardi and Roberto Zanoli
Abstract We present TextPro, a suite of modular Natural Language Processing (NLP) tools for analysis of Italian and English texts. The suite has been designed so as to integrate and reuse state of the art NLP components developed by researchers at FBK. The current version of the tool suite provides functions ranging from tokenization to chunking and Named Entity Recognition (NER). The system’s architecture is organized as a pipeline of processors wherein each stage accepts data from an initial input or from an output of a previous stage, executes a specific task, and sends the resulting data to the next stage, or to the output of the pipeline. TextPro performed the best on the task of Italian NER and Italian PoS Tagging at EVALITA 2007. When tested on a number of other standard English benchmarks, TextPro confirms that it performs as state of the art system. Distributions for Linux, Solaris and Windows are available, for both research and commercial purposes. A web-service version of the system is under development.
Language Multiple languages
Topics Tools, systems, applications, Information Extraction, Information Retrieval, Tagging
Full paper The TextPro Tool Suite
Slides -
Bibtex @InProceedings{PIANTA08.645,
  author = {Emanuele Pianta, Christian Girardi and Roberto Zanoli},
  title = {The TextPro Tool Suite},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA