Title |
T2K^2: a System for Automatically Extracting and Organizing Knowledge from Texts |
Authors |
Felice Dell'orletta, Giulia Venturi, andrea Cimino and Simonetta Montemagni |
Abstract |
In this paper, we present T2K^2, a suite of tools for automatically extracting domain―specific knowledge from collections of Italian and English texts. T2K^2 (Text―To―Knowledge v2) relies on a battery of tools for Natural Language Processing (NLP), statistical text analysis and machine learning which are dynamically integrated to provide an accurate and incremental representation of the content of vast repositories of unstructured documents. Extracted knowledge ranges from domain―specific entities and named entities to the relations connecting them and can be used for indexing document collections with respect to different information types. T2K^2 also includes linguistic profiling functionalities aimed at supporting the user in constructing the acquisition corpus, e.g. in selecting texts belonging to the same genre or characterized by the same degree of specialization or in monitoring the added value of newly inserted documents. T2K^2 is a web application which can be accessed from any browser through a personal account which has been tested in a wide range of domains. |
Topics |
Information Extraction, Information Retrieval, MultiWord Expressions & Collocations |
Full paper |
T2K^2: a System for Automatically Extracting and Organizing Knowledge from Texts |
Bibtex |
@InProceedings{DELLORLETTA14.590,
author = {Felice Dell'orletta and Giulia Venturi and andrea Cimino and Simonetta Montemagni}, title = {T2K^2: a System for Automatically Extracting and Organizing Knowledge from Texts}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} } |