SUMMARY : Session O3-EW Authoring Tools, Information Extraction and Retrieval
Title | A Methodology and Tool for Representing Language Resources for Information Extraction |
---|---|
Authors | J. Iria, F. Ciravegna |
Abstract | In recent years there has been a growing interest in clarifying the process of Information Extraction (IE) from documents, particularly when coupled with Machine Learning. We believe that a fundamental step forward in clarifying the IE process would be to be able to perform comparative evaluations on the use of different representations. However, this is difficult because most of the time the way information is represented is too tightly coupled with the algorithm at an implementation level, making it impossible to vary representation while keeping the algorithm constant. A further motivation behind our work is to reduce the complexity of designing, developing and testing IE systems. The major contribution of this work is in defining a methodology and providing a software infrastructure for representing language resources independently of the algorithm, mainly for Information Extraction but with application in other fields - we are currently evaluating its use for ontology learning and document classification. |
Keywords | |
Full paper | A Methodology and Tool for Representing Language Resources for Information Extraction |