Title | Automatic Keyword Extraction from Spoken Text. A Comparison of two Lexical Resources: the EDR and WordNet |
Author(s) |
Lonneke van der Plas (1), Vincenzo Pallotta (2), Martin Rajman (2), Hatem Ghorbel (2)
(1) Rijksuniversiteit Groningen, Informatiekunde, vdplas@let.rug.nl; (2) Faculty of Information and Computer Science, Swiss Federal Institute of Technology - Lausanne, IN F Ecublens 1015 Lausanne, Switzerland, {Vincenzo.Pallotta, Martin.Rajman, Giovanni.Coray}@epfl.ch |
Session | O45-STW |
Abstract | Lexical resources such as WordNet and the EDR electronic dictionary (EDR) have been used in several NLP tasks. Probably partly due to the fact that the EDR is not freely available WordNet has been used far more often than the EDR. We have used both resources on the same task in order to make a comparison possible. The task is automatic assignment of keywords to multi-party dialogue episodes (i.e. thematically coherent stretches of spoken text). We show that the use of lexical resources in such a task results in slightly higher performances than the use of a purely statistically based method. |
Keyword(s) | Keyword Extraction, Spoken Dialog Corpora, Information Retrieval, Concept Dictionaries |
Language(s) | English |
Full Paper | 670.pdf |