LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title Modern Greek Corpus Taxonomy
Authors Mikros George (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, gmikros@ilsp.gr)
Carayannis George (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, gcara@ilsp.gr)
Keywords Corpus Analysis, Discriminant Function Analysis, Language Variation, Statistical Linguistics, Stylistic Analysis, Text Categorization
Session Session WO3 - Corpus Categorisation
Full Paper 351.ps, 351.pdf
Abstract The aim of this paper is to explore the way in which different kind of linguistic variables can be used in order to discriminate text type in 240 preclassified press texts. Modern Greek (MG) language due to its past diglossic status exhibits extended variation in written texts across all linguistic levels and can be exploited in text categorization tasks. The research presented used Discriminant Function Analysis (DFA) as a text categorization method and explores the way different variable groups contribute to the text type discrimination.