Title |
Modern Greek Corpus Taxonomy |
Authors |
Mikros George (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, gmikros@ilsp.gr) Carayannis George (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, gcara@ilsp.gr) |
Keywords |
Corpus Analysis, Discriminant Function Analysis, Language Variation, Statistical Linguistics, Stylistic Analysis, Text Categorization |
Session |
Session WO3 - Corpus Categorisation |
Full Paper |
351.ps, 351.pdf |
Abstract |
The aim of this paper is to explore the way in which different kind of linguistic variables can be used in order to discriminate text type in 240 preclassified press texts. Modern Greek (MG) language due to its past diglossic status exhibits extended variation in written texts across all linguistic levels and can be exploited in text categorization tasks. The research presented used Discriminant Function Analysis (DFA) as a text categorization method and explores the way different variable groups contribute to the text type discrimination. |