SUMMARY : Session P10-S
Title | Multilevel corpus analysis: generating and querying an AGset of spoken Italian (SpIt-MDb). |
---|---|
Authors | R. Savy, F. Cutugno, C. Crocco |
Abstract | In this paper we present an application of AGTK to a corpus of spoken Italian annotated at many different linguistic levels. The work consists of two parts: a) the presentation of AG-SpIt, a toolkit devoted to corpus data management that we developed according to AGTK proposals; b) the presentation of corpus’ structure together with some examples and results of cross-level linguistic analyses obtained querying the database (SpIt-MDb). As this work is still an ongoing investigation, results must be considered preliminary, as a ‘demo’ illustrating the potentiality of the tool and the advantages it introduces to validate linguistic theories and annotation systems. Currently, SpIt-MDb is a linguistic resource under development; it represents one of the first attempts to create an Italian corpus labelled at various linguistic levels (from acoustic/sub-phonetic, to textual/pragmatic ones) which can be queried in the interrelations among levels. |
Keywords | Spoken Italian corpus; multilevel database; cross-level queries |
Full paper | Multilevel corpus analysis: generating and querying an AGset of spoken Italian (SpIt-MDb). |