ABSTRACT
Full utilisation of information available in speech databases has not
always been feasible due to the differing standards and formats
employed. In addition, the extra diversity introduced by the
multilingual aspect has made the analysis of speech databases even more
difficult under a single computing environment.
In this paper we briefly present the QuickSig object oriented signal processing system [1] that represents a modern tool with which to perform DSP related studies. It empowers speech scientists to operate in a flexible and motivating environment where signals, filters, spectrograms, etc., are all modelled as objects. Seamlessly integrated to QuickSig is an object-oriented database [2] that permits signals along with their features and relations to be stored persistently between sessions in a manner that is transparent to the user. A multilingual phonetic representational system [3] exists within the same environment and allows speech from different databases (e.g., different languages and phonetic alphabets) to be modelled generically. Relations between speech units such as sentences, words, phones, etc., are defined explicitly forming a phonetic object structure for each utterance. Complex pattern matching searches can be easily formulated by the user and made to traverse the phonetic structures returning desired contexts. These speech events can then be used in actual applications.
The remainder of the paper presents some of the applications that have been developed on this platform where Finnish and Estonian databases have been used as the source speech material. These include speech synthesis [4,5], recognition [6], and speaker verification/identification [7].
References: