Around the world, there is a wide range of traditional data manually collected for different scientific purposes. A small portion of this data has been digitised, but much of it remains less usable due to a lack of rich semantic models to enable humans and machines to understand, interpret and use these data. This paper presents ongoing work to build a semantic model to enrich and publish traditional data collection questionnaires in particular, and the historical data collection of the Bavarian Dialects in Austria in general. The use of cultural and linguistic concepts identified in the questionnaire questions allow for cultural exploration of the non-standard data (answers) of the collection. The approach focuses on capturing the semantics of the questionnaires dataset using domain analysis and schema analysis. This involves analysing the overall data collection process (domain analysis) and analysing the various schema used at different stages (schema analysis). By starting with modelling the data collection method, the focus is placed on the questionnaires as a gateway to understanding, interlinking and publishing the datasets. A semantic model that describes the semantic structure of the main entities such as questionnaires, questions, answers and their relationships is presented.
@InProceedings{ABGAZ18.4, author = {Yalemisew Abgaz ,Amelie Dorn ,Barbara Piringer and Eveline Wandl-Vogt}, title = {A semantic Model for Traditional Data Collection Questionnaires enabling Cultural Analysis}, booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)}, year = {2018}, month = {may}, date = {7-12}, location = {Miyazaki, Japan}, editor = {John P. McCrae and Christian Chiarcos and Thierry Declerck and Jorge Gracia and Bettina Klimek}, publisher = {European Language Resources Association (ELRA)}, address = {Paris, France}, isbn = {979-10-95546-19-1}, language = {english} }