Summary of the paper

Title Strategies and Challenges for Crowdsourcing Regional Dialect Perception Data for Swiss German and Swiss French
Authors Jean-Philippe Goldman, Simon Clematide, Mathieu Avanzi and Raphaël Tandler
Abstract Following the dynamics of several recent crowdsourcing projects with the aim of collecting linguistic data, this paper focuses on such a project in the field of Swiss German dialects and Swiss French accents. The main scientific goal of the data collected is to understand people’s perception of dialects and accents, and provide a resource for future computational systems such as automatic dialect recognition. A gamified crowdsourcing platform was set up and launched for both main locales of Switzerland: “din dialäkt” (‘your dialect’) for Swiss German dialects and “ton accent” (‘your accent’) for Swiss French. The main activity for the participant is to localize preselected audio samples by clicking on a map of Switzerland. The media was highly interested in the two platforms and many reports appeared in newspapers, television and radio, which increased the public’s awareness of the project and thus also the traffic on the page. At this point of the project, 7,500 registered users (beside 30,000 anonymous visitors), have provided 470,000 localizations. By connecting user’s results of this localization task to their socio-demographic information, a quantitative analysis of the localization data can reveal which factors play a role in their performance. Preliminary results showed that age and childhood residence influence the how well dialects/accents are recognized. Nevertheless, quantity does not ensure quality when it comes to data. Crowdsourcing such linguistic data revealed traps to avoid such as scammers, or the participants’ quick loss of motivation causing them to click randomly. Such obstacles need to be taken into account when assessing the reliability of data and require a number of preliminary steps before an analysis of the data.
Topics Crowdsourcing, Other
Full paper Strategies and Challenges for Crowdsourcing Regional Dialect Perception Data for Swiss German and Swiss French
Bibtex @InProceedings{GOLDMAN18.920,
  author = {Jean-Philippe Goldman and Simon Clematide and Mathieu Avanzi and Raphaël Tandler},
  title = "{Strategies and Challenges for Crowdsourcing Regional Dialect Perception Data for Swiss German and Swiss French}",
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {May 7-12, 2018},
  address = {Miyazaki, Japan},
  editor = {Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {979-10-95546-00-9},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA