Friday, June 24, 2022 |
| 14:00–15:00 Opening |
| Keynote talk |
| 15:00–16:00 Speech |
15:00–15:15 | Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Marcely Zanon Boito, Bolaji Yusuf, Lucas Ondel, Aline Villavicencio and Laurent Besacier |
15:15–15:30 | An Open Source Web Reader for Under-Resourced Languages
Judy Fong, Þorsteinn Daði Gunnarsson, Sunneva Þorsteinsdóttir, Gunnar Thor Örnólfsson and Jon Gudnason |
15:30–15:45 | Text-to-Speech for Under-Resourced Languages: Phoneme Mapping and Source Language Selection in Transfer Learning
Phat Do, Matt Coler, Jelske Dijkstra and Esther Klabbers |
15:45–16:00 | ReadAlong Studio: Practical Zero-Shot Text-Speech Alignment for Indigenous Language Audiobooks
Patrick Littell, Eric Joanis, Aidan Pine, Marc Tessier, David Huggins Daines and Delasie Torkornoo |
| 16:00–16:30 Coffee break |
| 16:30–17:45 Data |
16:30–16:45 | Corpus Creation for Sentiment Analysis in Code-Mixed Tulu Text
Asha Hegde, Mudoor Devadas Anusha, Sharal Coelho, Hosahalli Lakshmaiah Shashirekha and Bharathi Raja Chakravarthi |
16:45–17:00 | Crowd-sourcing for Less-resourced Languages: Lingua Libre for Polish
Mathilde Hutin and Marc Allassonnière-Tang |
17:00–17:15 | Tupían Language Ressources: Data, Tools, Analyses
Lorena Martín Rodríguez, Tatiana Merzhevich, Wellington Silva, Tiago Tresoldi, Carolina Aragon and Fabrício F. Gerardi |
17:15–17:30 | Quality versus Quantity: Building Catalan-English MT Resources
Ona de Gibert Bonet, Ksenia Kharitonova, Blanca Calvo Figueras, Jordi Armengol-Estapé and Maite Melero |
17:30–17:45 | A Sentiment Corpus for South African Under-Resourced Languages in a Multilingual Context
Ronny Mabokela and Tim Schlippe |
Saturday, June 25, 2022 |
| 9:00–10:00 MT4All |
| CUNI Submission to MT4All Shared Task
Ivana Kvapilíková and Ondrej Bojar |
| 10:00–10:30 General |
10:00–10:15 | Resource: Indicators on the Presence of Languages in Internet
Daniel Pimienta |
10:15–10:30 | Language Technologies for Low Resource Languages: Sociolinguistic and Multilingual Insights
A. Seza Doğruöz and Sunayana Sitaram |
| 10:30–11:00 Coffee break |
| 11:00–12:45 NLP |
11:00–11:15 | Sentiment Analysis for Hausa: Classifying Students’ Comments
Ochilbek Rakhmanov and Tim Schlippe |
11:15–11:30 | Nepali Encoder Transformers: An Analysis of Auto Encoding Transformer Language Models for Nepali Text Classification
Utsav Maskey, Manish Bhatta, Shiva Bhatt, Sanket Dhungel and Bal Krishna Bal |
11:30–11:45 | CoSwID, a Code Switching Identification Method Suitable for Under-Resourced Languages
Laurent Kevers |
11:45–12:00 | A Neural Network Approach to Create Minangkabau-Indonesia Bilingual Dictionary
Kartika Resiandi, Yohei Murakami and Arbi Haza Nasution |
12:00–12:15 | Machine Translation from Standard German to Alemannic Dialects
Louisa Lambrecht, Felix Schneider and Alexander Waibel |
12:15–12:30 | Question Answering Classification for Amharic Social Media Community Based Questions
Tadesse Destaw, Seid Muhie Yimam, Abinew Ayele and Chris Biemann |
12:30–12:45 | Automatic Detection of Morphological Processes in the Yorùbá Language
Tunde Adegbola |
| 12:45–14:00 Lunch break |
| 14:00–14:50 Joint SIGUL2022-MWE Poster session |
| Evaluating Unsupervised Approaches to Morphological Segmentation for Wolastoqey
Diego Bear and Paul Cook |
| Baseline English and Maltese-English Classification Models for Subjectivity Detection, Sentiment Analysis, Emotion Analysis, Sarcasm Detection, and Irony Detection
Keith Cortis and Brian Davis |
| Building Open-source Speech Technology for Low-resource Minority Languages with SáMi as an Example – Tools, Methods and Experiments
Katri Hiovain-Asikainen and Sjur Moshagen |
| Investigating the Quality of Static Anchor Embeddings from Transformers for Under-Resourced Languages
Pranaydeep Singh, Orphee De Clercq and Els Lefever |
| Introducing YakuToolkit. Yakut Treebank and Morphological Analyzer.
Tatiana Merzhevich and Fabrício Ferraz Gerardi |
| A Language Model for Spell Checking of Educational Texts in Kurdish (Sorani)
Roshna Abdulrahman and Hossein Hassani |
| SimRelUz: Similarity and Relatedness Scores as a Semantic Evaluation Dataset for Uzbek Language
Ulugbek Salaev, Elmurod Kuriyozov and Carlos Gómez-Rodríguez |
| ENRICH4ALL: A First Luxembourgish BERT Model for a Multilingual Chatbot
Dimitra Anastasiou |
| 14:50–15:40 Joint SIGUL2022-MWE Keynote speech |
| 15:40–16:00 Joint SIGUL2022-MWE Common Discussion |
| 16:00–16:30 Coffee break |
| 16:30–17:30 Panel discussion |
| 17:30–17:50 General discussion |
| 17:50–18:00 Closing |