Last LoResMT (2024) Program
Thursday, August 15, 2024 (GMT-07:00)
09:00 - 09:10 Opening remarks by Chao-Hong Liu and Atul Kr. Ojha (on behalf of Workshop Chairs)
09:10 - 10:05 Invited talk 1: Kevin Duh, John Hopkins University (USA)
Chair: John P. McCrae
Title - Hyperparameter Optimization for Low-Resource Machine Translation
Abstract: Neural Machine Translation models are full of hyperparameters. To obtain a good model, one must carefully experiment with hyperparameters such as the number of layers, the number of hidden nodes, the type of non-linearity, the learning rate, and the drop-out parameter, just to name a few. He will discuss general hyperparameter optimization algorithms—including those based on evolutionary strategies, Bayesian techniques, and bandit learning–that can automate this laborious process. Further, he will argue that hyperparameter optimization is especially valuable for low-resource settings, where commonly-used hyperparameters are often suboptimal and small data sizes afford larger search spaces.
Finally, he will discuss benchmarks and datasets for evaluating hyperparameter optimization algorithms in practice.
10:05 - 10:30 Session 1: Booster Presentations
Chair: Nathaniel Oco
10:05-10:07 KpopMT: Translation Dataset with Terminology for Kpop Fandom- JiWoo Kim, Yunsu Kim and JinYeong Bak
10:07-10:09 HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew- Itai Mondshine, Tzuf Paz-Argaman, Asaf Achi Mordechai and Reut Tsarfaty
10:09-10:11 Challenges in Urdu Machine Translation- Abdul Basit, Abdul Hameed Azeemi and Agha Ali Raza
10:11-10:13 Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models- Gyutae Park, Seojin Hwang and Hwanhee Lee
10:13-10:15 Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study- Aniruddha Roy, Pretam Ray, Ayush Maheshwari, Sudeshna Sarkar and Pawan Goyal
10:15-10:17 Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin- Samuel Frontull and Georg Moser
10:17-10:19 AGE: Amharic, Ge’ez and English Parallel Dataset- Henok Biadglign Ademtew and Mikiyas Girma Birbo
10:19-10:21 Adopting Ensemble Learning for Cross-lingual Classification of Crisis-related Text On Social Media- Shareefa Ahmed Al Amer, Mark G. Lee and Phillip Smith
10:21-10:23 Rosetta Balcanica: Deriving a "Gold Standard'' Neural Machine Translation (NMT) Parallel Dataset from High-Fidelity Resources for Western Balkan Languages- Edmon Begoli, Maria Mahbub and Sudarshan Srinivasan
10:23-10:25 Irish-based Large Language Model with Extreme Low-Resource Settings in Machine Translation- Khanh-Tung Tran, Barry O’Sullivan and Hoang D. Nguyen
10:35 - 11:00 COFFEE/TEA BREAK
11:10 - 12:30 Session 2: Scientific Research Papers
Chair: Chao-Hong Liu
11:00-11:18 Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages - Zhuoyuan Mao and Yen Yu
11:18- 11:36 Linguistically Informed Transformers for Text to American Sign Language Translation- Abhishek Bharadwaj Varanasi, Manjira Sinha and Tirthankar Dasgupta
11:36-11:54 Leveraging Mandarin as a Pivot Language for Low-Resource Machine Translation from Cantonese to English - King Yiu Suen, Rudolf Chow and Albert Y.s. Lam
11:54-12:12 Enhancing Turkish Word Segmentation: A Focus on Borrowed Words and Invalid Morpheme - Soheila Behrooznia, Ebrahim Ansari and Zdenek Zabokrtsky
12:12-12:30 Super donors and super recipients: Studying cross-lingual transfer between high-resource and low-resource languages - Vitaly Protasov, Elisei Stakovskii, Ekaterina Voloshina, Tatiana Shavrina and Alexander Panchenko
12:30 - 14:00 LUNCH
14:00 - 15:00 Invited talk 2: Loïc Barrault, Meta AI
Chair: Ekaterina Vylomova
Title - TBD
Abstract: TBD
15:00 - 16:00 Session 3: Poster Session
Chair: Valentin Malykh
KpopMT: Translation Dataset with Terminology for Kpop Fandom- JiWoo Kim, Yunsu Kim and JinYeong Bak
HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew- Itai Mondshine, Tzuf Paz-Argaman, Asaf Achi Mordechai and Reut Tsarfaty
Challenges in Urdu Machine Translation- Abdul Basit, Abdul Hameed Azeemi and Agha Ali Raza
Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models- Gyutae Park, Seojin Hwang and Hwanhee Lee
Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study- Aniruddha Roy, Pretam Ray, Ayush Maheshwari, Sudeshna Sarkar and Pawan Goyal
Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin- Samuel Frontull and Georg Moser
AGE: Amharic, Ge’ez and English Parallel Dataset- Henok Biadglign Ademtew and Mikiyas Girma Birbo
Adopting Ensemble Learning for Cross-lingual Classification of Crisis-related Text On Social Media- Shareefa Ahmed Al Amer, Mark G. Lee and Phillip Smith
Rosetta Balcanica: Deriving a "Gold Standard'' Neural Machine Translation (NMT) Parallel Dataset from High-Fidelity Resources for Western Balkan Languages- Edmon Begoli, Maria Mahbub and Sudarshan Srinivasan
Irish-based Large Language Model with Extreme Low-Resource Settings in Machine Translation- Khanh-Tung Tran, Barry O’Sullivan and Hoang D. Nguyen
15:30 - 16:00 COFFEE/TEA BREAK
16:00 - 17:30 Session 3: Scientific Research Papers
Chair: Atul Kr. Ojha
16:00-16:18 Tokenisation in Machine Translation Does Matter: The impact of different tokenisation approaches for Maltese- Kurt Abela, Kurt Micallef, Marc Tanti and Claudia Borg
16:18-16:36 Machine Translation Through Cultural Texts: Can Verses and Prose Help Low-Resource Indigenous Models? - Antoine Cadotte, Nathalie André and Fatiha Sadat
16:36-16:54 Learning-From-Mistakes Prompting for Indigenous Language Translation - You Cheng Liao, Chen-Jui Yu, Chi-Yi Lin, He-Feng Yun, Yen-Hsiang Wang, Hsiao-Min Li and Yao-Chung Fan
16:54-17:12 Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation - Tiia Sildam, Andra Velve and Tanel Alumäe
17:12-17:30 Benchmarking of Low-Resource Machine Translation Systems - Ana Alexandra Morim Da Silva, Nikit Srivastava, Tatiana Moteu Ngoli, Michael Röder, Diego Moussallem and Axel-Cyrille Ngonga Ngomo
17:30 -17:40 Closing remarks by Atul Kr. Ojha and Chao-Hong Liu ( on behalf of Workshop Chairs)