Paper Digest: Recent Papers on Machine Translation
Paper Digest Team extracted all recent Machine Translation related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
Since 2018, Paper Digest has built a foundation of data spanning decades of conferences, journals, and research topics. The platform features a daily digest service that sifts through tens of thousands of new papers, clinical trials, news articles, and community posts, filtering the noise to highlight what matters most to specific interests. Beyond daily updates, dozens of built-in research tools streamline the academic workflow, supporting efficient reading and writing, comprehensive literature reviews, and automated research report generation.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: Recent Papers on Machine Translation
| Paper | Author(s) | Source | Date | |
|---|---|---|---|---|
| 1 | A Hybrid MT5-Based Machine Translation System for Kanuri–English Educational Translation in A Low-Resource Setting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a hybrid AI-based Kanuri–English machine translation system designed for primary school educational use. |
Muhammad Usman Dallah; Mohammad Suaib; Jameel Ahmad; | International Journal of Latest Technology in Engineering … | 2026-06-15 |
| 2 | An Interpretable Machine Learning Framework for Classifying Human and Machine Translations Across Genres Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present an interpretable machine-learning framework that classifies Chinese-to-English human, Google-Translate, and ChatGPT outputs across news, novel, and technology genres using a dataset of 450 texts. |
Lingxi Fan; Hongyang Du; Gan Huang; | Frontiers in Artificial Intelligence | 2026-06-15 |
| 3 | AI Tools in Translation: Bridging Literalism and Cultural Fluency in English-Arabic Idioms – A Comparative Analysis of ChatGPT and DeepSeek Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study employs two theoretical frameworks from translation studies—Nida’s dynamic equivalence as well as Vinay and Darbelnet’s translation strategies—alongside principles of neural machine translation (NMT). |
Shaimaa Mohamed Helal; | World Journal of English Language | 2026-06-11 |
| 4 | PiDA: Phonetically-Informed Data Augmentation for Robust Vietnamese Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We confirm that most ASR substitution errors arise from phonetic confusions rather than random noise, and that these phonetic errors significantly degrade ST quality. Motivated by this finding, we propose Phonetically-Informed Data Augmentation (PiDA), which generates ASR-like corruptions by substituting words with phonetically similar alternatives using phonetic word embeddings. |
GIANG SON NGUYEN et. al. | arxiv-cs.CL | 2026-06-11 |
| 5 | Bridging Past and Future: Emotive Lexicon in Qutty Bilik and Its Role in AI, NLP, and Cross-Cultural Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The findings demonstrate that while semantic meaning is largely preserved, the intensity and cultural resonance of emotions often shift in translation. Beyond its contribution to historical linguistics and translation studies, this research offers practical insights for sentiment analysis, machine translation, and intercultural communication, emphasizing the relevance of historical texts to contemporary language and AI research. |
DANA OSPANOVA et. al. | World Journal of English Language | 2026-06-11 |
| 6 | Back-Translation and Unsupervised Domain Adaptation for Machine Translation of Arabic Dialects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We develop and explore two methods to improve machine translation of low-resource Arabic dialects in a practical setting: back-translation to address the data scarcity problem, and unsupervised domain adaptation to utilize unlabeled data and leverage lexical similarities among different dialects. |
Aya Salim; Willie Brink; | ACM Transactions on Asian and Low-Resource Language … | 2026-06-10 |
| 7 | From Tool Use to Pedagogical Integration: Mapping AI-Assisted Translation Learning Behaviors Among Vietnamese EFL Translation Students Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Fourth-year students reported higher levels of AI use than third-year students, suggesting increased technology integration as students progress through translator training. Qualitative findings further revealed a developmental shift from AI as a translation tool toward AI as a learning partner.Based on these findings, the study proposes the AI Translation Learning Behavior Framework (AITLBF), which conceptualizes the progression from AI awareness and adoption to critical post-editing competence and human-AI collaborative translation. |
Nguyen Thi Viet Phuong; | Language, Education and Culture Research | 2026-06-09 |
| 8 | Development of A Machine Translation System Using Federated Learning Architecture for Low Resource Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abstract This paper presents a work on federated learning (FL) based machine translation considering three different language pairs i.e., English to Hindi, English to Assamese and English to Bodo. |
Parvez Aziz Boruah; Hiren Kumar Deva Sarma; Shikhar Kumar Sarma; | Discover Artificial Intelligence | 2026-06-07 |
| 9 | Assessing Neural Machine Translation in Speech: Problems and Solutions in AI-Powered Translations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The theoretical novelty of this study lies in integrating Nord’s text-typological translation problem framework with Molina and Albir’s translation technique model to evaluate NMT output, an approach rarely applied to spoken oratory texts. Empirically, the study provides a fine-grained error analysis of a ChatGPT-powered NMT system on formal political speech, quantifying problem types and mapping them to specific post-editing strategies. |
Ahmad Zaki Munibi; | Journal of Computing Innovations and Emerging Technologies | 2026-06-05 |
| 10 | A Comparative Study on The English Translation of The Sight of Father’s Back from The Perspective of Human-Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper takes Zhang Peiji’s human translation(HT), alongside machine translations produced by DeepL(NMT) and DeepSeek(LLM) as research objects. |
Jinjing Han; | Literature Language and Cultural Studies | 2026-06-04 |
| 11 | Better Literary Translation: A Multi-Aspect Data Generation and LLM Training Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a multi-aspect iterative refinement framework that generates high-quality translation references and preference data through specialized LLM translators, each targeting a distinct quality dimension. |
Zhihao Lin; Ziqi Zhu; Hao Huang; Guanghui Wang; Peiyang He; | arxiv-cs.CL | 2026-06-04 |
| 12 | English-to-Prakrit Machine Translation Via Multilingual Transfer Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We adapt the multilingual model by mapping Prakrit to the Hindi language tag (hin_Deva) without modifying the tokenizer, vocabulary, or architecture. |
Om Choksi; Smit Kareliya; Shrikant Malviya; Pruthwik Mishra; | arxiv-cs.CL | 2026-06-04 |
| 13 | Multilingual Coreference Resolution Via Cycle-Consistent Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While the task is well-studied in English, comparatively less attention is dedicated to coreference resolution in other languages, especially low-resource ones. To mitigate this gap, we propose a novel coreference resolution pipeline that harnesses machine translation (MT) from English to a target low-resource language, to generate or expand training data. |
Adriana-Valentina Costache; Eduard Poesina; Silviu-Florin Gheorghe; Paul Irofti; Radu Tudor Ionescu; | arxiv-cs.CL | 2026-06-03 |
| 14 | ComplexityMT: Benchmarking The Interaction Between Text Complexity and Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce ComplexityMT, a new challenge for assessing how text complexity and machine translation interact with and influence each other, using the Common European Framework of Reference for Languages (CEFR) levels as the measure of text complexity. |
JOSEPH MARVIN IMPERIAL et. al. | arxiv-cs.CL | 2026-06-03 |
| 15 | Research on The Optimization of English Neural Machine Translation System That Combines Hierarchical Attention and Dynamic Vocabulary Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Huanping Zhu; Ya Dong; Mengting Zhao; | Discover Artificial Intelligence | 2026-06-02 |
| 16 | AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We describe AlignAtt4LLM, an IWSLT 2026 simultaneous speech translation system for English to German, Italian, and Chinese. |
Quentin Fuxa; Dominik Macháček; | arxiv-cs.CL | 2026-06-02 |
| 17 | Context-Aware Neural Machine Translation For English–Hindi Sentences Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a Context-Aware Neural Machine Translation (CANMT) framework for English–Hindi translation that integrates inter-sentential and intra-sentential context to enhance translation quality. |
Anshul Pandey; Suman Kumar Mishra; Digesh Pandey; Shan-e- Fatima; | International Journal of Drug Delivery Technology | 2026-06-02 |
| 18 | Evaluative Voice and Interpersonal Shifts in ChatGPT-Translated Feedback for EFL Writing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates how ChatGPT’s machine translation from English to Arabic affects the realization of interpersonal meaning and learner engagement in academic feedback for EFL learners. |
Asma Alshehri; | Theory and Practice in Language Studies | 2026-06-01 |
| 19 | From Binary to Bilingual: How The National Weather Service Is Using Artificial Intelligence to Develop A Comprehensive Translation Program Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article outlines the foundation of an automated translation tool for NWS products, powered by artificial intelligence. |
JOSEPH E. TRUJILLO-FALCÓN et. al. | Artificial Intelligence for the Earth Systems | 2026-06-01 |
| 20 | Intelegi Româneşte?” A Recipe for Romanian Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a systematic study of building a language-specific VLM for Romanian, covering the full pipeline from data construction to architectural choices. |
Mihai Masala; Marius Leordeanu; Mihai Dascalu; Traian Rebedea; | arxiv-cs.CL | 2026-05-29 |
| 21 | Comparative Evaluation of Machine Translation Systems on Images with Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents a comparative evaluation of machine translation systems applied to images containing textual information, a task that lies at the intersection of computer vision and natural language processing. |
Blai Puchol; Sergio Gómez González; Miguel Domingo; Francisco Casacuberta; | arxiv-cs.CL | 2026-05-28 |
| 22 | LEXICAL CREATIVITY IN ENGLISH LANGUAGE CHICK LIT AND MACHINE TRANSLATION Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The article examines lexical creativity in English language chick lit, focusing on neologisms formed through deliberate deviations from conventional word-formation rules. |
Maryna BIELOVA; | Germanic Philology Journal of Yuriy Fedkovych Chernivtsi … | 2026-05-27 |
| 23 | HardMTBench: Stress-Testing Chinese-English Translation on Knowledge-Intensive Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce HardMTBench, a difficulty-aware diagnostic benchmark for bidirectional Chinese-English domain translation. |
Zheng Li; Mao Zheng; Mingyang Song; Tianxiang Fei; | arxiv-cs.CL | 2026-05-27 |
| 24 | ChatGPT Vs. DeepL Vs. Google Translate: A Human Evaluation of Multiword Expressions’ Machine Translation Quality Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Multiword expressions (MWEs) remain a persistent challenge in neural machine translation (NMT), particularly when they appear in discontinuous forms. In this context, the present study evaluates the ability of general-purpose large language models (LLMs) to address these limitations by systematically comparing the performance of Google Translate, DeepL, and GPT-4o in translating Spanish-to-English MWEs. |
Carlos Manuel Hidalgo-Ternero; Vicent Briva-Iglesias; | MonTi Monografías de Traducción e Interpretación | 2026-05-27 |
| 25 | How Close Are LLMs to Human Judgment in Assessing Machine Translation Quality? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper examines the potential of large language models (LLMs) in the automatic evaluation of AI-generated text. |
Zhenhai Li; Xindi Hao; Shuyin Zhang; | MonTi Monografías de Traducción e Interpretación | 2026-05-27 |
| 26 | How to Evaluate Speech Translation with Source-Aware Neural MT Metrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we conduct the first systematic study of source-aware metrics for ST, with a particular focus on real-world operating conditions where source transcripts are not available. |
Mauro Cettolo; Marco Gaido; Matteo Negri; Sara Papi; Luisa Bentivogli; | Computational Linguistics | 2026-05-26 |
| 27 | INTERTEXTUALITY AS MEANS OF ENSURING THE ADEQUACY OF TRANSLATING SCIENTIFIC TEXTS IN CONDITIONS OF AI USAGE (BASED ON ENGLISH AND GERMAN) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The article examines the role of intertextuality as a fundamental tool for ensuring translation adequacy in scientific and specialized texts. |
Uliana BARAN; Dmytro POBEREZHNYI; | Germanic Philology Journal of Yuriy Fedkovych Chernivtsi … | 2026-05-26 |
| 28 | Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with A Chinese Case Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study explores computational approaches for measuring moral foundations (MFs) in non-English corpora. |
Calvin Yixiang Cheng; Scott A. Hale; | Proceedings of the International AAAI Conference on Web and … | 2026-05-25 |
| 29 | Discovering Lexical Gaps Using Embeddings from Multilingual LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a data-driven framework for identifying cross-lingual lexical gaps. |
Yoonwon Jung; Aaron S. Cohen; Benjamin K. Bergen; | arxiv-cs.CL | 2026-05-22 |
| 30 | Exploring The Effects of Data Volume and Transfer-Language Choice on Transfer Learning with Application to Polish Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Transfer learning offers a practical way to improve neural machine translation in low-resource settings, but its effectiveness depends on both the choice of transfer language and the amount of target-language data available for adaptation. In this study, we examine these factors specifically for Polish–English translation using mBART. |
Juuso Eronen; Zhenzhen Liu; Michal Ptaszynski; Karol Nowakowski; Fumito Masui; | Electronics | 2026-05-22 |
| 31 | Moral Semantics Survive Machine Translation: Cross-Lingual Evidence from Moral Foundations Corpora Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Using $\sim$50k morally-annotated social media posts from a diverse range of topics, we apply a principled four-method validation pipeline: LaBSE cross-lingual embedding similarity, Centered Kernel Alignment (CKA), LLM-as-judge evaluation, and deep learning classifier parity tests. |
Maciej Skorski; | arxiv-cs.CL | 2026-05-21 |
| 32 | Enhancing Scientific Discourse: Machine Translation for The Scientific Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present the development of a collection of parallel and monolingual corpora for the scientific domain. |
Dimitris Roussis; Sokratis Sofianopoulos; Stelios Piperidis; | arxiv-cs.CL | 2026-05-20 |
| 33 | AI-assisted Post-editing and The Development of Critical Technology Awareness:A Study of Chinese-Japanese Political Text Translation Pedagogy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The paper argues that post-editing pedagogy in the AI era should move beyond tool operation and be designed as a competence loop linking technology use, error diagnosis, discourse reconstruction, and critical reflection. |
Qiushi Gu; Shiyan Wang; Xinyu Ji; Xinyao Ren; | Journal of Literature & Language | 2026-05-20 |
| 34 | Efficient English-Vietnamese Medical Machine Translation: Insights from The VLSP 2025 Shared Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abstract: In this paper, we present a comprehensive overview of the VLSP 2025 Medical MachineTranslation Shared Task, which focuses on English-Vietnamese translation in the medical domainusing small language models (SLMs). |
Tran Hong Viet; Tran Duy Long; Nguyen Minh Quy; Nguyen Van Vinh; | VNU Journal of Science: Computer Science and Communication … | 2026-05-19 |
| 35 | Direct Translation Between Sign Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To enable direct translation between sign language utterances, we use back-translation to produce synthetic sign-sign pairs from unaligned individual language utterance-sign corpora. Using this data, we jointly train a single MBART-based model for both text->sign (T2S) and sign->sign (S2S). |
ZETIAN WU et. al. | arxiv-cs.CL | 2026-05-19 |
| 36 | Analysis of Efficient English to Hindi Machine Translation Technique Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Chetan Agarwal; Kamlesh Dutta; Pardeep Singh; | International Journal of Intelligent Engineering Informatics | 2026-05-17 |
| 37 | PaliBench: A Multi-Reference Blueprint for Classical Language Translation Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article introduces PaliBench, both a benchmark for Pali-to-English translation and a reusable method for constructing multi-reference translation benchmarks for classical languages. |
Máté Metzger; Nadnapang Phophichit; | arxiv-cs.CL | 2026-05-16 |
| 38 | AI-Based Multilingual Video Summarization System with Text-to-Braille Conversion for Visually Impaired Users Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces an AI-based multilingual video summarization system with text-to-Braille conversion designed exclusively for visually impaired users. |
Swayam Margudri; | International Journal for Research in Applied Science and … | 2026-05-16 |
| 39 | ForMaT: Dataset for Visually-Grounded Multilingual PDF Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ForMaT (Format-Preserving Multilingual Translation), a parallel corpus of 3,956 PDFs across 15 language pairs that preserves original layout metadata proposed for multimodal machine translation. |
Michał Ciesiółka; Dawid Wiśniewski; Adrian Charkiewicz; Kamil Guttmann; | arxiv-cs.CL | 2026-05-15 |
| 40 | A Corpus-Based Evaluation of Human and Machine Translation Quality for The Literary Classic The Analects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on these findings, this study points out that both human and AI translations have their own strengths and limitations. Therefore, this paper proposes strategies to optimize the human-computer collaborative translation model in order to enhance the application efficiency of artificial intelligence in the translation of classical works. |
Yanling Yuan; Jiaqi Hu; Yimei Xiao; | Social Sciences and Humanities | 2026-05-14 |
| 41 | ATD-Trans: A Geographically Grounded Japanese-English Travelogue Translation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To facilitate in-depth analysis for geographic text, we introduce ATD-Trans, a geographically grounded Japanese–English travelogue translation dataset, which enables evaluation of MT quality at both the overall and geo-entity levels across domestic (within Japan) and overseas regions. |
Shohei Higashiyama; Hiroki Ouchi; Atsushi Fujita; Masao Utiyama; | arxiv-cs.CL | 2026-05-12 |
| 42 | A Quantitative Evaluation of Neural Machine Translation Systems for English-Arabic Customs Legal Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study addresses the central question of how different neural machine translation paradigms perform under the extreme syntactic and terminological constraints of customs law. |
Karam Damseh; Mozhgan Ghassemiazghandi; | Arab World English Journal For Translation and Literary … | 2026-05-10 |
| 43 | Why Do Large Language Models Fail in Low-resource Translation? Unraveling The Token Dynamics of Large Language Models for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we systematically analyze failure modes of LLMs in MT by evaluating 15 models, including four reasoning LLMs, across 22 language pairs (LPs) with varying resource levels. |
Shenbin Qian; Yves Scherrer; | arxiv-cs.CL | 2026-05-08 |
| 44 | A Lightweight Chinese–English Translation Model Integrating Compressed BERT Attention and Phrase Discard Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present an enhanced approach to neural machine translation that integrates multiple techniques to improve translation quality. |
Xin Qi; Hang Bao; | Frontiers in Neurorobotics | 2026-05-08 |
| 45 | An Elicitation-Matrix Approach to Pragmatic Context Modeling in Low-Resource Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Pragmatic ambiguity poses a major challenge for machine translation in low-resource languages like Akan, where a single English phrase may represent multiple pragmatic contexts and vice versa. To address this gap, we develop an elicitation matrix capturing key social and situational factors and use it to create a pragmatics‑focused Akan–English dataset of 863 annotated pairs. |
KWEKU ANDOH YAMOAH et. al. | The International FLAIRS Conference Proceedings | 2026-05-06 |
| 46 | Modal Markers, Aspect and Light Verb Constructions in Literary Texts As Testing Ground For the Machine Translationese Hypothesis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study draws on four components of the COVALT corpus. |
Josep Marco; | Across Languages and Cultures | 2026-05-05 |
| 47 | Students’ Perceptions of Using Google Translate in Learning English: A Mixed-Method Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study aims to investigate undergraduate students’ perceptions of using Google Translate in learning English within academic contexts. |
Desyca Claudia; Emma Martina Pakpahan; F. Ari Anggraini Sebayang; | Jurnal Impresi Indonesia | 2026-05-05 |
| 48 | Translanguaging with AI-powered Tools and English for Specific Academic Purposes: Chinese International Students’ Experiences in Anglophone English Medium Instruction Higher Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study highlights the interconnectedness of translanguaging, AI-powered tools, and English for specific academic purposes within the context of Anglophone English-medium instruction higher education and provides practical and theoretical implications for researchers, practitioners, and course administrators in similar contexts. |
Pimsiri Taylor; Andrew Gunn; | Journal of English as a Lingua Franca | 2026-05-05 |
| 49 | Simultaneous Speech-to-Speech Translation Without Aligned Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We instead propose Hibiki-Zero, a model for simultaneous speech translation trained without word-level alignments between source and target speech. |
Tom Labiausse; Romain Fabre; Yannick Estève; Alexandre Défossez; Neil Zeghidour; | icml | 2026-05-05 |
| 50 | Nsanku: Evaluating Zero-Shot Translation Performance of LLMs for Ghanaian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents Nsanku, a systematic benchmark that evaluates the zero-shot machine translation performance of 19 open-weight and proprietary LLMs across 43 Ghanaian languages paired with English. |
STEPHEN E. MOORE et. al. | arxiv-cs.CL | 2026-05-05 |
| 51 | ATOM: Adaptive Token-Level Optimal Transport Mixup for Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents ATOM, which combines Optimal Transport with token-level contrastive learning to enhance cross-modal alignment, and an adaptive modality mixup strategy. |
J. Wang; Y. Zhao; Y. Zhang; H. Li; | icassp | 2026-05-04 |
| 52 | Is Textual Similarity Invariant Under Machine Translation? Evidence Based on The Political Manifesto Corpus Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigate the extent to which cosine similarity between paragraph embeddings is invariant under machine translation, using the Manifesto Corpus of over 2,800 political party platforms in 28 languages translated to English via the EU eTranslation service. |
DARIA BORATYN et. al. | arxiv-cs.CL | 2026-05-01 |
| 53 | The Effects of Machine Translation (MT) Use on The Writing Quality and Perceptions of L2 Writers Across Proficiency Levels: A Longitudinal Examination Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The advanced learners and the low-intermediate learners, who represented the lowest-level learners in this study, improved more in terms of lexical complexity and syntactic complexity, respectively. |
Dae-Min Kang; | Language Teaching Research | 2026-04-30 |
| 54 | Prompting Strategies Enhance GPT-5’s Chinese-English Legal Translation Quality: Versus DeepL Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the whole, this study promotes the development of legal machine translation, reveals the different effects of prompt engineering, and shows that structured prompts can improve lexical richness, while the two systems have their own advantages. Based on this, an evidence-based prompt optimization framework is proposed. |
Weiyue Feng; | English Language Teaching and Linguistics Studies | 2026-04-30 |
| 55 | Zero-Shot to Full-Resource: Cross-lingual Transfer Strategies for Aspect-Based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents a multilingual evaluation of state-of-the-art ABSA approaches across seven languages (English, German, French, Dutch, Russian, Spanish, and Czech) and four subtasks (ACD, ACSA, TASD, ASQP). |
Jakob Fehle; Nils Constantin Hellwig; Udo Kruschwitz; Christian Wolff; | arxiv-cs.CL | 2026-04-29 |
| 56 | Backtranslation Augmented Direct Preference Optimization for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel framework that requires only a general text corpus and an expert translator which can be either human or an AI system to provide iterative feedback. |
Mehrdad Ghassabi; Spehr Rajabi; Hamidreza Baradaran Kashani; Sadra Hakim; Mahshid Keivandarian; | arxiv-cs.CL | 2026-04-28 |
| 57 | External Knowledge-Guided Tuning for Critical Error Detection in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a simple yet effective approach, termed external knowledge-guided tuning, for the CED task. |
Sugyeong Eo; Chanjun Park; | Mathematics | 2026-04-28 |
| 58 | Research on E-C Translation of Sci-Tech Texts from The Perspective of Translation Shifts Theory: A Case Study of Agentic Artificial Intelligence: Harnessing AI Agents to Reinvent Business, Work, and Life Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: From the perspective of Catford’s translation shifts theory, this study conducts a case analysis of the English-Chinese translation practice of Agentic Artificial Intelligence: Harnessing AI Agents to Reinvent Business, Work, and Life. |
Sirui Xiang; Guangfang Huang; | Journal of Education and Culture Studies | 2026-04-27 |
| 59 | GAIA-v2-LILT: Multilingual Adaptation of Agent Benchmark Beyond Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a refined workflow for adapting English benchmarks into multiple languages with explicit functional alignment, cultural alignment, and difficulty calibration using both automated checks and human review. |
Yunsu Kim; Kaden Uhlig; Joern Wuebker; | arxiv-cs.CL | 2026-04-27 |
| 60 | Adapting Knowledge Graph Embedding for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper explores the integration of Knowledge Graphs (KGs) into NMT models to enhance the translation of rare and out-of-vocabulary (OOV) words. |
Nha Tran; Tri Le; Nam Nguyen; Long Nguyen; | Vietnam Journal of Science and Technology | 2026-04-25 |
| 61 | Enhancing Accessibility of Government Notices Through LLM-Based Multilingual Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces Language-Specific Fine-Tuning with Low-rank adaptation (LSFTL), a method that enhances translation for low-resource languages by optimizing the multi-head attention and feed-forward networks of Transformer layers through low-rank matrix adaptation. |
Saurabh. R.Patil; .Shiva. D.Hinge; Shruti.S .Ujjainkar; Rohit. V. Talele; Dr..Karande M.U; | International Scientific Journal of Engineering and … | 2026-04-22 |
| 62 | MMTIT-Bench: A Multilingual and Multi-Scenario Benchmark with Cognition–Perception–Reasoning Guided Text-Image Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present MMTIT-Bench, a human-verified multilingual and multi-scenario benchmark with 1,400 images spanning fourteen non-English and non-Chinese languages and diverse settings such as documents, scenes, and web images, enabling rigorous assessment of end-to-end TIMT. |
GENGLUO LI et. al. | cvpr | 2026-04-21 |
| 63 | Exploring Language-Agnosticity in Function Vectors: A Case Study in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study whether FVs exhibit language-agnosticity, using machine translation as a case study. |
Nurkhan Laiyk; Gerard I. Gállego; Javier Ferrando; Fajri Koto; | arxiv-cs.CL | 2026-04-21 |
| 64 | Syntax As A Rosetta Stone: Universal Dependencies for In-Context Coptic Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a novel in-context learning approach to support low-resource machine translation of the Coptic language to English, with syntactic augmentation from Universal Dependencies parses of input sentences. |
Abhishek Purushothama; Emma Thronson; Alexia Guo; Amir Zeldes; | arxiv-cs.CL | 2026-04-20 |
| 65 | Beyond Reproduction: A Paired-Task Framework for Assessing LLM Comprehension and Creativity in Literary Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet translational creativity remains underexplored and is rarely evaluated at scale, while source-text comprehension is typically studied in isolation, despite the fact that, in professional translation, comprehension and creativity are tightly intertwined. We address these gaps with a paired-task framework applied to literary excerpts from 11 books. |
RAN ZHANG et. al. | arxiv-cs.CL | 2026-04-20 |
| 66 | Optimizing Korean-Centric LLMs Via Token Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a systematic benchmark of state-of-the-art multilingual large language models (LLMs) adapted via token pruning – a compression technique that eliminates tokens and embedding parameters corresponding to languages irrelevant to the target application. |
Hoyeol Kim; Hyeonwoo Kim; | arxiv-cs.CL | 2026-04-17 |
| 67 | Improving Bidirectional English–Assamese Neural Machine Translation with Word Embeddings and Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Basab Nath; Krishna Kant Pandey; Vinod Yadav; Prabhat Dixit; | International Journal of Networked and Distributed Computing | 2026-04-16 |
| 68 | Should We Be Pedantic About Reasoning Errors in Machine Translation? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Across multiple language pairings (English $\to$ \{Spanish, French, German, Mandarin, Japanese, Urdu, Cantonese\}), we find reasoning errors in translation. |
Calvin Bao; Marine Carpuat; | arxiv-cs.CL | 2026-04-10 |
| 69 | UNIVERSAL COMMUNICATION BRIDGE FOR MULTI MODEL REAL TIME TRANSLATION USING AI AND COMPUTER VISION Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a Universal Communication Bridge for Multi-Modal Real-Time Translation, integrating Artificial Intelligence and Computer Vision to support communication across speech, text, and sign language. |
HANIFYASAR N et. al. | International Scientific Journal of Engineering and … | 2026-04-09 |
| 70 | Automated Multilingual Translation of Documents and Question Papers Using Generative AI Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes an automated multilingual document translation system using Generative AI and Neural Machine Translation (NMT) approaches for the efficient translation of examination question papers in various languages. |
BALA VAMSI KRISHNA P; SAI KISHAN VARMA R; RAHUL REDDY P; SHAHEEN S; S YUGANDHAR KUMAR; | International Scientific Journal of Engineering and … | 2026-04-09 |
| 71 | XR-CareerAssist: An Immersive Platform for Personalised Career Guidance Leveraging Extended Reality and Multimodal AI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce XR-CareerAssist, a platform that unifies Extended Reality (XR) with several Artificial Intelligence (AI) modules to deliver immersive, multilingual career guidance. |
N. D. Tantaroudas; A. J. McCracken; I. Karachalios; E. Papatheou; V. Pastrikakis; | arxiv-cs.CE | 2026-04-08 |
| 72 | MERIT: Multilingual Expert-Reward Informed Tuning for Chinese-Centric Low-Resource Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce \textbf{M}ultilingual \textbf{E}xpert-\textbf{R}eward \textbf{I}nformed \textbf{T}uning (\textbf{MERIT}), a unified translation framework that transforms the traditional English-centric ALT benchmark into a Chinese-centric evaluation suite for five Southeast Asian low-resource languages (LRLs). |
ZHIXIANG LU et. al. | arxiv-cs.CL | 2026-04-06 |
| 73 | AI-Based Multilingual Machine Translator from English to GorBoli Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an AI-based multilingual machine translation system designed to translate English text into GorBoli using a hybrid approach. |
Poojitha Chiradi; Poojitha Kavati; Pinki Banot; Sarika Rao P; | International Scientific Journal of Engineering and … | 2026-04-03 |
| 74 | Understanding Google Translate’s Limitations in English – Chinese Translation: A Linguistic Error Analysis Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study examines the quality of GT by conducting an error analysis of Chinese-to-English and English-to-Chinese translations across texts of varying genres and lengths. |
R. D. A. E. Jayasundara; R. P. Yaddehige; | International Journal of Contemporary Business Research | 2026-03-31 |
| 75 | Open Machine Translation for Esperanto Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present the first comprehensive evaluation of open-source MT systems for Esperanto, comparing rule-based systems, encoder-decoder models, and LLMs across model sizes. |
Ona de Gibert; Lluís de Gibert; | arxiv-cs.CL | 2026-03-31 |
| 76 | A Transformer-Based Method for Bidirectional French–Lingala Machine Translation in Speech and Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a deep neural network pipeline for bidirectional French–Lingala automatic translation, covering both text-to-text and voice-to-text scenarios, by integrating Long Short-Term Memory (LSTM) and Transformer models on a specialized parallel corpus. |
REAGAN E. MANDIYA et. al. | Applied Sciences | 2026-03-31 |
| 77 | CROSS-LINGUAL TRANSFORMER-BASED SCREENING OF POST-TRAUMATIC STRESS DISORDER BASED ON COMPARATIVE ANALYSIS OF BERT AND XLM-ROBERTA WITH MACHINE TRANSLATION ADAPTATION FOR UKRAINIAN LANGUAGE Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The paper presents a comprehensive study on automated screening of post-traumatic stress disorder (PTSD) using transformer-based natural language processing models in a cross-lingual setting. |
Andrii FEDORYCHKO; Victoria VYSOTSKA; Lyubomyr CHYRUN; | Computer systems and information technologies | 2026-03-26 |
| 78 | MMTIT-Bench: A Multilingual and Multi-Scenario Benchmark with Cognition-Perception-Reasoning Guided Text-Image Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present MMTIT-Bench, a human-verified multilingual and multi-scenario benchmark with 1,400 images spanning fourteen non-English and non-Chinese languages and diverse settings such as documents, scenes, and web images, enabling rigorous assessment of end-to-end TIMT. |
GENGLUO LI et. al. | arxiv-cs.CV | 2026-03-24 |
| 79 | ConGA: Guidelines for Contextual Gender Annotation. A Framework for Annotating Gender in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This asymmetry often leads MT systems to default to masculine forms, reinforcing bias and reducing translation accuracy. To address this issue, we present the Contextual Gender Annotation (ConGA) framework, a linguistically grounded set of guidelines for word-level gender annotation. |
Argentina Anna Rescigno; Eva Vanmassenhove; Johanna Monti; | arxiv-cs.CL | 2026-03-18 |
| 80 | Zero-shot English–Assamese Neural Machine Translation Via Pivot-based Cross-lingual Embedding Alignment and Transfer Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Basab Nath; Yonis Gulzar; | Scientific Reports | 2026-03-17 |
| 81 | Ensemble Self-Training for Unsupervised Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present an ensemble-driven self-training framework for unsupervised neural machine translation (UNMT). |
Ido Aharon; Jonathan Shaki; Sarit Kraus; | arxiv-cs.CL | 2026-03-17 |
| 82 | A Reinforcement Generative Adversarial Network for Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This limitation restricts their capacity to fully comprehend contextual semantics and often results in translation imbalances. To address these challenges, this paper proposes a reinforcement generative adversarial networks for multilingual neural machine translation (RGAN). |
Yiou Huang; Hui Liu; | Journal of Artificial Intelligence Research | 2026-03-17 |
| 83 | Bidirectional Chinese and English Passive Sentences Dataset for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Regarding English-Chinese language pairs, passive sentences are constructed and distributed differently due to language variation, thus need special attention in MT. This paper proposes a bidirectional multi-domain dataset of passive sentences, extracted from five Chinese-English parallel corpora and annotated automatically with structure labels according to human translation, and a test set with manually verified annotation. |
Xinyue Ma; Pol Pastells; Mireia Farrús; Mariona Taulé; | arxiv-cs.CL | 2026-03-16 |
| 84 | Machine Translation in The Wild: User Reaction to Xiaohongshu’s Built-In Translation Feature Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Drawing on a dataset of 6,723 comments collected from 11 official posts promoting the translation function, this paper combines sentiment analysis with thematic analysis to investigate how users perceived and experimented with the function. |
Sui He; | arxiv-cs.HC | 2026-03-16 |
| 85 | Developing An English-Efik Corpus and Machine Translation System for Digitization Inclusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study evaluates the effectiveness of state-of-the-art multilingual neural machine translation models for English-Efik translation, leveraging a small-scale, community-curated parallel corpus of 13,865 sentence pairs. |
Offiong Bassey Edet; Mbuotidem Sunday Awak; Emmanuel Oyo-Ita; Benjamin Okon Nyong; Ita Etim Bassey; | arxiv-cs.CL | 2026-03-16 |
| 86 | Towards Privacy-Preserving Machine Translation at The Inference Stage: A New Task and Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The absence of these elements has seriously constrained researchers’ in-depth exploration of this direction. To bridge this gap, this paper proposes a novel Privacy-Preserving Machine Translation (PPMT) task, aiming to protect the private information in text during the model inference stage. |
WEI SHAO et. al. | arxiv-cs.CL | 2026-03-15 |
| 87 | Human Translation Versus Post-Editing of Machine-Translated Figurative Expressions: A Pilot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research paper presents a pilot study of temporal effort and translation quality in human translation (HT) and machine translation post-editing (MTPE) by trainee translators. |
Ogareet Khoury; | World Journal of English Language | 2026-03-13 |
| 88 | Distinguishing ChatGPT‐Generated Translation From Neural Machine Translation and Human Translation: A Linguistic and Stylistic Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: ABSTRACT The widespread adoption of large language models (LLMs) for translation tasks necessitates a deeper understanding of the stylistic characteristics of their generated translations, an area that remains largely underexplored. To address this gap, this study examines whether a distinct LLM‐translationese emerges in diplomatic translation and identifies the linguistic features that differentiate LLM‐based translation from neural machine translation (NMT) and human translation (HT). |
Zhaokun Jiang; Qianxi Lv; Ziyin Zhang; Lei Lei; | International Journal of Applied Linguistics | 2026-03-13 |
| 89 | Translation Approaches to Support Systemic Anti-cancer Therapy Consent for Individuals with Limited English Proficiency Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study aimed to (a) compare professional and machine translation of written SACT information and (b) investigate whether providing a bilingual SACT consent form altered comprehension of key information during an interpreted SACT consent consultation. Methods This randomised study included healthy, Bengali- or Sylheti-speaking adults with LEP across London, UK. |
STEPHEN P. HIBBS et. al. | Supportive Care in Cancer | 2026-03-13 |
| 90 | Translationese As A Rational Response to Translation Task Difficulty Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose that translationese reflects cognitive load inherent in the translation task itself. |
Maria Kunilovskaya; | arxiv-cs.CL | 2026-03-12 |
| 91 | Semi-Synthetic Parallel Data for Translation Quality Estimation: A Case Study of Dataset Building for An Under-Resourced Language Pair Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet, developing highly accurate, adaptable and reliable QE systems for under-resourced language pairs remains largely unsolved, due mainly to limited parallel corpora and to diverse language-dependent factors, such as with morphosyntactically complex languages. This study presents a semi-synthetic parallel dataset for English-to-Hebrew QE, generated by creating English sentences based on examples of usage that illustrate typical linguistic patterns, translating them to Hebrew using multiple MT engines, and filtering outputs via BLEU-based selection. |
Assaf Siani; Anna Kernerman; Ilan Kernerman; | arxiv-cs.CL | 2026-03-12 |
| 92 | Streaming Translation and Transcription Through Speech-to-Text Causal Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Hikari, a policy-free, fully end-to-end model that performs simultaneous speech-to-text translation and streaming transcription by encoding READ/WRITE decisions into a probabilistic WAIT token mechanism. |
ROMAN KOSHKIN et. al. | arxiv-cs.CL | 2026-03-12 |
| 93 | IMTBench: A Multi-Scenario Cross-Modal Collaborative Evaluation Benchmark for In-Image Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing IIMT benchmarks are largely synthetic and thus fail to reflect real-world complexity, while current evaluation protocols focus on single-modality metrics and overlook cross-modal faithfulness between rendered text and model outputs. To address these shortcomings, we present In-image Machine Translation Benchmark (IMTBench), a new benchmark of 2,500 image translation samples covering four practical scenarios and nine languages. |
JIAHAO LYU et. al. | arxiv-cs.CV | 2026-03-11 |
| 94 | Large Language Models As Annotators for Machine Translation Quality Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a systematic approach for the development of a GPT-4o-based prompt, called PPbMQM (Prompt-Pattern-based-MQM). |
Sidi Wang; Sophie Arnoult; Amir Kamran; | arxiv-cs.CL | 2026-03-11 |
| 95 | EPIC-EuroParl-UdS: Information-Theoretic Perspectives on Translation and Interpreting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces an updated and combined version of the bidirectional English-German EPIC-UdS (spoken) and EuroParl-UdS (written) corpora containing original European Parliament speeches as well as their translations and interpretations. |
Maria Kunilovskaya; Christina Pollkläsener; | arxiv-cs.CL | 2026-03-10 |
| 96 | Gender Bias in MT for A Genderless Language: New Benchmarks for Basque Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language models (LLMs) and machine translation (MT) systems are increasingly used in our daily lives, but their outputs can reproduce gender bias present in the training data. |
Amaia Murillo; Naiara Perez; | arxiv-cs.CL | 2026-03-09 |
| 97 | Terminology Rarity Predicts Catastrophic Failure in LLM Translation of Low-Resource Ancient Languages: Evidence from Ancient Greek Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents the first systematic, reference-free human evaluation of large language model (LLM) machine translation (MT) for Ancient Greek (AG) technical prose. |
James L. Zainaldin; Cameron Pattison; Manuela Marai; Jacob Wu; Mark J. Schiefsky; | arxiv-cs.CL | 2026-02-27 |
| 98 | Computation of Machine Translation Errors Through Implicit Consideration of Word Sense Disambiguation in Bengali Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
TANIYA SEAL et. al. | Cureus Journal of Computer Science | 2026-02-27 |
| 99 | Scalable Multilingual Multimodal Machine Translation with Speech-Text Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a Speech-guided Machine Translation (SMT) framework that integrates speech and text as fused inputs into an MLLM to improve translation quality. |
YEXING DU et. al. | arxiv-cs.CL | 2026-02-25 |
| 100 | Application of Adversarial Transfer Learning in Domain Adaptive English Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abstract Machine translation systems have demonstrated a significant level of success, especially in rich language pairs endowed with heavy parallel corpora. When the model is used in domain-specific settings where in-domain data is scarce or even unavailable, however, its performance suffers greatly. |
Caiping Li; | Discover Artificial Intelligence | 2026-02-18 |
| 101 | A STUDY OF COLLOCATIONS IN A PARALLEL CORPUS OF TRANSLATIONS AND EQUIVALENT TEXTS Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates the translation patterns and equivalence strategies of collocations in a parallel corpus consisting of English-Uzbek translations. |
Ibrohim Voxitov Sodiq o‘g‘li; | Advances in Science and Humanities | 2026-02-18 |
| 102 | DiscoX: Benchmarking Discourse-Level Translation in Expert Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While these translations demand discourse-level coherence and strict terminological precision, current evaluation methods predominantly focus on segment-level accuracy and fluency. To address this limitation, we introduce DiscoX, a new benchmark for discourse-level and expert-level Chinese-English translation. |
XIYING ZHAO et. al. | iclr | 2026-02-17 |
| 103 | LuxMT Technical Report Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce LuxMT, a machine translation system based on Gemma 3 27B and fine-tuned for translation from Luxembourgish (LB) into French (FR) and English (EN). |
Nils Rehlinger; | arxiv-cs.CL | 2026-02-17 |
| 104 | Building Bridges Between Computational Methods and Human Translation: An English to Brazilian Portuguese Application of Machine Translation in The Cross-Cultural Adaptation of Psychological and Health-Related Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The present study evaluated the effectiveness of machine translation (MT) in both forward (English to Brazilian Portuguese) and backward translation (Brazilian Portuguese to English) of psychological and health-related assessments. |
Maicon Rodrigues Albuquerque; Renan Pedra de Souza; Débora Marques de Miranda; Marco Aurélio Romano-Silva; | Journal of Cross-Cultural Psychology | 2026-02-17 |
| 105 | Biologically Inspired Convolutional Neural Architectures for Enhanced Chinese–English Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Zhihao Jiang; | Discover Artificial Intelligence | 2026-02-17 |
| 106 | Crowdsourcing Piedmontese to Test LLMs on Non-Standard Orthography Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a crowdsourced dataset for Piedmontese, an endangered Romance language of northwestern Italy. |
Gianluca Vico; Jindřich Libovický; | arxiv-cs.CL | 2026-02-16 |
| 107 | Automated Evaluation of LLMs for Effective Machine Translation of Mandarin Chinese to English Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we utilise an automated machine learning framework featuring semantic and sentiment analysis to assess Mandarin Chinese to English translation using Google Translate and LLMs, including GPT-4, GPT-4o, and DeepSeek. |
Yue Zhang; Rodney Beard; John Hawkins; Rohitash Chandra; | arxiv-cs.CL | 2026-02-15 |
| 108 | Linguistic Knowledge Injected Into Large Language Model for Urdu-English Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
MUHAMMAD NAEEM UL HASSAN et. al. | Language Resources and Evaluation | 2026-02-09 |
| 109 | Beyond Scalar Scores: Reinforcement Learning for Error-Aware Quality Estimation of Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, for low-resource languages where annotated QE data is limited, existing approaches struggle to achieve reliable performance. To address these challenges, we introduce the first segment-level QE dataset for English to Malayalam, a severely resource-scarce language pair in the QE domain, comprising human-annotated Direct Assessment (DA) scores and Translation Quality Remarks (TQR), which are short, contextual, free-form annotator comments that describe translation errors. |
Archchana Sindhujan; Girish A. Koushik; Shenbin Qian; Diptesh Kanojia; Constantin Orăsan; | arxiv-cs.CL | 2026-02-09 |
| 110 | The Logovista English-Japanese Machine Translation System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The paper is intended as a technical and historical record rather than an argument for reviving rule-based MT, and describes the software and linguistic resources that have been preserved for potential future study. |
Barton D. Wright; | arxiv-cs.CL | 2026-02-08 |
| 111 | MTQE.en-he: Machine Translation Quality Estimation for English-Hebrew Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Fine-tuning experiments with TransQuest and CometKiwi reveal that full-model updates are sensitive to overfitting and distribution collapse, yet parameter-efficient methods (LoRA, BitFit, and FTHead, i.e., fine-tuning only the classification head) train stably and yield improvements of 2-3 percentage points. MTQE.en-he and our experimental results enable future research on this under-resourced language pair. |
Andy Rosenbaum; Assaf Siani; Ilan Kernerman; | arxiv-cs.CL | 2026-02-06 |
| 112 | Consensus-Aligned Neuron Efficient Fine-Tuning Large Language Models for Multi-Domain Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a neuron-efficient fine-tuning framework for MDMT that identifies and updates consensus-aligned neurons within LLMs. |
SHUTING JIANG et. al. | arxiv-cs.CL | 2026-02-05 |
| 113 | No One-Size-Fits-All: Building Systems For Translation to Bashkir, Kazakh, Kyrgyz, Tatar and Chuvash Using Synthetic And Original Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We explore machine translation for five Turkic language pairs: Russian-Bashkir, Russian-Kazakh, Russian-Kyrgyz, English-Tatar, English-Chuvash. |
Dmitry Karpov; | arxiv-cs.CL | 2026-02-04 |
| 114 | PEGRL: Improving Machine Translation By Post-Editing Guided Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A task-specific weighting scheme further balances the contributions of translation and post-editing objectives, yielding a biased yet more sample-efficient estimator. |
YUNZHI SHEN et. al. | arxiv-cs.CL | 2026-02-03 |
| 115 | EmoAra: Emotion-Preserving English Speech Transcription and Cross-Lingual Translation with Arabic Text-to-Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents EmoAra, an end-to-end emotion-preserving pipeline for cross-lingual spoken communication, motivated by banking customer service where emotional context affects service quality. |
BESHER HASSAN et. al. | arxiv-cs.CL | 2026-02-01 |
| 116 | A Comparative Study of Human and Machine Translation of Animal Metaphors in Mo Yan’s Frog Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study adopts conceptual metaphor theory and an integrated methodology combining qualitative and quantitative analysis with theoretical interpretation. |
Juechu Yin; Qiushi Gu; | Digital Technologies Research and Applications | 2026-01-30 |
| 117 | Ensuring Consistency for In-Image Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While this task has numerous applications in various scenarios such as film poster translation and everyday scene image translation, existing methods frequently neglect the aspect of consistency throughout this process. We propose the need to uphold two types of consistency in this task: translation consistency and image generation consistency. |
CHENGPENG FU et. al. | Mathematics | 2026-01-30 |
| 118 | Domain Adaptive Machine Translation with Synthetic Feedback for Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Furthermore, the current ICL paradigm does not offer the fine-grained domain features in addition to parallel translation pairs. To address these challenges, we propose a pipeline that collects in-domain translations from LLMs and generates synthetic, human-like feedback for revising these translations. |
XINYI YANG et. al. | ACM Transactions on Asian and Low-Resource Language … | 2026-01-30 |
| 119 | TRANSLATION STRATEGIES IN EFL/ESL AND MTGEN/AI POST-EDITING Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study suggests combining MT/GenAI with rubric based post editing to improve translators’ tech and strategic skills in modern translation teaching. |
Lusinda Juliani Lusi; Nailah Zubaidah; Cyntya Isabel; Refika Andriani; | ELTR Journal | 2026-01-28 |
| 120 | ADAT Novel Time-series-aware Adaptive Transformer Architecture for Sign Language Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, their quadratic attention complexity leads to inefficient training. To mitigate these issues, we introduce ADAT, an Adaptive Transformer architecture that combines convolutional feature extraction, log-sparse self-attention, and an adaptive gating mechanism to efficiently model both short- and long-range temporal dependencies in sign language sequences. |
Nada Shahin; Leila Ismail; | Scientific Reports | 2026-01-28 |
| 121 | Reflective Translation: Improving Low-Resource Machine Translation Via Structured Self-Reflection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The proposed method is model-agnostic, requires no fine-tuning, and introduces a reflection-augmented dataset that can support future supervised or analysis-driven work. |
Nicholas Cheng; | arxiv-cs.CL | 2026-01-27 |
| 122 | DIETA: A Decoder-only Transformer-based Model for Italian-English Machine TrAnslation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present DIETA, a small, decoder-only Transformer model with 0.5 billion parameters, specifically designed and trained for Italian-English machine translation. |
PRANAV KASELA et. al. | arxiv-cs.CL | 2026-01-25 |
| 123 | La Traducción Automática En El Ámbito De La Atención Al Cliente: Percepciones De Los Estudiantes De Traducción Sobre La Idiomaticidad Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: : The article analyzes the perception of idiomaticity and machine translation acceptability thresholds in customer service. |
Aurora Martín de Santa Olalla Sánchez; Celia Rico Pérez; | Hermēneus. Revista de traducción e interpretación | 2026-01-19 |
| 124 | Improving Low-Resource Machine Translation Via Round-Trip Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Using the NLLB-MD dataset, we evaluate both the 600M and 1.3B parameter NLLB models and observe consistent improvements for the following languages: Central Aymara, Friulian, Wolof and Russian. |
Ahmed Attia; Alham Fikri; | arxiv-cs.CL | 2026-01-18 |
| 125 | A Comparative Analysis of Translation and Back-Translation Vs. Machine Translation With Post-Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Methods The study evaluated the TBT and MTPE methods in translating 15 self-administered (SA) scales of the Standard for Clinicians’ Interview in Psychiatry (SCIP) from English into Arabic. |
AHMED ABORAYA et. al. | Annals of Clinical Psychiatry | 2026-01-17 |
| 126 | Redefining Machine Simultaneous Interpretation: From Incremental Translation to Human-Like Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We extend the action space of SiMT with four adaptive actions: Sentence_Cut, Drop, Partial_Summarization and Pronominalization, which enable real-time restructuring, omission, and simplification while preserving semantic fidelity. We adapt these actions in a large language model (LLM) framework and construct training references through action-aware prompting. |
Qianen Zhang; Zeyu Yang; Satoshi Nakamura; | arxiv-cs.CL | 2026-01-16 |
| 127 | Multilingual-To-Multimodal (M2M): Unlocking New Languages with Monolingual Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce METAL, a lightweight alignment method that learns only a few linear layers using English text alone to map multilingual text embeddings into a multimodal space. |
Piyush Singh Pasi; | arxiv-cs.LG | 2026-01-15 |
| 128 | Get Away with Less: Need of Source Side Data Curation to Build Parallel Corpus for Low Resource Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: But, for low-resource languages, human translation to generate sufficient data is prohibitively expensive. Therefore, it is crucial to develop a framework that screens source sentences to form efficient parallel text, ensuring optimal MT system performance in low-resource environments. |
Saumitra Yadav; Manish Shrivastava; | arxiv-cs.CL | 2026-01-13 |
| 129 | Mitrasamgraha: A Comprehensive Classical Sanskrit Machine Translation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As of now, there is a strong lack of publicly available resources that cover these different domains and temporal layers of Sanskrit. We therefore introduce Mitrasamgraha, a high-quality Sanskrit-to-English machine translation dataset consisting of 391,548 bitext pairs, more than four times larger than the largest previously available Sanskrit dataset Itih=asa. |
SEBASTIAN NEHRDICH et. al. | arxiv-cs.CL | 2026-01-12 |
| 130 | MITRA: A Large-Scale Parallel Corpus and Multilingual Pretrained Language Model for Machine Translation and Semantic Retrieval for Pāli, Sanskrit, Buddhist Chinese, and Tibetan Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the MITRA framework, which consists of a novel pipeline for multilingual parallel passage mining, MITRA-parallel, a large-scale corpus of 1.74 million parallel sentence pairs between Sanskrit, Chinese, and Tibetan, and the development of the domain-specific pretrained language model Gemma 2 MITRA. |
Sebastian Nehrdich; Kurt Keutzer; | arxiv-cs.CL | 2026-01-09 |
| 131 | A Rising Tide Lifts All Boats: MTQE Rewards for Idioms Improve General Translation Quality Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Because models are fairly good at translating compositional text, we investigate GRPO-style fine-tuning using Machine Translation Quality Estimation (MTQE) models as reward functions to train models to better translate idioms. |
Ishika Agarwal; Zhenlin He; Dhruva Patil; Dilek Hakkani-Tür; | arxiv-cs.CL | 2026-01-09 |
| 132 | How Different Prompts Affect GPT-5s Chinese-to-English Translation Performance of Government Work Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study examines the impact of different prompts – simple prompts, complex prompts, and few-shot prompts – on GPT-5s translation performance for the 2024 Chinese Government Work Report, finding that while complex prompts yielded better results in automatic evaluation metrics, human assessment showed no substantial differences in translation quality between simple and complex prompts. |
Jingjing Feng; | Advances in Humanities Research | 2026-01-08 |
| 133 | NeoAMT: Neologism-Aware Agentic Machine Translation with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an agentic framework, NeoAMT, for neologism-aware machine translation using a Wiktionary search tool. |
Zhongtao Miao; Kaiyan Zhao; Masaaki Nagata; Yoshimasa Tsuruoka; | arxiv-cs.CL | 2026-01-07 |
| 134 | English–Uzbek Machine Translation Based on Independent Parts of Speech and New Mathematical Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The proposed model integrates morphological, syntactic, and semantic stages and introduces new mathematical approaches for processing both simple and complex words. |
Vazira Bekova; Muftakh Khakimov; Sirojiddinov Ziyoviddin; | WSEAS TRANSACTIONS ON COMPUTER RESEARCH | 2026-01-05 |
| 135 | AraBART-based Arabic Lemmatization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce AraBART, the first Arabic model to feature an end-to-end pre-trained encoder-decoder, leveraging the BART architecture. |
Soumia Afartass; Fadoua Ataa Allah; Khalid Minaoui; | WSEAS TRANSACTIONS ON INFORMATION SCIENCE AND APPLICATIONS | 2026-01-02 |
| 136 | TIMTQE: Benchmarking Machine Translation Quality Estimation for Text Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Accurate quality estimation is critical for the reliable machine translation of text from historical document images. However, existing quality estimation methods typically … |
Shuo Li; Xiaojun Bi; Yiwen Sun; | IEEE Signal Processing Letters | 2026-01-01 |
| 137 | An Entropy-based Study of Simplification in ChatGPT Translations Compared to Neural Machine Translation and Human Translation Across Genres Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates the phenomenon of simplification in Chinese-to-English translation across Human Translation (HT), neural machine translation (NMT), and large language model (LLM)-based translation, ChatGPT as an example. |
Guangyuan Yao; Lingxi Fan; | PLOS One | 2025-12-31 |
| 138 | HY-MT1.5 Technical Report Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this report, we introduce our latest translation models, HY-MT1.5-1.8B and HY-MT1.5-7B, a new family of machine translation models developed through a holistic training framework tailored for high-performance translation. |
Mao Zheng; Zheng Li; Tao Chen; Mingyang Song; Di Wang; | arxiv-cs.CL | 2025-12-30 |
| 139 | Effort in Machine Translation Post-Editing: The Role of Expertise Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study examines the relationship between expertise and effort in MTPE, addressing the following two questions: (1) To what extent can expertise serve as a valid indicator of cognitive, temporal, and technical/linguistic effort in MTPE? |
Minel Sayar Öztürk; Alper Kumcu; | Cankaya University Journal of Humanities and Social Sciences | 2025-12-29 |
| 140 | A Study on Chinese-English Translation of Shiwan Ceramic Sculpture Culture for International Dissemination Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study centers on commentaries of Shiwan Ceramic Sculpture in the Guangdong Shiwan Ceramic Museum, exploring Chinese-English (C-E) translation strategies for Shiwan Ceramic Sculpture under the theoretical framework of Systemic Functional Linguistics (SFL), with additional discussions on machine translation. This study adopts a qualitative approach as its primary research method, supplemented by an appropriate amount of quantitative research involving data statistics. |
Yafei Pang; | International Journal of Social Science Education and … | 2025-12-28 |
| 141 | Assessing and Improving Punctuation Robustness in English-Marathi Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we focus on Marathi, a low- to middle-resource language. |
Kaustubh Shivshankar Shejole; Sourabh Deoghare; Pushpak Bhattacharyya; | arxiv-cs.CL | 2025-12-28 |
| 142 | Next-Generation Translation and Artificial Intelligence: Evaluating Human-Machine Interaction in The English – Turkish Language Pair Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper evaluates the performance of AI-assisted machine translation systems in translating English into Turkish, focusing on terminology management, semantic shifts, and contextual accuracy. |
Aziz Oğuzhan Çakaloğlu; İnönü Korkmaz; | Karamanoğlu Mehmetbey Üniversitesi Uluslararası Filoloji ve … | 2025-12-27 |
| 143 | Improve Arabic–English Translation Using Arabic Language Attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Purpose The purpose of the proposed work is to enhance Arabic-English translation and other Arabic NLP tasks by addressing the unique linguistic challenges of the Arabic language. |
Faisal Alamri; | Data Technologies and Applications | 2025-12-27 |
| 144 | A Comparative Dependency Analysis of Human Translation and Machine Translation: A Case Study of English Translation of To Live Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study examines syntactic differences between human and machine translations of a Chinese literary text from the perspective of mean dependency distance. |
Nuo Ding; Jingxiang Cao; | Studies in Linguistics and Literature | 2025-12-26 |
| 145 | AlignAR: Generative Sentence Alignment for Arabic-English Parallel Corpora of Legal and Literary Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present AlignAR, a generative sentence alignment method, and a new Arabic-English dataset comprising complex legal and literary texts. |
Baorong Huang; Ali Asiri; | arxiv-cs.CL | 2025-12-25 |
| 146 | From Scratch to Fine-Tuned: A Comparative Study of Transformer Training Strategies for Legal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents our work for the JUST-NLP 2025 Legal MT shared task, focusing on English-Hindi translation using Transformer-based approaches. |
Amit Barman; Atanu Mandal; Sudip Kumar Naskar; | arxiv-cs.CL | 2025-12-20 |
| 147 | A Review on Handwritten Malayalam to English Digitization and Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The review covers a range of methodologies, including classical Optical Character Recognition (OCR) techniques, statistical machine translation (SMT), neural machine translation (NMT), and new Vision Language Models (VLMs). |
Mohammed Farhan; | International Journal for Research in Applied Science and … | 2025-12-20 |
| 148 | The Quality of AI-Powered Machine Translation: A Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Some instances of over-translation were also identified, underscoring the difficulty of balancing literal accuracy with cultural sensitivity. Overall, the study highlights the technical competence of AI translation while pointing to areas where its cultural and contextual awareness remains limited. |
Jacek Woźny; | Anglica Wratislaviensia | 2025-12-19 |
| 149 | Readability, Fluency and Error Identification When Using Machine Translation and AI to Translate Medical Research Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With the improvements in the algorithms used to develop online machine translation tools and the growing relevance of AI there is a raising concern related to the quality of these tools’ outputs. Therefore, we sought to evaluate the readability of the results of machine translation and AI for the translation of medical research texts from English into European Portuguese. |
Raquel Moreira; | Langues & Parole | 2025-12-17 |
| 150 | Google Translate Then and Now: Translations From Five Languages Into English and Arabic (2012–2025) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study compares the translation of six texts from Hungarian, German and Spanish, Turkish and Japanese to English and Arabic by GT in 2012 (SMT era) and in 2025 (NMT era) in terms of intelligibility, fluency, and semantic, lexical and syntactic accuracy. |
Reima Al-Jarf; | Journal of Computer Science and Technology Studies | 2025-12-17 |
| 151 | An Empirical Study on Chinese Character Decomposition in Multiword Expression-Aware Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we conduct a systematic study of the Chinese character decomposition technology in the context of MWE-aware neural machine translation (NMT). |
Lifeng Han; Gareth J. F. Jones; Alan F. Smeaton; | arxiv-cs.CL | 2025-12-17 |
| 152 | PrahokBART: A Pre-trained Sequence-to-Sequence Model for Khmer Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces {\it PrahokBART}, a compact pre-trained sequence-to-sequence model trained from scratch for Khmer using carefully curated Khmer and English corpora. |
HOUR KAING et. al. | arxiv-cs.CL | 2025-12-15 |
| 153 | Advancing Bangla Machine Translation Through Informal Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this research, we explore current state-of-the-art models and propose improvements to Bangla translation by developing a dataset from informal sources like social media and conversational texts. |
AYON ROY et. al. | arxiv-cs.CL | 2025-12-15 |
| 154 | A Comparative Study of Human and Machine Translation in English and Urdu Language: Evaluating Accuracy Using Google Translate and ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study highlighted that machine translation performs well in simple informational sentences, but human translation provides more natural, culturally appropriate, and meaningful results. |
Aneeqa Sabir; Moneeba Habib; | ACADEMIA International Journal for Social Sciences | 2025-12-15 |
| 155 | Improving Translation Quality By Selecting Better Data for LLM Fine-Tuning: A Comparative Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigated the impact of data selection on machine translation fine-tuning for open LLMs. |
Felipe Ribeiro Fujita de Mello; Hideyuki Takada; | arxiv-cs.CL | 2025-12-12 |
| 156 | MultiScript30k: Leveraging Multilingual Embeddings to Extend Cross Script Parallel Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Previous efforts to extend Multi30k exist, but the list of supported languages, represented language families, and scripts is still very short. To address these issues, we propose MultiScript30k, a new Multi30k dataset extension for global languages in various scripts, created by translating the English version of Multi30k (Multi30k-En) using NLLB200-3.3B. |
CHRISTOPHER DRIGGERS-ELLIS et. al. | arxiv-cs.CL | 2025-12-11 |
| 157 | Optimizing Hate Speech Detection in Malayalam-English Code-Mixed Text: Handling Women’s Abuse By Synthetic Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work examines the efficacy of synthetic data augmentation techniques—Machine Translation (MT), Masked Language Modeling (MLM), and Few-Shot Learning (FSL)—in enhancing hate speech identification inside Malayalam-English (Manglish) social media text. We use these three methodologies to improve transformer-based models like mBERT, BERT, and IndicBERT. |
Dhanya LK; Kannan Balakrishnan; | Journal of Intelligent & Fuzzy Systems: Applications in … | 2025-12-11 |
| 158 | Automatic Evaluation Method for Machine Translation Quality Based on Cross Attention Recurrent Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes an automatic scoring model of MTQ-CARNN(Machine Translation Quality of Cross Attention Recurrent Neural Networks). |
Anan Tian; Meijuan Sun; | Journal of Circuits, Systems and Computers | 2025-12-11 |
| 159 | Literacia Crítica Digital A Partir De Traduções Automáticas: Explorando Variação Linguística Em <i>The Color Purple</i> Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article presents a didactic sequence aimed at basic education, which articulates the themes of linguistic variation, digital critical literacy, and machine translation. |
Camila De Bona; | #Tear: Revista de Educação, Ciência e Tecnologia | 2025-12-11 |
| 160 | TeluguST-46: A Benchmark Corpus and Comprehensive Evaluation for Telugu-English Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite Telugu being spoken by over 80 million people, speech translation research for this morphologically rich language remains severely underexplored. We address this gap by developing a high-quality Telugu–English speech translation benchmark from 46 hours of manually verified CSTD corpus data (30h/8h/8h train/dev/test split). |
BHAVANA AKKIRAJU et. al. | arxiv-cs.CL | 2025-12-08 |
| 161 | CULTURAL AND LINGUISTIC ASPECTS OF TRANSLATION OF METAPHORS IN ADVERTISING Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The aim of the study is to identify common challenges, strategies, and tendencies in rendering metaphorical constructions in translated advertising slogans, as well as to determine the degree to which their pragmatic and emotional impact is preserved. |
Antonina KOROL; Oksana Kravets; | Germanic Philology Journal of Yuriy Fedkovych Chernivtsi … | 2025-12-08 |
| 162 | Translating English Number Idioms Into Arabic: A Neural Machine Translation Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates the efficacy of NMT systems, specifically Google Translate and Chatgpt, in translating English number idioms into Arabic. |
Abdullah Sanad Mohammad Aldela’a; Mai Abdel karim Mohammad Malkawi; Ayman Altalahin; | Technium Social Sciences Journal | 2025-12-08 |
| 163 | Evolving L2 Translation Competence in Light of Technological Turn Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper explores the evolving landscape of second language (L2) translation competence in the context of technological advancements, particularly the impact of generative artificial intelligence (GenAI) and large language models. |
Ondřej Molnár; | AUC PHILOLOGICA | 2025-12-08 |
| 164 | SwissGov-RSD: A Human-annotated, Cross-lingual Benchmark for Token-level Recognition of Semantic Differences Between Related Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We make our code and datasets publicly available. |
Michelle Wastl; Jannis Vamvas; Rico Sennrich; | arxiv-cs.CL | 2025-12-08 |
| 165 | The Effectiveness of AI-Human Mediation in Recreating Literary Style: A Case of English/Arabic Poetry Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study aims to assess the quality and limitations of AI-powered translation enhanced with human prompt engineering in producing a stylistic translation of a poem from English into Arabic. |
Lamis Ismail Omar; Abdelrahman Abdalla Salih; Aladdin Al Zahran; | International Journal of Learning, Teaching and Educational … | 2025-12-05 |
| 166 | Translation in The Wild Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The key to explaining the origins of LLMs’ translation capabilities is a continuous iteration between Local and Global learning, which is a natural and helpful consequence of batch training. I discuss the prospects for testing the “duality hypothesis” empirically and its implications for reconceptualizing translation, human and machine, in the age of deep learning. |
Yuri Balashov; | Information | 2025-12-04 |
| 167 | A BLEU-Based Evaluation of ChatGPT’s Chinese-to-English Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research contributes to understanding AI translation capabilities in political discourse and provides evidence-based recommendations for developing more appropriate evaluation frameworks for specialized translation domains. |
Linli He; Mozhgan Ghassemiazghandi; | Theory and Practice in Language Studies | 2025-12-01 |
| 168 | Cross-Linguistic Near-Synonym in The Context of Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: It is recommended that scholars benefit from the near-synonym corpora gathered in other subjects and their detections in other language pairs and that future research assess synonyms in other fields using machine translation or in various settings. |
Emrah Eriş; Yılmaz Akdemir; | Uluslararası Filoloji Bengü | 2025-11-30 |
| 169 | Conveying Imagistic Thinking in Traditional Chinese Medicine Translation: A Prompt Engineering and LLM-Based Evaluation Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study adopted a human-in-the-loop (HITL) framework and selected four passages from the medical canon Huangdi Neijing that are fundamental in theory. |
Jiatong Han; | arxiv-cs.CL | 2025-11-30 |
| 170 | Teaching Translation Through Machine Translation and Hybrid Approach: An Experimental Study at The Intermediate Level Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study has explored the phenomenon of the use of Machine Translation in teaching translation from Urdu to English and vice versa. |
Farkhanda Jabeen; Muhammad Abdullah; | Global Sociological Review | 2025-11-28 |
| 171 | Conveying Imagistic Thinking in TCM Translation: A Prompt Engineering and LLM-Based Evaluation Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study adopted a human-in-the-loop framework and selected four passages from the medical canon Huangdi Neijing that are fundamental in theory. |
Jiatong Han; | arxiv-cs.CL | 2025-11-28 |
| 172 | Neural Machine Translation for Multilingual Human–Machine Collaboration in Smart Factories Supporting The IIoT Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces xTUNE, a consistency‐regularized neural machine translation (NMT) model, and presents a comprehensive multilingual translation system that incorporates data cleaning, augmentation, and embedding construction. |
Jiang Huarui; Hua Shuai; | Internet Technology Letters | 2025-11-25 |
| 173 | Deromanization of Hindi-English Code-Mixed Text and Its Influence on Toxic Comment Classification and Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce two categories of deromanization models. |
Kiran Babu Nelatoori; Ashish Kumar Sahagal; Hima Bindu Kommanti; | ACM Transactions on Asian and Low-Resource Language … | 2025-11-22 |
| 174 | LangMark: A Multilingual Dataset for Automatic Post-Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite advances in neural machine translation (NMT), the development of effective APE systems has been hindered by the lack of large-scale multilingual datasets specifically tailored to NMT outputs. To address this gap, we present and release LangMark, a new human-annotated multilingual APE dataset for English translation to seven languages: Brazilian Portuguese, French, German, Italian, Japanese, Russian, and Spanish. |
DIEGO VELAZQUEZ et. al. | arxiv-cs.CL | 2025-11-21 |
| 175 | Estonian WinoGrande Dataset: Comparative Analysis of LLM Performance on Human and Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a localized and culturally adapted Estonian translation of the test set from the widely used commonsense reasoning benchmark, WinoGrande. |
MARII OJASTU et. al. | arxiv-cs.CL | 2025-11-21 |
| 176 | Automated Polyglot Script Recognition and Translation Engine Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This project provides an Offline Multilingual OCR and Translation System allowing users to extract text from images, and convert or translate to a text output, and audio output, into a range of regional Indian languages, using Optical Character Recognition (OCR) dedicated to Tesseract, and a Neural Machine Translation (NMT) powered by Facebook’s NLLB-200 model, a language translation model trained on a large parallel corpus across languages like Kannada, Tamil, Telugu, Hindi, Marathi, and English. |
Sampath P B; S. Pandikumar; | INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING … | 2025-11-19 |
| 177 | English Audio-video to Marathi Audio-video Using Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This report details the objective, methodology, and system overview for developing an expert-level Speech-to-Speech Translation (S2ST) system for the resource-constrained English-to-Marathi language pair using machine learning. |
KARISHMA KARANDE et. al. | International Journal For Multidisciplinary Research | 2025-11-16 |
| 178 | Exploring Parameter-Efficient Fine-Tuning and Backtranslation for The WMT 25 General Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we explore the effectiveness of combining fine-tuning and backtranslation on a small Japanese corpus for neural machine translation. |
Felipe Fujita; Hideyuki Takada; | arxiv-cs.CL | 2025-11-15 |
| 179 | DiscoX: Benchmarking Discourse-Level Translation Task in Expert Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While these translations demand discourse-level coherence and strict terminological precision, current evaluation methods predominantly focus on segment-level accuracy and fluency. To address this limitation, we introduce DiscoX, a new benchmark for discourse-level and expert-level Chinese-English translation. |
XIYING ZHAO et. al. | arxiv-cs.CL | 2025-11-14 |
| 180 | Comprehension of Multilingual Expressions Referring to Target Objects in Visual Inputs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work addresses multilingual REC through two main contributions. First, we construct a unified multilingual dataset spanning 10 languages, by systematically expanding 12 existing English REC benchmarks through machine translation and context-based translation enhancement. |
Francisco Nogueira; Alexandre Bernardino; Bruno Martins; | arxiv-cs.CV | 2025-11-14 |
| 181 | Is Human Translation More Conservative Than Machine Translation? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abstract The present study investigates whether conservatism exists in human- and machine-translated texts from Chinese into English, and whether this tendency is consistently observable across different registers and multiple lexico-grammatical features by applying profile-based correspondence analysis and mixed-effects logistic regression modelling. |
Jia Li; Xianyao Hu; | International Journal of Corpus Linguistics | 2025-11-14 |
| 182 | How Small Can You Go? Compact Language Models for On-Device Critical Error Detection in Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large Language Models (LLMs) excel at evaluating machine translation (MT), but their scale and cost hinder deployment on edge devices and in privacy-sensitive workflows. We ask: … |
Muskaan Chopra; Lorenz Sparrenberg; Sarthak Khanna; Rafet Sifa; | arxiv-cs.CL | 2025-11-12 |
| 183 | EFL STUDENTS’ ACCEPTANCE OF DEEPL TRANSLATION: A TECHNOLOGY ACCEPTANCE MODEL STUDY Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study applied the Technology Acceptance Modes (TAM)framework to investigates university EFL students’ acceptance of DeepL machine translation tool, focusing on how frequency of use influences their perceptions. |
Reihayyu Dwi Cahyani; Syamdianita Syamdianita; Aridah Aridah; Weningtyas Parama Iswari; Ichi Ahada; | Lire Journal (Journal of Linguistics and Literature) | 2025-11-11 |
| 184 | A Picture Is Worth A Thousand (Correct) Captions: A Vision-Guided Judge-Corrector System for Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we describe our system under the team name BLEU Monday for theEnglish-to-Indic Multimodal Translation Task at WAT 2025. |
Siddharth Betala; Kushan Raj; Vipul Betala; Rohan Saswade; | arxiv-cs.CL | 2025-11-10 |
| 185 | Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose\textbf{Strategic Downsampling}, a simple yet effective method to mitigate thisdegeneration. |
YINGFENG LUO et. al. | arxiv-cs.CL | 2025-11-10 |
| 186 | Measuring Creative Phraseology in Literature: Machine Translation Systems Versus Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study aims at assessing the quality of the output rendered by neural machine translation (NMT) systems, i.e., DeepL and Google Translate, and large language models (LLMs), i.e., ChatGPT and Gemini, in the English>Spanish translation of five comparative idioms extracted from literary texts. |
Laura Noriega-Santiáñez; Gloria Corpas Pastor; | Yearbook of Phraseology | 2025-11-10 |
| 187 | Multi-head Temporal Latent Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper proposes Multi-head Temporal Latent Attention (MTLA), which further reduces the KV cache size along the temporal dimension, greatly lowering the memory footprint of self-attention inference. |
Keqi Deng; Phil Woodland; | nips | 2025-11-07 |
| 188 | It Takes Two: A Dual Stage Approach for Terminology-Aware Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces DuTerm, a novel two-stage architecture for terminology-constrained machine translation. |
Akshat Singh Jaswal; | arxiv-cs.CL | 2025-11-07 |
| 189 | The Dilemma and Breakthrough of AI Translation from The Perspective of Discourse Coherence A Comparative Analysis of English-Chinese Literacy Translations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By taking Hemingway’sThe Old Man and the Seaas a case study, this research employed a comparative analysis methodology to systematically examine the capabilities of AI translations versus authoritative human translations. |
Boya Zheng; | Lecture Notes in Education Psychology and Public Media | 2025-11-05 |
| 190 | Automatic Machine Translation Detection Using A Surrogate Multilingual Translation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel approach thatdirectly exploits the internal representations of a surrogate multilingual MTmodel to distinguish between human and machine-translated sentences.Experimental results show that our method outperforms current state-of-the-arttechniques, particularly for non-English language pairs, achieving gains of atleast 5 percentage points of accuracy. |
Cristian García-Romero; Miquel Esplà-Gomis; Felipe Sánchez-Martínez; | arxiv-cs.CL | 2025-11-04 |
| 191 | The Analysis of Lexical Errors in Machine Translation from English Into Romanian Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The research explores error analysis in the performance of translating byMachine Translation from English into Romanian, and it focuses on lexicalerrors found in texts which include official information, provided by the WorldHealth Organization (WHO), the Gavi Organization, by the patient informationleaflet (the information about the active ingredients of the vaccines or themedication, the indications, the dosage instructions, the storage instructions,the side effects and warning, etc.). |
Angela Stamatie; | arxiv-cs.CL | 2025-11-04 |
| 192 | Benchmarking LLMs for Translating Classical Chinese Poetry: Evaluating Adequacy, Fluency, and Elegance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our study reveals that existing LLMs fall short of this task. To address these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. |
ANDONG CHEN et. al. | emnlp | 2025-11-02 |
| 193 | Liaozhai Through The Looking-Glass: On Paratextual Explicitation of Culture-Bound Terms in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we formalize Genette’s (1987) theory of paratexts from literary and translation studies to introduce the task of paratextual explicitation for MT. We construct a dataset of 560 expert-aligned paratexts from four English translations of the classical Chinese short story collection Liaozhai and evaluate LLMs with and without reasoning traces on choice and content of explicitation. |
Sherrie Shen; Weixuan Wang; Alexandra Birch; | emnlp | 2025-11-02 |
| 194 | Translating Domain-Specific Terminology in Typologically-Diverse Languages: A Study in Tax and Financial Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a gold-standard terminology resource for the tax and financial education domains, built from curated governmental publications and covering seven typologically diverse languages: English, Spanish, Russian, Vietnamese, Korean, Chinese (traditional and simplified) and Haitian Creole. |
ARTURO ONCEVAY et. al. | emnlp | 2025-11-02 |
| 195 | Whisper-UT: A Unified Translation Framework for Speech and Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose Whisper-UT, a unified and efficient framework that leverages lightweight adapters to enable seamless adaptation across tasks, including a multi-modal machine translation (MMT) task that explicitly conditions translation on both speech and source language text inputs. |
CIHAN XIAO et. al. | emnlp | 2025-11-02 |
| 196 | EnAnchored-X2X: English-Anchored Optimization for Many-to-Many Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language models (LLMs) have demonstrated strong machine translation capabilities for English-centric language pairs but underperform in direct non-English (x2x) translation. |
SEN YANG et. al. | emnlp | 2025-11-02 |
| 197 | Explicit Learning and The LLM in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study explores an LLM’s ability to learn new languages using explanations found in a grammar book—a process we term “explicit learning. ” |
Malik Marmonier; Rachel Bawden; Benoît Sagot; | emnlp | 2025-11-02 |
| 198 | BOUQuET : Dataset, Benchmark and Open Initiative for Universal Quality Evaluation in Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Compared with related machine translation datasets, we show that BOUQuET has a broader representation of domains while simplifying the translation task for non-experts. |
PIERRE ANDREWS et. al. | emnlp | 2025-11-02 |
| 199 | Assumed Identities: Quantifying Gender Bias in Machine Translation of Gender-Ambiguous Occupational Terms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this, we introduce GRAPE, a probability-based metric designed to evaluate gender bias by analyzing aggregated model responses. Alongside this, we present GAMBIT, a benchmarking dataset in English with gender-ambiguous occupational terms. |
Orfeas Menis Mastromichalakis; Giorgos Filandrianos; Maria Symeonaki; Giorgos Stamou; | emnlp | 2025-11-02 |
| 200 | PRIM: Towards Practical In-Image Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an end-to-end model VisTrans to handle the challenge of practical conditions in PRIM, which processes visual text and background information in the image separately, ensuring the capability of multilingual translation while improving the visual quality. |
YANZHI TIAN et. al. | emnlp | 2025-11-02 |
| 201 | Dynamic Jointly Batch Selection for Data Efficient Machine Translation Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a data selection methodology specifically designed for fine-tuning machine translation systems, which leverages the synergy between a learner model and a pre-trained reference model to enhance overall training effectiveness. |
Mohammad Amin Ghanizadeh; Mohammad Javad Dousti; | emnlp | 2025-11-02 |
| 202 | Languages Still Left Behind: Toward A Better Multilingual Machine Translation Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Notably, we show that MT models trained on naturalistic data perform poorly on FLORES+ while achieving significant gains on our domain-relevant evaluation set. Based on these findings, we advocate for multilingual MT benchmarks that use domain-general, named-entity-agnostic, and culturally neutral source texts to better reflect real-world translation challenges. |
CHIHIRO TAGUCHI et. al. | emnlp | 2025-11-02 |
| 203 | Speech Vecalign: An Embedding-based Method for Aligning Parallel Speech Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Speech Vecalign, a parallel speech document alignment method that monotonically aligns speech segment embeddings and does not depend on text transcriptions. |
Chutong Meng; Philipp Koehn; | emnlp | 2025-11-02 |
| 204 | Learning to Translate Ambiguous Terminology By Preference Optimization on Post-Edits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Luckily, in a corporate context, many examples of human post-edits of valid but incorrect terminology exist. The goal of this work is to learn how to disambiguate our terminology based on these corrections. |
Nathaniel Berger; Johannes Eschbach-Dymanus; Miriam Exel; Matthias Huck; Stefan Riezler; | emnlp | 2025-11-02 |
| 205 | AFRIDOC-MT: Document-level MT Corpus for African Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces AFRIDOC-MT, a document-level multi-parallel translation dataset covering English and five African languages: Amharic, Hausa, Swahili, Yorùbá, and Zulu. |
JESUJOBA OLUWADARA ALABI et. al. | emnlp | 2025-11-02 |
| 206 | MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we present the first systematic study on medical ST, to our best knowledge, by releasing MultiMedST, a large-scale ST dataset for the medical domain, spanning all translation directions in five languages: Vietnamese, English, German, French, and Simplified/Traditional Chinese, together with the models. |
KHAI LE-DUC et. al. | emnlp | 2025-11-02 |
| 207 | A Comparative Study of AI-Powered Tools for Arabic-English and English-Arabic Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study is a quantitative analysis that investigates and compares the quality level of Computer Assisted Translation (CAT) tools, Neural Machine Translation (NMT) systems, and Large Language Models (LLMs) against each other and how well they can perform on translation tasks from English into Arabic and vice versa, in comparison to human translator. |
Razan R. Khasawneh; Bilal B. Alsharif; | Journal of Language Teaching and Research | 2025-11-01 |
| 208 | Leveraging The Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel Corpus Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Comprehensive evaluation demonstrates that thefine-tuned NLLB-200 distilled 600M variant model outperforms others,highlighting the potential of multilingual models in low resource scenarios.Furthermore, we investigated the generative translation capabilities ofmultilingual LLMs on BHEPC using in-context learning, assessing performanceunder cross-domain generalization and quantifying distributional divergence.This work bridges a critical resource gap and promotes inclusive naturallanguage processing technologies for low-resource and marginalized languagesglobally. |
Pooja Singh; Shashwat Bhardwaj; Vaibhav Sharma; Sandeep Kumar; | arxiv-cs.CL | 2025-11-01 |
| 209 | Assessing Machine Translation of Emotional Depth, Metaphorical Complexity, and Cultural Nuances in Literary Texts — A Case Study of <i>Dream of The Red Chamber</i> Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These results align with the existing literature highlighting persistent limitations of MT in literary contexts. Future research should explore enhanced literary training datasets and emotion-aware MT techniques to bridge these identified gaps. |
Hao Fang; Boran Wang; | International Journal of Chinese and English Translation … | 2025-10-31 |
| 210 | Bridging Language Gaps In Real-Time: Investigating University Students’ Self-Initiated Use Of Speech-To-Text Translation In English Language Classrooms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates the self-initiated use of real-time speech-to-text translation (STTT) tools among Thai university students in English language classrooms, and how they perceive these tools to support their comprehension, motivation, engagement, and participation. |
Napattanissa Sangkawong; Junifer Leal Bucol; ; Rozelda Luciano; | Teaching English with Technology | 2025-10-30 |
| 211 | Gaperon: A Peppered English-French Generative Language Model Suite Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through this work, we study how data filtering andcontamination interact to shape both benchmark and generative performance. |
NATHAN GODEY et. al. | arxiv-cs.CL | 2025-10-29 |
| 212 | Challenging Multilingual LLMs: A New Taxonomy and Benchmark for Unraveling Hallucination in Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To disclose hallucination inmultilingual LLMs, we introduce a diagnostic framework with a taxonomy thatseparates Instruction Detachment from Source Detachment. |
XINWEI WU et. al. | arxiv-cs.CL | 2025-10-28 |
| 213 | From Restriction to Responsibility: AI Guideline Development in A Project-based English Program Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study examines the development of guidelines for the use of AI-based machine translation (MT) and generative AI (GenAI) within the Project-based English Program (PEP) at a Japanese university. |
Hideki Goto; Mayumi Oga; Takuya Inoue; Takuya Hattori; Yukie Kondo; | English Language Teaching | 2025-10-28 |
| 214 | NEXUS-O: An Omni-Perceptive and -Interactive Model for Language, Audio, and Vision IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work proposes an industry-level omni-modal large language model (LLM) pipeline that integrates auditory, visual, and linguistic modalities to overcome challenges such as limited tri-modal datasets, high computational costs, and complex feature alignments. |
CHE LIU et. al. | mm | 2025-10-27 |
| 215 | A U-Net and Transformer Pipeline for Multilingual Image Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an end-to-end multilingual translation pipeline thatintegrates a custom U-Net for text detection, the Tesseract engine for textrecognition, and a from-scratch sequence-to-sequence (Seq2Seq) Transformer forNeural Machine Translation (NMT). |
Siddharth Sahay; Radhika Agarwal; | arxiv-cs.LG | 2025-10-27 |
| 216 | Iterative Layer Pruning for Efficient Translation Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, efficient deployment ofLLMs remains challenging due to their intensive computational requirements. Inthis paper, we address this challenge and present our submissions to the ModelCompression track at the Conference on Machine Translation (WMT 2025). |
Yasmin Moslem; Muhammad Hazim Al Farouq; John D. Kelleher; | arxiv-cs.CL | 2025-10-26 |
| 217 | A Comprehensive Survey on Transformer-Based Machine Translation: Identifying Research Gaps and Solutions for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper handles the key challenges in transformer-based architectures for machine translation, proposing solutions to specific issues and highlighting areas where researchers can focus on bridging existing gaps, thereby reducing the effort needed to identify research opportunities. |
Anasua Banerjee; Dr. Debajyoty Banik; | ACM Computing Surveys | 2025-10-24 |
| 218 | Evaluating Uighur Literary Translation: A Comparative Study of ChatGPT, Google Translate, and Bing Translator Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study compares generative artificial intelligence (GenAI) and neural machine translation (NMT) systems in translating Uighur literary text (قۇتادغۇ بىلىك)into English. |
Qiufen Wang; | PLOS One | 2025-10-23 |
| 219 | Spatio-temporal Sign Language Representation and Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper describes the DFKI-MLT submission to the WMT-SLT 2022 signlanguage translation (SLT) task from Swiss German Sign Language (video) intoGerman (text). |
Yasser Hamidullah; Josef van Genabith; Cristina España-Bonet; | arxiv-cs.CL | 2025-10-22 |
| 220 | Multimodal Real-time English Translation Based on The Transformer Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces a Transformer-based multimodal translation framework that integrates textual, visual, and audio modalities. |
Yanling Hu; Siyu Huang; | Journal of Computational Methods in Sciences and Engineering | 2025-10-22 |
| 221 | Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we evaluate MBRdecoding for ASR and ST tasks on English and Japanese using Whisper and itsderivative models. |
Yuu Jinnai; | arxiv-cs.CL | 2025-10-22 |
| 222 | Evaluating Large Language Models on Urdu Idiom Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Idiomatic translation remains a significant challenge in machine translation,especially for low resource languages such as Urdu, and has received limitedprior attention. To advance research in this area, we introduce the firstevaluation datasets for Urdu to English idiomatic translation, covering bothNative Urdu and Roman Urdu scripts and annotated with gold-standard Englishequivalents. |
Muhammad Farmal Khan; Mousumi Akter; | arxiv-cs.CL | 2025-10-20 |
| 223 | Giving Social Media Post Authors More Control Over The Translation of Their Posts Enhances Their User Experience Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We collect objective and subjective measures of users’ experience using the assigned translation feature. |
Ananya Gupta; Heba Aly; Jae D. Takeuchi; Bart Piet Knijnenburg; | Proceedings of the ACM on Human-Computer Interaction | 2025-10-16 |
| 224 | An Empirical Study on The Quality of Chinese-English Translation Under Bilingual Prompts Based on Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper conducts an empirical study on the research hypothesis using Chinese-English bilingual prompts as the direction, concluding that the language factor of prompts affects translation quality, with impacts at the lexical and syntactic levels. |
Zhao Junzhe; Lv Nan; | English Language Teaching and Linguistics Studies | 2025-10-16 |
| 225 | Semantic Prosody in Machine Translation: The English-Chinese Case of Passive Structures Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Tobridge the gap, we propose an approach to teach machine translation modelsabout semantic prosody of a specific structure. |
Xinyue Ma; Pol Pastells; Mireia Farrús; Mariona Taulé; | arxiv-cs.CL | 2025-10-16 |
| 226 | ACADATA: Parallel Dataset of Academic Data for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ACADATA, a high-quality parallel dataset for academic translation,that consists of two subsets: ACAD-TRAIN, which contains approximately 1.5million author-generated paragraph pairs across 96 language directions andACAD-BENCH, a curated evaluation set of almost 6,000 translations covering 12directions. |
IÑAKI LACUNZA et. al. | arxiv-cs.CL | 2025-10-14 |
| 227 | MCL-MT: Multi-Level Contrastive Learning for Many-to-many Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a method to train an NMT model. |
Haolun Ran; | Advances in Engineering Technology Research | 2025-10-14 |
| 228 | DPO-Tuned Large Language Models for Segmentation in Simultaneous Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a segmentationframework based on large language models (LLMs) trained with Direct PreferenceOptimization (DPO). |
Zeyu Yang; Satoshi Nakamura; | arxiv-cs.CL | 2025-10-14 |
| 229 | Discovering Hyponymic Knowledge Patterns in English Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article addresses the lack of a comprehensive inventory of hyponymic knowledge patterns (KPs) in English by presenting a robust methodology for their collection. |
Antonio San Martín; Catherine Trekker; | Terminology. International Journal of Theoretical and … | 2025-10-14 |
| 230 | When Machines Meet Gavel: A Case Study of The English–Arabic Machine Translation of The Egyptian Arguments Before The International Court of Justice (2024) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abstract The legal field heavily relies on audio–visual content such as witness testimonies and trials, making accurate transcription and translation crucial, especially in cross-border cases. This study examines the performance of neural machine translation (NMT) in handling such material, using the DQF-MQM harmonized error typology to categorize errors by type, including terminology, accuracy, and fluency. |
Aya Sayed Omran Elsayed; | Language and Semiotic Studies | 2025-10-13 |
| 231 | Machine Vs. Human Translation of Stylistic Neologisms in English Language Chick Lit Into Ukrainian Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite growing research on HT vs MT of neologisms, studies focusing on English > Ukrainian translation remain absent, leaving a critical research gap. This study addresses this gap by analysing how HT and MT rendered SNs formed through various morphological patterns in English language chick (ELCL) into Ukrainian. |
Maryna Bielova; | Respectus Philologicus | 2025-10-13 |
| 232 | End-to-end Automatic Speech Recognition and Speech Translation: Integration of Speech Foundational Models and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper explores acombined end-to-end architecture of pre-trained speech encoders and LargeLanguage Models (LLMs) for performing both Automatic Speech Recognition (ASR)and ST simultaneously. |
Nam Luu; Ondřej Bojar; | arxiv-cs.CL | 2025-10-11 |
| 233 | DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language models (LLMs) have substantially advanced machine translation(MT), yet their effectiveness in translating web novels remains unclear.Existing benchmarks rely on surface-level metrics that fail to capture thedistinctive traits of this genre. To address these gaps, we introduce DITING,the first comprehensive evaluation framework for web novel translation,assessing narrative and cultural fidelity across six dimensions: idiomtranslation, lexical ambiguity, terminology localization, tense consistency,zero-pronoun resolution, and cultural safety, supported by over 18Kexpert-annotated Chinese-English sentence pairs. |
ENZE ZHANG et. al. | arxiv-cs.CL | 2025-10-10 |
| 234 | A Systematic Review of Bias Detection Methods for Non-English Word Embeddings and Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we conduct a systematic literature review to identify and compare existing bias detection methods for non-English word embeddings and language models. |
ALEXANDRE PUTTICK et. al. | Artificial Intelligence Review | 2025-10-08 |
| 235 | TreePrompt: Leveraging Hierarchical Few-Shot Example Selection for Improved English-Persian and English-German Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large Language Models (LLMs) have consistently demonstrated strongperformance in machine translation, especially when guided by high-qualityprompts. |
Ramtin Kakavand; Ebrahim Ansari; | arxiv-cs.CL | 2025-10-04 |
| 236 | SynCED-EnDe 2025: A Synthetic and Curated English – German Dataset for Critical Error Detection in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present SynCED-EnDe, a new resource consisting of 1,000gold-labeled and 8,000 silver-labeled sentence pairs, balanced 50/50 betweenerror and non-error cases. |
Muskaan Chopra; Lorenz Sparrenberg; Rafet Sifa; | arxiv-cs.CL | 2025-10-01 |
| 237 | Redefining Machine Simultaneous Interpretation: From Incremental Translation to Human-Like Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We extend the action spaceof SiMT with four adaptive actions: SENTENCE_CUT, DROP, PARTIAL_SUMMARIZATIONand PRONOMINALIZATION, which enable real-time restructuring, omission, andsimplification while preserving semantic fidelity. |
Qianen Zhang; Satoshi Nakamura; | arxiv-cs.CL | 2025-09-25 |
| 238 | Low-Resource English-Tigrinya MT: Leveraging Multilingual Models, Custom Tokenizers, and Clean Evaluation Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a refined approachthat integrates language-specific tokenization, informed embeddinginitialization, and domain-adaptive fine-tuning. |
Hailay Kidu Teklehaymanot; Gebrearegawi Gidey; Wolfgang Nejdl; | arxiv-cs.CL | 2025-09-24 |
| 239 | CorIL: Towards Enriching Indian Language to Indian Language Parallel Corpora and Machine Translation Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce alarge-scale, high-quality annotated parallel corpus covering 11 of theselanguages : English, Telugu, Hindi, Punjabi, Odia, Kashmiri, Sindhi, Dogri,Kannada, Urdu, and Gujarati comprising a total of 772,000 bi-text sentencepairs. |
SOHAM BHATTACHARJEE et. al. | arxiv-cs.CL | 2025-09-24 |
| 240 | SiniticMTError: A Machine Translation Dataset with Error Annotations for Sinitic Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Cantonese and Wu Chinese are two Siniticexamples, although each enjoys more than 80 million speakers around the world.In this paper, we introduce SiniticMTError, a novel dataset that builds onexisting parallel corpora to provide error span, error type, and error severityannotations in machine-translated examples from English to Mandarin, Cantonese,and Wu Chinese. |
HANNAH LIU et. al. | arxiv-cs.CL | 2025-09-24 |
| 241 | Scaling, Simplification, and Adaptation: Lessons from Pretraining on Machine-Translated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: (3) How well do models pretrained on MT-deriveddata adapt when continually trained on limited native text? |
Dan John Velasco; Matthew Theodore Roque; | arxiv-cs.CL | 2025-09-21 |
| 242 | CUTE: A Multilingual Dataset for Enhancing Cross-Lingual Knowledge Transfer in Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large Language Models (LLMs) demonstrate exceptional zero-shot capabilitiesin various NLP tasks, significantly enhancing user experience and efficiency.However, this advantage is primarily limited to resource-rich languages. |
Wenhao Zhuang; Yuan Sun; | arxiv-cs.CL | 2025-09-21 |
| 243 | Multilingual LLM Prompting Strategies for Medical English-Vietnamese Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We systematically evaluate promptingstrategies for six multilingual LLMs (0.5B-9B parameters) on the MedEV dataset,comparing zero-shot, few-shot, and dictionary-augmented prompting with Meddict,an English-Vietnamese medical lexicon. |
Nhu Vo; Nu-Uyen-Phuong Le; Dung D. Le; Massimo Piccardi; Wray Buntine; | arxiv-cs.CL | 2025-09-19 |
| 244 | Multilingual Sentiment Analysis with Data Augmentation: A Cross-Language Evaluation in French, German, and Japanese Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine learning in natural language processing (NLP) analyzes datasets to make future predictions, but developing accurate models requires large, high-quality, and balanced … |
Suboh M. Alkhushayni; Hye-ok Lee; | Inf. | 2025-09-17 |
| 245 | Audio-Based Crowd-Sourced Evaluation of Machine Translation Quality Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study compares text-only and audio-basedevaluations of 10 MT systems from the WMT General MT Shared Task, usingcrowd-sourced judgments collected via Amazon Mechanical Turk. |
Sami Ul Haq; Sheila Castilho; Yvette Graham; | arxiv-cs.CL | 2025-09-17 |
| 246 | PATIMT-Bench: A Multi-Scenario Benchmark for Position-Aware Text Image Machine Translation in Large Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we extendtraditional TIMT into position-aware TIMT (PATIMT), aiming to supportfine-grained and layoutpreserving translation, which holds great practicalvalue but remains largely unexplored. |
WANRU ZHUANG et. al. | arxiv-cs.CV | 2025-09-14 |
| 247 | Mitigating Language Barriers in Education: Developing Multilingual Digital Learning Materials with Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Launched through collaboration between amajor Czech academic institution and the country’s largest educationalpublisher, the project is aimed at translating up to 9,000 multimodalinteractive exercises from Czech into Ukrainian, English, and German for aneducational web portal. |
LUCIE POLÁKOVÁ et. al. | arxiv-cs.CL | 2025-09-11 |
| 248 | Small Open Models Achieve Near Parity with Large Models in Low Resource Literary Translation at A Fraction of The Cost Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We contribute to this ongoing research byintroducing TINYFABULIST TRANSLATION FRAMEWORK (TF2), a unified framework fordataset creation, fine tuning, and evaluation in English-Romanian literarytranslations, centred on the creation and open release of both a compact, finetuned language model (TF2-12B) and large scale synthetic parallel datasets(DS-TF2-EN-RO-3M and DS-TF2-EN-RO-15K). Building on DS-TF1-EN-3M (TF1), thelargest collection of synthetic English fables to date, we address the need forrich, high quality literary datasets in low resource languages such asRomanian. |
Mihai Nadas; Laura Diosan; Andreea Tomescu; Andrei Piscoran; | arxiv-cs.CL | 2025-09-09 |
| 249 | Hunyuan-MT Technical Report Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this report, we introduce Hunyuan-MT-7B, our first open-sourcemultilingual translation model, which supports bidirectional translation across33 major languages and places a special emphasis on translation betweenMandarin and several ethnic minority languages as well as dialects.Furthermore, to serve and address diverse translation scenarios and enhancemodel performance at test time, we introduce Hunyuan-MT-Chimera-7B, atranslation model inspired by the slow thinking mode. |
MAO ZHENG et. al. | arxiv-cs.CL | 2025-09-05 |
| 250 | On English-Chinese Neural Machine Translation Leveraging Transformer Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
S. MONDAL et. al. | Nat. Lang. Process. J. | 2025-09-01 |
| 251 | Evaluating The Impact of Verbal Multiword Expressions on Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, weanalyze the impact of three VMWE categories — verbal idioms, verb-particleconstructions, and light verb constructions — on machine translation qualityfrom English to multiple languages. |
Linfeng Liu; Saptarshi Ghosh; Tianyu Jiang; | arxiv-cs.CL | 2025-08-24 |
| 252 | DocHPLT: A Massively Multilingual Document-Level Translation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To facilitate the trainingand evaluation of document-level translation and, more broadly, long-contextmodeling for global communities, we create DocHPLT, the largest publiclyavailable document-level translation dataset to date. |
DAYYÁN O’BRIEN et. al. | arxiv-cs.CL | 2025-08-18 |
| 253 | Scaling Pseudo-labeling Data for End-to-end Low-resource Speech Translation (the Case of Kurdish Language) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper we propose a pseudo-labeling pipeline to generate End-to-End Speech to Text Translation (E2E S2TT) data for low-resource languages. This pipeline allows us to … |
Mohammad MohammadAmini; Aghilas Sini; Marie Tahon; Antoine Laurent; | Interspeech 2025 | 2025-08-17 |
| 254 | PEACH: A Sentence-aligned Parallel English-Arabic Corpus for Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces PEACH, a sentence-aligned parallel English-Arabiccorpus of healthcare texts encompassing patient information leaflets andeducational materials. |
Rania Al-Sabbagh; | arxiv-cs.CL | 2025-08-07 |
| 255 | Optimal Corpus Aware Training for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose Optimal Corpus Aware Training (OCAT),which fine-tunes a CAT pre-trained model by freezing most of the modelparameters and only tuning small set of corpus-related parameters. |
Yi-Hsiu Liao; Cheng Shen; | arxiv-cs.LG | 2025-08-07 |
| 256 | In-Context Learning for Low-Resource Machine Translation: A Study on Tarifit with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study presents the first systematic evaluation of in-context learning for Tarifit machine translation, a low-resource Amazigh language spoken by 5 million people in Morocco … |
Oussama Akallouch; Khalid Fardousse; | Algorithms | 2025-08-06 |
| 257 | CultureGuard: Towards Culturally-Aware Dataset and Guard Model for Multilingual Safety Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our approach introduces a four-stagesynthetic data generation and filtering pipeline: cultural data segregation,cultural data adaptation, machine translation, and quality filtering. |
RAVIRAJ JOSHI et. al. | arxiv-cs.CL | 2025-08-03 |
| 258 | ArzEn-MultiGenre: An Aligned Parallel Dataset of Egyptian Arabic Song Lyrics, Novels, and Subtitles, with English Translations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: ArzEn-MultiGenre is a parallel dataset of Egyptian Arabic song lyrics,novels, and TV show subtitles that are manually translated and aligned withtheir English counterparts. The … |
Rania Al-Sabbagh; | arxiv-cs.CL | 2025-08-02 |
| 259 | Enhancing Neural Machine Translation Through Incorporation of Unsupervised Language Understanding and Generation Techniques: The Case of English-Afaan Oromo Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Chala Bekabil Geta; Fantahun Gereme; | SN Computer Science | 2025-08-01 |
| 260 | Zero-Shot Cross-Lingual Knowledge Transfer in VQA Via Multimodal Distillation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As multilingual artificial intelligence systems proliferate, achieving robust cross-lingual understanding remains an open challenge. Recent works have made progress on visual … |
YU WENG et. al. | IEEE Transactions on Computational Social Systems | 2025-08-01 |
| 261 | RL from Teacher-Model Refinement: Gradual Imitation Learning for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Reinforcement Learning fromTeacher-Model Refinement (RLfR), a novel framework that removes reliance onstatic triplets by leveraging continuous, high-quality feedback from anexternal teacher model (GPT-4o). |
Dongyub Jude Lee; Zhenyi Ye; Pengcheng He; | arxiv-cs.CL | 2025-07-29 |
| 262 | GIIFT: Graph-guided Inductive Image-free Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we construct novel multimodal scene graphs to preserveand integrate modality-specific information and introduce GIIFT, a two-stageGraph-guided Inductive Image-Free MMT framework that uses a cross-modal GraphAttention Network adapter to learn multimodal knowledge in a unified fusedspace and inductively generalize it to broader image-free translation domains.Experimental results on the Multi30K dataset of English-to-French andEnglish-to-German tasks demonstrate that our GIIFT surpasses existingapproaches and achieves the state-of-the-art, even without images duringinference. |
Jiafeng Xiong; Yuting Zhao; | arxiv-cs.CL | 2025-07-24 |
| 263 | GG-BBQ: German Gender Bias Benchmark for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In our work, we evaluate gender bias in German LargeLanguage Models (LLMs) using the Bias Benchmark for Question Answering byParrish et al. (2022) as a reference. |
SHALAKA SATHEESH et. al. | arxiv-cs.CL | 2025-07-22 |
| 264 | Introducing Quality Estimation to Machine Translation Post-editing Workflow: An Empirical Study on Its Usefulness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This preliminary study investigates the usefulness of sentence-level QualityEstimation (QE) in English-Chinese Machine Translation Post-Editing (MTPE),focusing on its impact on post-editing speed and student translators’perceptions. |
Siqi Liu; Guangrong Dai; Dechao Li; | arxiv-cs.CL | 2025-07-22 |
| 265 | Evaluating Text Style Transfer: A Nine-Language Benchmark for Text Detoxification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we perform the firstcomprehensive multilingual study on evaluation of text detoxification systemacross nine languages: English, Spanish, German, Chinese, Arabic, Hindi,Ukrainian, Russian, Amharic. |
Vitaly Protasov; Nikolay Babakov; Daryna Dementieva; Alexander Panchenko; | arxiv-cs.CL | 2025-07-21 |
| 266 | It’s Not A Walk in The Park! Challenges of Idiom Translation in Speech-to-text Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we systematically evaluate idiom translation as compared to conventional news translation in both text-to-text machine translation (MT) and speech-to-text translation (SLT) systems across two language pairs (German to English, Russian to English). |
IULIIA ZAITOVA et. al. | acl | 2025-07-21 |
| 267 | AfroCS-xs: Creating A Compact, High-Quality, Human-Validated Code-Switched Dataset for African Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Code-switching is prevalent in multilingual communities but lacks adequate high-quality data for model development, especially for African languages. To address this, we present AfroCS-xs, a small human-validated synthetic code-switched dataset for four African languages (Afrikaans, Sesotho, Yoruba, isiZulu) and English within a specific domain—agriculture. |
KAYODE OLALEYE et. al. | acl | 2025-07-21 |
| 268 | Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While most existing research has focused on English-centric translation directions, the exploration of many-to-many translation is still limited by the scarcity of parallel data. To address this, we propose a three-stage curriculum learning strategy that leverages the machine translation capabilities of large language models and adapts them to S2TT tasks, enabling effective learning in low-resource settings. |
YEXING DU et. al. | acl | 2025-07-21 |
| 269 | Different Speech Translation Models Encode and Translate Speaker Gender Differently Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: If so, what are the implications for the speaker’s gender assignment in translation? We address these questions from an interpretability perspective, using probing methods to assess gender encoding across diverse ST models. |
DENNIS FUCCI et. al. | acl | 2025-07-21 |
| 270 | Truth Knows No Language: Evaluating Truthfulness Beyond English Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a professionally translated extension of the TruthfulQA benchmark designed to evaluate truthfulness in Basque, Catalan, Galician, and Spanish. |
BLANCA CALVO FIGUERAS et. al. | acl | 2025-07-21 |
| 271 | Multi-perspective Alignment for Increasing Naturalness in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by the reinforcement learning from human feedback framework, we introduce a novel method that rewards both naturalness and content preservation. |
Huiyuan Lai; Esther Ploeger; Rik Van Noord; Antonio Toral; | acl | 2025-07-21 |
| 272 | SwiLTra-Bench: The Swiss Legal Translation Benchmark IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, this process traditionally relies on professionals who must be both legal experts and skilled translators—creating bottlenecks and impacting effective access to justice. To address this challenge, we introduce SwiLTra-Bench, a comprehensive multilingual benchmark of over 180K aligned Swiss legal translation pairs comprising laws, headnotes, and press releases across all Swiss languages along with English, designed to evaluate LLM-based translation systems. |
JOEL NIKLAUS et. al. | acl | 2025-07-21 |
| 273 | How Important Is ‘Perfect’ English for Machine Translation Prompts? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large language models (LLMs) have achieved top results in recent machine translation evaluations, but they are also known to be sensitive to errors and perturbations in their … |
PATR’ICIA SCHMIDTOV’A et. al. | Conference of the European Chapter of the Association for … | 2025-07-13 |
| 274 | How Important Is `Perfect’ English for Machine Translation Prompts? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language models (LLMs) have achieved top results in recent machinetranslation evaluations, but they are also known to be sensitive to errors andperturbations in their prompts. |
PATRÍCIA SCHMIDTOVÁ et. al. | arxiv-cs.CL | 2025-07-13 |
| 275 | GRAFT: A Graph-based Flow-aware Agentic Framework for Document-level Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing approaches rely onheuristic rules to segment documents into discourse units, which rarely alignwith the true discourse structure required for accurate translation. Otherwise,they fail to maintain consistency throughout the document during translation.To address these challenges, we propose Graph Augmented Agentic Framework forDocument Level Translation (GRAFT), a novel graph based DocMT system thatleverages Large Language Model (LLM) agents for document translation. |
Himanshu Dutta; Sunny Manchanda; Prakhar Bapat; Meva Ram Gurjar; Pushpak Bhattacharyya; | arxiv-cs.CL | 2025-07-04 |
| 276 | Application of Intelligent Translation System Based on Machine Translation in English Education Curriculum Reform Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Shan Guo; | Discov. Artif. Intell. | 2025-07-01 |
| 277 | SH-RAG: A Syntax-Based Hierarchical Retrieval-Augmented Generation Framework for Handling Syntactic Complexity in Literary Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid advancements in machine translation (MT), large language models (LLMs) have demonstrated the ability to generate high-quality translations with relatively low … |
Hao Lu; Zhen Zhao; | 2025 International Joint Conference on Neural Networks … | 2025-06-30 |
| 278 | Decoding Machine Translationese in English-Chinese News: LLMs Vs. NMTs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores Machine Translationese (MTese) — the linguisticpeculiarities of machine translation outputs — focusing on theunder-researched English-to-Chinese language pair in news texts. |
Delu Kong; Lieve Macken; | arxiv-cs.CL | 2025-06-27 |
| 279 | High-Fidelity Simultaneous Speech-To-Speech Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Hibiki, a decoder-only model for simultaneous speech translation. |
Tom Labiausse; Laurent Mazaré; Edouard Grave; Alexandre Défossez; Neil Zeghidour; | icml | 2025-06-25 |
| 280 | Learning from Others’ Mistakes: Finetuning Machine Translation Models with Span-level Error Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we explore the potential of utilizing fine-grained span-level annotations from offline datasets to improve model quality. |
LILY H ZHANG et. al. | icml | 2025-06-25 |
| 281 | Lost in Machine Translation: The Sociocultural Implications of Language Technologies in Nigeria Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, we explore machine translation within Nigerian contexts. We conducted 20 semi-structured interviews with native speakers of Ìgbò, Yorùbá, and Hausa that have used … |
SEYI OLOJO et. al. | Proceedings of the 2025 ACM Conference on Fairness, … | 2025-06-23 |
| 282 | Gender-Neutral Machine Translation Strategies in Practice Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here we assess the sensitivity of 21 MT systems to the need for gender neutrality in response to gender ambiguity in three translation directions of varying difficulty. |
Hillary Dawkins; Isar Nejadgholi; Chi-kiu Lo; | arxiv-cs.CL | 2025-06-18 |
| 283 | Edeflip: Supervised Word Translation Between English and Yoruba Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we implement an established supervised embedding alignment method for word translation from English to Yoruba, the latter a low-resource language. |
Ikeoluwa Abioye; Jiani Ge; | arxiv-cs.CL | 2025-06-15 |
| 284 | The Saturation Point of Backtranslation in High Quality Low Resource English Gujarati Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Backtranslation BT is widely used in low resource machine translation MT to generate additional synthetic training data using monolingual corpora. While this approach has shown … |
Arwa Arif; | ArXiv | 2025-06-12 |
| 285 | Design of Intelligent Proofreading System for English Translation Based on CNN and BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a novel hybrid approach for robust proofreading that combines convolutional neural networks (CNN) with Bidirectional Encoder Representations from Transformers (BERT). |
Feijun Liu; Huifeng Wang; Kun Wang; Yizhen Wang; | arxiv-cs.CL | 2025-06-05 |
| 286 | It’s Not A Walk in The Park! Challenges of Idiom Translation in Speech-to-text Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we systematically evaluate idiom translation as compared to conventional news translation in both text-to-text machine translation (MT) and speech-to-text translation (SLT) systems across two language pairs (German to English, Russian to English). |
IULIIA ZAITOVA et. al. | arxiv-cs.CL | 2025-06-03 |
| 287 | The Quality Optimization of English–Chinese Machine Translation Based on Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ping Lu; Fangfang Xu; | Discover Artificial Intelligence | 2025-06-03 |
| 288 | How Programming Concepts and Neurons Are Shared in Code Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we investigate the relationship between multiple PLs and English in the concept space of LLMs. |
Amir Hossein Kargaran; Yihong Liu; François Yvon; Hinrich Schütze; | arxiv-cs.CL | 2025-06-01 |
| 289 | VietMix: A Naturally Occurring Vietnamese-English Code-Mixed Corpus with Iterative Augmentation for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Machine translation systems fail when processing code-mixed inputs for low-resource languages. We address this challenge by curating VietMix, a parallel corpus of naturally occurring code-mixed Vietnamese text paired with expert English translations. |
HIEU TRAN et. al. | arxiv-cs.CL | 2025-05-30 |
| 290 | BeaverTalk: Oregon State University’s IWSLT 2025 Simultaneous Speech Translation System Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper discusses the construction, fine-tuning, and deployment of BeaverTalk, a cascaded system for speech-to-text translation as part of the IWSLT 2025 simultaneous translation task. |
Matthew Raffel; Victor Agostinelli; Lizhong Chen; | arxiv-cs.CL | 2025-05-29 |
| 291 | MT$^{3}$: Scaling MLLM-based Text Image Machine Translation Via Multi-Task Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent advances in large-scale Reinforcement Learning (RL) have improved reasoning in Large Language Models (LLMs) and Multimodal LLMs (MLLMs), but their application to end-to-end TIMT is still underexplored. To bridge this gap, we introduce MT$^{3}$, the first framework to apply Multi-Task RL to MLLMs for end-to-end TIMT. |
ZHAOPENG FENG et. al. | arxiv-cs.CL | 2025-05-26 |
| 292 | KIT’s Low-resource Speech Translation Systems for IWSLT2025: System Enhancement with Synthetic Data and Model Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents KIT’s submissions to the IWSLT 2025 low-resource track. |
ZHAOLIN LI et. al. | arxiv-cs.CL | 2025-05-26 |
| 293 | Evaluating Machine Translation Models for English-Hindi Language Pairs: A Comparative Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The study aims to provide insights into the effectiveness of different machine translation approaches in handling both general and specialized language domains. |
Ahan Prasannakumar Shetty; | arxiv-cs.CL | 2025-05-26 |
| 294 | MT3: Scaling MLLM-based Text Image Machine Translation Via Multi-Task Reinforcement Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Text Image Machine Translation (TIMT)-the task of translating textual content embedded in images-is critical for applications in accessibility, cross-lingual information access, … |
ZHAOPENG FENG et. al. | ArXiv | 2025-05-26 |
| 295 | Evaluating Translation Quality: A Qualitative and Quantitative Assessment of Machine and LLM-Driven Arabic-English Translations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study investigates translation quality between Arabic and English, comparing traditional rule-based machine translation systems, modern neural machine translation tools such … |
Tawffeek A. S. Mohammed; | Inf. | 2025-05-26 |
| 296 | Building A Functional Machine Translation Corpus for Kpelle Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce the first publicly available English-Kpelle dataset for machine translation, comprising over 2000 sentence pairs drawn from everyday communication, religious texts, and educational materials. |
Kweku Andoh Yamoah; Jackson Weako; Emmanuel J. Dorley; | arxiv-cs.CL | 2025-05-24 |
| 297 | Low-Resource NMT: A Case Study on The Written and Spoken Languages in Hong Kong Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper describes a transformer-based neural machine translation (NMT) system for written-Chinese-to-written-Cantonese translation. |
Hei Yi Mak; Tan Lee; | arxiv-cs.CL | 2025-05-23 |
| 298 | Comparative Analysis of Subword Tokenization Approaches for Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper examines how different subword tokenization techniques, such as SentencePiece, Byte Pair Encoding (BPE), and WordPiece Tokenization, affect ILs. |
Sudhansu Bala Das; Samujjal Choudhury; Tapas Kumar Mishra; Bidyut Kr. Patra; | arxiv-cs.CL | 2025-05-22 |
| 299 | SSR-Zero: Simple Self-Rewarding Reinforcement Learning for Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Large language models (LLMs) have recently demonstrated remarkablecapabilities in machine translation (MT). However, most advanced MT-specificLLMs heavily rely on external … |
Wenjie Yang; Mao Zheng; Mingyang Song; Zheng Li; Sitong Wang; | arxiv-cs.CL | 2025-05-22 |
| 300 | SlangDIT: Benchmarking LLMs in Interpretative Slang Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on the benchmark, we propose a deep thinking model, named SlangOWL. |
Yunlong Liang; Fandong Meng; Jiaan Wang; Jie Zhou; | arxiv-cs.CL | 2025-05-20 |
| 301 | ExTrans: Multilingual Deep Reasoning Translation Via Exemplar-Enhanced Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: With a carefully designed lightweight reward modeling in RL, we can simply transfer the strong MT ability from a single direction into multiple (i.e., 90) translation directions and achieve impressive multilingual MT performance. |
Jiaan Wang; Fandong Meng; Jie Zhou; | arxiv-cs.CL | 2025-05-19 |
| 302 | Multi-head Temporal Latent Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper proposes Multi-head Temporal Latent Attention (MTLA), which further reduces the KV cache size along the temporal dimension, greatly lowering the memory footprint of self-attention inference. |
Keqi Deng; Philip C. Woodland; | arxiv-cs.LG | 2025-05-18 |
| 303 | LLM-Based Evaluation of Low-Resource Machine Translation: A Reference-less Dialect Guided Approach with A Refined Sylheti-English Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose a comprehensive framework that enhances LLM-based MT evaluation using a dialect guided approach. |
Md. Atiqur Rahman; Sabrina Islam; Mushfiqul Haque Omi; | arxiv-cs.CL | 2025-05-18 |
| 304 | Multilingual Machine Translation with Quantum Encoder Decoder Attention-based Convolutional Variational Circuits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Cloud-based multilingual translation services like Google Translate and Microsoft Translator achieve state-of-the-art translation capabilities. These services inherently use large multilingual language models such as GRU, LSTM, BERT, GPT, T5, or similar encoder-decoder architectures with attention mechanisms as the backbone. |
Subrit Dikshit; Ritu Tiwari; Priyank Jain; | arxiv-cs.CL | 2025-05-14 |
| 305 | Do Not Change Me: On Transferring Entities Without Modification in Neural Machine Translation — A Multilingual Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we explore the abilities of popular NMT models, including models from the OPUS project, Google Translate, MADLAD, and EuroLLM, to preserve entities such as URL addresses, IBAN numbers, or emails when producing translations between four languages: English, German, Polish, and Ukrainian. |
Dawid Wisniewski; Mikolaj Pokrywka; Zofia Rostek; | arxiv-cs.CL | 2025-05-09 |
| 306 | Data Augmentation With Back Translation for Low Resource Languages: A Case of English and Luganda Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper,we explore the application of Back translation (BT) as a semi-supervised technique to enhance Neural Machine Translation(NMT) models for the English-Luganda language pair, specifically addressing the challenges faced by low-resource languages. |
Richard Kimera; Dongnyeong Heo; Daniela N. Rim; Heeyoul Choi; | arxiv-cs.CL | 2025-05-05 |
| 307 | Giving The Old A Fresh Spin: Quality Estimation-Assisted Constrained Decoding for Automatic Post-Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel technique to mitigate over-correction by incorporating word-level Quality Estimation (QE) information during the decoding process. |
Sourabh Deoghare; Diptesh Kanojia; Pushpak Bhattacharyya; | naacl | 2025-05-04 |
| 308 | SwissADT: An Audio Description Translation System for Swiss Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce SwissADT, an **emerging** ADT system for three main Swiss languages and English, designed for future use by our industry partners. |
Lukas Fischer; Yingqiang Gao; Alexa Lintner; Annette Rios; Sarah Ebling; | naacl | 2025-05-04 |
| 309 | The Impact of Domain-Specific Terminology on Machine Translation for Finance in European Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we present the first impact analysis of domain-specific terminology on multilingual MT for finance, focusing on European languages within the subdomain of macroeconomics. |
Arturo Oncevay; Charese Smiley; Xiaomo Liu; | naacl | 2025-05-04 |
| 310 | Large Language Models for Persian-English Idiom Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces two parallel datasets of sentences containing idiomatic expressions for Persian→English and English→Persian translations, with Persian idioms sampled from our PersianIdioms resource, a collection of 2,200 idioms and their meanings, with 700 including usage examples. |
Sara Rezaeimanesh; Faezeh Hosseini; Yadollah Yaghoobzadeh; | naacl | 2025-05-04 |
| 311 | FLEURS-ASL: Including American Sign Language in Massively Multilingual Multitask Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to help converge the fields, we introduce FLEURS-ASL, an extension of the multiway parallel benchmarks FLORES (for text) and FLEURS (for speech) to support their first sign language (as video), American Sign Language, translated by 5 Certified Deaf Interpreters. |
Garrett Tanzer; | naacl | 2025-05-04 |
| 312 | Automatic Input Rewriting Improves Translation with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present an empirical study of 21 input rewriting methods with 3 open-weight LLMs for translating from English into 6 target languages. |
Dayeon Ki; Marine Carpuat; | naacl | 2025-05-04 |
| 313 | Wav2Prompt: End-to-End Speech Prompt Learning and Task-based Fine-tuning for Text-based LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Wav2Prompt uses a straightforward training process with only the same data used to train an automatic speech recognition (ASR) model. |
Keqi Deng; Guangzhi Sun; Phil Woodland; | naacl | 2025-05-04 |
| 314 | Team ACK at SemEval-2025 Task 2: Beyond Word-for-Word Machine Translation for English-Korean Pairs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We evaluate 13 models (LLMs and MT models) using automatic metrics and human assessment by bilingual annotators. |
DANIEL LEE et. al. | arxiv-cs.CL | 2025-04-29 |
| 315 | To MT or Not to MT: An Eye-tracking Study on The Reception By Dutch Readers of Different Translation and Creativity Levels Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article presents the results of a pilot study involving the reception of a fictional short story translated from English into Dutch under four conditions: machine translation (MT), post-editing (PE), human translation (HT) and original source text (ST). |
Kyo Gerrits; Ana Guerberof-Arenas; | arxiv-cs.CL | 2025-04-28 |
| 316 | Optimising ChatGPT for Creativity in Literary Translation: A Case Study from English Into Dutch, Chinese, Catalan and Spanish Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study examines the variability of Chat-GPT machine translation (MT) outputs across six different configurations in four languages,with a focus on creativity in a literary text. |
Shuxiang Du; Ana Guerberof Arenas; Antonio Toral; Kyo Gerrits; Josep Marco Borillo; | arxiv-cs.CL | 2025-04-25 |
| 317 | Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we develop Neural Machine Translation (NMT) and Transformer-based transfer learning models for English-to-Igbo translation – a low-resource African language spoken by over 40 million people across Nigeria and West Africa. |
Ocheme Anthony Ekle; Biswarup Das; | arxiv-cs.CL | 2025-04-24 |
| 318 | Memory Reviving, Continuing Learning and Beyond: Evaluation of Pre-trained Encoders and Decoders for Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we conduct a systematic study on the impact of pre-trained encoders and decoders in multimodal translation models. |
Zhuang Yu; Shiliang Sun; Jing Zhao; Tengfei Song; Hao Yang; | arxiv-cs.CL | 2025-04-24 |
| 319 | Comparing Large Language Models and Traditional Machine Translation Tools for Translating Medical Consultation Summaries: A Pilot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study evaluates how well large language models (LLMs) and traditional machine translation (MT) tools translate medical consultation summaries from English into Arabic, Chinese, and Vietnamese. |
Andy Li; Wei Zhou; Rashina Hoda; Chris Bain; Peter Poon; | arxiv-cs.CL | 2025-04-23 |
| 320 | CLIRudit: Cross-Lingual Information Retrieval of Scientific Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By making the dataset and code publicly available, we aim to facilitate further research that will help make scientific knowledge more accessible across language barriers. |
Francisco Valentini; Diego Kozlowski; Vincent Larivière; | arxiv-cs.IR | 2025-04-22 |
| 321 | The Paradox of Poetic Intent in Back-Translation: Evaluating The Quality of Large Language Models in Chinese Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study constructs a diverse corpus encompassing Chinese scientific terminology, historical translation paradoxes, and literary metaphors. |
Li Weigang; Pedro Carvalho Brom; | arxiv-cs.CL | 2025-04-22 |
| 322 | FairTranslate: An English-French Dataset for Gender Bias Evaluation in Machine Translation By Overcoming Gender Binarity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents FairTranslate, a novel, fully human-annotated dataset designed to evaluate non-binary gender biases in machine translation systems from English to French. |
Fanny Jourdan; Yannick Chevalier; Cécile Favre; | arxiv-cs.CL | 2025-04-22 |
| 323 | Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The rise of Large Language Models (LLMs) has reshaped machine translation (MT), but multilingual MT still relies heavily on parallel data for supervised fine-tuning (SFT), facing challenges like data scarcity for low-resource languages and catastrophic forgetting. To address these issues, we propose TRANS-ZERO, a self-play framework that leverages only monolingual data and the intrinsic multilingual knowledge of LLM. |
WEI ZOU et. al. | arxiv-cs.CL | 2025-04-20 |
| 324 | A Multimodal Recaptioning Framework to Account for Perceptual Diversity in Multilingual Vision-Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Machine translation of captions has pushed multilingual capabilities in vision-language models (VLMs), but data comes mainly from English speakers, indicating a perceptual bias and lack of model flexibility. In this work, we address this challenge and outline a data-efficient framework to instill multilingual VLMs with greater understanding of perceptual diversity. |
Kyle Buettner; Jacob Emmerson; Adriana Kovashka; | arxiv-cs.CV | 2025-04-19 |
| 325 | Non-Autoregressive Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose the non-autoregressive language model (NA-LM) for multimodal machine translation. |
G. LIU et. al. | icassp | 2025-04-15 |
| 326 | Evaluating Menu OCR and Translation: A Benchmark for Aligning Human and Automated Evaluations in Large Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose Menu OCR and Translation Benchmark (MOTBench), a specialized evaluation framework emphasizing the pivotal role of menu translation in cross-cultural communication. |
ZHANGLIN WU et. al. | arxiv-cs.LG | 2025-04-15 |
| 327 | Automated Python Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To that end, we introduce the task of automatically translating Python’s natural modality (keywords, error types, identifiers, etc.) into other human languages. |
Joshua Otten; Antonios Anastasopoulos; Kevin Moran; | arxiv-cs.CL | 2025-04-15 |
| 328 | Investigating Numerical Translation with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study focuses on evaluating the reliability of LLM-based machine translation systems when handling numerical data. |
W. Tang; | icassp | 2025-04-15 |
| 329 | Textless Streaming Speech-to-Speech Translation Using Semantic Speech Tokens Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a transducer-based speech translation model that outputs discrete speech tokens in a low-latency streaming fashion. |
J. Zhao; | icassp | 2025-04-15 |
| 330 | Can You Map It to English? The Role of Cross-Lingual Alignment in Multilingual Performance of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: For this purpose, we introduce cross-lingual alignment metrics such as the Discriminative Alignment Index (DALI) to quantify the alignment at an instance level for discriminative tasks. |
Kartik Ravisankar; Hyojung Han; Marine Carpuat; | arxiv-cs.CL | 2025-04-12 |
| 331 | Enhancing Contrastive Demonstration Selection with Semantic Diversity for Robust In-Context Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose DiverseConE (Diversity-Enhanced Contrastive Example Selection), a novel approach for demonstration selection in in-context learning for machine translation. |
Owen Patterson; Chee Ng; | arxiv-cs.CL | 2025-04-12 |
| 332 | Adaptive English Translation Parameter Tuning Via Particle Swarm Optimization and Attention Mechanism Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper proposes an English Translation Model Parameter Adaptive Tuning Method Integrating Particle Swarm Optimization (PSO) and Attention Mechanism. We encode key Neural … |
Zhigao Wei; | Int. J. Swarm Intell. Res. | 2025-04-12 |
| 333 | High-Resource Translation:Turning Abundance Into Accessibility Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a novel approach to constructing an English-to-Telugu translation model by leveraging transfer learning techniques and addressing the challenges associated with low-resource languages. |
Abhiram Reddy Yanampally; | arxiv-cs.CL | 2025-04-08 |
| 334 | GlotEval: A Test Suite for Massively Multilingual Evaluation of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing evaluation frameworks aredisproportionately focused on English and a handful of high-resource languages,thereby overlooking the realistic performance of LLMs in multilingual andlower-resource scenarios. To address this gap, we introduce GlotEval, alightweight framework designed for massively multilingual evaluation.Supporting seven key tasks (machine translation, text classification,summarization, open-ended generation, reading comprehension, sequence labeling,and intrinsic evaluation), spanning over dozens to hundreds of languages,GlotEval highlights consistent multilingual benchmarking, language-specificprompt templates, and non-English-centric machine translation. |
HENGYU LUO et. al. | arxiv-cs.CL | 2025-04-05 |
| 335 | MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we present the first systematic study on medical ST, to our best knowledge, by releasing MultiMed-ST, a large-scale ST dataset for the medical domain, spanning all translation directions in five languages: Vietnamese, English, German, French, Traditional Chinese and Simplified Chinese, together with the models. |
KHAI LE-DUC et. al. | arxiv-cs.CL | 2025-04-04 |
| 336 | State-of-the-Art Translation of Text-to-Gloss Using MBART : A Case Study of Bangla Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on the results, this study proposes a new paradigm for text-to-gloss task using mBART models. |
Sharif Md. Abdullah; Abhijit Paul; Shebuti Rayana; Ahmedul Kabir; Zarif Masud; | arxiv-cs.CL | 2025-04-03 |
| 337 | Limitations of Religious Data and The Importance of The Target Domain: Towards Machine Translation for Guinea-Bissau Creole Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a new dataset for machine translation of Guinea-Bissau Creole (Kiriol), comprising around 40 thousand parallel sentences to English and Portuguese. |
Jacqueline Rowe; Edward Gow-Smith; Mark Hepple; | arxiv-cs.CL | 2025-04-03 |
| 338 | Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, conventional single-stage fine-tuning methods struggle in extremely low-resource NMT settings, where training data is very limited. This paper contributes to artificial intelligence by proposing two approaches for adapting msLLMs in these challenging scenarios: (1) continual pre-training (CPT), where the msLLM is further trained with domain-specific monolingual data to compensate for the under-representation of LRLs, and (2) intermediate task transfer learning (ITTL), a method that fine-tunes the msLLM with both in-domain and out-of-domain parallel data to enhance its translation capabilities across various domains and tasks. |
Sarubi Thillainathan; Songchen Yuan; En-Shiun Annie Lee; Sanath Jayasena; Surangika Ranathunga; | arxiv-cs.CL | 2025-03-28 |
| 339 | HausaNLP at SemEval-2025 Task 2: Entity-Aware Fine-tuning Vs. Prompt Engineering in Entity-Aware Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents our findings for SemEval 2025 Task 2, a shared task on entity-aware machine translation (EA-MT). |
ABDULHAMID ABUBAKAR et. al. | arxiv-cs.CL | 2025-03-25 |
| 340 | New Trends for Modern Machine Translation with Large Reasoning Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We identify three foundational shifts: 1) contextual coherence, where LRMs resolve ambiguities and preserve discourse structure through explicit reasoning over cross-sentence and complex context or even lack of context; 2) cultural intentionality, enabling models to adapt outputs by inferring speaker intent, audience expectations, and socio-linguistic norms; 3) self-reflection, LRMs can perform self-reflection during the inference time to correct the potential errors in translation especially extremely noisy cases, showing better robustness compared to simply mapping X->Y translation. |
SINUO LIU et. al. | arxiv-cs.CL | 2025-03-13 |
| 341 | Word2winners at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper describes our system for SemEval 2025 Task 7: Previously Fact-Checked Claim Retrieval. |
Amirmohammad Azadi; Sina Zamani; Mohammadmostafa Rostamkhani; Sauleh Eetemadi; | arxiv-cs.CL | 2025-03-11 |
| 342 | Contextual Cues in Machine Translation: Investigating The Potential of Multi-Source Input Strategies in LLMs and NMT Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We explore the impact of multi-source input strategies on machine translation (MT) quality, comparing GPT-4o, a large language model (LLM), with a traditional multilingual neural machine translation (NMT) system. |
Lia Shahnazaryan; Patrick Simianer; Joern Wuebker; | arxiv-cs.CL | 2025-03-10 |
| 343 | Assumed Identities: Quantifying Gender Bias in Machine Translation of Ambiguous Occupational Terms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine Translation (MT) systems frequently encounter gender-ambiguous occupational terms, where they must assign gender without explicit contextual cues. While individual … |
O. M. Mastromichalakis; Giorgos Filandrianos; M. Symeonaki; G. Stamou; | ArXiv | 2025-03-06 |
| 344 | Improving Neural Machine Translation Through Code‐Mixed Data Augmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper studies neural machine translation (NMT) of code‐mixed (CM) text. Specifically, we generate synthetic CM data and how it can be used to improve the translation … |
Ramakrishna Appicharla; Kamal Kumar Gupta; Asif Ekbal; Pushpak Bhattacharyya; | Computational Intelligence | 2025-03-06 |
| 345 | Comparative Study of Zero-Shot Cross-Lingual Transfer for Bodo POS and NER Tagging Using Gemini 2.0 Flash Thinking Experimental Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article presents a comparative empirical study investigating the effectiveness of Google’s Gemini 2.0 Flash Thinking Experiment model for zero-shot cross-lingual transfer of POS and NER tagging to Bodo. |
SANJIB NARZARY et. al. | arxiv-cs.CL | 2025-03-06 |
| 346 | Semantic Relationship Extraction of English Long Sentences and Quality Optimization of Machine Translation Based on BERT Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the acceleration of globalization, cross-linguistic communication has become an indispensable part of daily life, and the status of English as an international lingua franca … |
Sha Chen; | Journal of Computational Methods in Sciences and Engineering | 2025-03-04 |
| 347 | Chitranuvad: Adapting Multi-Lingual LLMs for Multimodal Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we provide the system description of our submission as part of the English to Lowres Multimodal Translation Task at the Workshop on Asian Translation (WAT2024). |
SHAHARUKH KHAN et. al. | arxiv-cs.CL | 2025-02-27 |
| 348 | Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work proposes an industry-level omni-modal large language model (LLM)pipeline that integrates auditory, visual, and linguistic modalities toovercome challenges such as limited tri-modal datasets, high computationalcosts, and complex feature alignments. |
CHE LIU et. al. | arxiv-cs.MM | 2025-02-26 |
| 349 | Nexus-O: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This work proposes an industry-level omni-modal large language model (LLM) pipeline that integrates auditory, visual, and linguistic modalities to overcome challenges such as … |
CHE LIU et. al. | ArXiv | 2025-02-26 |
| 350 | Contextual Effects of Sentiment Deployment in Human and Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper illustrates how the overall sentiment of a text may be shifted in translation and the implications for automated sentiment analyses, particularly those that utilize machine translation and assess findings via semantic similarity metrics. |
Lindy Comstock; Priyanshu Sharma; Mikhail Belov; | arxiv-cs.CL | 2025-02-25 |
| 351 | Adaptive Few-shot Prompting for Machine Translation with Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing evidence shows that LLMs are prompt-sensitive and it is sub-optimal to apply the fixed prompt to any input for downstream machine translation tasks. To address this issue, we propose an adaptive few-shot prompting (AFSP) framework to automatically select suitable translation demonstrations for various source input sentences to further elicit the translation capability of an LLM for better machine translation. |
Lei Tang; Jinghui Qin; Wenxuan Ye; Hao Tan; Zhijing Yang; | aaai | 2025-02-25 |
| 352 | Using Machine Learning to Detect Fraudulent SMSs in Chichewa Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a first dataset for SMS fraud detection in Chichewa, a major language in Africa, and reports on experiments with machine learning algorithms for classifying SMSs in Chichewa as fraud or non-fraud. |
Amelia Taylor; Amoss Robert; | arxiv-cs.LG | 2025-02-24 |
| 353 | UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces UrduLLaMA 1.0, a model derived from the open-source Llama-3.1-8B-Instruct architecture and continually pre-trained on 128 million Urdu tokens, capturing the rich diversity of the language. |
Layba Fiaz; Munief Hassan Tahir; Sana Shams; Sarmad Hussain; | arxiv-cs.CL | 2025-02-24 |
| 354 | MultiSlav: Using Cross-Lingual Knowledge Transfer to Combat The Curse of Multilinguality Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we explore multiple approaches for extending the available data-regime in NMT and we prove cross-lingual benefits even in 0-shot translation regime for low-resource languages. |
ARTUR KOT et. al. | arxiv-cs.CL | 2025-02-20 |
| 355 | English Please: Evaluating Machine Translation with Large Language Models for Multilingual Bug Reports Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we conduct the first comprehensive evaluation of machine translation (MT) performance on bug reports, analyzing the capabilities of DeepL, AWS Translate, and large language models such as ChatGPT, Claude, Gemini, LLaMA, and Mistral using data from the Visual Studio Code GitHub repository, specifically focusing on reports labeled with the english-please tag. |
Avinash Patil; Siru Tao; Aryan Jadon; | arxiv-cs.CL | 2025-02-20 |
| 356 | A Study on The Translation of Spoken English from Speech to Text Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Rapid translation of spoken English is conducive to international communication. This paper briefly introduces a convolutional neural network (CNN) algorithm for converting … |
Ying Zhang; | J. ICT Stand. | 2025-02-19 |
| 357 | Identifying Gender Stereotypes and Biases in Automated Translation from English to Italian Using Similarity Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper is a collaborative effort between Linguistics, Law, and Computer Science to evaluate stereotypes and biases in automated translation systems. |
Fatemeh Mohammadi; Marta Annamaria Tamborini; Paolo Ceravolo; Costanza Nardocci; Samira Maghool; | arxiv-cs.CL | 2025-02-17 |
| 358 | Truth Knows No Language: Evaluating Truthfulness Beyond English Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a professionally translated extension of the TruthfulQA benchmark designed to evaluate truthfulness in Basque, Catalan, Galician, and Spanish. |
BLANCA CALVO FIGUERAS et. al. | arxiv-cs.CL | 2025-02-13 |
| 359 | Automatic Translation Between Kreol Morisien and English Using The Marian Machine Translation Framework Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Kreol Morisien is a vibrant and expressive language that reflects the multicultural heritage of Mauritius. There are different versions of Kreol languages. While Kreol Morisien is … |
Z. Boodeea; S. Pudaruth; Nitish Chooramun; Aneerav Sukhoo; | Informatics | 2025-02-10 |
| 360 | Joint Pairwise Learning and Masked Language Models for Neural Machine Translation of English Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Shuhan Yang; Qun Yang; | Artif. Life Robotics | 2025-02-10 |
| 361 | Evaluating Text Style Transfer Evaluation: Are There Any Reliable Metrics? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we examine both set of existing and novel metrics from broader NLP tasks for TST evaluation, focusing on two popular subtasks, sentiment transfer and detoxification, in a multilingual context comprising English, Hindi, and Bengali. |
Sourabrata Mukherjee; Atul Kr. Ojha; John P. McCrae; Ondrej Dusek; | arxiv-cs.CL | 2025-02-07 |
| 362 | BOUQuET: Dataset, Benchmark and Open Initiative for Universal Quality Evaluation in Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Compared with related machine translation datasets, we show that BOUQuET has a broader representation of domains while simplifying the translation task for non-experts. |
THE OMNILINGUAL MT TEAM et. al. | arxiv-cs.CL | 2025-02-06 |
| 363 | High-Fidelity Simultaneous Speech-To-Speech Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Hibiki, a decoder-only model for simultaneous speech translation. |
TOM LABIAUSSE et. al. | arxiv-cs.CL | 2025-02-05 |
| 364 | Cross-Lingual Transfer for Low-Resource Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For model-based transfer, we introduce a constrained decoding algorithm that enhances cross-lingual Sequence Labeling in zero-shot settings using text-to-text models. |
Iker García-Ferrero; | arxiv-cs.CL | 2025-02-04 |
| 365 | When End-to-End Is Overkill: Rethinking Cascaded Speech-to-Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we explore the benefits of incorporating multiple candidates from ASR and self-supervised speech features into MT. Our analysis reveals that the primary cause of cascading errors stems from the increased divergence between similar samples in the speech domain when mapped to the text domain. |
Anna Min; Chenxu Hu; Yi Ren; Hang Zhao; | arxiv-cs.CL | 2025-02-01 |
| 366 | Cross-Language Approach for Quranic QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these systems face unique challenges, including the linguistic disparity between questions written in Modern Standard Arabic and answers found in Quranic verses written in Classical Arabic, and the small size of existing datasets, which further restricts model performance. To address these challenges, we adopt a cross-language approach by (1) Dataset Augmentation: expanding and enriching the dataset through machine translation to convert Arabic questions into English, paraphrasing questions to create linguistic diversity, and retrieving answers from an English translation of the Quran to align with multilingual training requirements; and (2) Language Model Fine-Tuning: utilizing pre-trained models such as BERT-Medium, RoBERTa-Base, DeBERTa-v3-Base, ELECTRA-Large, Flan-T5, Bloom, and Falcon to address the specific requirements of Quranic QA. |
Islam Oshallah; Mohamed Basem; Ali Hamdi; Ammar Mohammed; | arxiv-cs.CL | 2025-01-29 |
| 367 | A Comparison of Data Filtering Techniques for English-Polish LLM-based Machine Translation in The Biomedical Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper evaluates the impact of commonly used data filtering techniques, such as LASER, MUSE, and LaBSE, on English-Polish translation within the biomedical domain. |
Jorge del Pozo Lérida; Kamil Kojs; János Máté; Mikołaj Antoni Barański; Christian Hardmeier; | arxiv-cs.CL | 2025-01-27 |
| 368 | Improving Estonian Text Simplification Through Pretrained Language Models and Custom Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces an approach to Estonian text simplification using two model architectures: a neural machine translation model and a fine-tuned large language model (LLaMA). |
Eduard Barbu; Meeri-Ly Muru; Sten Marcus Malva; | arxiv-cs.CL | 2025-01-26 |
| 369 | CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Direct Preference Optimization (DPO) has emerged as a simpler and more efficient alternative, but its performance depends heavily on the quality of preference data. To address this, we propose Confidence-Reward driven Preference Optimization (CRPO), a novel method that combines reward scores with model confidence to improve data selection for fine-tuning. |
GUOFENG CUI et. al. | arxiv-cs.CL | 2025-01-23 |
| 370 | Domain-Specific Machine Translation to Translate Medicine Brochures in English to Sorani Kurdish Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Access to Kurdish medicine brochures is limited, depriving Kurdish-speaking communities of critical health information. To address this problem, we developed a specialized Machine Translation (MT) model to translate English medicine brochures into Sorani Kurdish using a parallel corpus of 22,940 aligned sentence pairs from 319 brochures, sourced from two pharmaceutical companies in the Kurdistan Region of Iraq (KRI). |
Mariam Shamal; Hossein Hassani; | arxiv-cs.CL | 2025-01-23 |
| 371 | HERITAGE: An End-to-End Web Platform for Processing Korean Historical Documents in Hanja Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Modern Koreans and Chinese cannot understand Korean historical documents without substantial additional help, and while previous efforts have produced some Korean and English translations, this requires in-depth expertise, and so most of the documents are not translated into any modern language. To address this gap, we present HERITAGE, the first open-source Hanja NLP toolkit to assist in understanding and translating the unexplored Korean historical documents written in Hanja. |
Seyoung Song; Haneul Yoo; Jiho Jin; Kyunghyun Cho; Alice Oh; | arxiv-cs.CL | 2025-01-21 |
| 372 | ViBidirectionMT-Eval: Machine Translation for Vietnamese-Chinese and Vietnamese-Lao Language Pair Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an results of the VLSP 2022-2023 Machine Translation Shared Tasks, focusing on Vietnamese-Chinese and Vietnamese-Lao machine translation. |
Hong-Viet Tran; Minh-Quy Nguyen; Van-Vinh Nguyen; | arxiv-cs.CL | 2025-01-15 |
| 373 | Improving Neural Machine Translation in The Field of Electrical Engineering By Using Sentence Backbone Information Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Due to the limited availability of corpora in the field of Electrical Engineering and the presence of numerous specialized terms, neural machine translation (NMT) performs poorly … |
Bingtao Teng; Yuan Chen; Juwei Zhang; | ACM Transactions on Asian and Low-Resource Language … | 2025-01-15 |
| 374 | Multilingual LLMs Struggle to Link Orthography and Semantics in Bilingual Word Processing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In humans, this is evident in the ease with which cognate words – words similar in both orthographic form and meaning (e.g., blind, meaning sightless in both English and German) – are processed, compared to the challenges posed by interlingual homographs, which share orthographic form but differ in meaning (e.g., gift, meaning present in English but poison in German). We investigate how multilingual Large Language Models (LLMs) handle such phenomena, focusing on English-Spanish, English-French, and English-German cognates, non-cognate, and interlingual homographs. |
Eshaan Tanwar; Gayatri Oke; Tanmoy Chakraborty; | arxiv-cs.CL | 2025-01-15 |
| 375 | AFRIDOC-MT: Document-level MT Corpus for African Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces AFRIDOC-MT, a document-level multi-parallel translationdataset covering English and five African languages: Amharic, Hausa, Swahili,Yor\`ub\’a, and Zulu. |
JESUJOBA O. ALABI et. al. | arxiv-cs.CL | 2025-01-10 |
| 376 | Investigating Numerical Translation with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study focuses on evaluating the reliability of LLM-based machine translation systems when handling numerical data. |
WEI TANG et. al. | arxiv-cs.CL | 2025-01-08 |
| 377 | Crossing Language Borders: A Pipeline for Indonesian Manhwa Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this project, we develop a practical and efficient solution for automating the Manhwa translation from Indonesian to English. |
Nithyasri Narasimhan; Sagarika Singh; | arxiv-cs.LG | 2025-01-02 |
| 378 | Constitutive Artificial Neural Network for The Construction of An English Multimodal Corpus Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multimodal corpus is a novel multimedia teaching tool in social development and educational reform process. It uses a range of multimedia components to build a wide-ranging … |
Junhua Li; Yuehua Li; Lihao Han; | Int. Arab J. Inf. Technol. | 2025-01-01 |
| 379 | A Novel Approach to Continual Knowledge Transfer in Multilingual Neural Machine Translation Using Autoregressive and Non-Autoregressive Models for Indic Languages Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent progress in multilingual pre-trained models has significantly improved translation quality for Indic languages. However, extending these models to new languages via … |
Shailashree K. Sheshadri; Deepa Gupta; Biswajit Paul; J. S. Bhavani; | IEEE Access | 2025-01-01 |
| 380 | Multimodal Machine Translation with Text-Image In-depth Questioning Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YUE GAO et. al. | Annual Meeting of the Association for Computational … | 2025-01-01 |
| 381 | Neural Machine Translation in Electrical Engineering With Cross-Layer Information Fusion and Multiple Positional Mapping Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study focuses on the English-Chinese neural machine translation task of professional texts in the field of electrical engineering. Such texts are usually dense in terms, … |
Zhenyu Zhang; Yuan Chen; Juwei Zhang; | IEEE Access | 2025-01-01 |
| 382 | InImageTrans: Multimodal LLM-based Text Image Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Fei Zuo; Kehai Chen; Yu Zhang; Zhengshan Xue; Min Zhang; | Annual Meeting of the Association for Computational … | 2025-01-01 |
| 383 | Adaptive Neural Machine Translation with Attention Mechanisms for English Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Weiwei Suo; | Int. J. Inf. Commun. Technol. | 2025-01-01 |
| 384 | Improving Low-Resource Kazakh-English and Turkish-English Neural Machine Translation Using Transfer Learning and Part of Speech Tags Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study presents a novel translation framework by combining transfer learning and part-of-speech (POS) tagging methods to improve the performance of low-resource neural machine … |
Bilge Kagan Yazar; Erdal Kiliç; | IEEE Access | 2025-01-01 |
| 385 | Algorithms for Optimizing The Effect of English Machine Translation Using Transformer Mode Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite the widespread application of the Internet and the in-depth development of globalization, English machine translation still has problems such as poor translation accuracy, … |
Yan Ma; | Journal of Computational Methods in Sciences and Engineering | 2025-01-01 |
| 386 | English Please: Evaluating Machine Translation for Multilingual Bug Reports Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save |
Avinash Patil; Aryan Jadon; | ArXiv | 2025-01-01 |
| 387 | Application of Minimax Optimization Mechanism in Chinese-English Machine Translation Quality Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine Translation Quality Estimation (MTQE) is pivotal in bridging the gap between machine-generated translations and human translation quality, especially in real-time … |
Xiaomei Zhang; | IEEE Access | 2025-01-01 |
| 388 | Knowledge‐Grounded Attention‐Based Neural Machine Translation Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Neural machine translation (NMT) model processes sentences in isolation and ignores additional contextual or side information beyond sentences. The input text alone often provides … |
HUMA ISRAR et. al. | Applied Computational Intelligence and Soft Computing | 2025-01-01 |
| 389 | Transformer-Based Amharic-to-English Machine Translation With Character Embedding and Combined Regularization Techniques Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Amharic is the working language of Ethiopia and, owing to its Semitic characteristics, the language is known for its complex morphology. It is also an under-resourced language, … |
Surafiel Habib Asefa; Yaregal Assabie; | IEEE Access | 2025-01-01 |
| 390 | Domain-Generalized Emotion Recognition on German Text Corpora Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Text-based emotion recognition plays a crucial role in various domains and applications due to its significance in understanding human behavior and improving communication … |
Oweys Momenzada; Michael Palk; Stefan Voss; | IEEE Access | 2025-01-01 |
| 391 | Developing Japanese CLIP Models Leveraging An Open-weight LLM for Large-scale Dataset Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: CLIP is a foundational model that bridges images and text, widely adopted as a key component in numerous vision-language models. However, the lack of large-scale open Japanese … |
Issa Sugiura; Shuhei Kurita; Yusuke Oda; Daisuke Kawahara; Naoaki Okazaki; | North American Chapter of the Association for Computational … | 2025-01-01 |
| 392 | Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the field of computational linguistics, addressing machine translation (MT) challenges for low-resource languages remains crucial, as these languages often lack extensive data … |
Xiao Liang; Yen-Min Jasmina Khaw; S. Liew; T. Tan; Donghong Qin; | IEEE Access | 2025-01-01 |
| 393 | QDLTrans: Enhancing English Neural Machine Translation With Quantized Attention Block and Tunable Dual Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Neural Machine Translation (NMT) is a fundamental task in natural language processing, typically relying on large-scale parallel corpora to achieve high translation quality. … |
Xing-Ying Liu; | IEEE Access | 2025-01-01 |
| 394 | Advancing Explainability in Neural Machine Translation: Analytical Metrics for Attention and Alignment Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The interpretability of these models, especially their internal attention mechanisms, is critical for building trust and verifying that these systems behave as intended. In this work, we introduce a systematic framework to quantitatively evaluate the explainability of an NMT model attention patterns by comparing them against statistical alignments and correlating them with standard machine translation quality metrics. |
Anurag Mishra; | arxiv-cs.AI | 2024-12-24 |
| 395 | Towards Global AI Inclusivity: A Large-Scale Multilingual Terminology Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The field of machine translation has achieved significant advancements, yet domain-specific terminology translation, particularly in AI, remains challenging. We introduce GIST, a … |
JIARUI LIU et. al. | ArXiv | 2024-12-24 |
| 396 | Towards Global AI Inclusivity: A Large-Scale Multilingual Terminology Dataset (GIST) Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce GIST, a large-scale multilingual AI terminology dataset containing 5K terms extracted from top AI conference papers spanning 2000 to 2023. |
JIARUI LIU et. al. | arxiv-cs.CL | 2024-12-24 |
| 397 | Ensuring Consistency for In-Image Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The former entails incorporating image information during translation, while the latter involves maintaining consistency between the style of the text-image and the original image, ensuring background integrity. To address these consistency requirements, we introduce a novel two-stage framework named HCIIT (High-Consistency In-Image Translation) which involves text-image translation using a multimodal multilingual large language model in the first stage and image backfilling with a diffusion model in the second stage. |
CHENGPENG FU et. al. | arxiv-cs.CL | 2024-12-23 |
| 398 | Mention Attention for Pronoun Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We assume that extracting additional mention features can help pronoun translation. Therefore, we introduce an additional mention attention module in the decoder to pay extra attention to source mentions but not non-mention tokens. |
Gongbo Tang; Christian Hardmeier; | arxiv-cs.CL | 2024-12-19 |
| 399 | Understanding and Analyzing Model Robustness and Knowledge-Transfer in Multilingual Neural Machine Translation Using TX-Ray Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, Multilingual Neural Machine Translation (MNMT) in extremely low-resource settings remains underexplored. This research investigates how knowledge transfer across languages can enhance MNMT in such scenarios. |
Vageesh Saxena; Sharid Loáiciga; Nils Rethmeier; | arxiv-cs.CL | 2024-12-18 |
| 400 | Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Why do we build local large language models (LLMs)? |
KOSHIRO SAITO et. al. | arxiv-cs.CL | 2024-12-18 |
| 401 | The Role of Handling Attributive Nouns in Improving Chinese-To-English Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we specifically target the translation challenges posed by attributive nouns in Chinese, which frequently cause ambiguities in English translation. |
Lisa Wang; Adam Meyers; John E. Ortega; Rodolfo Zevallos; | arxiv-cs.CL | 2024-12-18 |
| 402 | Analyzing The Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we investigate the role of attention heads in Context-aware Machine Translation models for pronoun disambiguation in the English-to-German and English-to-French language directions. |
Paweł Mąka; Yusuf Can Semerci; Jan Scholtes; Gerasimos Spanakis; | arxiv-cs.CL | 2024-12-15 |
| 403 | Neural Machine Translation Techniques for English Text to Pakistan Sign Language Gloss Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Abdul Majid Tanwir; Muhammad Najeeb Jilani; Zaviar Khan; Abdul Samad; | Univers. Access Inf. Soc. | 2024-12-14 |
| 404 | Large Language Models for Persian $ \leftrightarrow $ English Idiom Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces two parallel datasets of sentences containing idiomatic expressions for Persian$\rightarrow$English and English$\rightarrow$Persian translations, with Persian idioms sampled from our PersianIdioms resource, a collection of 2,200 idioms and their meanings, with 700 including usage examples. |
Sara Rezaeimanesh; Faezeh Hosseini; Yadollah Yaghoobzadeh; | arxiv-cs.CL | 2024-12-13 |
| 405 | Shiksha: A Technical Domain Focused Translation Dataset and Model for Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Finding a translation dataset that tends to these domains in particular, poses a difficult challenge. In this paper, we address this by creating a multilingual parallel corpus containing more than 2.8 million rows of English-to-Indic and Indic-to-Indic high-quality translation pairs across 8 Indian languages. |
Advait Joglekar; Srinivasan Umesh; | arxiv-cs.CL | 2024-12-12 |
| 406 | Neural Machine Translation Model Using GRU with Hybrid Attention Mechanism for English to Kannada Language Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Neural machine translation is a machine translation system that uses artificial neural networks to identify nonlinear relationships between bilingual sentence pairs. The language … |
Gunti Spandan; Prasannavenkatesan Theerthagiri; | J. Wirel. Mob. Networks Ubiquitous Comput. Dependable Appl. | 2024-12-12 |
| 407 | Domain-Specific Translation with Open-Source Large Language Models: Resource-Oriented Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we compare the domain-specific translation performance of open-source autoregressive decoder-only large language models (LLMs) with task-oriented machine translation (MT) models. |
Aman Kassahun Wassie; Mahdi Molaei; Yasmin Moslem; | arxiv-cs.CL | 2024-12-08 |
| 408 | BhashaVerse : Translation Ecosystem for Indian Subcontinent Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper focuses on developing translation models and related applications for 36 Indian languages, including Assamese, Awadhi, Bengali, Bhojpuri, Braj, Bodo, Dogri, English, Konkani, Gondi, Gujarati, Hindi, Hinglish, Ho, Kannada, Kangri, Kashmiri (Arabic and Devanagari), Khasi, Mizo, Magahi, Maithili, Malayalam, Marathi, Manipuri (Bengali and Meitei), Nepali, Oriya, Punjabi, Sanskrit, Santali, Sinhala, Sindhi (Arabic and Devanagari), Tamil, Tulu, Telugu, and Urdu. |
Vandan Mujadia; Dipti Misra Sharma; | arxiv-cs.CL | 2024-12-05 |
| 409 | Representation Purification for End-to-End Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we conceptualize speech representation as a combination of content-agnostic and content-relevant factors. |
Chengwei Zhang; Yue Zhou; Rui Zhao; Yidong Chen; Xiaodong Shi; | arxiv-cs.CL | 2024-12-05 |
| 410 | Agent AI with LangGraph: A Modular Framework for Enhancing Machine Translation Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper explores the transformative role of Agent AI and LangGraph in advancing the automation and effectiveness of machine translation (MT). |
Jialin Wang; Zhihua Duan; | arxiv-cs.CL | 2024-12-04 |
| 411 | Comparison of Machine Translation Services in The Biomedical Context Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Patients and healthcare providers often face communication barriers due to different languages, which can impede healthcare quality. Moreover, with digital medicine and health … |
Tamara Slosarek; Daniel Paeschke; Igor Sivtsev; Erwin P. Böttinger; | 2024 IEEE International Conference on Bioinformatics and … | 2024-12-03 |
| 412 | Automatic SQL Query Generation from Code Switched Natural Language Questions on Electronic Medical Records Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Electronic Medical Records (EMRs) meticulously document patient information in relational databases, presenting a challenging task for medical professionals to effectively … |
Haodi Zhang; Jinyin Nie; Zeming Liu; Dong Lei; Yuanfeng Song; | 2024 IEEE International Conference on Bioinformatics and … | 2024-12-03 |
| 413 | A Multi-way Parallel Named Entity Annotated Corpus for English, Tamil and Sinhala Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents a multi-way parallel English-Tamil-Sinhala corpus annotated with Named Entities (NEs), where Sinhala and Tamil are low-resource languages. |
SURANGIKA RANATHUNGA et. al. | arxiv-cs.CL | 2024-12-02 |
| 414 | TIFD: Tibetan Instruction-Following Dataset for Large Language Models Supervised Fine-Tuning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In addressing challenges within the field of Natural Language Processing (NLP), supervised fine-tuning is an efficient technique that allows pre-trained Large Language Models to … |
Wenhao Zhuang; Dawa Cairen; Yuan Sun; | Data Intell. | 2024-12-01 |
| 415 | Towards Santali Linguistic Inclusion: Building The First Santali-to-English Translation Model Using MT5 Transformer and Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our paper aims to include Santali to the NPL spectrum. |
SYED MOHAMMED MOSTAQUE BILLAH et. al. | arxiv-cs.CL | 2024-11-29 |
| 416 | Aligning Pre-trained Models for Spoken Language Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper investigates a novel approach to end-to-end speech translation (ST) based on aligning frozen pre-trained automatic speech recognition (ASR) and machine translation (MT) models via a small connector module (Q-Former, our Subsampler-Transformer Encoder). |
Šimon Sedláček; Santosh Kesiraju; Alexander Polok; Jan Černocký; | arxiv-cs.CL | 2024-11-27 |
| 417 | Keep It Local: Comparing Domain-Specific LLMs in Native and Machine Translated Text Using Parallel Corpora on Political Conflict Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The dynamics of political conflict and cooperation require powerful computerized tools capable of effectively tracking security threats and cooperation around the world. This … |
JAVIER OSORIO et. al. | 2024 2nd International Conference on Foundation and Large … | 2024-11-26 |
| 418 | From Scarcity to Sufficiency: Machine Translation Techniques for Low-Resource LLMs Enhancement Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study addresses the significant performance disparities in large language models between high-resource languages like English and low-resource languages such as Thai, … |
Hao Yang; Min Zhang; Jiaxin Guo; | 2024 2nd International Conference on Foundation and Large … | 2024-11-26 |
| 419 | SwissADT: An Audio Description Translation System for Swiss Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By collecting well-crafted AD data augmented with video clips in German, French, Italian, and English, and leveraging the power of Large Language Models (LLMs), we aim to enhance information accessibility for diverse language populations in Switzerland by automatically translating AD scripts to the desired Swiss language. |
Lukas Fischer; Yingqiang Gao; Alexa Lintner; Sarah Ebling; | arxiv-cs.CL | 2024-11-22 |
| 420 | Benchmarking GPT-4 Against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study presents a comprehensive evaluation of GPT-4’s translation capabilities compared to human translators of varying expertise levels. |
JIANHAO YAN et. al. | arxiv-cs.CL | 2024-11-20 |
| 421 | A Comparative Study of Text Retrieval Models on DaReCzech Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article presents a comprehensive evaluation of 7 off-the-shelf document retrieval models: Splade, Plaid, Plaid-X, SimCSE, Contriever, OpenAI ADA and Gemma2 chosen to determine their performance on the Czech retrieval dataset DaReCzech. |
Jakub Stetina; Martin Fajcik; Michal Stefanik; Michal Hradis; | arxiv-cs.IR | 2024-11-19 |
| 422 | Effect of Parallel Data Processing Model on Bi-Directional English-Khimtagne Machine Translation Using Deep Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Neural Machine Translation (NMT) is a key application of Natural Language Processing (NLP) that allows text to be translated automatically from one natural language to another … |
ADANE KASIE CHEKOLE et. al. | 2024 International Conference on Information and … | 2024-11-18 |
| 423 | MiTTenS: A Dataset for Evaluating Gender Mistranslation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Translation systems, including foundation models capable of translation, can produce errors that result in gender mistranslation, and such errors can be especially harmful. To measure the extent of such potential harms when translating into and out of English, we introduce a dataset, MiTTenS, covering 26 languages from a variety of language families and scripts, including several traditionally under-represented in digital resources. |
Kevin Robinson; Sneha Kudugunta; Romina Stella; Sunipa Dev; Jasmijn Bastings; | emnlp | 2024-11-11 |
| 424 | Reconsidering Sentence-Level Sign Language Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Historically, sign language machine translation has been posed as a sentence-level task: datasets consisting of continuous narratives are chopped up and presented to the model as isolated clips. In this work, we explore the limitations of this task framing. |
Garrett Tanzer; Maximus Shengelia; Ken Harrenstien; David Uthus; | emnlp | 2024-11-11 |
| 425 | Error Analysis of Multilingual Language Models in Machine Translation: A Case Study of English-Amharic Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We employed both automatic and human evaluation methods to analyze translation errors. |
Hizkiel Mitiku Alemayehu; Hamada M Zahera; Axel-Cyrille Ngonga Ngomo; | emnlp | 2024-11-11 |
| 426 | Using Language Models to Disambiguate Lexical Choices in Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We work with native speakers of nine languages to create DTAiLS, a dataset of 1,377 sentence pairs that exhibit cross-lingual concept variation when translating from English. |
Josh Barua; Sanjay Subramanian; Kayo Yin; Alane Suhr; | emnlp | 2024-11-11 |
| 427 | Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we address the problem of cross-cultural translation on two fronts: (i) we introduce XC-Translate, the first large-scale, manually-created benchmark for machine translation that focuses on text that contains potentially culturally-nuanced entity names, and (ii) we propose KG-MT, a novel end-to-end method to integrate information from a multilingual knowledge graph into a neural machine translation model by leveraging a dense retrieval mechanism. |
SIMONE CONIA et. al. | emnlp | 2024-11-11 |
| 428 | Fine-Tuning Large Language Models to Translate: Will A Touch of Noisy Data in Misaligned Languages Suffice? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditionally, success in multilingual machine translation can be attributed to three key factors in training data: large volume, diverse translation directions, and high quality. In the current practice of fine-tuning large language models (LLMs) for translation, we revisit the importance of these factors. |
DAWEI ZHU et. al. | emnlp | 2024-11-11 |
| 429 | SpeechQE: Estimating The Quality of Direct Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we formulate the task of quality estimation for speech translation (SpeechQE), construct a benchmark, and evaluate a family of systems based on cascaded and end-to-end architectures. |
HyoJung Han; Kevin Duh; Marine Carpuat; | emnlp | 2024-11-11 |
| 430 | CULL-MT: Compression Using Language and Layer Pruning for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present CULL-MT, a compression method for machine translation models based on structural layer pruning and selected language directions. |
Pedram Rostami; Mohammad Javad Dousti; | arxiv-cs.CL | 2024-11-10 |
| 431 | Fineweb-Edu-Ar: Machine-translated Corpus to Support Arabic Small Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This report introduces FineWeb-Edu-Ar, a machine-translated version of the exceedingly popular (deduplicated) FineWeb-Edu dataset from HuggingFace. |
Sultan Alrashed; Dmitrii Khizbullin; David R. Pugh; | arxiv-cs.CL | 2024-11-10 |
| 432 | Distilling Knowledge in Machine Translation of Agglutinative Languages with Backward and Morphological Decoders Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Agglutinative languages often have morphologically complex words (MCWs) composed of multiple morphemes arranged in a hierarchical structure, posing significant challenges in … |
Telem Joyson Singh; Sanasam Ranbir Singh; Priyankoo Sarmah; | ACM Transactions on Asian and Low-Resource Language … | 2024-11-07 |
| 433 | Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a series of advanced explorations of Transformer architecture design to minimize the error compared to the true “solution.” |
BEI LI et. al. | arxiv-cs.CL | 2024-11-05 |
| 434 | Context-Informed Machine Translation of Manga Using Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we investigate to what extent multimodal large language models (LLMs) can provide effective manga translation, thereby assisting manga authors and publishers in reaching wider audiences. |
Philip Lippmann; Konrad Skublicki; Joshua Tanner; Shonosuke Ishiwatari; Jie Yang; | arxiv-cs.CL | 2024-11-04 |
| 435 | Multimodal LLM Enhanced Cross-lingual Cross-modal Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, aligning their representations poses challenges due to the significant semantic gap between vision and text, as well as the lower quality of non-English representations caused by pre-trained encoders and data noise. To overcome these challenges, we propose LECCR, a novel solution that incorporates the multi-modal large language model (MLLM) to improve the alignment between visual and non-English representations. |
YABING WANG et. al. | mm | 2024-10-30 |
| 436 | Virtual Visual-Guided Domain-Shadow Fusion Via Modal Exchanging for Domain-Specific Multi-Modal Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This challenge can lead to a decrease in machine translation performance for domain-specific terms. To tackle this problem, this paper presents a virtual visual scene-guided domain-shadow multi-modal fusion mechanism to simultaneously integrate multi-grained domain visual details and text with the guidance of modality-agnostic virtual visual scene, thereby enhancing machine translation performance for DMNMT, especially for domain terms. |
Zhenyu Hou; Junjun Guo; | mm | 2024-10-30 |
| 437 | Evaluating Terminology Translation in Machine Translation Systems Via Metamorphic Testing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine translation has become an integral part of daily life, with terminology translation playing a crucial role in ensuring the accuracy of translation results. However, … |
Yihui Xu; Yanhui Li; Jun Wang; Xiaofang Zhang; | 2024 39th IEEE/ACM International Conference on Automated … | 2024-10-27 |
| 438 | Enhancing Pretrained Multilingual Machine Translation Model with Code-Switching: A Study on Chinese, English and Malay Language Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the field of multilingual machine translation, many pretrained language models have achieved the inspiring results. However, the results based on pretrained models are not yet … |
Haijing Liu; Noraini Seman; | Proceedings of the 2024 13th International Conference on … | 2024-10-25 |
| 439 | Dialectal and Low Resource Machine Translation for Aromanian Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents the process of building a neural machine translation system with support for English, Romanian, and Aromanian – an endangered Eastern Romance language. The … |
Alexandru-Iulius Jerpelea; Alina-cStefania Ruadoi; Sergiu Nisioi; | International Conference on Computational Linguistics | 2024-10-23 |
| 440 | Dialectal and Low-Resource Machine Translation for Aromanian Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The primary contribution of this research is twofold: (1) the creation of the most extensive Aromanian-Romanian parallel corpus to date, consisting of 79,000 sentence pairs, and (2) the development and comparative analysis of several machine translation models optimized for Aromanian. |
Alexandru-Iulius Jerpelea; Alina Rădoi; Sergiu Nisioi; | arxiv-cs.CL | 2024-10-23 |
| 441 | Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language models (LLMs) perform well on common tasks but struggle with generalization in low-resource and low-computation settings. We examine this limitation by testing various LLMs and specialized translation models on English-Thai machine translation and code-switching datasets. |
JIRAT CHIARANAIPANICH et. al. | arxiv-cs.CL | 2024-10-22 |
| 442 | On Creating An English-Thai Code-switched Machine Translation in Medical Domain Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Our research prioritizes not merely improving translation accuracy but also maintaining medical terminology in English within the translated text through code-switched (CS) translation. We developed a method to produce CS medical translation data, fine-tuned a CS translation model with this data, and evaluated its performance against strong baselines, such as Google Neural Machine Translation (NMT) and GPT-3.5/GPT-4. |
PARINTHAPAT PENGPUN et. al. | arxiv-cs.CL | 2024-10-21 |
| 443 | Learning from Others’ Mistakes: Finetuning Machine Translation Models with Span-level Error Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we explore the potential of utilizing fine-grained span-level annotations from offline datasets to improve model quality. |
LILY H. ZHANG et. al. | arxiv-cs.CL | 2024-10-21 |
| 444 | A Study on Non-Autoregressive Mongolian-Chinese Neural Machine Translation for Multilingual Pre-Training Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Aiming at the problem that most of the current Mongolian-Chinese Neural Machine Translation (NMT) adopts autoregressive generation, which is prone to error accumulation and slow … |
Xiaoli Zheng; Yonghong Tian; Chang Ma; Kangkang Sun; | 2024 7th International Conference on Machine Learning and … | 2024-10-18 |
| 445 | IIIT-Speech Twins 1.0: An English-Hindi Parallel Speech Corpora for Speech-to-Speech Machine Translation and Automatic Dubbing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The demand for high-quality parallel speech data has been increasing as deep-learning based Speech to Speech Machine Translation (SSMT) and automatic dubbing approaches gain … |
Anindita Mondal; A. Vuppala; Chiranjeevi Yarra; | 2024 27th Conference of the Oriental COCOSDA International … | 2024-10-17 |
| 446 | Quantity Vs. Quality of Monolingual Source Data in Automatic Text Translation: Can It Be Too Little If It Is Too Good? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, it has been shown that too much of this data can be detrimental to the performance of the model if the available parallel data is comparatively extremely low. In this study, we investigate whether the monolingual data can also be too little and if this reduction, based on quality, has any effect on the performance of the translation model. |
Idris Abdulmumin; Bashir Shehu Galadanci; Garba Aliyu; Shamsuddeen Hassan Muhammad; | arxiv-cs.CL | 2024-10-17 |
| 447 | Findings of The WMT 2024 Shared Task on Chat Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents the findings from the third edition of the Chat Translation Shared Task. |
WAFAA MOHAMMED et. al. | arxiv-cs.CL | 2024-10-15 |
| 448 | Study on An Intelligent English Translation Method Using An Improved Convolutional Neural Network Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study uses an enhanced convolutional neural network (CNN) model for English translation. Traditional translation methods often struggle with complex language structures, … |
Lijie Su; | Int. J. e Collab. | 2024-10-15 |
| 449 | Machine Translation Evaluation Benchmark for Wu Chinese: Workflow and Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a FLORES+ dataset as an evaluation benchmark for modern Wu Chinese machine translation models and showcase its compatibility with existing Wu data. |
Hongjian Yu; Yiming Shi; Zherui Zhou; Christopher Haberland; | arxiv-cs.CL | 2024-10-14 |
| 450 | Code-Mixer Ya Nahi: Novel Approaches to Measuring Multilingual LLMs’ Code-Mixing Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Rule-Based Prompting, a novel prompting technique to generate code-mixed sentences. |
Ayushman Gupta; Akhil Bhogal; Kripabandhu Ghosh; | arxiv-cs.CL | 2024-10-14 |
| 451 | Is Hate Lost in Translation?: Evaluation of Multilingual LGBTQIA+ Hate Speech Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper explores the challenges of detecting LGBTQIA+ hate speech of large language models across multiple languages, including English, Italian, Chinese and (code-switched) English-Tamil, examining the impact of machine translation and whether the nuances of hate speech are preserved across translation. |
Fai Leui Chan; Duke Nguyen; Aditya Joshi; | arxiv-cs.CL | 2024-10-14 |
| 452 | QE-EBM: Using Quality Estimators As Energy Loss for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose QE-EBM, a method of employing quality estimators as trainable loss networks that can directly backpropagate to the NMT model. |
Gahyun Yoo; Jay Yoon Lee; | arxiv-cs.CL | 2024-10-14 |
| 453 | Ukrainian-to-English Folktale Corpus: Parallel Corpus Creation and Augmentation for Machine Translation in Low-resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We have created a new Ukrainian-To-English parallel corpus of familiar Ukrainian folktales based on available English translations and suggested several new ones. We offer a combined domain-specific approach to building and augmenting this corpus, considering the nature of the domain and differences in the purpose of human versus machine translation. |
Olena Burda-Lassen; | arxiv-cs.CL | 2024-10-13 |
| 454 | QUEST: Quality-Aware Metropolis-Hastings Sampling for Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we address the problem of sampling a set of high-quality and diverse translations. |
GONÇALO FARIA et. al. | nips | 2024-10-07 |
| 455 | TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we introduce a novel model framework TransVIP that leverages diverse datasets in a cascade fashion yet facilitates end-to-end inference through joint probability. |
CHENYANG LE et. al. | nips | 2024-10-07 |
| 456 | Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a series of advanced explorations of Transformer architecture design to minimize the error compared to the true “solution.” |
BEI LI et. al. | nips | 2024-10-07 |
| 457 | Tibetan-Chinese Machine Translation Enhanced on Cross-Lingual Pre-Trained Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Tibetan-Chinese machine translation has become a focal point of interest within the Tibetan community due to its importance for effective communication and cultural preservation. … |
Mingjun Zhou; Quzong Gesang; Nuo Qun; Tashi Nyima; Rinchen Dongrub; | 2024 IEEE International Conference on Systems, Man, and … | 2024-10-06 |
| 458 | Cogs in A Machine, Doing What They’re Meant to Do – The AMI Submission to The WMT24 General Translation Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents the submission of the \’Arni Magnusson Institute’s team to the WMT24 General translation task. We work on the English->Icelandic translation direction. Our … |
Atli Jasonarson; Hinrik Hafsteinsson; Bjarki ‘Armannsson; Steinth’or Steingr’imsson; | ArXiv | 2024-10-04 |
| 459 | Cogs in A Machine, Doing What They’re Meant to Do — The AMI Submission to The WMT24 General Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents the submission of the \’Arni Magnusson Institute’s team to the WMT24 General translation task. |
Atli Jasonarson; Hinrik Hafsteinsson; Bjarki Ármannsson; Steinþór Steingrímsson; | arxiv-cs.CL | 2024-10-04 |
| 460 | Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our evaluation of prominent LLMs reveals a discernible performance gap against traditional MT systems, highlighting domain overfitting and catastrophic forgetting issues after fine-tuning on domain-limited corpora. To mitigate this, we propose a domain Chain of Thought (CoT) fine-tuning technique that utilizes the intrinsic multi-domain intelligence of LLMs to improve translation performance. |
TIANXIANG HU et. al. | arxiv-cs.CL | 2024-10-03 |
| 461 | Research on English Translation Optimization Algorithm Based on Statistical Machine Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the study titled Research on English Translation Optimization Algorithm Based on Statistical Machine Learning: IAAM-NN (Integrating Advanced Attention Mechanisms with Neural … |
Jinghan Wang; | Scalable Comput. Pract. Exp. | 2024-10-01 |
| 462 | Semantic Textual Similarity: Overview and Comparative Study Between Arabic and English Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: . Semantic Textual Similarity is crucial for various end-user applications of Natural Language Processing, including Search Engines, Chatbots, Machine Translation Systems, … |
Samira Boudaa; Tarik Boudaa; Anass El Haddadi; | Computación y Sistemas (CyS) | 2024-09-30 |
| 463 | Disentangling Singlish Discourse Particles with Task-Driven Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: After disentanglement, we cluster these discourse particles to differentiate their pragmatic functions, and perform Singlish-to-English machine translation. Our work provides a computational method to understanding Singlish discourse particles, and opens avenues towards a deeper comprehension of the language and its usage. |
Linus Tze En Foo; Lynnette Hui Xian Ng; | arxiv-cs.CL | 2024-09-30 |
| 464 | Generation of Test Datasets Using LLM – Quality Assurance Perspective Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Domain relevant data and an adequate number of samples are necessary to properly evaluate the robustness of the Machine Learning (ML) models. This is the case for ML models used … |
Jose Leandro Sousa; Cristian Souza; Raiza Hanada; Diogo Nascimento; Eliane Collins; | Brazilian Symposium on Software Engineering | 2024-09-30 |
| 465 | Can LLMs Really Learn to Translate A Low-Resource Language from One Grammar Book? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Machine Translation from One Book (Tanzer et al., 2024) suggests that prompting long-context LLMs with one grammar book enables English-Kalamang translation, an XLR language unseen by LLMs – a noteworthy case of linguistics helping an NLP task. We investigate the source of this translation ability, finding almost all improvements stem from the book’s parallel examples rather than its grammatical explanations. |
Seth Aycock; David Stap; Di Wu; Christof Monz; Khalil Sima’an; | arxiv-cs.CL | 2024-09-27 |
| 466 | On Translating Technical Terminology: A Translation Workflow for Machine-Translated Acronyms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The typical workflow for a professional translator to translate a document from its source language (SL) to a target language (TL) is not always focused on what many language … |
Richard Yue; John E. Ortega; Kenneth Ward Church; | arxiv-cs.CL | 2024-09-26 |
| 467 | Predicting Anchored Text from Translation Memories for Machine Translation Using Deep Learning Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this article, we show that for a large part of those words which are anchored, we can use other techniques that are based on machine learning approaches such as Word2Vec. |
Richard Yue; John E. Ortega; | arxiv-cs.CL | 2024-09-26 |
| 468 | Machine Translation Advancements of Low-Resource Indian Languages By Transfer Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces the submission by Huawei Translation Center (HW-TSC) to the WMT24 Indian Languages Machine Translation (MT) Shared Task. |
BIN WEI et. al. | arxiv-cs.CL | 2024-09-24 |
| 469 | EuroLLM: Multilingual Language Models for Europe IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce the EuroLLM project, aimed at developing a suite of open-weight multilingual LLMs capable of understanding and generating text in all official European Union languages, as well as several additional relevant languages. |
PEDRO HENRIQUE MARTINS et. al. | arxiv-cs.CL | 2024-09-24 |
| 470 | Context-aware and Style-related Incremental Decoding Framework for Discourse-Level Literary Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This report outlines our approach for the WMT24 Discourse-Level Literary Translation Task, focusing on the Chinese-English language pair in the Constrained Track. |
YUANCHANG LUO et. al. | arxiv-cs.AI | 2024-09-24 |
| 471 | Brotherhood at WMT 2024: Leveraging LLM-Generated Contextual Conversations for Cross-Lingual Image Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we describe our system under the team name Brotherhood for the English-to-Lowres Multi-Modal Translation Task. |
Siddharth Betala; Ishan Chokshi; | arxiv-cs.CL | 2024-09-23 |
| 472 | HW-TSC’s Submission to The CCMT 2024 Machine Translation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents the submission of Huawei Translation Services Center (HW-TSC) to machine translation tasks of the 20th China Conference on Machine Translation (CCMT 2024). |
ZHANGLIN WU et. al. | arxiv-cs.AI | 2024-09-23 |
| 473 | Choose The Final Translation from NMT and LLM Hypotheses Using MBR Decoding: HW-TSC’s Submission to The WMT24 General MT Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents the submission of Huawei Translate Services Center (HW-TSC) to the WMT24 general machine translation (MT) shared task, where we participate in the English to … |
ZHANGLIN WU et. al. | ArXiv | 2024-09-23 |
| 474 | Cross-Lingual Short-Text Semantic Similarity for Kannada-English Language Pair Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Analyzing the semantic similarity of cross-lingual texts is a crucial part of natural language processing (NLP). The computation of semantic similarity is essential for a variety … |
Muralikrishna S N; Raghuram Holla; H. N; Raghavendra Ganiga; | Comput. | 2024-09-18 |
| 475 | RoMath: A Mathematical Reasoning Benchmark in Romanian Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces RoMath, a Romanian mathematical reasoning benchmark suite comprising three subsets: Baccalaureate, Competitions and Synthetic, which cover a range of mathematical domains and difficulty levels, aiming to improve non-English language models and promote multilingual AI development. |
Adrian Cosma; Ana-Maria Bucur; Emilian Radoi; | arxiv-cs.CL | 2024-09-17 |
| 476 | GOSt-MT: A Knowledge Graph for Occupation-related Gender Biases in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a novel approach to studying occupation-related gender bias through the creation of the GOSt-MT (Gender and Occupation Statistics for Machine Translation) Knowledge Graph. |
ORFEAS MENIS MASTROMICHALAKIS et. al. | arxiv-cs.CL | 2024-09-17 |
| 477 | Research on English–Chinese Machine Translation Shift Based on Word Vector Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Qingqing Ma; | Artificial Life and Robotics | 2024-09-16 |
| 478 | Translating Step-by-Step: Decomposing The Translation Process for Improved Translation Quality of Long-Form Texts IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper we present a step-by-step approach to long-form text translation, drawing on established processes in translation studies. |
Eleftheria Briakou; Jiaming Luo; Colin Cherry; Markus Freitag; | arxiv-cs.CL | 2024-09-10 |
| 479 | Evaluation of Google Translate for Mandarin Chinese Translation Using Sentiment and Semantic Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we provide an automated assessment of translation quality of Google Translate with human experts using sentiment and semantic analysis. |
Xuechun Wang; Rodney Beard; Rohitash Chandra; | arxiv-cs.CL | 2024-09-08 |
| 480 | Open Language Data Initiative: Advancing Low-Resource Machine Translation for Karakalpak Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents several contributions for the Karakalpak language: a FLORES+ devtest dataset translated to Karakalpak, parallel corpora for Uzbek-Karakalpak, Russian-Karakalpak and English-Karakalpak of 100,000 pairs each and open-sourced fine-tuned neural models for translation across these languages. |
Mukhammadsaid Mamasaidov; Abror Shopulatov; | arxiv-cs.CL | 2024-09-06 |
| 481 | A Data Selection Approach for Enhancing Low Resource Machine Translation Using Cross-Lingual Sentence Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To mitigate the impact of data quality issues, we propose a data filtering approach based on cross-lingual sentence representations. |
Nidhi Kowtal; Tejas Deshpande; Raviraj Joshi; | arxiv-cs.CL | 2024-09-04 |
| 482 | Creating Domain-Specific Translation Memories for Machine Translation Fine-tuning: The TRENCARD Bilingual Cardiology Corpus Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The article introduces a semi-automatic TM preparation methodology leveraging primarily translation tools used by translators in favor of data quality and control by the translators. |
Gokhan Dogru; | arxiv-cs.CL | 2024-09-04 |
| 483 | Human Versus Neural Machine Translation Creativity: A Study on Manipulated MWEs in Literature IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the digital era, the (r)evolution of neural machine translation (NMT) has reshaped both the market and translators’ workflow. However, the adoption of this technology has not … |
Gloria Corpas Pastor; Laura Noriega-Santiáñez; | Inf. | 2024-09-02 |
| 484 | Towards Cross-Lingual Explanation of Artwork in Large-scale Vision Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In addition, multilingual QA benchmarks that create datasets using machine translation have cultural differences and biases, remaining issues for use as evaluation tasks. To address these challenges, this study created an extended dataset in multiple languages without relying on machine translation. |
SHINTARO OZAKI et. al. | arxiv-cs.CL | 2024-09-02 |
| 485 | Towards Tailored Recovery of Lexical Diversity in Literary Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel approach that consists of reranking translation candidates with a classifier that distinguishes between original and translated text. |
Esther Ploeger; Huiyuan Lai; Rik van Noord; Antonio Toral; | arxiv-cs.CL | 2024-08-30 |
| 486 | Efficient News Synopsis Generation with T5 Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper we propose a Text-To-Text Transfer Transformer-based (T5) model to generate concise, human-like synopses of Merger and Acquisition (M&A) economic news. Leveraging … |
João António; Lyudmila Mihaylova; Petia Georgieva; | 2024 IEEE 12th International Conference on Intelligent … | 2024-08-29 |
| 487 | Cultural Adaptation of Menus: A Fine-Grained Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce the ChineseMenuCSI dataset, the largest for Chinese-English menu corpora, annotated with CSI vs Non-CSI labels and a fine-grained test set. |
Zhonghe Zhang; Xiaoyu He; Vivek Iyer; Alexandra Birch; | arxiv-cs.CL | 2024-08-24 |
| 488 | Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Intuitively, with a growing number of seen languages the encoder sentence representation grows more flexible and easily adaptable to new languages. In this work, we test this hypothesis by zero-shot translating from unseen languages. |
Carlos Mullov; Quan Pham; Alexander Waibel; | acl | 2024-08-20 |
| 489 | Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Babel-ImageNet, a massively multilingual benchmark that offers (partial) translations of ImageNet labels to 100 languages, built without machine translation or manual annotation. |
Gregor Geigle; Radu Timofte; Goran Glava�; | acl | 2024-08-20 |
| 490 | What Is The Best Way for ChatGPT to Translate Poetry? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite promising outcomes, our analysis reveals persistent issues in the translations generated by ChatGPT that warrant attention. To address these shortcomings, we propose an Explanation-Assisted Poetry Machine Translation (EAPMT) method, which leverages monolingual poetry explanation as a guiding information for the translation process. |
Shanshan Wang; Derek Wong; Jingming Yao; Lidia Chao; | acl | 2024-08-20 |
| 491 | Speech Sense Disambiguation: Tackling Homophone Ambiguity in End-to-End Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To facilitate this, we first create a comprehensive homophone dictionary and an annotated dataset rich with homophone information established based on speech-text alignment. Building on this unique dictionary, we introduce AmbigST, an innovative homophone-aware contrastive learning approach that integrates a homophone-aware masking strategy. |
TENGFEI YU et. al. | acl | 2024-08-20 |
| 492 | Large Language Models for Classical Chinese Poetry Translation: Benchmarking, Evaluating, and Improving Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Hence, we propose a Retrieval-Augmented Machine Translation (RAT) method which incorporates knowledge related to classical poetry for advancing the translation of Chinese Poetry in LLMs. |
ANDONG CHEN et. al. | arxiv-cs.CL | 2024-08-19 |
| 493 | Cross-Lingual Conversational Speech Summarization with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We build a baseline cascade-based system using open-source speech recognition and machine translation models. |
Max Nelson; Shannon Wotherspoon; Francis Keith; William Hartmann; Matthew Snover; | arxiv-cs.CL | 2024-08-12 |
| 494 | Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Intuitively, with a growing number of seen languages the encoder sentence representation grows more flexible and easily adaptable to new languages. In this work, we test this hypothesis by zero-shot translating from unseen languages. |
Carlos Mullov; Ngoc-Quan Pham; Alexander Waibel; | arxiv-cs.CL | 2024-08-05 |
| 495 | Transferring Zero-shot Multilingual Chinese-Chinese Translation Model for Chinese Minority Language Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Transfer learning is an effective method to improve the performance of low-resource translation, but its effectiveness heavily relies on specific languages, and transferring … |
Ziyue Yan; Hongying Zan; Yifan Guo; Hongfei Xu; | 2024 International Conference on Asian Language Processing … | 2024-08-04 |
| 496 | Encoder–Decoder Calibration for Multimodal Machine Translation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The main purpose of multimodal machine translation (MMT) is to improve the quality of translation results by taking the corresponding visual context as an additional input. … |
Turghun Tayir; Lin Li; Bei Li; Jianquan Liu; Kong Aik Lee; | IEEE Transactions on Artificial Intelligence | 2024-08-01 |
| 497 | In-Context Example Selection Via Similarity Search Improves Low-Resource Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we focus on machine translation (MT), a task that has been shown to benefit from in-context translation examples. |
Armel Zebaze; Benoît Sagot; Rachel Bawden; | arxiv-cs.CL | 2024-08-01 |
| 498 | Generating Gender Alternatives in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our key technical contribution is a novel semi-supervised solution for generating alternatives that integrates seamlessly with standard MT models and maintains high performance without requiring additional components or increasing inference overhead. |
SARTHAK GARG et. al. | arxiv-cs.CL | 2024-07-29 |
| 499 | The Power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our findings reveal pervasive gender bias across all models, with base LLMs exhibiting a higher degree of bias compared to NMT models. To combat this bias, we explore prompting engineering techniques applied to an instruction-tuned LLM. |
Aleix Sant; Carlos Escolano; Audrey Mash; Francesca De Luca Fornaciari; Maite Melero; | arxiv-cs.CL | 2024-07-26 |
| 500 | Research on Tibetan-Chinese Machine Translation Method Based on Graphic Multimodal Fusion Alignment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This article explores a Tibetan-Chinese machine translation model based on multimodal alignment of images and texts, using the Resnet50 model for feature extraction of images, the … |
Chenghao He; Quzong Gesang; Nuo Qun; Gadeng Luosang; Tashi Nyima; | 2024 6th International Conference on Internet of Things, … | 2024-07-26 |
| 501 | Machine Translation for Open Scholarly Communication: Examining The Relationship Between Translation Quality and Reading Effort Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study assesses the usability of machine-translated texts in scholarly communication, using self-paced reading experiments with texts from three scientific disciplines, … |
L. Macken; Vanessa De Wilde; A. Tezcan; | Inf. | 2024-07-23 |
| 502 | CoVoSwitch: Machine Translation of Synthetic Code-Switched Text Based on Intonation Units Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: With our dataset, CoVoSwitch, spanning 13 languages, we evaluate the code-switching translation performance of two multilingual translation models, M2M-100 418M and NLLB-200 600M. |
Yeeun Kang; | arxiv-cs.CL | 2024-07-19 |
| 503 | Towards Zero-Shot Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose a method to bypass the need for fully supervised data to train MMT systems, using multimodal English data only. |
Matthieu Futeral; Cordelia Schmid; Benoît Sagot; Rachel Bawden; | arxiv-cs.CL | 2024-07-18 |
| 504 | Adaptive Multi-task Learning for Speech to Text Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: End-to-end speech to text translation aims to directly translate speech from one language into text in another, posing a challenging cross-modal task particularly in scenarios of … |
Xin Feng; Yue Zhao; Wei Zong; Xiaona Xu; | EURASIP J. Audio Speech Music. Process. | 2024-07-13 |
| 505 | SPhinX: Sample Efficient Multilingual Instruction Fine-Tuning Through N-shot Guided Prompting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this,we introduce a novel approach for strategically constructing a multilingualsynthetic instruction tuning dataset, sPhinX. |
SANCHIT AHUJA et. al. | arxiv-cs.CL | 2024-07-13 |
| 506 | Towards Chapter-to-Chapter Context-Aware Literary Translation Via Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through our comprehensive analysis, we unveil that literary translation under the Ch2Ch setting is challenging in nature, with respect to both model learning methods and translation decoding algorithms. |
Linghao Jin; Li An; Xuezhe Ma; | arxiv-cs.CL | 2024-07-12 |
| 507 | Segment-Based Interactive Machine Translation for Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Pre-trained large language models (LLM) are starting to be widely used in many applications. In this work, we explore the use of these models in interactive machine translation (IMT) environments. |
Angel Navarro; Francisco Casacuberta; | arxiv-cs.CL | 2024-07-09 |
| 508 | An Automatic Quality Metric for Evaluating Simultaneous Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose an automatic evaluation metric for SI and SiMT focusing on word order synchronization. |
Mana Makinae; Katsuhito Sudoh; Masaru Yamada; Satoshi Nakamura; | arxiv-cs.CL | 2024-07-09 |
| 509 | Enhancing Language Learning Through Technology: Introducing A New English-Azerbaijani (Arabic Script) Parallel Corpus Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a pioneering English-Azerbaijani (Arabic Script) parallel corpus, designed to bridge the technological gap in language learning and machine translation (MT) for under-resourced languages. |
JALIL NOURMOHAMMADI KHIARAK et. al. | arxiv-cs.CL | 2024-07-06 |
| 510 | Identifying Intensity of The Structure and Content in Tweets and The Discriminative Power of Attributes in Context with Referential Translation Machines Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We use referential translation machines (RTMs) to identify the similarity between an attribute and two words in English by casting the task as machine translation performance prediction (MTPP) between the words and the attribute word and the distance between their similarities for Task 10 with stacked RTM models. |
Ergun Biçici; | arxiv-cs.CL | 2024-07-06 |
| 511 | Low Resource Twi-English Parallel Corpus for Machine Translation in Multiple Domains (Twi-2-ENG) Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
EMMANUEL AGYEI et. al. | Discov. Comput. | 2024-07-05 |
| 512 | Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We evaluated three publicly available end-to-end models: Whisper, OWSM 3.1, and SeamlessM4T. |
Tiia Sildam; Andra Velve; Tanel Alumäe; | arxiv-cs.CL | 2024-07-04 |
| 513 | A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conduct experiments on cascade MTL architecture, which consists of one encoder and two decoders. |
Ramakrishna Appicharla; Baban Gain; Santanu Pal; Asif Ekbal; Pushpak Bhattacharyya; | arxiv-cs.CL | 2024-07-03 |
| 514 | Language Portability Strategies for Open-domain Dialogue with Pre-trained Language Models from High to Low Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper we propose a study of linguistic portability strategies of large pre-trained language models (PLMs) used for open-domain dialogue systems in a high-resource language for this task. |
Ahmed Njifenjou; Virgile Sucal; Bassam Jabaian; Fabrice Lefèvre; | arxiv-cs.CL | 2024-07-01 |
| 515 | Is Translation Helpful? An Exploration of Cross-Lingual Transfer in Low-Resource Dialog Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Cross-lingual transfer is important for developing high-quality chatbots in multiple languages to address the imbalanced distribution of language resources. A typical approach of … |
Lei Shen; Shuai Yu; Xiaoyu Shen; | 2024 International Joint Conference on Neural Networks … | 2024-06-30 |
| 516 | Document-Level Machine Translation with Effective Batch-Level Context Representation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: It is critical to provide inter-sentential context for document-level neural machine translation (DocNMT) to achieve higher-quality translations. As the document-level information … |
Kang Zhong; Jie Zhang; Wu Guo; | 2024 International Joint Conference on Neural Networks … | 2024-06-30 |
| 517 | SITD-NMT: Synchronous Inference NMT with Turing Re-Translation Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Conventional Neural Machine Translation (NMT) relies on previous tokens and the hidden state of the target for the inference of the target tokens in the decoding phase, and this … |
NIER WU et. al. | 2024 International Joint Conference on Neural Networks … | 2024-06-30 |
| 518 | Towards Massive Multilingual Holistic Bias Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the current landscape of automatic language generation, there is a need to understand, evaluate, and mitigate demographic biases as existing models are becoming increasingly multilingual. To address this, we present the initial eight languages from the MASSIVE MULTILINGUAL HOLISTICBIAS (MMHB) dataset and benchmark consisting of approximately 6 million sentences representing 13 demographic axes. |
XIAOQING ELLEN TAN et. al. | arxiv-cs.CL | 2024-06-29 |
| 519 | Less Is More: Accurate Speech Recognition & Translation Without Web-Scale Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We argue that state-of-the art accuracy can be reached without relying on web-scale data. |
KRISHNA C. PUVVADA et. al. | arxiv-cs.CL | 2024-06-28 |
| 520 | Sparse Regression for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce \textit{dice} instance selection method for proper selection of training instances, which plays an important role to learn correct feature mappings for improving the source and target coverage of the training set. |
Ergun Biçici; | arxiv-cs.CL | 2024-06-27 |
| 521 | ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Motivated by the widespread increase in the phenomenon of code-switching between Egyptian Arabic and English in recent times, this paper explores the intricacies of machine translation (MT) and automatic speech recognition (ASR) systems, focusing on translating code-switched Egyptian Arabic-English to either English or Egyptian Arabic. Our goal is to present the methodologies employed in developing these systems, utilizing large language models such as LLama and Gemma. |
Ahmed Heakl; Youssef Zaghloul; Mennatullah Ali; Rania Hossam; Walid Gomaa; | arxiv-cs.CL | 2024-06-26 |
| 522 | FFN: A Fine-grained Chinese-English Financial Domain Parallel Corpus Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: For comparison, we also trained an OpenNMT model based on our dataset. We detail problems of LLMs and provide in-depth analysis, intending to stimulate further research and solutions in this largely uncharted territory. |
Yuxin Fu; Shijing Si; Leyi Mai; Xi-ang Li; | arxiv-cs.CL | 2024-06-26 |
| 523 | Neural Machine Translation Using A Pivot Approach: Dogri to English Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The progress of Neural Machine Translation for Indic languages is gradually achieving a noteworthy result in terms of translation accuracy. However, its advancement is impeded by … |
Aarathi Rajagopalan Nair; Shailashree K. Sheshadri; Akula Dhanush; N. V. S. Pradyumna; Deepa Gupta; | 2024 15th International Conference on Computing … | 2024-06-24 |
| 524 | Innovations in Real-Time Speech Translation: Leveraging Griffin-Lim Algorithm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: At present, speech-to-speech translator systems often involve multiple intermediary steps, such as automatic speech recognition (ASR), text-to-text machine translation (MT), and … |
HIMANSHU MAITHANI et. al. | 2024 15th International Conference on Computing … | 2024-06-24 |
| 525 | Bridging The Language Gap: Enhancing English-to-Telugu Translation Using NMT and Encoding Decoding Techniques Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Language translation, particularly in the context of natural language processing, involves converting text or speech from one language into another while preserving its meaning, … |
M. Asmitha; C. R. Kavitha; | 2024 15th International Conference on Computing … | 2024-06-24 |
| 526 | Complexity of Symbolic Representation in Working Memory of Transformer Correlates with The Complexity of A Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper explores the properties of the content of symbolic working memory added to the Transformer model decoder. |
Alsu Sagirova; Mikhail Burtsev; | arxiv-cs.CL | 2024-06-20 |
| 527 | Document Image Machine Translation with Dynamic Multi-pre-trained Models Assembling IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we extend the current TIMT task and propose a novel task, **D**ocument **I**mage **M**achine **T**ranslation to **Markdown** (**DIMT2Markdown**), which aims to translate a source document image with long context and complex layout structure to markdown-formatted target translation. |
YUPU LIANG et. al. | naacl | 2024-06-20 |
| 528 | An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we conduct empirical studies on intra-modal and cross-modal consistency and propose two training strategies, SimRegCR and SimZeroCR, for E2E ST in regular and zero-shot scenarios. |
Pengzhi Gao; Ruiqing Zhang; Zhongjun He; Hua Wu; Haifeng Wang; | naacl | 2024-06-20 |
| 529 | Do Multilingual Language Models Think Better in English? IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce a new approach called self-translate that leverages the few-shot translation capabilities of multilingual language models. |
Julen Etxaniz; Gorka Azkune; Aitor Soroa; Oier Lacalle; Mikel Artetxe; | naacl | 2024-06-20 |
| 530 | Fixing Rogue Memorization in Many-to-One Multilingual Translators of Extremely-Low-Resource Languages By Rephrasing Training Samples Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper we study the fine-tuning of pre-trained large high-resource language models (LLMs) into many-to-one multilingual machine translators for extremely-low-resource languages such as endangered Indigenous languages. |
Paulo Cavalin; Pedro Henrique Domingues; Claudio Pinhanez; Julio Nogima; | naacl | 2024-06-20 |
| 531 | M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This complexity is particularly evident in widely used PDF documents, which represent information visually. This paper addresses this gap by introducing M3T a novel benchmark dataset tailored to evaluate NMT systems on the comprehensive task of translating semi-structured documents. |
BENJAMIN HSU et. al. | naacl | 2024-06-20 |
| 532 | How Effective Is Multi-source Pivoting for Translation of Low Resource Indian Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Taking the case of English to Indian language MT, this paper explores the ‘multi-source translation’ approach with pivoting, using both source and pivot sentences to improve translation. |
Pranav Gaikwad; Meet Doshi; Raj Dabre; Pushpak Bhattacharyya; | arxiv-cs.CL | 2024-06-19 |
| 533 | Does Context Help Mitigate Gender Bias in Neural Machine Translation? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Context-aware models have been previously suggested as a means to mitigate this type of bias. In this work, we examine this claim by analysing in detail the translation of stereotypical professions in English to German, and translation with non-informative context in Basque to Spanish. |
Harritxu Gete; Thierry Etchegoyhen; | arxiv-cs.CL | 2024-06-18 |
| 534 | Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we get the best both worlds by integrating MT encoders directly into LLM backbones via sample-efficient self-distillation. |
Fabian David Schmidt; Philipp Borchert; Ivan Vulić; Goran Glavaš; | arxiv-cs.CL | 2024-06-18 |
| 535 | LiLiuM: EBay’s Large Language Models for E-commerce Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We introduce the LiLiuM series of large language models (LLMs): 1B, 7B, and 13B parameter models developed 100% in-house to fit eBay’s specific needs in the e-commerce domain. … |
CHRISTIAN HEROLD et. al. | ArXiv | 2024-06-17 |
| 536 | LiLiuM: EBay’s Large Language Models for E-commerce Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce the LiLiuM series of large language models (LLMs): 1B, 7B, and 13B parameter models developed 100% in-house to fit eBay’s specific needs in the e-commerce domain. |
CHRISTIAN HEROLD et. al. | arxiv-cs.CL | 2024-06-17 |
| 537 | CoSTA: Code-Switched Speech Translation Using Aligned Speech-Text Interleaving Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we focus on the problem of spoken translation (ST) of code-switched speech in Indian languages to English text. |
Bhavani Shankar; Preethi Jyothi; Pushpak Bhattacharyya; | arxiv-cs.CL | 2024-06-16 |
| 538 | Datasets for Multilingual Answer Sentence Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce new high-quality datasets for AS2 in five European languages (French, German, Italian, Portuguese, and Spanish), obtained through supervised Automatic Machine Translation (AMT) of existing English AS2 datasets such as ASNQ, WikiQA, and TREC-QA using a Large Language Model (LLM). |
Matteo Gabburo; Stefano Campese; Federico Agostini; Alessandro Moschitti; | arxiv-cs.CL | 2024-06-14 |
| 539 | Towards Multilingual Audio-Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we work towards extending Audio-Visual Question Answering (AVQA) to multilingual settings. |
ORCHID CHETIA PHUKAN et. al. | arxiv-cs.LG | 2024-06-13 |
| 540 | Word Order in English-Japanese Simultaneous Interpretation: Analyses and Evaluation Using Chunk-wise Monotonic Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper analyzes the features of monotonic translations, which follow the word order of the source language, in simultaneous interpreting (SI). |
Kosuke Doi; Yuka Ko; Mana Makinae; Katsuhito Sudoh; Satoshi Nakamura; | arxiv-cs.CL | 2024-06-13 |
| 541 | M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This complexity is particularly evident in widely used PDF documents, which represent information visually. This paper addresses this gap by introducing M3T, a novel benchmark dataset tailored to evaluate NMT systems on the comprehensive task of translating semi-structured documents. |
BENJAMIN HSU et. al. | arxiv-cs.CL | 2024-06-12 |
| 542 | Building Bridges: A Dataset for Evaluating Gender-Fair Machine Translation Into German Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We address this research gap by studying gender-fair language in English-to-German MT. Concretely, we enrich a community-created gender-fair language dictionary and sample multi-sentence test instances from encyclopedic text and parliamentary speeches. |
Manuel Lardelli; Giuseppe Attanasio; Anne Lauscher; | arxiv-cs.CL | 2024-06-10 |
| 543 | The Link Between Translation Difficulty and The Quality of Machine Translation: A Literature Review and Empirical Investigation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We survey the relevant literature on translation difficulty and automatic evaluation of machine translation (MT) quality and investigate whether source text’s translation … |
S. Araghi; Alfons Palangkaraya; | Lang. Resour. Evaluation | 2024-06-10 |
| 544 | Recovering Document Annotations for Sentence-level Bitext Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we reconstruct document-level information for three (ParaCrawl, News Commentary, and Europarl) large datasets in German, French, Spanish, Italian, Polish, and Portuguese (paired with English). |
Rachel Wicks; Matt Post; Philipp Koehn; | arxiv-cs.CL | 2024-06-06 |
| 545 | StatBot.Swiss: Bilingual Open Data Exploration in Natural Language IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we release the StatBot.Swiss dataset, the first bilingual benchmark for evaluating Text-to-SQL systems based on real-world applications. |
FARHAD NOORALAHZADEH et. al. | arxiv-cs.CL | 2024-06-05 |
| 546 | What Is The Best Way for ChatGPT to Translate Poetry? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite promising outcomes, our analysis reveals persistent issues in the translations generated by ChatGPT that warrant attention. To address these shortcomings, we propose an Explanation-Assisted Poetry Machine Translation (EAPMT) method, which leverages monolingual poetry explanation as a guiding information for the translation process. |
Shanshan Wang; Derek F. Wong; Jingming Yao; Lidia S. Chao; | arxiv-cs.CL | 2024-06-05 |
| 547 | Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We find that these artifacts can significantly affect the models, confirmed by extensive experiments across diverse models, languages, and translation processes. In light of this, we present a simple data augmentation strategy that can alleviate the adverse impacts of translation artifacts. |
CHAEHUN PARK et. al. | arxiv-cs.CL | 2024-06-04 |
| 548 | How Multilingual Are Large Language Models Fine-Tuned for Translation? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: How does translation fine-tuning impact the MT capabilities of LLMs for zero-shot languages, zero-shot language pairs, and translation tasks that do not involve English? To address these questions, we conduct an extensive empirical evaluation of the translation quality of the TOWER family of language models (Alves et al., 2024) on 132 translation tasks from the multi-parallel FLORES-200 data. |
Aquia Richburg; Marine Carpuat; | arxiv-cs.CL | 2024-05-30 |
| 549 | Significance of Chain of Thought in Gender Bias Mitigation for English-Dravidian Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper examines gender bias in machine translation systems for languages such as Telugu and Kan- nada from the Dravidian family, analyzing how gender inflections affect translation accuracy and neutrality using Google Translate and Chat- GPT. |
Lavanya Prahallad; Radhika Mamidi; | arxiv-cs.CL | 2024-05-30 |
| 550 | Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a new data pruning technique: Checkpoints Across Time (CAT), that leverages early model training dynamics to identify the most relevant data points for model performance. |
EVERLYN ASIKO CHIMOTO et. al. | arxiv-cs.CL | 2024-05-29 |
| 551 | Spanish and LLM Benchmarks: Is MMLU Lost in Translation? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The evaluation of Large Language Models (LLMs) is a key element in their continuous improvement process and many benchmarks have been developed to assess the performance of LLMs … |
IRENE PLAZA et. al. | ArXiv | 2024-05-28 |
| 552 | QUEST: Quality-Aware Metropolis-Hastings Sampling for Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we address the problem of sampling a set of high-quality and diverse translations. |
GONÇALO R. A. FARIA et. al. | arxiv-cs.CL | 2024-05-28 |
| 553 | TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we introduce a novel model framework TransVIP that leverages diverse datasets in a cascade fashion yet facilitates end-to-end inference through joint probability. |
CHENYANG LE et. al. | arxiv-cs.CL | 2024-05-28 |
| 554 | Multimodal Machine Translation Approaches for Indian Languages: A Comprehensive Survey Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multimodal machine translation (MMT) is a challenging task in the linguistically diverse Indian landscape. Machine translation refers to the task of automatically converting … |
Binnu Paul; D. Rudrapal; Kunal Chakma; Anupam Jamatia; | J. Univers. Comput. Sci. | 2024-05-28 |
| 555 | Improving Language Models Trained on Translated Data with Continual Pre-Training and Dictionary Learning Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we investigate the role of translation and synthetic data in training language models. |
Sabri Boughorbel; MD Rizwan Parvez; Majd Hawasly; | arxiv-cs.CL | 2024-05-23 |
| 556 | MELD-ST: An Emotion-aware Speech Translation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the MELD-ST dataset for the emotion-aware speech translation task, comprising English-to-Japanese and English-to-German language pairs. |
SIROU CHEN et. al. | arxiv-cs.CL | 2024-05-21 |
| 557 | From Translation to Generative LLMs: Classification of Code-Mixed Affective Tasks IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Code-mixed (CM) discourse combines multiple languages in a single text. It is commonly used in informal discourse in countries with several official languages, but also in many … |
ANJALI YADAV et. al. | IEEE Transactions on Affective Computing | 2024-05-21 |
| 558 | DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce DiffNorm, a diffusion-based normalization strategy that simplifies data distributions for training NAT models. |
Weiting Tan; Jingyu Zhang; Lingfeng Shen; Daniel Khashabi; Philipp Koehn; | arxiv-cs.CL | 2024-05-21 |
| 559 | MindWave App: Leveraging AI for Mental Health Support in English and Arabic Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The global concern regarding mental health, especially amidst crises like stress, burnout or depression, necessitates accessible platforms catering to diverse populations, … |
Nouhaila Bensalah; H. Ayad; A. Adib; Abdelhamid Ibn El Farouk; | 2024 IEEE 12th International Symposium on Signal, Image, … | 2024-05-21 |
| 560 | Beyond MLE: Investigating SEARNN for Low-Resourced Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With an average BLEU score improvement of $5.4$\% over the MLE objective, we proved that SEARNN is indeed a viable algorithm to effectively train RNNs on machine translation for low-resourced languages. |
Chris Emezue; | arxiv-cs.CL | 2024-05-20 |
| 561 | Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Accordingly, they still fall short in using gender-inclusive language, also representative of non-binary identities. In this paper, we look at gender-inclusive neomorphemes, neologistic elements that avoid binary gender markings as an approach towards fairer MT. In this direction, we explore prompting techniques with large language models (LLMs) to translate from English into Italian using neomorphemes. |
Andrea Piergentili; Beatrice Savoldi; Matteo Negri; Luisa Bentivogli; | arxiv-cs.CL | 2024-05-14 |
| 562 | LLM-Assisted Rule Based Machine Translation for Low/No-Resource Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a new paradigm for machine translation that is particularly useful for no-resource languages (those without any publicly available bilingual or monolingual corpora): LLM-RBMT (LLM-Assisted Rule Based Machine Translation). |
Jared Coleman; Bhaskar Krishnamachari; Khalil Iskarous; Ruben Rosales; | arxiv-cs.CL | 2024-05-14 |
| 563 | CANTONMT: Investigating Back-Translation and Model-Switch Mechanisms for Cantonese-English Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper investigates the development and evaluation of machine translation models from Cantonese to English, where we propose a novel approach to tackle low-resource language translations. |
Kung Yin Hong; Lifeng Han; Riza Batista-Navarro; Goran Nenadic; | arxiv-cs.CL | 2024-05-13 |
| 564 | An Empirical Study on The Robustness of Massively Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we empirically investigate the translation robustness of Indonesian-Chinese translation in the face of various naturally occurring noise. |
Leiyu Pan; Deyi Xiong; | arxiv-cs.CL | 2024-05-13 |
| 565 | Neural Machine Translation Based on Semantic Word Replacement Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, there has been significant progress in the quality of neural machine translation technology, largely attributed to the abundance of high-quality bilingual … |
Bo Jin; | Proceedings of the 2024 International Conference on … | 2024-05-10 |
| 566 | Using Machine Translation to Augment Multilingual Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we explore the effects of using machine translation to fine-tune a multilingual model for a classification task across multiple languages. |
Adam King; | arxiv-cs.CL | 2024-05-08 |
| 567 | Sentiment Analysis Across Languages: Evaluation Before and After Machine Translation to English Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper examines the performance of transformer models in Sentiment Analysis tasks across multilingual datasets and text that has undergone machine translation. |
Aekansh Kathunia; Mohammad Kaif; Nalin Arora; N Narotam; | arxiv-cs.CL | 2024-05-05 |
| 568 | A Multimodal French Corpus of Aligned Speech, Text, and Pictogram Sequences for Speech-to-Pictogram Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The automatic translation of spoken language into pictogram units can facilitate communication involving individuals with language impairments. However, there is no established … |
C. MACAIRE et. al. | International Conference on Language Resources and … | 2024-05-01 |
| 569 | Multilinguality or Back-translation? A Case Study with Estonian Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine translation quality is highly reliant on large amounts of training data, and, when a limited amount of parallel data is available, synthetic back-translated or … |
Elizaveta Korotkova; Taido Purason; Agnes Luhtaru; Mark Fishel; | International Conference on Language Resources and … | 2024-05-01 |
| 570 | Construction of English Corpus Oral Instant Translation Model Based on Internet of Things and Deep Learning of Information Security Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In order to improve the security and performance of the oral English instant translation model, this paper optimizes the instant translation model through the Internet of Things … |
He Cang; Dan Feng; | Journal of Computational Methods in Science and Engineering | 2024-05-01 |
| 571 | Unmasking Biases: Exploring Gender Bias in English-Catalan Machine Translation Through Tokenization Analysis and Novel Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a comprehensive evaluation of gender bias in English-Catalan machine translation, encompassing the creation of a novel language resource and an analysis of … |
Audrey Mash; C. Escolano; Aleix Sant; Maite Melero; Francesca de Luca Fornaciari; | International Conference on Language Resources and … | 2024-05-01 |
| 572 | Indic-TEDST: Datasets and Baselines for Low-Resource Speech to Text Translation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Speech-to-text (ST) task is the translation of speech in a language to text in a different language. It has use cases in subtitling, dubbing, etc. Traditionally, ST task has been … |
Nivedita Sethiya; Saanvi Nair; C. Maurya; | International Conference on Language Resources and … | 2024-05-01 |
| 573 | UMTIT: Unifying Recognition, Translation, and Generation for Multimodal Text Image Translation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Prior research in Image Machine Translation (IMT) has focused on either translating the source image solely into the target language text or exclusively into the target image. As … |
Liqiang Niu; Fandong Meng; Jie Zhou; | International Conference on Language Resources and … | 2024-05-01 |
| 574 | Born A BabyNet with Hierarchical Parental Supervision for End-to-End Text Image Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Text image machine translation (TIMT) aims at translating source language texts in images into another target language, which has been proven successful by bridging text image … |
CONG MA et. al. | International Conference on Language Resources and … | 2024-05-01 |
| 575 | Benchmarking The Performance of Machine Translation Evaluation Metrics with Chinese Multiword Expressions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: To investigate the impact of Multiword Expressions (MWEs) on the fine-grained performance of the state-of-the-art metrics for Machine Translation Evaluation (MTE), we conduct … |
Huacheng Song; Hongzhi Xu; | International Conference on Language Resources and … | 2024-05-01 |
| 576 | E-learning Application in English Writing Classroom Based on Neural Machine Translation and Semantic Analysis Algorithms IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yaqiu Wang; | Entertain. Comput. | 2024-05-01 |
| 577 | Humanistic Buddhism Corpus: A Challenging Domain-Specific Dataset of English Translations for Classical and Modern Chinese Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We introduce the Humanistic Buddhism Corpus (HBC), a dataset containing over 80,000 Chinese-English parallel phrases extracted and translated from publications in the domain of … |
Youheng W. Wong; N. Parde; Erdem Koyuncu; | International Conference on Language Resources and … | 2024-05-01 |
| 578 | A Reinforcement Learning Approach to Improve Low-Resource Machine Translation Leveraging Domain Monolingual Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Due to the lack of parallel data, the mainstream fine-tuning-based domain adaptation methods have the overfitting problem in the translation of low-resource domains, and it is … |
HONGXIAO ZHANG et. al. | International Conference on Language Resources and … | 2024-05-01 |
| 579 | EPOQUE: An English-Persian Quality Estimation Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Translation quality estimation (QE) is an important component in real-world machine translation applications. Unfortunately, human labeled QE datasets, which play an important … |
Mohammed Hossein Jafari Harandi; Fatemeh Azadi; M. Dousti; Heshaam Faili; | International Conference on Language Resources and … | 2024-05-01 |
| 580 | GAATME: A Genetic Algorithm for Adversarial Translation Metrics Evaluation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Building on a recent method for decoding translation candidates from a Machine Translation (MT) model via a genetic algorithm, we modify it to generate adversarial translations to … |
J. Jon; Ondrej Bojar; | International Conference on Language Resources and … | 2024-05-01 |
| 581 | Esposito: An English-Persian Scientific Parallel Corpus for Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Neural machine translation requires large number of parallel sentences along with in-domain parallel data to attain best results. Nevertheless, no scientific parallel corpus for … |
Mersad Esalati; M. Dousti; Heshaam Faili; | International Conference on Language Resources and … | 2024-05-01 |
| 582 | Context-Aware Machine Translation with Source Coreference Explanation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This can lead to the explain-away effect, wherein the models only consider features easier to explain predictions, resulting in inaccurate translations. To address this issue, we propose a model that explains the decisions made for translation by predicting coreference features in the input. |
Huy Hien Vu; Hidetaka Kamigaito; Taro Watanabe; | arxiv-cs.CL | 2024-04-30 |
| 583 | Suvach — Generated Hindi QA Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a new benchmark specifically designed for evaluating Hindi EQA models and discusses the methodology to do the same for any task. |
Vaishak Narayanan; Prabin Raj KP; Saifudheen Nouphal; | arxiv-cs.CL | 2024-04-30 |
| 584 | Suvach – Generated Hindi QA Benchmark Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Current evaluation benchmarks for question answering (QA) in Indic languages often rely on machine translation of existing English datasets. This approach suffers from bias and … |
Vaishak Narayanan; KP PrabinRaj; Saifudheen Nouphal; | ArXiv | 2024-04-30 |
| 585 | 3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Multimodal machine translation (MMT) is a challenging task that seeks to improve translation quality by incorporating visual information. However, recent studies have indicated … |
XINYU MA et. al. | International Conference on Language Resources and … | 2024-04-29 |
| 586 | Learning Domain Specific Sub-layer Latent Variable for Multi-Domain Adaptation Neural Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Domain adaptation proves to be an effective solution for addressing inadequate translation performance within specific domains. However, the straightforward approach of mixing … |
SHUANGHONG HUANG et. al. | ACM Transactions on Asian and Low-Resource Language … | 2024-04-29 |
| 587 | Prefix Text As A Yarn: Eliciting Non-English Alignment in Foundation Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We critically examine this hypothesis within the scope of cross-lingual generation tasks, proposing that the effectiveness of SFT may be constrained by its reliance on prior tokens to guide cross-lingual generation. Based on this crucial insight, and in response to the challenges posed by the costly and limited availability of non-English data for SFT, we introduce a novel training-free alignment method named PreTTY, which employs minimal task-related prior tokens to bridge the foundation LLM and the SFT LLM, achieving comparable performance without training. |
Runzhe Zhan; Xinyi Yang; Derek F. Wong; Lidia S. Chao; Yue Zhang; | arxiv-cs.CL | 2024-04-25 |
| 588 | Setting Up The Data Printer with Improved English to Ukrainian Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Examples of task performance expressed in English are abundant, so with a high-quality translation system our community will be enabled to curate datasets faster. To aid this goal, we introduce a recipe to build a translation system using supervised finetuning of a large pretrained language model with a noisy parallel dataset of 3M pairs of Ukrainian and English sentences followed by a second phase of training using 17K examples selected by k-fold perplexity filtering on another dataset of higher quality. |
Yurii Paniv; Dmytro Chaplynskyi; Nikita Trynus; Volodymyr Kyrylov; | arxiv-cs.CL | 2024-04-23 |
| 589 | Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we examine using local Generative Pretrained Transformer (GPT) models to perform automated zero shot black-box, sentence wise, multi-natural-language translation into English text. |
Elijah Pelofske; Vincent Urias; Lorie M. Liebrock; | arxiv-cs.CL | 2024-04-22 |
| 590 | From LLM to NMT: Advancing Low-Resource Machine Translation with Claude IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We show that Claude 3 Opus, a large language model (LLM) released by Anthropic in March 2024, exhibits stronger machine translation competence than other LLMs. |
Maxim Enis; Mark Hopkins; | arxiv-cs.CL | 2024-04-21 |
| 591 | Grammatical Error Correction for Code-Switched Sentences By Learners of English Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we conduct the first exploration into the use of GEC systems on CSW text. |
Kelvin Wey Han Chan; Christopher Bryant; Li Nguyen; Andrew Caines; Zheng Yuan; | arxiv-cs.CL | 2024-04-18 |
| 592 | Vector Quantization Knowledge Transfer for End-to-End Text Image Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To better align TIMT features with MT semantic features, we propose a novel Vector Quantization Knowledge Transfer (VQKT) method that employs a trainable codebook to quantize continuous features into discrete space. |
C. Ma; Y. Zhang; Y. Zhao; Y. Zhou; C. Zong; | icassp | 2024-04-15 |
| 593 | Memory-Augmented Speech-to-text Translation with Multi-Scale Context Translation Strategy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose memory-augmented speech-to-text translation, which leverages a memory module to perform context-aware translation. |
Y. Yuan; Y. Zhou; X. Shi; | icassp | 2024-04-15 |
| 594 | M2BART: Multilingual and Multimodal Encoder-Decoder Pre-Training for Any-to-Any Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces M2BART, a streamlined multilingual and multimodal framework for encoderdecoder models. |
P. -J. Chen; B. Shi; K. Niu; A. Lee; W. -N. Hsu; | icassp | 2024-04-15 |
| 595 | Vector Quantization Knowledge Transfer for End-to-End Text Image Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: End-to-end text image machine translation (TIMT) aims at translating source language embedded in images into target language without recognizing intermediate texts in images. … |
Cong Ma; Yaping Zhang; Yang Zhao; Yu Zhou; Chengqing Zong; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
| 596 | CAILMD-23 at SemEval-2024 Task 1: Multilingual Evaluation of Semantic Textual Relatedness Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The explosive growth of online content demands robust Natural Language Processing (NLP) techniques that can capture nuanced meanings and cultural context across diverse languages. … |
SHARVI ENDAIT et. al. | ArXiv | 2024-04-13 |
| 597 | Multilingual Evaluation of Semantic Textual Relatedness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work aims to not only showcase our achievements but also inspire further research in multilingual STR, particularly for low-resourced languages. |
SHARVI ENDAIT et. al. | arxiv-cs.CL | 2024-04-13 |
| 598 | Investigating Neural Machine Translation for Low-Resource Languages: Using Bavarian As A Case Study IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we revisit state-of-the-art Neural Machine Translation techniques to develop automatic translation systems between German and Bavarian. |
Wan-Hua Her; Udo Kruschwitz; | arxiv-cs.CL | 2024-04-12 |
| 599 | Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Thiswork exploits the complementary strengths of LLMs and supervised MT by guidingLLMs to automatically post-edit MT with external feedback on its quality,derived from Multidimensional Quality Metric (MQM) annotations. Working withLLaMA-2 models, we consider prompting strategies varying the nature of feedbackprovided and then fine-tune the LLM to improve its ability to exploit theprovided guidance. |
Dayeon Ki; Marine Carpuat; | arxiv-cs.CL | 2024-04-11 |
| 600 | MedMT5: An Open-Source Multilingual Text-to-Text LLM for The Medical Domain IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Research on language technology for the development of medical applications is currently a hot topic in Natural Language Understanding and Generation. Thus, a number of large … |
IKER GARC’IA-FERRERO et. al. | ArXiv | 2024-04-11 |
| 601 | Medical MT5: An Open-Source Multilingual Text-to-Text LLM for The Medical Domain IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This is particularly true of text-to-text models, which typically require large amounts of domain-specific pre-training data, often not easily accessible for many languages. In this paper, we address these shortcomings by compiling, to the best of our knowledge, the largest multilingual corpus for the medical domain in four languages, namely English, French, Italian and Spanish. |
IKER GARCÍA-FERRERO et. al. | arxiv-cs.CL | 2024-04-11 |
| 602 | A Data Selection Approach for Enhancing Low Resource Machine Translation Using Cross Lingual Sentence Representations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine translation in low-resource language pairs faces significant challenges due to the scarcity of parallel corpora and linguistic resources. This study focuses on the … |
Nidhi Kowtal; Tejas Deshpande; Raviraj Joshi; | 2024 IEEE 9th International Conference for Convergence in … | 2024-04-05 |
| 603 | Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose using fine-tuned Large Language Models (LLM) to generate translations that can accurately capture the nuances of cybercrime language. |
Veronica Valeros; Anna Širokova; Carlos Catania; Sebastian Garcia; | arxiv-cs.CL | 2024-04-02 |
| 604 | Low-resource Neural Machine Translation with Morphological Modeling IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose a framework-solution for modeling complex morphology in low-resource settings. |
Antoine Nzeyimana; | arxiv-cs.CL | 2024-04-02 |
| 605 | Going Beyond Word Matching: Syntax Improves In-context Example Selection for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a syntax-based in-context example selection method for MT, by computing the syntactic similarity between dependency trees using Polynomial Distance. |
Chenming Tang; Zhixiang Wang; Yunfang Wu; | arxiv-cs.CL | 2024-03-28 |
| 606 | KazParC: Kazakh Parallel Corpus for Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce KazParC, a parallel corpus designed for machine translation across Kazakh, English, Russian, and Turkish. |
Rustem Yeshpanov; Alina Polonskaya; Huseyin Atakan Varol; | arxiv-cs.CL | 2024-03-28 |
| 607 | A Tulu Resource for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present the first parallel dataset for English-Tulu translation. |
Manu Narayanan; Noëmi Aepli; | arxiv-cs.CL | 2024-03-28 |
| 608 | Improving Vietnamese-English Medical Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce MedEV — a high-quality Vietnamese-English parallel dataset constructed specifically for the medical domain, comprising approximately 360K sentence pairs. |
Nhu Vo; Dat Quoc Nguyen; Dung D. Le; Massimo Piccardi; Wray Buntine; | arxiv-cs.CL | 2024-03-28 |
| 609 | Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we tackle the problem of code-mixed (Hinglish and Bengalish) to English machine translation. |
Kartik Kartik; Sanjana Soni; Anoop Kunchukuttan; Tanmoy Chakraborty; Md Shad Akhtar; | arxiv-cs.CL | 2024-03-25 |
| 610 | Isometric Neural Machine Translation Using Phoneme Count Ratio Reward-based Reinforcement Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Traditional Automatic Video Dubbing (AVD) pipeline consists of three key modules, namely, Automatic Speech Recognition (ASR), Neural Machine Translation (NMT), and Text-to-Speech … |
SHIVAM MHASKAR et. al. | NAACL-HLT | 2024-03-20 |
| 611 | Multi-Dimensional Machine Translation Evaluation: Model Evaluation and Resource for Korean Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Previous studies have demonstrated the feasibility of MQM annotation but there are, to our knowledge, no computational models that predict MQM scores for novel texts, due to a lack of resources. In this paper, we address these shortcomings by (a) providing a 1200-sentence MQM evaluation benchmark for the language pair English-Korean and (b) reframing MT evaluation as the multi-task problem of simultaneously predicting several MQM scores using SOTA language models, both in a reference-based MT evaluation setup and a reference-free quality estimation (QE) setup. |
Dojun Park; Sebastian Padó; | arxiv-cs.CL | 2024-03-19 |
| 612 | Enhancing Taiwanese Hokkien Dual Translation By Exploring and Standardizing of Four Writing Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Machine translation focuses mainly on high-resource languages (HRLs), while low-resource languages (LRLs) like Taiwanese Hokkien are relatively under-explored. The study aims to address this gap by developing a dual translation model between Taiwanese Hokkien and both Traditional Mandarin Chinese and English. |
Bo-Han Lu; Yi-Hsuan Lin; En-Shiun Annie Lee; Richard Tzong-Han Tsai; | arxiv-cs.CL | 2024-03-18 |
| 613 | CantonMT: Cantonese to English NMT Platform with Fine-Tuned Models Using Real and Synthetic Back-Translation Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Neural Machine Translation (NMT) for low-resource languages remains a challenge for many NLP researchers. In this work, we deploy a standard data augmentation methodology by … |
Kung Yin Hong; Lifeng Han; R. Batista-Navarro; Goran Nenadic; | ArXiv | 2024-03-17 |
| 614 | CantonMT: Cantonese to English NMT Platform with Fine-Tuned Models Using Synthetic Back-Translation Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we deploy a standard data augmentation methodology by back-translation to a new language translation direction Cantonese-to-English. |
Kung Yin Hong; Lifeng Han; Riza Batista-Navarro; Goran Nenadic; | arxiv-cs.CL | 2024-03-17 |
| 615 | Scaling Behavior of Machine Translation with Large Language Models Under Prompt Injection Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Their generality, however, opens them up to subversion by end users who may embed into their requests instructions that cause the model to behave in unauthorized and possibly unsafe ways. In this work we study these Prompt Injection Attacks (PIAs) on multiple families of LLMs on a Machine Translation task, focusing on the effects of model size on the attack success rates. |
Zhifan Sun; Antonio Valerio Miceli-Barone; | arxiv-cs.CL | 2024-03-14 |
| 616 | To Label or Not to Label: Hybrid Active Learning for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Both approaches have limitations – diversity methods may extract varied but trivial examples, while uncertainty sampling can yield repetitive, uninformative instances. To bridge this gap, we propose Hybrid Uncertainty and Diversity Sampling (HUDS), an AL strategy for domain adaptation in NMT that combines uncertainty and diversity for sentence selection. |
Abdul Hameed Azeemi; Ihsan Ayyub Qazi; Agha Ali Raza; | arxiv-cs.CL | 2024-03-14 |
| 617 | Basque and Spanish Counter Narrative Generation: Data Creation and Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present CONAN-EUS, a new Basque and Spanish dataset for CN generation developed by means of Machine Translation (MT) and professional post-edition. |
Jaione Bengoetxea; Yi-Ling Chung; Marco Guerini; Rodrigo Agerri; | arxiv-cs.CL | 2024-03-14 |
| 618 | Triples-to-isiXhosa (T2X): Addressing The Challenges of Low-Resource Agglutinative Data-to-Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper we tackle data-to-text for isiXhosa, which is low-resource and agglutinative. |
Francois Meyer; Jan Buys; | arxiv-cs.CL | 2024-03-12 |
| 619 | Consensus-Based Machine Translation for Code-Mixed Texts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multilingualism in India is widespread due to its long history of foreign acquaintances. This leads to the presence of an audience familiar with conversing using more than one … |
S. Mahata; Dipankar Das; Sivaji Bandyopadhyay; | ACM Transactions on Asian and Low-Resource Language … | 2024-03-09 |
| 620 | Enhanced Auto Language Prediction with Dictionary Capsule — A Novel Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The paper presents a novel Auto Language Prediction Dictionary Capsule (ALPDC) framework for language prediction and machine translation. |
PINNI VENKATA ABHIRAM et. al. | arxiv-cs.CL | 2024-03-09 |
| 621 | Cross-lingual Transfer or Machine Translation? On Data Augmentation for Monolingual Semantic Textual Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we directly compared two data augmentation techniques as potential solutions for monolingual STS: (a) cross-lingual transfer that exploits English resources alone as training data to yield non-English sentence embeddings as zero-shot inference, and (b) machine translation that coverts English data into pseudo non-English training data in advance. |
Sho Hoshino; Akihiko Kato; Soichiro Murakami; Peinan Zhang; | arxiv-cs.CL | 2024-03-08 |
| 622 | BiVert: Bidirectional Vocabulary Evaluation Using Relations for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a bidirectional semantic-based evaluation method designed to assess the sense distance of the translation from the source text. |
Carinne Cherf; Yuval Pinter; | arxiv-cs.CL | 2024-03-06 |
| 623 | Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present an Indian English-to-Hindi SSMT system that can transfer stress and aim to enhance the overall quality and engagement of educational content. |
Sai Akarsh; Vamshi Raghusimha; Anindita Mondal; Anil Vuppala; | arxiv-cs.CL | 2024-03-06 |
| 624 | GaHealth: An English-Irish Bilingual Corpus of Health Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Our study outlines the process used in developing the corpus and empirically demonstrates the benefits of using an in-domain dataset for the health domain. |
Séamus Lankford; Haithem Afli; Órla Ní Loinsigh; Andy Way; | arxiv-cs.CL | 2024-03-06 |
| 625 | GaHealth: An English–Irish Bilingual Corpus of Health Data IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine Translation is a mature technology for many high-resource language pairs. However in the context of low-resource languages, there is a paucity of parallel data datasets … |
Séamus Lankford; Haithem Afli; Orla Ni Loinsigh; Andy Way; | ArXiv | 2024-03-06 |
| 626 | General2Specialized LLMs Translation for E-commerce IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Taking e-commerce as an example, the texts usually include amounts of domain-related words and have more grammar problems, which leads to inferior performances of current NMT methods. To address these problems, we collect two domain-related resources, including a set of term pairs (aligned Chinese-English bilingual terms) and a parallel corpus annotated for the e-commerce domain. |
KAIDI CHEN et. al. | arxiv-cs.CL | 2024-03-06 |
| 627 | Adding Multimodal Capabilities to A Text-only Translation Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to perform well on both Multi30k and typical text-only datasets, we use a performant text-only machine translation (MT) model as the starting point of our MMT model. |
Vipin Vijayan; Braeden Bowen; Scott Grigsby; Timothy Anderson; Jeremy Gwinnup; | arxiv-cs.CL | 2024-03-05 |
| 628 | The Case for Evaluating Multimodal Translation Models on Text Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Namely, the use of visual information by the MMT model cannot be shown directly from the Multi30k test set results and the sentences in Multi30k are are image captions, i.e., short, descriptive sentences, as opposed to complex sentences that typical text-only machine translation models are evaluated against. Therefore, we propose that MMT models be evaluated using 1) the CoMMuTE evaluation framework, which measures the use of visual information by MMT models, 2) the text-only WMT news translation task test sets, which evaluates translation performance against complex sentences, and 3) the Multi30k test sets, for measuring MMT model performance against a real MMT dataset. |
Vipin Vijayan; Braeden Bowen; Scott Grigsby; Timothy Anderson; Jeremy Gwinnup; | arxiv-cs.CL | 2024-03-05 |
| 629 | Transformers for Low-Resource Languages: Is Féidir Linn! IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The Transformer model is the state-of-the-art in Machine Translation. However and in general and neural translation models often under perform on language pairs with insufficient … |
Séamus Lankford; H. Alfi; Andy Way; | Machine Translation Summit | 2024-03-04 |
| 630 | Enhancing Neural Machine Translation of Low-Resource Languages: Corpus Development, Human Evaluation and Explainable AI Architectures Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the current machine translation (MT) landscape, the Transformer architecture stands out as the gold standard, especially for high-resource language pairs. This research delves … |
Séamus Lankford; | ArXiv | 2024-03-03 |
| 631 | Machine Translation in The Covid Domain: An English-Irish Case Study for LoResMT 2021 IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Translation models for the specific domain of translating Covid data from English to Irish were developed for the LoResMT 2021 shared task. |
Séamus Lankford; Haithem Afli; Andy Way; | arxiv-cs.CL | 2024-03-02 |
| 632 | A Benchmark for Learning to Translate A New Language from One Grammar Book IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We turn to a field that is explicitly motivated and bottlenecked by a scarcity of web data: low-resource languages. In this paper, we introduce MTOB (Machine Translation from One Book), a benchmark for learning to translate between English and Kalamang—a language with less than 200 speakers and therefore virtually no presence on the web—using several hundred pages of field linguistics reference materials. |
Garrett Tanzer; Mirac Suzgun; Eline Visser; Dan Jurafsky; Luke Melas-Kyriazi; | iclr | 2024-02-26 |
| 633 | TEaR: Improving LLM-based Machine Translation with Systematic Self-Refinement IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Importantly, feeding back such error information into the LLMs can lead to self-refinement and result in improved translation performance. Motivated by these insights, we introduce a systematic LLM-based self-refinement translation framework, named \textbf{TEaR}, which stands for \textbf{T}ranslate, \textbf{E}stimate, \textbf{a}nd \textbf{R}efine, marking a significant step forward in this direction. |
ZHAOPENG FENG et. al. | arxiv-cs.CL | 2024-02-26 |
| 634 | A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we propose a novel fine-tuning approach for LLMs that is specifically designed for the translation task, eliminating the need for the abundant parallel data that traditional translation models usually depend on. |
Haoran Xu; Young Jin Kim; Amr Sharaf; Hany Hassan Awadalla; | iclr | 2024-02-26 |
| 635 | Direct Punjabi to English Speech Translation Using Discrete Units Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With a motive to contribute towards speech translation research for low-resource languages, our work presents a direct speech-to-speech translation model for one of the Indic languages called Punjabi to English. |
Prabhjot Kaur; L. Andrew M. Bush; Weisong Shi; | arxiv-cs.CL | 2024-02-24 |
| 636 | Bangla AI: A Framework for Machine Translation Utilizing Large Language Models for Ethnic Media Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The paper outlines a theoretical framework elucidating the integration of LLM and MMT into the news searching and translation processes for ethnic media. |
MD Ashraful Goni; Fahad Mostafa; Kerk F. Kee; | arxiv-cs.CL | 2024-02-21 |
| 637 | GATE X-E : A Challenge Set for Gender-Fair Translations from Weakly-Gendered Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite numerous studies on gender bias in translations into English from weakly gendered-languages, there are no benchmarks for evaluating this phenomenon or for assessing mitigation strategies. To address this gap, we introduce GATE X-E, an extension to the GATE (Rarrick et al., 2023) corpus, that consists of human translations from Turkish, Hungarian, Finnish, and Persian into English. |
Spencer Rarrick; Ranjita Naik; Sundar Poudel; Vishal Chowdhary; | arxiv-cs.CL | 2024-02-21 |
| 638 | Could We Have Had Better Multilingual LLMs If English Was Not The Central Language? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large Language Models (LLMs) demonstrate strong machine translation capabilities on languages they are trained on. |
Ryandito Diandaru; Lucky Susanto; Zilu Tang; Ayu Purwarianti; Derry Wijaya; | arxiv-cs.CL | 2024-02-21 |
| 639 | UMBCLU at SemEval-2024 Task 1: Semantic Textual Relatedness with and Without Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The aim of SemEval-2024 Task 1, “Semantic Textual Relatedness for African and Asian Languages” is to develop models for identifying semantic textual relatedness (STR) between two … |
Shubhashis Roy Dipta; Sai Vallurupalli; | Proceedings of the 18th International Workshop on Semantic … | 2024-02-20 |
| 640 | UMBCLU at SemEval-2024 Task 1A and 1C: Semantic Textual Relatedness with and Without Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Large language models (LLMs) have shown impressive performance on several natural language understanding tasks such as multilingual machine translation (MMT), semantic similarity (STS), and encoding sentence embeddings. Using a combination of LLMs that perform well on these tasks, we developed two STR models, $\textit{TranSem}$ and $\textit{FineSem}$, for the supervised and cross-lingual settings. |
Shubhashis Roy Dipta; Sai Vallurupalli; | arxiv-cs.CL | 2024-02-20 |
| 641 | A Study for Enhancing Low-resource Thai-Myanmar-English Neural Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Several methodologies have recently been proposed to enhance the performance of low-resource Neural Machine Translation (NMT). However, these techniques have yet to be explored … |
Mya Ei San; Sasiporn Usanavasin; Ye Kyaw Thu; Manabu Okumura; | ACM Transactions on Asian and Low-Resource Language … | 2024-02-13 |
| 642 | Unsupervised Sign Language Translation and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a sliding window method to address the issues of aligning variable-length text with video sequences. |
ZHENGSHENG GUO et. al. | arxiv-cs.CL | 2024-02-12 |
| 643 | TransLLaMa: LLM-based Simultaneous Translation System IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study demonstrates that, after fine-tuning on a small dataset comprising causally aligned source and target sentence pairs, a pre-trained open-source LLM can control input segmentation directly by generating a special wait token. |
Roman Koshkin; Katsuhito Sudoh; Satoshi Nakamura; | arxiv-cs.CL | 2024-02-07 |
| 644 | Error Analysis of Pretrained Language Models (PLMs) in English-to-Arabic Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
H. Al-Khalifa; Khaloud Al-Khalefah; Hesham Haroon; | Hum. Centric Intell. Syst. | 2024-02-05 |
| 645 | Leveraging Machine Translation to Enhance Sentiment Analysis on Multilingual Text Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This research investigates the complexity of sentiment analysis in the multilingual context of Malaysia, where various languages, including English and Malay, are often … |
Chuk Fong Ho; Kae Lun Chean; Tong Ming Lim; | Proceedings of the 2024 13th International Conference on … | 2024-02-01 |
| 646 | Neural Machine Translation for Malayalam Paraphrase Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores four methods of generating paraphrases in Malayalam, utilizing resources available for English paraphrasing and pre-trained Neural Machine Translation (NMT) models. |
Christeena Varghese; Sergey Koshelev; Ivan P. Yamshchikov; | arxiv-cs.CL | 2024-01-31 |
| 647 | MultiMUC: Multilingual Template Filling on MUC-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce MultiMUC, the first multilingual parallel corpus for template filling, comprising translations of the classic MUC-4 template filling benchmark into five languages: Arabic, Chinese, Farsi, Korean, and Russian. |
WILLIAM GANTT et. al. | arxiv-cs.CL | 2024-01-29 |
| 648 | Massively Multilingual Text Translation For Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We attempt to leverage translation resources from rich-resource languages to efficiently produce best possible translation quality for well known texts, which are available in multiple languages, in a new, low-resource language. |
Zhong Zhou; | arxiv-cs.CL | 2024-01-29 |
| 649 | How Far Can 100 Samples Go? Unlocking Overall Zero-Shot Multilingual Translation Via Tiny Multi-Parallel Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we show that for an English-centric model, surprisingly large zero-shot improvements can be achieved by simply fine-tuning with a very small amount of multi-parallel data. |
Di Wu; Shaomu Tan; Yan Meng; David Stap; Christof Monz; | arxiv-cs.CL | 2024-01-22 |
| 650 | Gender Bias in Machine Translation and The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This chapter examines the role of Machine Translation in perpetuating gender bias, highlighting the challenges posed by cross-linguistic settings and statistical dependencies. |
Eva Vanmassenhove; | arxiv-cs.CL | 2024-01-18 |
| 651 | Machine Translation with Large Language Models: Prompt Engineering for Persian, English, and Russian Directions IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Generative large language models (LLMs) have demonstrated exceptional proficiency in various natural language processing (NLP) tasks, including machine translation, question … |
Nooshin Pourkamali; Shler Ebrahim Sharifi; | ArXiv | 2024-01-16 |
| 652 | A Novel Approach for Automatic Program Repair Using Round-Trip Translation with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back using neural machine translation with language models. We … |
Fernando Vallecillos Ruiz; Anastasiia Grishina; Max Hort; Leon Moonen; | ArXiv | 2024-01-15 |
| 653 | Machine Translation Models Are Zero-Shot Detectors of Translation Direction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we explore an unsupervised approach to translation direction detection based on the simple hypothesis that $p(\text{translation}|\text{original})>p(\text{original}|\text{translation})$, motivated by the well-known simplification effect in translationese or machine-translationese. |
Michelle Wastl; Jannis Vamvas; Rico Sennrich; | arxiv-cs.CL | 2024-01-12 |
| 654 | An Approach for Mistranslation Removal from Popular Dataset for Indic MT Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Hence, the MT systems built using this dataset cannot perform to their usual potential. In this paper, we propose an algorithm to remove mistranslations from the training corpus and evaluate its performance and efficiency. |
Sudhansu Bala Das; Leo Raphael Rodrigues; Tapas Kumar Mishra; Bidyut Kr. Patra; | arxiv-cs.CL | 2024-01-12 |
| 655 | Addressing Data Scarcity Issue for English–Mizo Neural Machine Translation Using Data Augmentation and Language Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Low-resource language in machine translation systems poses multiple complications regarding accuracy in translation due to insufficient incorporation of linguistic information. … |
Vanlalmuansangi Khenglawt; Sahinur Rahman Laskar; Partha Pakray; Ajoy Kumar Khan; | Journal of Intelligent & Fuzzy Systems | 2024-01-11 |
| 656 | End to End Hindi to English Speech Conversion Using Bark, MBART and A Finetuned XLSR Wav2Vec2 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Speech has long been a barrier to effective communication and connection, persisting as a challenge in our increasingly interconnected world. This research paper introduces a … |
Aniket Tathe; Anand Kamble; Suyash Kumbharkar; Atharva Bhandare; Anirban C. Mitra; | ArXiv | 2024-01-11 |
| 657 | Building Efficient and Effective OpenQA Systems for Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we show that effective, low-cost OpenQA systems can be developed for low-resource contexts. |
EMRAH BUDUR et. al. | arxiv-cs.CL | 2024-01-07 |
| 658 | Optimising LLM-Driven Machine Translation with Context-Aware Sliding Windows Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper describes SheffieldGATE’s submission to WMT 2024 Chat Shared Translation Task. We participate in three language pairs: English-German, English-Dutch, and … |
Xinye Yang; Yida Mu; Kalina Bontcheva; Xingyi Song; | Conference on Machine Translation | 2024-01-01 |
| 659 | The SETU-ADAPT Submissions to WMT 2024 Chat Translation Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents the SETU-ADAPT submissions to the WMT24 Chat Translation Task. Large language models (LLM) currently provides the state-of-the-art solutions in many natural … |
Maria Zafar; Antonio Castaldo; Prashant Nayak; Rejwanul Haque; Andy Way; | Conference on Machine Translation | 2024-01-01 |
| 660 | Reducing Redundancy in Japanese-to-English Translation: A Multi-Pipeline Approach for Translating Repeated Elements in Japanese Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a multi-pipeline Japanese-to-English machine translation (MT) system designed to address the challenge of translating repeated elements from Japanese into … |
Qiao Wang; Yixuan Huang; Zheng Yuan; | Conference on Machine Translation | 2024-01-01 |
| 661 | SYSTRAN @ WMT24 Non-Repetitive Translation Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Many contemporary NLP systems rely on neural decoders for text generation, which demonstrate an impressive ability to generate text approaching human fluency levels. However, in … |
Marko Avila; Josep Crego; | Conference on Machine Translation | 2024-01-01 |
| 662 | Graph Representations for Machine Translation in Dialogue Settings Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, we present our approach to the WMT24 – Chat Task, addressing the challenge of translating chat conversations. Chat conversations are characterised by their … |
Lea Krause; Selene Baez Santamaria; Jan-Christoph Kalo; | Conference on Machine Translation | 2024-01-01 |
| 663 | YNU-HPCC at SemEval-2024 Task 7: Instruction Fine-tuning Models for Numerical Understanding and Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents our systems for Task 7, Numeral-Aware Language Understanding and Generation of SemEval 2024. As participants of Task 7, we engage in all subtasks and implement … |
Kaiyuan Chen; Jin Wang; Xuejie Zhang; | International Workshop on Semantic Evaluation | 2024-01-01 |
| 664 | Findings of WMT2024 English-to-Low Resource Multimodal Translation Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents the results of the English-to-Low Resource Multimodal Translation shared tasks from the Ninth Conference on Machine Translation (WMT2024). This year, 7 teams … |
Shantipriya Parida; Ondrej Bojar; Idris Abdulmumin; Shamsuddeen Hassan Muhammad; I. Ahmad; | Conference on Machine Translation | 2024-01-01 |
| 665 | Findings of WMT 2024 Shared Task on Low-Resource Indic Languages Translation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents the results of the low-resource Indic language translation task, organized in conjunction with the Ninth Conference on Machine Translation (WMT) 2024. In this … |
PARTHA PAKRAY et. al. | Conference on Machine Translation | 2024-01-01 |
| 666 | Rakuten’s Participation in WMT 2024 Patent Translation Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper introduces our machine translation system (team sakura ), developed for the 2024 WMT Patent Translation Task. Our sys-tem focuses on translations between … |
Ohnmar Htun; Alberto Poncelas; | Conference on Machine Translation | 2024-01-01 |
| 667 | Spanish Corpus and Provenance with Computer-Aided Translation for The WMT24 OLDI Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents the S EED -CAT submission to the WMT24 Open Language Data Initiative shared task. We detail our data collection method, which involves a computer-aided … |
Jose Cols; | Conference on Machine Translation | 2024-01-01 |
| 668 | WMT24 System Description for The MultiIndic22MT Shared Task on Manipuri Language Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a Transformer-based Neural Machine Translation (NMT) system developed by the Centre for Natural Language Processing and the Department of Computer Science and … |
Ningthoujam Justwant Singh; Kshetrimayum Boynao Singh; Avichandra Singh Ningthoujam; Sanjita Phijam; Thoudam Doren Singh; | Conference on Machine Translation | 2024-01-01 |
| 669 | English-to-Low-Resource Translation: A Multimodal Approach for Hindi, Malayalam, Bengali, and Hausa Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multimodal machine translation leverages multiple data modalities to enhance translation quality, particularly for low-resourced languages. This paper uses a multimodal model that … |
ALI HATAMI et. al. | Conference on Machine Translation | 2024-01-01 |
| 670 | DCU ADAPT at WMT24: English to Low-resource Multi-Modal Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Sami Haq; Rudali Huidrom; Sheila Castilho; | Conference on Machine Translation | 2024-01-01 |
| 671 | System Description of BV-SLP for Sindhi-English Machine Translation in MultiIndic22MT 2024 Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents our machine translation system that was developed for the WAT2024 MultiIndic MT shared task. We built our system for the Sindhi-English language pair. We … |
Nisheeth Joshi; Pragya Katyayan; Palak Arora; Bharti Nathani; | Conference on Machine Translation | 2024-01-01 |
| 672 | NovelTrans: System for WMT24 Discourse-Level Literary Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper describes our submission system, NovelTrans, from NLP 2 CT and DeepTranx for the WMT24 Discourse-Level Literary Translation Task in Chinese-English, Chinese-German, and … |
Yuchen Liu; Yutong Yao; Runzhe Zhan; Yuchu Lin; Derek F. Wong; | Conference on Machine Translation | 2024-01-01 |
| 673 | CzeGPT-2–Training New Model for Czech Generative Text Processing Evaluated With The Summarization Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Automatic text summarization (ATS), alongside neural machine translation or question answering, is one of the leading tasks in Natural Language Processing (NLP). In recent years, … |
Adam Hájek; Aleš Horák; | IEEE Access | 2024-01-01 |
| 674 | Hidetsune at SemEval-2024 Task 10: An English Based Approach to Emotion Recognition in Hindi-English Code-mixed Conversations Using Machine Learning and Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this system paper for SemEval-2024 Task10 subtask 1 (ERC), I present my approach torecognizing emotions in Hindi-English codemixed conversations. I train a SpaCy modelwith … |
Hidetsune Takahashi; | International Workshop on Semantic Evaluation | 2024-01-01 |
| 675 | An Intelligent Error Detection Model for Machine Translation Using Composite Neural Network-Based Semantic Perception Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Although machine translation has received great progress in recent years, machine translation results usually existed some errors due to the complex relationship between sentence … |
Yaoxi Wu; Qiao Liang; | IEEE Access | 2024-01-01 |
| 676 | Improving BERTScore for Machine Translation Evaluation Through Contrastive Learning IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: BERTScore is an automatic evaluation metric for machine translation. It calculates similarity scores between candidate and reference tokens through embeddings. The quality of … |
Gongbo Tang; Oreen Yousuf; Zeying Jin; | IEEE Access | 2024-01-01 |
| 677 | BLASER 2.0: A Metric for Evaluation and Quality Estimation of Massively Multilingual Speech and Text Translation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We present B LASER 2.0, an automatic metric of machine translation quality which supports both speech and text modalities. Compared to its predecessor B LASER (Chen et al., 2023), … |
David Dale; M. Costa-jussà; | Conference on Empirical Methods in Natural Language … | 2024-01-01 |
| 678 | How Far Can 100 Samples Go? Unlocking Zero-Shot Translation with Tiny Multi-Parallel Data IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Zero-shot translation aims to translate between language pairs not seen during training in Multilingual Machine Translation (MMT) and is widely considered an open problem. A … |
Di Wu; Shaomu Tan; Yan Meng; David Stap; C. Monz; | Annual Meeting of the Association for Computational … | 2024-01-01 |
| 679 | English to Arabic Braille Neural Machine Translation Through Corpus Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Nisheeth Joshi; Pragya Katyayan; Syed Afroz Ahmed; | International Conference on Arabic Computational Linguistics | 2024-01-01 |
| 680 | Rethinking Efficient Multilingual Text Summarization Meta-Evaluation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Evaluating multilingual summarization evaluation metrics, i.e., meta-evaluation, is challenging because of the difficulty of human annotation collection. Therefore, we investigate … |
Rilyn Han; Jiawen Chen; Yixin Liu; Arman Cohan; | Annual Meeting of the Association for Computational … | 2024-01-01 |
| 681 | Naïve Bayes Approach for Word Sense Disambiguation System With A Focus on Parts-of-Speech Ambiguity Resolution IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Natural languages are written and spoken languages, and NLP (Natural Language Processing) is the ability of a computer program to recognize both written and spoken languages. Word … |
AJITH ABRAHAM et. al. | IEEE Access | 2024-01-01 |
| 682 | Tamil Lang TSP: Tamil Lang Transformer Neural Text to Sign Production Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Tamil lang Task-Specific Prompts (TSP) is an advanced machine translation system that seamlessly converts Tamil text into Tamil Sign Language. This innovative system integrates … |
S. ThillaiSivakavi; R. I. Minu; | Int. Arab J. Inf. Technol. | 2024-01-01 |
| 683 | Enhancing Nepali Text Understanding with Machine Translation and LoRA Fine-Tuning of Open-Source LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Kshitiz Rimal; Noorhan Abbas; | SGAI Conferences | 2024-01-01 |
| 684 | Improving LLM-based Machine Translation with Systematic Self-Correction IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large Language Models (LLMs) have achieved impressive results in Machine Translation (MT). However, careful evaluations by human reveal that the translations produced by LLMs … |
ZHAOPENG FENG et. al. | ArXiv | 2024-01-01 |
| 685 | Entity-aware Multi-task Training Helps Rare Word Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Named entities (NE) are integral for preserving context and conveying accurate information in the machine translation (MT) task. Challenges often lie in handling NE diversity, … |
Matīss Rikters; Makoto Miwa; | International Conference on Natural Language Generation | 2024-01-01 |
| 686 | Large Language Models As Legal Translators of Arabic Legislatives: Does ChatGPT and Gemini Care for Context and Terminology? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Accurate translation of terminology and adaptation to in-context information is a pillar to high quality translation. Recently, there is a remarkable interest towards the use and … |
Khadija ElFqih; Johanna Monti; | ARABICNLP | 2024-01-01 |
| 687 | Enhancing Low-Resource NLP By Consistency Training With Data and Model Perturbations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Natural language processing (NLP) has recently shown significant progress in rich-resource scenarios. However, it is much less effective for low-resource scenarios due to the … |
XIAOBO LIANG et. al. | IEEE/ACM Transactions on Audio, Speech, and Language … | 2024-01-01 |
| 688 | Speech Recognition and Intelligent Translation Under Multimodal Human–computer Interaction System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The traditional translation robot is limited to the translation of single-mode text images and text videos, which has the problem of low translation accuracy. Therefore, speech … |
Danhua Huang; Shuaiqiu Xiang; | Journal of Intelligent Systems | 2024-01-01 |
| 689 | MSLC24 Submissions to The General Machine Translation Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The MSLC (Metric Score Landscape Challenge) submissions for English–German, English–Spanish, and Japanese–Chinese are constrained systems built using Transformer models for the … |
Samuel Larkin; Chi-liu Lo; Rebecca Knowles; | Conference on Machine Translation | 2024-01-01 |
| 690 | Findings of The WMT 2024 Biomedical Translation Shared Task: Test Sets on Abstract Level Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We present the results of the ninth edition of the Biomedical Translation Task at WMT’24. We released test sets for six language pairs, namely, French, German, Italian, … |
MARIANA L. NEVES et. al. | Conference on Machine Translation | 2024-01-01 |
| 691 | Arabic Text Formality Modification: A Review and Future Research Directions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Formality transfer seeks to adjust text formality without altering its core meaning, which carries substantial implications across diverse domains like machine translation, … |
Shadi I. Abudalfa; F. J. Abdu; Maad Alowaifeer; | IEEE Access | 2024-01-01 |
| 692 | English-chinese Bilingual Teaching: A DECTMT-NBO-DMSFNN Approach for Design and Application of Machine Translation Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This manuscript introduces a bilingual teaching model prediction system called Deep Multi-Scale Fusion Neural Network (DMSFNN). The system utilizes data from the Back-end … |
Yuwei Wang; | Int. Arab J. Inf. Technol. | 2024-01-01 |
| 693 | Tulu Language Text Recognition and Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Language is a primary means of communication, but it is not the only means; knowing a language does, however, assist speed up the process. Many distinct languages are spoken … |
.. Prathwini; Anisha P. Rodrigues; P. Vijaya; Roshan Fernandes; | IEEE Access | 2024-01-01 |
| 694 | How Grammatical Features Impact Machine Translation: A New Test Suite for Chinese-English MT Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
HUACHENG SONG et. al. | Conference on Machine Translation | 2024-01-01 |
| 695 | T5G2P: Text-to-Text Transfer Transformer Based Grapheme-to-Phoneme Conversion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The present paper explores the use of several deep neural network architectures to carry out a grapheme-to-phoneme (G2P) conversion, aiming to find a universal and … |
Markéta Řezáčková; Daniel Tihelka; J. Matousek; | IEEE/ACM Transactions on Audio, Speech, and Language … | 2024-01-01 |
| 696 | Investigating The Linguistic Performance of Large Language Models in Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper summarizes the results of our test suite evaluation on 39 machine translation systems submitted at the Shared Task of the Ninth Conference of Machine Translation … |
SHUSHEN MANAKHIMOVA et. al. | Conference on Machine Translation | 2024-01-01 |
| 697 | Domain Dynamics: Evaluating Large Language Models in English-Hindi Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities in machine translation, leveraging extensive pre-training on vast amounts of data. However, this gener-alist … |
Soham Bhattacharjee; Baban Gain; Asif Ekbal; | Conference on Machine Translation | 2024-01-01 |
| 698 | OZemi at SemEval-2024 Task 1: A Simplistic Approach to Textual Relatedness Evaluation Using Transformers and Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this system paper for SemEval-2024 Task 1 subtask A, we present our approach to evaluating the semantic relatedness of sentence pairs in nine languages. We use a mix of … |
HIDETSUNE TAKAHASHI et. al. | International Workshop on Semantic Evaluation | 2024-01-01 |
| 699 | Research on Grammatical Error Correction Algorithm in English Translation Via Deep Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study provides a concise overview of a grammatical error correction algorithm that is based on an encoder-decoder machine translation structure. Additionally, it incorporates … |
Lihua Cai; | Journal of Intelligent Systems | 2024-01-01 |
| 700 | Research on Automatic Identification of Machine English Translation Errors Based on Improved GLR Algorithm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine translation is a powerful tool for overcoming linguistic obstacles, but it often introduces errors that lower the overall translation quality. This research project aims … |
Guanghuan Li; | Informatica (Slovenia) | 2024-01-01 |
| 701 | UvA-MT’s Participation in The WMT24 General Translation Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Fine-tuning Large Language Models (FT-LLMs) with parallel data has emerged as a promising paradigm in recent machine translation research. In this paper, we explore the … |
Shaomu Tan; David Stap; Seth Aycock; C. Monz; Di Wu; | Conference on Machine Translation | 2024-01-01 |
| 702 | CUNI at WMT24 General Translation Task: LLMs, (Q)LoRA, CPO and Model Merging Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents the contributions of Charles University teams to the WMT24 General Translation task (English to Czech, German and Russian, and Czech to Ukrainian) and the … |
MIROSLAV HRABAL et. al. | Conference on Machine Translation | 2024-01-01 |
| 703 | Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We participated in the constrained track for English-Japanese and Japanese-Chinese translations at the WMT 2024 General Machine Translation Task. Our approach was to generate a … |
KEITO KUDO et. al. | Conference on Machine Translation | 2024-01-01 |
| 704 | TSU HITS’s Submissions to The WMT 2024 General Machine Translation Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper describes the TSU HITS team’s submission system for the WMT’24 general translation task. We focused on exploring the capabilities of discrete diffusion models for the … |
Vladimir Mynka; Nikolay Mikhaylovskiy; | Conference on Machine Translation | 2024-01-01 |
| 705 | No Error Left Behind: Multilingual Grammatical Error Correction with Pre-trained Translation Models IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Grammatical Error Correction (GEC) enhances language proficiency and promotes effective communication, but research has primarily centered around English. We propose a simple … |
Agnes Luhtaru; Elizaveta Korotkova; Mark Fishel; | Conference of the European Chapter of the Association for … | 2024-01-01 |
| 706 | TMU-HIT’s Submission for The WMT24 Quality Estimation Shared Task: Is GPT-4 A Good Evaluator for Machine Translation? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In machine translation quality estimation (QE), translation quality is evaluated automatically without the need for reference translations. This paper describes our contribution … |
AYAKO SATO et. al. | Conference on Machine Translation | 2024-01-01 |
| 707 | The Bangla/Bengali Seed Dataset Submission to The WMT24 Open Language Data Initiative Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We contribute a seed dataset for the Bangla/Bengali language as part of the WMT24 Open Language Data Initiative shared task. We validate the quality of the dataset against a mined … |
Firoz Ahmed; Nitin Venkateswaran; Sarah Moeller; | Conference on Machine Translation | 2024-01-01 |
| 708 | HW-TSC’s Participation in The WMT 2024 QEAPE Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The paper presents the submission by HW-TSC in the WMT 2024 Quality-informed Automatic Post Editing (QEAPE) shared task for the English-Hindi (En-Hi) and English-Tamil (En-Ta) … |
JIAWEI YU et. al. | Conference on Machine Translation | 2024-01-01 |
| 709 | Evaluating WMT 2024 Metrics Shared Task Submissions on AfriMTE (the African Challenge Set) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The A FRI MTE challenge set from WMT 2024 Metrics Shared Task aims to evaluate the capabilities of evaluation metrics for machine translation on low-resource African languages, … |
Jiayi Wang; D. Adelani; Pontus Stenetorp; | Conference on Machine Translation | 2024-01-01 |
| 710 | MSLC24: Further Challenges for Metrics on A Wide Landscape of Translation Quality Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this second edition of the Metric Score Land-scape Challenge (MSLC), we examine how automatic metrics for machine translation perform on a wide variety of machine translation … |
Rebecca Knowles; Samuel Larkin; Chi-kiu Lo; | Conference on Machine Translation | 2024-01-01 |
| 711 | SRIB-NMT’s Submission to The Indic MT Shared Task in WMT 2024 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the context of the Indic Low Resource Machine Translation (MT) challenge at WMT-24 ((Pakray et al., 2024)), we participated in four language pairs: English-Assamese (en-as), … |
Pranamya Patil; Raghavendra Hr; Aditya Raghuwanshi; Kushal Verma; | Conference on Machine Translation | 2024-01-01 |
| 712 | MTNLP-IIITH: Machine Translation for Low-Resource Indic Languages Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine Translation for low-resource languages poses significant challenges, primarily due to the limited availability of data.The WMT24 Low-Resource Indic Neural Machine … |
Abhinav P M; Ketaki Shetye; Parameswari Krishnamurthy; | Conference on Machine Translation | 2024-01-01 |
| 713 | Findings of The WMT 2024 Shared Task on Non-Repetitive Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The repetition of words in an English sentence can create a monotonous or awkward impression. In such cases, repetition should be avoided appropriately. To evaluate the … |
Kazutaka Kinugawa; Hideya Mino; Isao Goto; Naoto Shirai; | Conference on Machine Translation | 2024-01-01 |
| 714 | Benchmarking and Improving Long-Text Translation with Large Language Models IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent studies have illuminated the promising capabilities of large language models (LLMs) in handling long texts. However, their performance in machine translation (MT) of long … |
LONGYUE WANG et. al. | Annual Meeting of the Association for Computational … | 2024-01-01 |
| 715 | Research on Tibetan-Chinese Neural Machine Translation Integrating Statistical Method Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, with the emergence of deep learning methods, Neural Machine Translation has become a new research direction of machine translation. Due to the scarcity of digital … |
Maoxian Zhou; | Proceedings of the 2023 6th International Conference on … | 2023-12-27 |
| 716 | A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In MT, this might lead to misgendered translations, resulting, among other harms, in the perpetuation of stereotypes and prejudices. In this work, we address this gap by investigating whether and to what extent such models exhibit gender bias in machine translation and how we can mitigate it. |
Giuseppe Attanasio; Flor Plaza del Arco; Debora Nozza; Anne Lauscher; | emnlp | 2023-12-22 |
| 717 | Challenges in Context-Aware Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we investigate and present several core challenges that impede progress within the field, relating to discourse phenomena, context usage, model architectures, and document-level evaluation. |
Linghao Jin; Jacqueline He; Jonathan May; Xuezhe Ma; | emnlp | 2023-12-22 |
| 718 | Video-Helpful Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce EVA (Extensive training set and Video-helpful evaluation set for Ambiguous subtitles translation), an MMT dataset containing 852k Japanese-English parallel subtitle pairs, 520k Chinese-English parallel subtitle pairs, and corresponding video clips collected from movies and TV episodes. |
Yihang Li; Shuichiro Shimizu; Chenhui Chu; Sadao Kurohashi; Wei Li; | emnlp | 2023-12-22 |
| 719 | Revisiting Machine Translation for Cross-lingual Classification IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed. |
Mikel Artetxe; Vedanuj Goswami; Shruti Bhosale; Angela Fan; Luke Zettlemoyer; | emnlp | 2023-12-22 |
| 720 | Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with The GeNTE Corpus IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Based on GeNTE, we then overview existing reference-based evaluation approaches, highlight their limits, and propose a reference-free method more suitable to assess gender-neutral translation. |
Andrea Piergentili; Beatrice Savoldi; Dennis Fucci; Matteo Negri; Luisa Bentivogli; | emnlp | 2023-12-22 |
| 721 | Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, when it comes to non-English languages, the quantity and quality of textual information are comparatively scarce. To address this issue, we introduce the novel task of automatic Knowledge Graph Completion (KGE) and perform a thorough investigation on bridging the gap in both the quantity and quality of textual information between English and non-English languages. |
SIMONE CONIA et. al. | emnlp | 2023-12-22 |
| 722 | PROSE: A Pronoun Omission Solution for Chinese-English Spoken Language Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To alleviate the negative impact introduced by pro-drop, we propose Mention-Aware Semantic Augmentation, a novel approach that leverages the semantic embedding of dropped pronouns to augment training pairs. |
Ke Wang; Xiutian Zhao; Yanghui Li; Wei Peng; | emnlp | 2023-12-22 |
| 723 | On The Use of Metaphor Translation in Psychiatry Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Now, metaphor has been shown to be paramount in both identifying individuals struggling with mental problems and helping those individuals understand and communicate their experiences. Therefore, this paper aims to survey the potential of Machine Translation for providing equitable psychiatric healthcare and highlights the need for further research on the transferability of existing machine and metaphor translation research in the domain of psychiatry. |
Lois Wong; | arxiv-cs.CL | 2023-12-22 |
| 724 | CLAD-ST: Contrastive Learning with Adversarial Data for Robust Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We address this robustness problem in downstream MT models by forcing the MT encoder to bring the representations of a noisy input closer to its clean version in the semantic space. This is achieved by introducing a contrastive learning method that leverages adversarial examples in the form of ASR outputs paired with their corresponding human transcripts to optimize the network parameters. |
Sathish Indurthi; Shamil Chollampatt; Ravi Agrawal; Marco Turchi; | emnlp | 2023-12-22 |
| 725 | An Empirical Study of Unsupervised Neural Machine Translation: Analyzing NMT Output, Model’s Behavior and Sentences’ Contribution Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We focus on three very diverse languages, French, Gujarati, and Kazakh, and train bilingual NMT models, to and from English, with various levels of supervision, in high- and low- resource setups, measure quality of the NMT output and compare the generated sequences’ word order and semantic similarity to source and reference sentences. |
Isidora Chara Tourni; Derry Wijaya; | arxiv-cs.CL | 2023-12-19 |
| 726 | Fine-tuning Large Language Models for Adaptive Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents the outcomes of fine-tuning Mistral 7B, a general-purpose large language model (LLM), for adaptive machine translation (MT). |
Yasmin Moslem; Rejwanul Haque; Andy Way; | arxiv-cs.CL | 2023-12-19 |
| 727 | APE-then-QE: Correcting Then Filtering Pseudo Parallel Corpora for MT Training Data Creation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Automatic Post-Editing (APE) is the task of automatically identifying and correcting errors in the Machine Translation (MT) outputs. We propose a repair-filter-use methodology … |
Akshay Batheja; S. Deoghare; Diptesh Kanojia; Pushpak Bhattacharyya; | ArXiv | 2023-12-18 |
| 728 | Neural Machine Translation of Clinical Text: An Empirical Investigation Into Multilingual Pre-Trained Language Models and Transfer-Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We conduct investigations on clinical text machine translation by examining multilingual neural network models using deep learning such as Transformer based structures. |
LIFENG HAN et. al. | arxiv-cs.CL | 2023-12-12 |
| 729 | Performance Evaluation of Popular Deep Neural Networks for Neural Machine Translation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The field of Neural Machine Translation (NMT) has shown impressive performance for quick and easy communication in various languages spoken all over the world. NMT helps us by … |
MUHAMMAD NAEEM et. al. | 2023 International Conference on Frontiers of Information … | 2023-12-11 |
| 730 | Converting Epics/Stories Into Pseudocode Using Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With this research paper, we aim to present a methodology to generate pseudocode from a given agile user story of small functionalities so as to reduce the overall time spent on the industrial project. |
Gaurav Kolhatkar; Akshit Madan; Nidhi Kowtal; Satyajit Roy; Sheetal Sonawane; | arxiv-cs.CL | 2023-12-08 |
| 731 | Design of Automatic Translation System for English for Special Purpose in Agriculture Based on Neural Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Agricultural terms have some unique characteristics, which make them need special treatment in machine translation. Agriculture is a highly specialized field, with a large number … |
Meilin Wang; | Proceedings of the 3rd International Conference on … | 2023-12-08 |
| 732 | Efficient Monotonic Multihead Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce the Efficient Monotonic Multihead Attention (EMMA), a state-of-the-art simultaneous translation model with numerically-stable and unbiased monotonic alignment estimation. |
Xutai Ma; Anna Sun; Siqi Ouyang; Hirofumi Inaguma; Paden Tomasello; | arxiv-cs.CL | 2023-12-07 |
| 733 | Improving Neural Machine Translation By Multi-Knowledge Integration with Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we focus on how to integrate multi-knowledge, multiple types of knowledge, into NMT models to enhance the performance with prompting. |
Ke Wang; Jun Xie; Yuqi Zhang; Yu Zhao; | arxiv-cs.CL | 2023-12-07 |
| 734 | English-Arabic Text Translation and Abstractive Summarization Using Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The vast growth of online and offline data has revolutionized how we gather, evaluate, and understand information. Comprehending lengthy text documents and extracting crucial … |
Heidi Ahmed Holiel; Nancy Mohamed; Arwa Ahmed; Walaa Medhat; | 2023 20th ACS/IEEE International Conference on Computer … | 2023-12-04 |
| 735 | End-to-End Speech-to-Text Translation: A Survey IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a result, researchers have been exploring end-to-end (E2E) models for ST translation. |
Nivedita Sethiya; Chandresh Kumar Maurya; | arxiv-cs.CL | 2023-12-02 |
| 736 | Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present GEST — a new manually created dataset designed to measure gender-stereotypical reasoning in language models and machine translation systems. |
Matúš Pikuliak; Andrea Hrckova; Stefan Oresko; Marián Šimko; | arxiv-cs.CL | 2023-11-30 |
| 737 | Relevance-guided Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an explainability-based training approach for NMT, applied in Unsupervised and Supervised model training, for translation of three languages of varying resources, French, Gujarati, Kazakh, to and from English. |
Isidora Chara Tourni; Derry Wijaya; | arxiv-cs.CL | 2023-11-30 |
| 738 | English to Arabic Machine Translation of Mathematical Documents Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper is about the development of a machine translation system tailored specifically for LATEX mathematical documents. The system focuses on translating English LATEX … |
Mustapha Eddahibi; Mohammed Mensouri; | ArXiv | 2023-11-29 |
| 739 | A Benchmark for Evaluating Machine Translation Metrics on Dialects Without Standard Orthography IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we evaluate how robust metrics are to non-standardized dialects, i.e. spelling differences in language varieties that do not have a standard orthography. |
Noëmi Aepli; Chantal Amrhein; Florian Schottmann; Rico Sennrich; | arxiv-cs.CL | 2023-11-28 |
| 740 | Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, when it comes to non-English languages, the quantity and quality of textual information are comparatively scarce. To address this issue, we introduce the novel task of automatic Knowledge Graph Enhancement (KGE) and perform a thorough investigation on bridging the gap in both the quantity and quality of textual information between English and non-English languages. |
SIMONE CONIA et. al. | arxiv-cs.AI | 2023-11-27 |
| 741 | Reducing Gender Bias in Machine Translation Through Counterfactual Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We also propose a novel domain-adaptation technique that leverages in-domain data created with the counterfactual data generation techniques proposed by Zmigrod et al. (2019) to further improve accuracy on the WinoMT challenge test set without significant loss in translation quality. We show its effectiveness in NMT systems from English into three morphologically rich languages French, Spanish, and Italian. |
Ranjita Naik; Spencer Rarrick; Vishal Chowdhary; | arxiv-cs.CL | 2023-11-27 |
| 742 | Improving Word Sense Disambiguation in Neural Machine Translation with Salient Document Context Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a simple and scalable approach to resolve translation ambiguity by incorporating a small amount of extra-sentential context in neural \mt. Our approach requires no sense annotation and no change to standard model architectures. |
Elijah Rippeth; Marine Carpuat; Kevin Duh; Matt Post; | arxiv-cs.CL | 2023-11-26 |
| 743 | Machine Translation to Control Formality Features in The Target Language Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: When a language translation technique is used to translate from a source language that does not pertain the formality (e.g. English) to a target language that does, there is a missing information on formality that could be a challenge in producing an accurate outcome. This research explores how this issue should be resolved when machine learning methods are used to translate from English to languages with formality, using Hindi as the example data. |
Harshita Tyagi; Prashasta Jung; Hyowon Lee; | arxiv-cs.CL | 2023-11-22 |
| 744 | Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we explore how context-awareness can improve the performance of the current Neural Machine Translation (NMT) models for English-Japanese business dialogues translation, and what kind of context provides meaningful information to improve translation. |
Sumire Honda; Patrick Fernandes; Chrysoula Zerva; | arxiv-cs.CL | 2023-11-20 |
| 745 | Thai-English Neural Machine Translation Method with Local and Global Syllable Feature Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Neural Machine Translation (NMT) has superseded Statistical Machine Translation (SMT) owing to the advent of deep learning in natural language processing. However, enhancing the … |
Ming Xiong; Ruimin Liu; | 2023 International Conference on Asian Language Processing … | 2023-11-18 |
| 746 | Vashantor: A Large-scale Multilingual Benchmark Dataset for Automated Translation of Bangla Regional Dialects to Bangla Language IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Despite extensive study into translating Bangla to English, English to Bangla, and Banglish to Bangla in the past, there has been a noticeable gap in translating Bangla regional dialects into standard Bangla. In this study, we set out to fill this gap by creating a collection of 32,500 sentences, encompassing Bangla, Banglish, and English, representing five regional Bangla dialects. |
FATEMA TUJ JOHORA FARIA et. al. | arxiv-cs.CL | 2023-11-18 |
| 747 | Multi-Task Self-Supervised Learning Based Tibetan-Chinese Speech-to-Speech Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Speech-to-speech translation tasks are commonly tackled by using a three-level cascade system which comprises of speech recognition, machine translation, and speech synthesis. … |
Rouhe Liu; Yue Zhao; Xiaona Xu; | 2023 International Conference on Asian Language Processing … | 2023-11-18 |
| 748 | English–Vietnamese Machine Translation Using Deep Learning for Chatbot Applications IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Sakya Tuan; P. Meesad; Ha Huy Cuong Nguyen; | SN Computer Science | 2023-11-15 |
| 749 | Evaluating Gender Bias in The Translation of Gender-Neutral Languages Into English Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite numerous studies into gender bias in translations from gender-neutral languages such as Turkish into more strongly gendered languages like English, there are no benchmarks for evaluating this phenomenon or for assessing mitigation strategies. To address this gap, we introduce GATE X-E, an extension to the GATE (Rarrick et al., 2023) corpus, that consists of human translations from Turkish, Hungarian, Finnish, and Persian into English. |
Spencer Rarrick; Ranjita Naik; Sundar Poudel; Vishal Chowdhary; | arxiv-cs.CL | 2023-11-15 |
| 750 | SentAlign: Accurate and Scalable Sentence Alignment IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present SentAlign, an accurate sentence alignment tool designed to handle very large parallel document pairs. |
Steinþór Steingrímsson; Hrafn Loftsson; Andy Way; | arxiv-cs.CL | 2023-11-15 |
| 751 | Assessing Translation Capabilities of Large Language Models Involving English and Indian Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, our aim is to explore the multilingual capabilities of large language models by using machine translation as a task involving English and 22 Indian languages. |
VANDAN MUJADIA et. al. | arxiv-cs.CL | 2023-11-15 |
| 752 | Extending Multilingual Machine Translation Through Imitation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We aim to extend large-scale MNMT models to a new language, allowing for translation between the newly added and all of the already supported languages in a challenging scenario: using only a parallel corpus between the new language and English. |
Wen Lai; Viktor Hangya; Alexander Fraser; | arxiv-cs.CL | 2023-11-14 |
| 753 | Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Pivoting via high-resource languages remains a strong strategy for low-resource directions, and in this paper we revisit ways of pivoting through multiple languages. |
Alireza Mohammadshahi; Jannis Vamvas; Rico Sennrich; | arxiv-cs.CL | 2023-11-13 |
| 754 | Don’t Overlook The Grammatical Gender: Bias Evaluation for Hindi-English Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Neural Machine Translation (NMT) models, though state-of-the-art for translation, often reflect social biases, particularly gender bias. Existing evaluation benchmarks primarily … |
Pushpdeep Singh; | ArXiv | 2023-11-11 |
| 755 | Gender Inflected or Bias Inflicted: On Using Grammatical Gender Cues for Bias Evaluation in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To demonstrate our point, in this work, we use Hindi as the source language and construct two sets of gender-specific sentences: OTSC-Hindi and WinoMT-Hindi that we use to evaluate different Hindi-English (HI-EN) NMT systems automatically for gender bias. |
Pushpdeep Singh; | arxiv-cs.CL | 2023-11-07 |
| 756 | Findings of The WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in The Cosmos of LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We employ both automatic and human evaluations to measure the performance of the submitted systems. |
LONGYUE WANG et. al. | arxiv-cs.CL | 2023-11-06 |
| 757 | CBSiMT: Mitigating Hallucination in Simultaneous Machine Translation with Weighted Prefix-to-Prefix Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a Confidence-Based Simultaneous Machine Translation (CBSiMT) framework, which uses model confidence to perceive hallucination tokens and mitigates their negative impact with weighted prefix-to-prefix training. |
MENGGE LIU et. al. | arxiv-cs.CL | 2023-11-06 |
| 758 | Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To create the parallel corpora, we propose a dynamic programming based sentence alignment algorithm which leverages the cosine similarity of machine-translated sentences. |
Haiyue Song; Raj Dabre; Chenhui Chu; Atsushi Fujita; Sadao Kurohashi; | arxiv-cs.CL | 2023-11-06 |
| 759 | Parts of Speech Tagged Phrase-Based Statistical Machine Translation System for English → Mizo Language Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Chanambam Sveta Devi; Amit Kumar Roy; Bipul Syam Purkayastha; | SN Computer Science | 2023-11-01 |
| 760 | Cross-Lingual Sentiment Analysis in Literary Translation: A Case Study of The Novel Crystal Boys Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Sentiment analysis, which is a subdomain of natural language processing (NLP), has attracted significant interest due to its extensive applications. However, its utilization in … |
Yanwen Hou; Yurong Ma; | 2023 10th International Conference on Behavioural and … | 2023-10-30 |
| 761 | Gex’ez-English Bi-Directional Neural Machine Translation Using Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine translation is the technique of translating texts from one language to another without human intervention using artificial intelligence. Neural Machine Translation (NMT) … |
Sefineh Getachew; Yirga Yayeh; | 2023 International Conference on Information and … | 2023-10-26 |
| 762 | Cultural Adaptation of Recipes IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a new task involving the translation and cultural adaptation of recipes between Chinese and English-speaking cuisines. |
YONG CAO et. al. | arxiv-cs.CL | 2023-10-26 |
| 763 | DISCO: A Large Scale Human Annotated Corpus for Disfluency Correction in Indo-European Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Towards the goal of multilingual disfluency correction, we present a high-quality human-annotated DC corpus covering four important Indo-European languages: English, Hindi, German and French. |
Vineet Bhat; Preethi Jyothi; Pushpak Bhattacharyya; | arxiv-cs.CL | 2023-10-25 |
| 764 | Machine Translation for Nko: Tools, Corpora and Baseline Results Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Currently, there is no usable machine translation system for Nko, a language spoken by tens of millions of people across multiple West African countries, which holds significant cultural and educational value. To address this issue, we present a set of tools, resources, and baseline results aimed towards the development of usable machine translation systems for Nko and other languages that do not currently have sufficiently large parallel text corpora available. |
MOUSSA KOULAKO BALA DOUMBOUYA et. al. | arxiv-cs.CL | 2023-10-24 |
| 765 | ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present ComSL, a speech-language model built atop a composite architecture of public pre-trained speech-only and language-only models and optimized data-efficiently for spoken language tasks. |
CHENYANG LE et. al. | nips | 2023-10-24 |
| 766 | Data Augmentation Techniques for Machine Translation of Code-Switched Texts: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we compare three popular approaches: lexical replacements, linguistic theories, and back-translation (BT), in the context of Egyptian Arabic-English CSW. |
Injy Hamed; Nizar Habash; Ngoc Thang Vu; | arxiv-cs.CL | 2023-10-23 |
| 767 | Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Surprisingly, our initial experiments find that fine-tuning for translation purposes even led to performance degradation. To overcome this, we propose an alternative approach: adapting LLM’s as Automatic Post-Editors (APE) rather than direct translators. |
Sai Koneru; Miriam Exel; Matthias Huck; Jan Niehues; | arxiv-cs.CL | 2023-10-23 |
| 768 | Boosting Unsupervised Machine Translation with Pseudo-Parallel Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a training strategy that relies on pseudo-parallel sentence pairs mined from monolingual corpora in addition to synthetic sentence pairs back-translated from monolingual corpora. |
Ivana Kvapilíková; Ondřej Bojar; | arxiv-cs.CL | 2023-10-22 |
| 769 | Domain Terminology Integration Into Machine Translation: Leveraging Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper discusses the methods that we used for our submissions to the WMT 2023 Terminology Shared Task for German-to-English (DE-EN), English-to-Czech (EN-CS), and Chinese-to-English (ZH-EN) language pairs. |
D. Kelleher; | arxiv-cs.CL | 2023-10-22 |
| 770 | Long-Form Speech Translation Through Segmentation with Finite-State Decoding Constraints on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: One challenge in speech translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations. To address this mismatch, we adapt large language models (LLMs) to split long ASR transcripts into segments that can be independently translated so as to maximize the overall translation quality. |
Arya D. McCarthy; Hao Zhang; Shankar Kumar; Felix Stahlberg; Ke Wu; | arxiv-cs.CL | 2023-10-20 |
| 771 | CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This work introduces CAPIVARA, a cost-efficient framework designed to enhance the performance of multilingual CLIP models in low-resource languages. |
GABRIEL OLIVEIRA DOS SANTOS et. al. | arxiv-cs.LG | 2023-10-20 |
| 772 | Translation Performance from The User’s Perspective of Large Language Models and Neural Machine Translation Systems IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The rapid global expansion of ChatGPT, which plays a crucial role in interactive knowledge sharing and translation, underscores the importance of comparative performance … |
Jungha Son; Boyoung Kim; | Inf. | 2023-10-19 |
| 773 | Direct Neural Machine Translation with Task-level Mixture of Experts Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we examine Task-level MoE’s applicability in direct NMT and propose a series of high-performing training and evaluation configurations, through which Task-level MoE-based direct NMT systems outperform bilingual and pivot-based models for a large number of low and high-resource direct pairs, and translation directions. |
Isidora Chara Tourni; Subhajit Naskar; | arxiv-cs.CL | 2023-10-18 |
| 774 | Knn-seq: Efficient, Extensible KNN-MT Framework Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present an efficient and extensible kNN-MT framework, knn-seq, for researchers and developers that is carefully designed to run efficiently, even with a billion-scale large datastore. |
HIROYUKI DEGUCHI et. al. | arxiv-cs.CL | 2023-10-18 |
| 775 | A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In MT, this might lead to misgendered translations, resulting, among other harms, in the perpetuation of stereotypes and prejudices. In this work, we address this gap by investigating whether and to what extent such models exhibit gender bias in machine translation and how we can mitigate it. |
Giuseppe Attanasio; Flor Miriam Plaza-del-Arco; Debora Nozza; Anne Lauscher; | arxiv-cs.CL | 2023-10-18 |
| 776 | Exploring Automatic Evaluation Methods Based on A Decoder-based LLM for Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper compares various methods, including tuning with encoder-based models and large language models under equal conditions, on two different tasks, machine translation evaluation and semantic textual similarity, in two languages, Japanese and English. |
Tomohito Kasahara; Daisuke Kawahara; | arxiv-cs.CL | 2023-10-17 |
| 777 | UvA-MT’s Participation in The WMT23 General Translation Shared Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper describes the UvA-MT’s submission to the WMT 2023 shared task on general machine translation. |
Di Wu; Shaomu Tan; David Stap; Ali Araabi; Christof Monz; | arxiv-cs.CL | 2023-10-15 |
| 778 | UvA-MT’s Participation in The WMT 2023 General Translation Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper describes the UvA-MT’s submission to the WMT 2023 shared task on general machine translation. We participate in the constrained track in two directions: English … |
Di Wu; Shaomu Tan; David Stap; Ali Araabi; C. Monz; | ArXiv | 2023-10-15 |
| 779 | MILPaC: A Novel Benchmark for Evaluating Translation of Legal Text to Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we construct the first high-quality legal parallel corpus containing aligned text units in English and nine Indian languages, that includes several low-resource languages. |
Sayan Mahapatra; Debtanu Datta; Shubham Soni; Adrijit Goswami; Saptarshi Ghosh; | arxiv-cs.CL | 2023-10-15 |
| 780 | XDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To address the issue, we introduce xDial-Eval, built on top of open-source English dialogue evaluation datasets. |
CHEN ZHANG et. al. | arxiv-cs.CL | 2023-10-13 |
| 781 | Political Claim Identification and Categorization in A Multilingual Setting: First Experiments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper explores different strategies for the cross-lingual projection of political claims analysis. |
Urs Zaberer; Sebastian Padó; Gabriella Lapesa; | arxiv-cs.CL | 2023-10-13 |
| 782 | Human-in-the-loop Machine Translation with Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we propose a human-in-the-loop pipeline that guides LLMs to produce customized outputs with revision instructions. |
Xinyi Yang; Runzhe Zhan; Derek F. Wong; Junchao Wu; Lidia S. Chao; | arxiv-cs.CL | 2023-10-13 |
| 783 | Enhancing Expressivity Transfer in Textless Speech-to-speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Expressivity plays a vital role in conveying emotions, nuances, and cultural subtleties, thereby enhancing communication across diverse languages. To address this issue this study presents a novel method that operates at the discrete speech unit level and leverages multilingual emotion embeddings to capture language-agnostic information. |
Jarod Duret; Benjamin O’Brien; Yannick Estève; Titouan Parcollet; | arxiv-cs.SD | 2023-10-11 |
| 784 | Task-Oriented Semantic Communications for Speech Transmission IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic communications execute intelligent tasks at the receiver by only transmitting necessary information. In this paper, we introduce TOS-ST, a task-oriented semantic … |
Zhenzi Weng; Zhijin Qin; Xiaoming Tao; | 2023 IEEE 98th Vehicular Technology Conference … | 2023-10-10 |
| 785 | Larth: Dataset and Machine Translation for Etruscan Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To the best of our knowledge, there are no publicly available Etruscan corpora for natural language processing. Therefore, we propose a dataset for machine translation from Etruscan to English, which contains 2891 translated examples from existing academic sources. |
Gianluca Vico; Gerasimos Spanakis; | arxiv-cs.CL | 2023-10-09 |
| 786 | Evaluation of Cross-Lingual Bug Localization: Two Industrial Cases Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study reports the results of applying the cross-lingual bug localization approach proposed by Xia et al. to industrial software projects. |
Shinpei Hayashi; Takashi Kobayashi; Tadahisa Kato; | arxiv-cs.SE | 2023-10-03 |
| 787 | Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To understand when and why the navigation capabilities of language IDs are weakened, we compare two extreme decoder input cases in the ZST directions: Off-Target (OFF) and On-Target (ON) cases. |
CHANGTONG ZAN et. al. | arxiv-cs.CL | 2023-09-28 |
| 788 | CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, these are not directly applicable to MMT since they do not provide aligned multimodal multilingual features for generative tasks. To alleviate this issue, instead of designing complex modules for MMT, we propose CLIPTrans, which simply adapts the independently pre-trained multimodal M-CLIP and the multilingual mBART. |
DEVAANSH GUPTA et. al. | iccv | 2023-09-27 |
| 789 | Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper describes the FBK’s participation in the Simultaneous Translation and Automatic Subtitling tracks of the IWSLT 2023 Evaluation Campaign. |
Sara Papi; Marco Gaido; Matteo Negri; | arxiv-cs.CL | 2023-09-27 |
| 790 | Unify Word-level and Span-level Tasks: NJUNLP’s Participation for The WMT2023 Quality Estimation Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We introduce the submissions of the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task. Our team submitted predictions for the English-German language pair on all two … |
XIANG GENG et. al. | Conference on Machine Translation | 2023-09-23 |
| 791 | Audience-specific Explanations for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work we explore techniques to extract example explanations from a parallel corpus. |
Renhan Lou; Jan Niehues; | arxiv-cs.CL | 2023-09-22 |
| 792 | Hindi to English: Transformer-Based Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we have developed a Neural Machine Translation (NMT) system by training the Transformer model to translate texts from Indian Language Hindi to English. |
Kavit Gangar; Hardik Ruparel; Shreyas Lele; | arxiv-cs.CL | 2023-09-22 |
| 793 | Unify Word-level and Span-level Tasks: NJUNLP’s Participation for The WMT2023 Quality Estimation Shared Task Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce the submissions of the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task. |
XIANG GENG et. al. | arxiv-cs.CL | 2023-09-22 |
| 794 | Domain Adaptation for Arabic Machine Translation: The Case of Financial Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we developed carefully a parallel corpus for Arabic-English (AR- EN) translation in the financial domain for benchmarking different domain adaptation methods. |
Emad A. Alghamdi; Jezia Zakraoui; Fares A. Abanmy; | arxiv-cs.CL | 2023-09-22 |
| 795 | Language Model Method for Collocation Rules of Parts of Speech in Machine Translation System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the development of the times, modern society has now entered the Internet of Things (IoT) information age and Machine Translation (MT) plays an important role in increasingly … |
Jinhui Liu; Feng Zhang; | ACM Transactions on Asian and Low-Resource Language … | 2023-09-21 |
| 796 | OSN-MDAD: Machine Translation Dataset for Arabic Multi-Dialectal Conversations on Online Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While few attempts have been made to build translation datasets for dialectal Arabic, they are domain dependent and are not OSN cultural-language friendly. In this work, we attempt to alleviate these limitations by proposing an online social network-based multidialect Arabic dataset that is crafted by contextually translating English tweets into four Arabic dialects: Gulf, Yemeni, Iraqi, and Levantine. |
Fatimah Alzamzami; Abdulmotaleb El Saddik; | arxiv-cs.CL | 2023-09-21 |
| 797 | SignBank+: Preparing A Multilingual Sign Language Dataset for Machine Translation Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce SignBank+, a clean version of the SignBank dataset, optimized for machine translation between spoken language text and SignWriting, a phonetic sign language writing system. |
Amit Moryossef; Zifan Jiang; | arxiv-cs.CL | 2023-09-20 |
| 798 | Machine Translation of Electrical Terminology Constraints Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In practical applications, the accuracy of domain terminology translation is an important criterion for the performance evaluation of domain machine translation models. Aiming at … |
Zepeng Wang; Yuan Chen; Juwei Zhang; | Inf. | 2023-09-20 |
| 799 | SpeechAlign: A Framework for Speech Translation Alignment Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Speech-to-Speech and Speech-to-Text translation are currently dynamic areas of research. In our commitment to advance these fields, we present SpeechAlign, a framework designed to evaluate the underexplored field of source-target alignment in speech models. |
Belen Alastruey; Aleix Sant; Gerard I. Gállego; David Dale; Marta R. Costa-jussà; | arxiv-cs.CL | 2023-09-20 |
| 800 | Optimizing Machine Translation for Virtual Assistants: Multi-Variant Generation with VerbNet and Conditional Beam Search Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, we introduce a domain-adapted machine translation (MT) model for intelligent virtual assistants (IVA) designed to translate natural language understanding (NLU) … |
Marcin Sowański; Artur Janicki; | 2023 18th Conference on Computer Science and Intelligence … | 2023-09-17 |
| 801 | Controllability for English-Ukrainian Machine Translation By Using Style Transfer Techniques Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: While straightforward machine translation got significant improvements in the last 10 years with the arrival of encoder-decoder neural networks and transformers architecture, … |
DANIIL MAKSYMENKO et. al. | 2023 18th Conference on Computer Science and Intelligence … | 2023-09-17 |
| 802 | A Benchmark for Text Expansion: Datasets, Metrics, and Baselines Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings. |
YI CHEN et. al. | arxiv-cs.CL | 2023-09-17 |
| 803 | A Multi-domain Adaptive Neural Machine Translation Method Based on Domain Data Balancer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Most methods for multi-domain adaptive neural machine translation (NMT) currently rely on mixing data from multiple domains in a single model to achieve multi-domain translation. … |
Jinlei Xu; Yonghua Wen; Shuanghong Huang; Zhengtao Yu; | Intelligent Data Analysis | 2023-09-14 |
| 804 | Design of A Smart Teaching English Translation System Based on Big Data Machine Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the context of artificial intelligence, the use of machine translation in English reading classroom teaching is a more common learning method. In traditional teaching methods, … |
Chunye Zhang; Tianyue Yu; Yingqi Gao; Mau Luen Tham; | Int. J. Web Based Learn. Teach. Technol. | 2023-09-12 |
| 805 | Dual-view Curricular Optimal Transport for Cross-lingual Cross-modal Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Improperly assuming the pseudo-parallel data are correctly correlated will make the networks overfit to the noisy correspondence. Therefore, we propose Dual-view Curricular Optimal Transport (DCOT) to learn with noisy correspondence in CCR. |
YABING WANG et. al. | arxiv-cs.CV | 2023-09-11 |
| 806 | The Effect of Alignment Objectives on Code-Switching Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we are proposing a way of training a single machine translation model that is able to translate monolingual sentences from one language to another, along with translating code-switched sentences to either language. |
Mohamed Anwar; | arxiv-cs.CL | 2023-09-10 |
| 807 | Epi-Curriculum: Episodic Curriculum Learning for Low-Resource Domain Adaptation in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a novel approach Epi-Curriculum to address low-resource domain adaptation (DA), which contains a new episodic training framework along with denoised curriculum learning. |
Keyu Chen; Di Zhuang; Mingchen Li; J. Morris Chang; | arxiv-cs.LG | 2023-09-05 |
| 808 | Advancing Text-to-GLOSS Neural Translation Using A Novel Hyper-parameter Optimization Technique Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate the use of transformers for Neural Machine Translation of text-to-GLOSS for Deaf and Hard-of-Hearing communication. |
Younes Ouargani; Noussaima El Khattabi; | arxiv-cs.CL | 2023-09-05 |
| 809 | Algorithmic Translation Correction Mechanisms: An End-to-end Algorithmic Implementation of English-Chinese Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: INTRODUCTION: Machine translation is a modern natural language processing research field with important scientific and practical significance. In practice, the variation of … |
Lei Shi; | EAI Endorsed Trans. Scalable Inf. Syst. | 2023-09-05 |
| 810 | Neural Machine Translation Systems for English to Khasi: A Case Study of An Austroasiatic Language IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
A. V. Hujon; Thoudam Doren Singh; Khwairakpam Amitab; | Expert Syst. Appl. | 2023-09-01 |
| 811 | Impact of Visual Context on Noisy Multimodal NMT: An Empirical Study for English to Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Neural Machine Translation (NMT) has made remarkable progress usinglarge-scale textual data, but the potential of incorporating multimodal inputs,especially visual information, remains underexplored in high-resource settings.While prior research has focused on using multimodal data in low-resourcescenarios, this study examines how image features impact translation when addedto a large-scale, pre-trained unimodal NMT system. |
Baban Gain; Dibyanayan Bandyopadhyay; Samrat Mukherjee; Chandranath Adak; Asif Ekbal; | arxiv-cs.CL | 2023-08-30 |
| 812 | Improving Translation Faithfulness of Large Language Models Via Augmenting Instructions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Large Language Models (LLMs) present strong general capabilities, and a current compelling challenge is stimulating their specialized capabilities, such as machine translation, through low-cost instruction tuning. |
YIJIE CHEN et. al. | arxiv-cs.CL | 2023-08-24 |
| 813 | SeamlessM4T: Massively Multilingual & Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: More specifically, conventional speech-to-speech translation systems rely on cascaded systems that perform translation progressively, putting high-performing unified systems out of reach. To address these gaps, we introduce SeamlessM4T, a single model that supports speech-to-speech translation, speech-to-text translation, text-to-speech translation, text-to-text translation, and automatic speech recognition for up to 100 languages. |
SEAMLESS COMMUNICATION et. al. | arxiv-cs.CL | 2023-08-22 |
| 814 | Knowledge Distillation on Joint Task End-to-End Speech Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: An End-to-End Speech Translation (E2E-ST) model takes input audio in one language and directly produces output text in another language. The model requires to learn both … |
Khandokar Md. Nayem; Ran Xue; Ching-Yun Chang; A. Shanbhogue; | Interspeech | 2023-08-20 |
| 815 | Factuality Detection Using Machine Translation — A Use Case for German Clinical Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the context of factuality detection, this work presents a simple solution using machine translation to translate English data to German to train a transformer-based factuality detection model. |
Mohammed Bin Sumait; Aleksandra Gabryszak; Leonhard Hennig; Roland Roller; | arxiv-cs.CL | 2023-08-17 |
| 816 | Fast Training of NMT Model with Data Sorting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: One potential area for improvement is to address the computation of empty tokens that the Transformer computes only to discard them later, leading to an unnecessary computational burden. To tackle this, we propose an algorithm that sorts translation sentence pairs based on their length before batching, minimizing the waste of computing power. |
Daniela N. Rim; Kimera Richard; Heeyoul Choi; | arxiv-cs.CL | 2023-08-16 |
| 817 | Extrapolating Large Language Models to Non-English By Aligning Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we empower pre-trained LLMs on non-English languages by building semantic alignment across languages. |
WENHAO ZHU et. al. | arxiv-cs.CL | 2023-08-09 |
| 818 | Negative Lexical Constraints in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We compared various methods based on modifying either the decoding process or the training data. |
Josef Jon; Dušan Variš; Michal Novák; João Paulo Aires; Ondřej Bojar; | arxiv-cs.CL | 2023-08-07 |
| 819 | Show Me The World in My Language: Establishing The First Baseline for Scene-Text to Scene-Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we study the task of “visually” translating scene text from a source language (e.g., Hindi) to a target language (e.g., English). |
Shreyas Vaidya; Arvind Kumar Sharma; Prajwal Gatti; Anand Mishra; | arxiv-cs.CV | 2023-08-06 |
| 820 | Sinhala-English Parallel Word Dictionary Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Thus, in this work, we introduce three parallel English-Sinhala word dictionaries (En-Si-dict-large, En-Si-dict-filtered, En-Si-dict-FastText) which help in multilingual Natural Language Processing (NLP) tasks related to English and Sinhala languages. |
Kasun Wickramasinghe; Nisansa de Silva; | arxiv-cs.CL | 2023-08-04 |
| 821 | Do Multilingual Language Models Think Better in English? IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce a new approach called self-translate, which overcomes the need of an external translation system by leveraging the few-shot translation capabilities of multilingual language models. |
Julen Etxaniz; Gorka Azkune; Aitor Soroa; Oier Lopez de Lacalle; Mikel Artetxe; | arxiv-cs.CL | 2023-08-02 |
| 822 | Using Online Machine Translation in International Scholarly Writing and Publishing: A Longitudinal Case of A Chinese Engineering Scholar Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Scholars who use English as an additional language (EAL) worldwide are under increasing pressure to write and publish in English due to the pervasive publish‐or‐perish culture and … |
C. Zou; Wei Gong; Ping Li; | Learned Publishing | 2023-07-31 |
| 823 | Predicting Perfect Quality Segments in MT Output with Fine-Tuned OpenAI LLM: Is It Possible to Capture Editing Distance Patterns from Historical Data? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Translation Quality Estimation (TQE) is an essential step before deploying the output translation into usage. TQE is also critical in assessing machine translation (MT) and human … |
Serge Gladkoff; G. Erofeev; Lifeng Han; G. Nenadic; | ArXiv | 2023-07-31 |
| 824 | MTUncertainty: Assessing The Need for Post-editing of Machine Translation Outputs By Fine-tuning OpenAI LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We take OpenAI models as the best state-of-the-art technology and approach TQE as a binary classification task. |
Serge Gladkoff; Lifeng Han; Gleb Erofeev; Irina Sorokina; Goran Nenadic; | arxiv-cs.CL | 2023-07-31 |
| 825 | Structural Transfer Learning in NL-to-Bash Semantic Parsers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a methodology for obtaining a quantitative understanding of structural overlap between machine translation tasks. |
Kyle Duffy; Satwik Bhattamishra; Phil Blunsom; | arxiv-cs.CL | 2023-07-31 |
| 826 | Multilingual Lexical Simplification Via Paraphrase Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a novel multilingual LS method via paraphrase generation, as paraphrases provide diversity in word selection while preserving the sentence’s meaning. |
KANG LIU et. al. | arxiv-cs.CL | 2023-07-27 |
| 827 | The JOKER Corpus: English-French Parallel Data for Multilingual Wordplay Recognition IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce and analyze a new dataset for research and applications in the retrieval and processing of wordplay. |
Liana Ermakova; Anne-Gwenn Bosser; Adam Jatowt; Tristan Miller; | sigir | 2023-07-25 |
| 828 | Lost In Translation: Generating Adversarial Examples Robust to Round-Trip Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present a comprehensive study on the robustness of current text adversarial attacks to round-trip translation. |
Neel Bhandari; Pin-Yu Chen; | arxiv-cs.CL | 2023-07-24 |
| 829 | Construction of Mizo: English Parallel Corpus for Machine Translation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Parallel corpus is a key component of statistical and Neural Machine Translation (NMT). While most research focuses on machine translation, corpus creation studies are limited for … |
Thangkhanhau Haulai; J. Hussain; | ACM Transactions on Asian and Low-Resource Language … | 2023-07-21 |
| 830 | Incorporating Human Translator Style Into English-Turkish Literary Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on English-Turkish literary translation and develop machine translation models that take into account the stylistic features of translators. |
ZEYNEP YIRMIBEŞOĞLU et. al. | arxiv-cs.CL | 2023-07-21 |
| 831 | Improving End-to-End Speech Translation By Imitation-Based Knowledge Distillation with Synthetic Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present an imitation learning approach where a teacher NMT system corrects the errors of an AST student without relying on manual transcripts. |
Rebekka Hubert; Artem Sokolov; Stefan Riezler; | arxiv-cs.CL | 2023-07-17 |
| 832 | A Neural-Symbolic Approach Towards Identifying Grammatically Correct Sentences Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite the importance of having access to well-written sentences, figuring out ways to validate them is still an open area of research. To address this problem, we present a simplified way to validate English sentences through a novel neural-symbolic approach. |
Nicos Isaak; | arxiv-cs.CL | 2023-07-16 |
| 833 | Investigation of Facilitating English Speaking of Elementary School Students in Authentic Contexts with UEnglish Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite the importance of English in various aspects of life, students learning English as a foreign language often lack sufficient opportunities to practice speaking in their … |
Xiangfeng Zhao; Wu-Yuin Hwang; Anh Hoang; Yun-Chi Su; | Human-Centric Intelligent Systems | 2023-07-15 |
| 834 | Investigating Unsupervised Neural Machine Translation for Low-resource Language Pair English-Mizo Via Lexically Enhanced Pre-trained Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The vast majority of languages in the world at present are considered to be low-resource languages. Since the availability of large parallel data is crucial for the success of … |
C. Lalrempuii; B. Soni; | ACM Transactions on Asian and Low-Resource Language … | 2023-07-13 |
| 835 | The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper describes the NPU-MSXF system for the IWSLT 2023 speech-to-speech translation (S2ST) task which aims to translate from English speech of multi-source to Chinese speech. |
KUN SONG et. al. | arxiv-cs.SD | 2023-07-10 |
| 836 | Scene Graph As Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we investigate a more realistic unsupervised multimodal machine translation (UMMT) setup, inference-time image-free UMMT, where the model is trained with source-text image pairs, and tested with only source-text inputs. |
Hao Fei; Qian Liu; Meishan Zhang; Min Zhang; Tat-Seng Chua; | acl | 2023-07-08 |
| 837 | Multilingual Event Extraction from Historical Newspaper Adverts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a new multilingual dataset in English, French, and Dutch composed of newspaper ads from the early modern colonial period reporting on enslaved people who liberated themselves from enslavement. |
Nadav Borenstein; Nat�lia da Silva Perez; Isabelle Augenstein; | acl | 2023-07-08 |
| 838 | Continual Knowledge Distillation for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose a method called continual knowledge distillation to take advantage of existing translation models to improve one model of interest. |
Yuanchi Zhang; Peng Li; Maosong Sun; Yang Liu; | acl | 2023-07-08 |
| 839 | Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To eliminate the rule-based nature of data creation, we instead propose using machine translation models to create gender-biased text from real gender-fair text via round-trip translation. |
Chantal Amrhein; Florian Schottmann; Rico Sennrich; Samuel L�ubli; | acl | 2023-07-08 |
| 840 | MCLIP: Multilingual CLIP Via Cross-lingual Transfer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce mCLIP, a retrieval-efficient dual-stream multilingual VLP model, trained by aligning the CLIP model and a Multilingual Text Encoder (MTE) through a novel Triangle Cross-modal Knowledge Distillation (TriKD) method. |
GUANHUA CHEN et. al. | acl | 2023-07-08 |
| 841 | Songs Across Borders: Singable and Controllable Neural Lyric Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper bridges the singability quality gap by formalizing lyric translation into a constrained translation problem, converting theoretical guidance and practical techniques from translatology literature to prompt-driven NMT approaches, exploring better adaptation methods, and instantiating them to an English-Chinese lyric translation system. |
Longshen Ou; Xichu Ma; Min-Yen Kan; Ye Wang; | acl | 2023-07-08 |
| 842 | Stop Pre-Training: Adapt Visual-Language Models to Unseen Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a simple yet efficient approach to adapt VLP to unseen languages using MPLM. |
Yasmine Karoui; R�mi Lebret; Negar Foroutan Eghlidi; Karl Aberer; | acl | 2023-07-08 |
| 843 | Using Neural Machine Translation for Generating Diverse Challenging Exercises for Language Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel approach to automatically generate distractors for cloze exercises for English language learners, using round-trip neural machine translation. |
Frank Palma Gomez; Subhadarshi Panda; Michael Flor; Alla Rozovskaya; | acl | 2023-07-08 |
| 844 | Translation-Enhanced Multilingual Text-to-Image Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We provide two key contributions. 1) Relying on a multilingual multi-modal encoder, we provide a systematic empirical study of standard methods used in cross-lingual NLP when applied to mTTI: Translate Train, Translate Test, and Zero-Shot Transfer. 2) We propose Ensemble Adapter (EnsAd), a novel parameter-efficient approach that learns to weigh and consolidate the multilingual text knowledge within the mTTI framework, mitigating the language gap and thus improving mTTI performance. |
Yaoyiran Li; Ching-Yun Chang; Stephen Rawls; Ivan Vulic; Anna Korhonen; | acl | 2023-07-08 |
| 845 | A Simple Concatenation Can Effectively Improve Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by the works of video Transformer, we propose a simple unified cross-modal ST method, which concatenates speech and text as the input, and builds a teacher that can utilize both cross-modal information simultaneously. |
Linlin Zhang; Kai Fan; Boxing Chen; Luo Si; | acl | 2023-07-08 |
| 846 | Understanding and Improving The Robustness of Terminology Constraints in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study the robustness of two typical terminology translation methods: Placeholder (PH) and Code-Switch (CS), concerning (1) the number of constraints and (2) the target constraint length. |
HUAAO ZHANG et. al. | acl | 2023-07-08 |
| 847 | MultiTACRED: A Multilingual Version of The TAC Relation Extraction Dataset IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Relation extraction (RE) is a fundamental task in information extraction, whose extension to multilingual settings has been hindered by the lack of supervised resources comparable in size to large English datasets such as TACRED (Zhang et al. , 2017). To address this gap, we introduce the MultiTACRED dataset, covering 12 typologically diverse languages from 9 language families, which is created by machine-translating TACRED instances and automatically projecting their entity annotations. |
Leonhard Hennig; Philippe Thomas; Sebastian M�ller; | acl | 2023-07-08 |
| 848 | Searching for Needles in A Haystack: On The Role of Incidental Bilingualism in PaLM�s Translation Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a mixed-method approach to measure and understand incidental bilingualism at scale. |
Eleftheria Briakou; Colin Cherry; George Foster; | acl | 2023-07-08 |
| 849 | Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Unpaired cross-lingual image captioning has long suffered from irrelevancy and disfluency issues, due to the inconsistencies of the semantic scene and syntax attributes during transfer. In this work, we propose to address the above problems by incorporating the scene graph (SG) structures and the syntactic constituency (SC) trees. |
Shengqiong Wu; Hao Fei; Wei Ji; Tat-Seng Chua; | acl | 2023-07-08 |
| 850 | XPQA: Cross-Lingual Product Question Answering in 12 Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While existing work on PQA focuses mainly on English, in practice there is need to support multiple customer languages while leveraging product information available in English. To study this practical industrial task, we present xPQA, a large-scale annotated cross-lingual PQA dataset in 12 languages, and report results in (1) candidate ranking, to select the best English candidate containing the information to answer a non-English question; and (2) answer generation, to generate a natural-sounding non-English answer based on the selected English candidate. |
Xiaoyu Shen; Akari Asai; Bill Byrne; Adria De Gispert; | acl | 2023-07-08 |
| 851 | Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present a new MMT approach based on a strong text-only MT model, which uses neural adapters, a novel guided self-attention mechanism and which is jointly trained on both visually-conditioned masking and MMT. |
Matthieu Futeral; Cordelia Schmid; Ivan Laptev; Beno�t Sagot; Rachel Bawden; | acl | 2023-07-08 |
| 852 | Multi-VALUE: A Framework for Cross-Dialectal English NLP IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a suite of resources for evaluating and achieving English dialect invariance. |
CALEB ZIEMS et. al. | acl | 2023-07-08 |
| 853 | TeCS: A Dataset and Benchmark for Tense Consistency of Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present a parallel tense test set, containing French-English 552 utterances. |
Yiming Ai; Zhiwei He; Kai Yu; Rui Wang; | acl | 2023-07-08 |
| 854 | On Evaluating Multilingual Compositional Generalization with Translated Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, we show that this entails critical semantic distortion. To address this limitation, we craft a faithful rule-based translation of the MCWQ dataset from English to Chinese and Japanese. |
Zi Wang; Daniel Hershcovich; | acl | 2023-07-08 |
| 855 | Neural Machine Translation Methods for Translating Text to Sign Language Glosses IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In our experiments, we improve the performance of the transformer-based models via (1) data augmentation, (2) semi-supervised Neural Machine Translation (NMT), (3) transfer learning and (4) multilingual NMT. |
Dele Zhu; Vera Czehmann; Eleftherios Avramidis; | acl | 2023-07-08 |
| 856 | Learning Language-Specific Layers for Multilingual Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce Language-Specific Transformer Layers (LSLs), which allow us to increase model capacity, while keeping the amount of computation and the number of parameters used in the forward pass constant. |
Telmo Pires; Robin Schmidt; Yi-Hsiu Liao; Stephan Peitz; | acl | 2023-07-08 |
| 857 | Exploring Better Text Image Translation with Multimodal Codebook IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we first annotate a Chinese-English TIT dataset named OCRMT30K, providing convenience for subsequent studies. |
ZHIBIN LAN et. al. | acl | 2023-07-08 |
| 858 | Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Firstly, the current objective of KD spreads its focus to whole distributions to learn the knowledge, yet lacks special treatment on the most crucial top-1 information. Secondly, the knowledge is largely covered by the golden information due to the fact that most top-1 predictions of teachers overlap with ground-truth tokens, which further restricts the potential of KD. To address these issues, we propose a new method named Top-1 Information Enhanced Knowledge Distillation (TIE-KD). |
SONGMING ZHANG et. al. | acl | 2023-07-08 |
| 859 | What About �em�? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Wrong pronoun translations can discriminate against marginalized groups, e. g. , non-binary individuals (Dev et al. , 2021). In this �reality check�, we study how three commercial MT systems translate 3rd-person pronouns. |
Anne Lauscher; Debora Nozza; Ehm Miltersen; Archie Crowley; Dirk Hovy; | acl | 2023-07-08 |
| 860 | CMOT: Cross-modal Mixup Via Optimal Transport for Speech Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose Cross-modal Mixup via Optimal Transport (CMOT) to overcome the modality gap. |
Yan Zhou; Qingkai Fang; Yang Feng; | acl | 2023-07-08 |
| 861 | PEIT: Bridging The Modality Gap with Pre-trained Models for End-to-End Image Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose PEIT, an end-to-end image translation framework that bridges the modality gap with pre-trained models. |
Shaolin Zhu; Shangjie Li; Yikun Lei; Deyi Xiong; | acl | 2023-07-08 |
| 862 | Do GPTs Produce Less Literal Translations? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, there has been relatively little investigation on how such translations differ qualitatively from the translations generated by standard Neural Machine Translation (NMT) models. In this work, we investigate these differences in terms of the literalness of translations produced by the two systems. |
Vikas Raunak; Arul Menezes; Matt Post; Hany Hassan; | acl | 2023-07-08 |
| 863 | Text Style Transfer Back-Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: For natural inputs, BT brings only slight improvements and sometimes even adverse effects. To address this issue, we propose Text Style Transfer Back Translation (TST BT), which uses a style transfer to modify the source side of BT data. |
DAIMENG WEI et. al. | acl | 2023-07-08 |
| 864 | An Analysis of Error Types in Chinese to English Translation By Google Neural Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Abstract: Due to the rapid development of globalization and digitalization, neural machine translation (NMT) systems have gradually developed into the mainstream technology in the … |
Yanqi Lu; | Proceedings of the 2023 International Joint Conference on … | 2023-07-07 |
| 865 | Tokenization Effect on Neural Machine Translation: An Experimental Investigation for English-Assamese Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Tokenization, as a research task, is mostly overlooked when dealing with machine translation as much emphasis is placed on modelling or data enhancement, not to speak for language … |
Mazida Akhtara Ahmed; Kishore Kashyap; Shikhar Kumar Sarma; | 2023 14th International Conference on Computing … | 2023-07-06 |
| 866 | Performance Evaluation of English to Bodo Neural Machine Translation System with Varying Model Architecture and Vocabulary Size Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper is about a work done on Neural Machine Translation of English-Bodo language pair using deep learning technique. Bodo is a language of northeastern part of India … |
P. Boruah; Shikhar Kr. Sarma; Kishore Kashyap; Simanta Kalita; | 2023 14th International Conference on Computing … | 2023-07-06 |
| 867 | EGRUMET: Enhanced Gated Recurrent Unit Machine for English to Kannada Lingual Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: A branch of natural language processing called neural machine translation (NMT) focuses on using artificial neural networks for translating textual content in one language to … |
S. R. Kashi; Vineeth H R; Gururaja H S; | 2023 14th International Conference on Computing … | 2023-07-06 |
| 868 | To Be or Not to Be: A Translation Reception Study of A Literary Text Translated Into Dutch and Catalan Using Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article presents the results of a study involving the reception of a fictional story by Kurt Vonnegut translated from English into Catalan and Dutch in three conditions: machine-translated (MT), post-edited (PE) and translated from scratch (HT). |
Ana Guerberof Arenas; Antonio Toral; | arxiv-cs.CL | 2023-07-05 |
| 869 | Shiftable Context: Addressing Training-Inference Context Mismatch in Simultaneous Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, such models create a context mismatch between training and inference environments, hindering potential translation accuracy. We solve this issue by proposing Shiftable Context, a simple yet effective scheme to ensure that consistent segment and context sizes are maintained throughout training and inference, even with the presence of partially filled segments due to the streaming nature of simultaneous translation. |
Matthew Raffel; Drew Penney; Lizhong Chen; | arxiv-cs.CL | 2023-07-03 |
| 870 | English Grammar Multiple-choice Question Generation Using Text-to-Text Transfer Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Peerawat Chomphooyod; A. Suchato; Nuengwong Tuaycharoen; P. Punyabukkana; | Comput. Educ. Artif. Intell. | 2023-07-01 |
| 871 | Simplification of Arabic Text: A Hybrid Approach Integrating Machine Translation and Transformer-based Lexical Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Suha Al-Thanyyan; Aqil M. Azmi; | J. King Saud Univ. Comput. Inf. Sci. | 2023-07-01 |
| 872 | X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Task-oriented dialogue research has mainly focused on a few popular languages like English and Chinese, due to the high dataset creation cost for a new language. |
MEHRAD MORADSHAHI et. al. | arxiv-cs.CL | 2023-06-30 |
| 873 | Stop Pre-Training: Adapt Visual-Language Models to Unseen Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a simple yet efficient approach to adapt VLP to unseen languages using MPLM. |
Yasmine Karoui; Rémi Lebret; Negar Foroutan; Karl Aberer; | arxiv-cs.CL | 2023-06-29 |
| 874 | Learning Multilingual Expressive Speech Representation for Prosody Prediction Without Parallel Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We propose a method for speech-to-speech emotionpreserving translation that operates at the level of discrete speech units. Our approach relies on the use of multilingual emotion … |
J. Duret; Titouan Parcollet; Y. Estève; | ArXiv | 2023-06-29 |
| 875 | Slot Lost in Translation? Not Anymore: A Machine Translation Model for Virtual Assistants with Type-Independent Slot Transfer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this article, we present a machine translation model adapted to the domain of intelligent virtual assistants (IVA) that can be used to translate training and evaluation … |
Marcin Sowanski; A. Janicki; | 2023 30th International Conference on Systems, Signals and … | 2023-06-27 |
| 876 | Constructing Multilingual Code Search Dataset Using Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this research, we create a multilingual code search dataset in four natural and four programming languages using a neural machine translation model. |
Ryo Sekizawa; Nan Duan; Shuai Lu; Hitomi Yanaka; | arxiv-cs.CL | 2023-06-27 |
| 877 | The Unreasonable Effectiveness of Few-shot Learning for Machine Translation IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We show that with only 5 examples of high-quality translation data shown at inference, a transformer decoder-only model trained solely with self-supervised learning, is able to match specialized supervised state-of-the-art models as well as more general commercial translation systems. |
XAVIER GARCIA et. al. | icml | 2023-06-27 |
| 878 | A Graph Fusion Approach for Cross-Lingual Machine Reading Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel approach, which jointly models the cross-lingual alignment information and the mono-lingual syntax information using a graph. |
ZENAN XU et. al. | aaai | 2023-06-26 |
| 879 | Evaluation of Chinese-English Machine Translation of Emotion-Loaded Microblog Texts: A Human Annotated Dataset for The Quality Assessment of Emotion Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we focus on how current Machine Translation (MT) tools perform on the translation of emotion-loaded texts by evaluating outputs from Google Translate according to a framework proposed in this paper. |
Shenbin Qian; Constantin Orasan; Felix do Carmo; Qiuliang Li; Diptesh Kanojia; | arxiv-cs.CL | 2023-06-20 |
| 880 | BayLing: Bridging Cross-lingual Alignment and Instruction Following Through Interactive Translation for Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To minimize human workload, we propose to transfer the capabilities of language generation and instruction following from English to other languages through an interactive translation task. |
SHAOLEI ZHANG et. al. | arxiv-cs.CL | 2023-06-19 |
| 881 | Robust Secret Data Hiding for Transformer-based Neural Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Hiding secret information in text is a research area of significant importance and a great challenge. In recent years, there have been huge developments and exciting advances in … |
Tianhe Lu; Gongshen Liu; Ru Zhang; Peixuan Li; Tianjie Ju; | 2023 International Joint Conference on Neural Networks … | 2023-06-18 |
| 882 | Discourse Representation Structure Parsing for Chinese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We describe the pipeline of automatically collecting the linearized Chinese meaning representation data for sequential-to sequential neural networks. |
Chunliu Wang; Xiao Zhang; Johan Bos; | arxiv-cs.CL | 2023-06-16 |
| 883 | Baseline Transliteration Corpus for Improved English-Amharic Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yohannes Biadgligne; Kamel Smaïli; | Informatica (Slovenia) | 2023-06-15 |
| 884 | Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Babel-ImageNet, a massively multilingual benchmark that offers (partial) translations of ImageNet labels to 100 languages, built without machine translation or manual annotation. |
Gregor Geigle; Radu Timofte; Goran Glavaš; | arxiv-cs.CL | 2023-06-14 |
| 885 | Measuring Sentiment Bias in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper we explore how machine translation might introduce a bias in sentiments as classified by sentiment analysis models. |
KAI HARTUNG et. al. | arxiv-cs.CL | 2023-06-12 |
| 886 | Textual Augmentation Techniques Applied to Low Resource Machine Translation: Case of Swahili Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work we investigate the impact of applying textual data augmentation tasks to low resource machine translation. |
Catherine Gitau; VUkosi Marivate; | arxiv-cs.CL | 2023-06-12 |
| 887 | A Survey of Vision-Language Pre-training from The Lens of Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We summarize the common architectures, pre-training objectives, and datasets from literature and conjecture what further is needed to make progress on multimodal machine translation. |
Jeremy Gwinnup; Kevin Duh; | arxiv-cs.CL | 2023-06-12 |
| 888 | Good, But Not Always Fair: An Evaluation of Gender Bias for Three Commercial Machine Translation Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Consequently, analyses have been redirected to more nuanced aspects, intricate phenomena, as well as potential risks that may arise from the widespread use of MT tools. Along this line, this paper offers a meticulous assessment of three commercial MT systems – Google Translate, DeepL, and Modern MT – with a specific focus on gender translation and bias. |
Silvia Alma Piazzolla; Beatrice Savoldi; Luisa Bentivogli; | arxiv-cs.CL | 2023-06-09 |
| 889 | Assisting Language Learners: Automated Trans-Lingual Definition Generation Via Contrastive Prompt Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel task of Trans-Lingual Definition Generation (TLDG), which aims to generate definitions in another language, i.e., the native speaker’s language. |
HENGYUAN ZHANG et. al. | arxiv-cs.CL | 2023-06-09 |
| 890 | Twi Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: French is a strategically and economically important language in the regions where the African language Twi is spoken. However, only a very small proportion of Twi speakers in … |
Frederick Gyasi; Tim Schlippe; | Big Data Cogn. Comput. | 2023-06-08 |
| 891 | A Little Is Enough: Few-Shot Quality Estimation Based Corpus Filtering Improves Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: All the scripts and datasets utilized in this study will be publicly available. |
Akshay Batheja; Pushpak Bhattacharyya; | arxiv-cs.CL | 2023-06-06 |
| 892 | MCTS: A Multi-Reference Chinese Text Simplification Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce MCTS, a multi-reference Chinese text simplification dataset. |
RUINING CHONG et. al. | arxiv-cs.CL | 2023-06-05 |
| 893 | Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper describes the submission of the UPC Machine Translation group to the IWSLT 2023 Offline Speech Translation task. |
Ioannis Tsiamas; Gerard I. Gállego; José A. R. Fonollosa; Marta R. Costa-jussà; | arxiv-cs.CL | 2023-06-02 |
| 894 | Improved Cross-Lingual Transfer Learning For Automatic Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The goal of this work it to improve cross-lingual transfer learning in multilingual speech-to-text translation via semantic knowledge distillation. |
SAMEER KHURANA et. al. | arxiv-cs.CL | 2023-06-01 |
| 895 | Regressing Word and Sentence Embeddings for Low-Resource Neural Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, neural machine translation (NMT) has achieved unprecedented performance in the automated translation of resource-rich languages. However, it has not yet managed … |
Inigo Jauregi Unanue; E. Z. Borzeshi; M. Piccardi; | IEEE Transactions on Artificial Intelligence | 2023-06-01 |
| 896 | Improving Polish to English Neural Machine Translation with Transfer Learning: Effects of Data Volume and Language Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper investigates the impact of data volume and the use of similar languages on transfer learning in a machine translation task. |
Juuso Eronen; Michal Ptaszynski; Karol Nowakowski; Zheng Lin Chia; Fumito Masui; | arxiv-cs.CL | 2023-06-01 |
| 897 | Automatic Discrimination of Human and Neural Machine Translation in Multilingual Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We tackle the task of automatically discriminating between human and machine translations. |
Malina Chichirau; Rik van Noord; Antonio Toral; | arxiv-cs.CL | 2023-05-31 |
| 898 | How Does Pretraining Improve Discourse-Aware Translation? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the underlying reasons for their strong performance have not been well explained. To bridge this gap, we introduce a probing task to interpret the ability of PLMs to capture discourse relation knowledge. |
Zhihong Huang; Longyue Wang; Siyou Liu; Derek F. Wong; | arxiv-cs.CL | 2023-05-31 |
| 899 | Translation-Enhanced Multilingual Text-to-Image Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: 2) We propose Ensemble Adapter (EnsAd), a novel parameter-efficient approach that learns to weigh and consolidate the multilingual text knowledge within the mTTI framework, mitigating the language gap and thus improving mTTI performance. |
Yaoyiran Li; Ching-Yun Chang; Stephen Rawls; Ivan Vulić; Anna Korhonen; | arxiv-cs.CL | 2023-05-30 |
| 900 | A Corpus for Sentence-level Subjectivity Detection on English News Articles IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We develop novel annotation guidelines for sentence-level subjectivity detection, which are not limited to language-specific cues. |
FRANCESCO ANTICI et. al. | arxiv-cs.CL | 2023-05-29 |
| 901 | HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents HaVQA, the first multimodal dataset for visual question-answering (VQA) tasks in the Hausa language. |
SHANTIPRIYA PARIDA et. al. | arxiv-cs.CL | 2023-05-28 |
| 902 | An Open-Source Gloss-Based Baseline for Spoken to Signed Language Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present an open-source implementation of a text-to-gloss-to-pose-to-video pipeline approach, demonstrating conversion from German to Swiss German Sign Language, French to French Sign Language of Switzerland, and Italian to Italian Sign Language of Switzerland. |
AMIT MORYOSSEF et. al. | arxiv-cs.CL | 2023-05-28 |
| 903 | Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper describes CIC NLP’s submission to the AmericasNLP 2023 Shared Task on machine translation systems for indigenous languages of the Americas. |
ATNAFU LAMBEBO TONJA et. al. | arxiv-cs.CL | 2023-05-27 |
| 904 | Do GPTs Produce Less Literal Translations? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, there has been relatively little investigation on how such translations differ qualitatively from the translations generated by standard Neural Machine Translation (NMT) models. In this work, we investigate these differences in terms of the literalness of translations produced by the two systems. |
Vikas Raunak; Arul Menezes; Matt Post; Hany Hassan Awadalla; | arxiv-cs.CL | 2023-05-26 |
| 905 | Robustness of Multi-Source MT to Transcription Errors Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Automatic speech translation is sensitive to speech recognition errors, but in a multilingual scenario, the same content may be available in various languages via simultaneous interpreting, dubbing or subtitling. In this paper, we hypothesize that leveraging multiple sources will improve translation quality if the sources complement one another in terms of correct information they contain. |
Dominik Macháček; Peter Polák; Ondřej Bojar; Raj Dabre; | arxiv-cs.CL | 2023-05-26 |
| 906 | MTCue: Learning Zero-Shot Control of Extra-Textual Attributes By Leveraging Unstructured Context in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This work introduces MTCue, a novel neural machine translation (NMT) framework that interprets all context (including discrete variables) as text. |
Sebastian Vincent; Robert Flynn; Carolina Scarton; | arxiv-cs.CL | 2023-05-25 |
| 907 | What About “em”? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As 3rd-person pronoun usage shifts to include novel forms, e.g., neopronouns, we need more research on identity-inclusive NLP. Exclusion is particularly harmful in one of the most … |
Anne Lauscher; Debora Nozza; Archie Crowley; E. Miltersen; Dirk Hovy; | Annual Meeting of the Association for Computational … | 2023-05-25 |
| 908 | Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose Cross-Lingual Knowledge Distillation (CLKD) from a strong English AS2 teacher as a method to train AS2 models for low-resource languages in the tasks without the need of labeled data for the target language. |
Shivanshu Gupta; Yoshitomo Matsubara; Ankit Chadha; Alessandro Moschitti; | arxiv-cs.CL | 2023-05-25 |
| 909 | What About Em? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Wrong pronoun translations can discriminate against marginalized groups, e.g., non-binary individuals (Dev et al., 2021). In this “reality check”, we study how three commercial MT systems translate 3rd-person pronouns. |
Anne Lauscher; Debora Nozza; Archie Crowley; Ehm Miltersen; Dirk Hovy; | arxiv-cs.CL | 2023-05-25 |
| 910 | Eliciting The Translation Ability of Large Language Models Via Multilingual Finetuning with Translation Instructions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation following given instructions. |
Jiahuan Li; Hao Zhou; Shujian Huang; Shanbo Cheng; Jiajun Chen; | arxiv-cs.CL | 2023-05-24 |
| 911 | Leveraging GPT-4 for Automatic Translation Post-Editing IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we formalize the task of direct translation post-editing with Large Language Models (LLMs) and explore the use of GPT-4 to automatically post-edit NMT outputs across several language pairs. |
Vikas Raunak; Amr Sharaf; Yiren Wang; Hany Hassan Awadallah; Arul Menezes; | arxiv-cs.CL | 2023-05-24 |
| 912 | Improving Speech Translation By Fusing Speech and Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we harness the complementary strengths of speech and text, which are disparate modalities. |
WENBIAO YIN et. al. | arxiv-cs.CL | 2023-05-23 |
| 913 | Non-parametric, Nearest-neighbor-assisted Fine-tuning for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through qualitative analysis, we found particular improvements when it comes to translating grammatical relations or function words, which results in increased fluency of our model. |
Jiayi Wang; Ke Wang; Yuqi Zhang; Yu Zhao; Pontus Stenetorp; | arxiv-cs.CL | 2023-05-22 |
| 914 | VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present our deployment-ready Speech-to-Speech Machine Translation (SSMT) system for English-Hindi, English-Marathi, and Hindi-Marathi language pairs. |
SHIVAM MHASKAR et. al. | arxiv-cs.CL | 2023-05-21 |
| 915 | Is Translation Helpful? An Empirical Analysis of Cross-Lingual Transfer in Low-Resource Dialog Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A typical approach is to leverage off-the-shelf machine translation (MT) systems to utilize either the training corpus or developed models from high-resource languages. In this work, we investigate whether it is helpful to utilize MT at all in this task. |
Lei Shen; Shuai Yu; Xiaoyu Shen; | arxiv-cs.CL | 2023-05-21 |
| 916 | Low-resource Multilingual Neural Translation Using Linguistic Feature-based Relevance Mechanisms IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This article investigates approaches to effectively harness source-side linguistic features for low-resource multilingual neural machine translation (MNMT). Previous works focus … |
Abhisek Chakrabarty; Raj Dabre; Chenchen Ding; M. Utiyama; E. Sumita; | ACM Transactions on Asian and Low-Resource Language … | 2023-05-18 |
| 917 | Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To eliminate the rule-based nature of data creation, we instead propose using machine translation models to create gender-biased text from real gender-fair text via round-trip translation. |
Chantal Amrhein; Florian Schottmann; Rico Sennrich; Samuel Läubli; | arxiv-cs.CL | 2023-05-18 |
| 918 | Cross-modality Data Augmentation for End-to-End Sign Language Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Due to these challenges, the input and output distributions of end-to-end sign language translation (i.e., video-to-text) are less effective compared to the gloss-to-text approach (i.e., text-to-text). To tackle these challenges, we propose a novel Cross-modality Data Augmentation (XmDA) framework to transfer the powerful gloss-to-text translation capabilities to end-to-end sign language translation (i.e. video-to-text) by exploiting pseudo gloss-text pairs from the sign gloss translation model. |
Jinhui Ye; Wenxiang Jiao; Xing Wang; Zhaopeng Tu; Hui Xiong; | arxiv-cs.CL | 2023-05-18 |
| 919 | Multilingual Event Extraction from Historical Newspaper Adverts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a new multilingual dataset in English, French, and Dutch composed of newspaper ads from the early modern colonial period reporting on enslaved people who liberated themselves from enslavement. |
Nadav Borenstein; Natalia da Silva Perez; Isabelle Augenstein; | arxiv-cs.CL | 2023-05-18 |
| 920 | NollySenti: Leveraging Transfer Learning and Machine Translation for Nigerian Movie Sentiment Classification IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we focus on the task of sentiment classification for cross domain adaptation. |
Iyanuoluwa Shode; David Ifeoluwa Adelani; Jing Peng; Anna Feldman; | arxiv-cs.CL | 2023-05-18 |
| 921 | ChatGPT Perpetuates Gender Bias in Machine Translation and Ignores Non-Gendered Pronouns: Findings Across Bengali and Five Other Low-Resource Languages IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this multicultural age, language translation is one of the most performed tasks, and it is becoming increasingly AI-moderated and automated. As a novel AI system, ChatGPT claims to be proficient in such translation tasks and in this paper, we put that claim to the test. |
Sourojit Ghosh; Aylin Caliskan; | arxiv-cs.CY | 2023-05-17 |
| 922 | Searching for Needles in A Haystack: On The Role of Incidental Bilingualism in PaLM’s Translation Capability IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large, multilingual language models exhibit surprisingly good zero- or few-shot machine translation capabilities, despite having never seen the intentionally-included translation … |
Eleftheria Briakou; Colin Cherry; George F. Foster; | Annual Meeting of the Association for Computational … | 2023-05-17 |
| 923 | Searching for Needles in A Haystack: On The Role of Incidental Bilingualism in PaLM’s Translation Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a mixed-method approach to measure and understand incidental bilingualism at scale. |
Eleftheria Briakou; Colin Cherry; George Foster; | arxiv-cs.CL | 2023-05-17 |
| 924 | XPQA: Cross-Lingual Product Question Answering Across 12 Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: While existing work on PQA focuses mainly on English, in practice there is need to support multiple customer languages while leveraging product information available in English. To study this practical industrial task, we present xPQA, a large-scale annotated cross-lingual PQA dataset in 12 languages across 9 branches, and report results in (1) candidate ranking, to select the best English candidate containing the information to answer a non-English question; and (2) answer generation, to generate a natural-sounding non-English answer based on the selected English candidate. |
Xiaoyu Shen; Akari Asai; Bill Byrne; Adrià de Gispert; | arxiv-cs.CL | 2023-05-16 |
| 925 | The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided By Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Motivated particularly by the task of cross-lingual SLU, we demonstrate that the task of speech translation (ST) is a good means of pretraining speech models for end-to-end SLU on both intra- and cross-lingual scenarios. |
Mutian He; Philip N. Garner; | arxiv-cs.CL | 2023-05-16 |
| 926 | Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Secondly, the knowledge is largely covered by the golden information due to the fact that most top-1 predictions of teachers overlap with ground-truth tokens, which further restricts the potential of KD. To address these issues, we propose a novel method named \textbf{T}op-1 \textbf{I}nformation \textbf{E}nhanced \textbf{K}nowledge \textbf{D}istillation (TIE-KD). |
SONGMING ZHANG et. al. | arxiv-cs.CL | 2023-05-14 |
| 927 | Cross-language Information Retrieval for Poetry Form of Literature-based on Machine Transliteration Using CNN Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Transliteration is phonetically translating a language’s words into an international or non-native screenplay. The machine translation process now plays an essential role in … |
R. Jadhav; M. Dhore; | J. Intell. Fuzzy Syst. | 2023-05-13 |
| 928 | Subword Segmental Machine Translation: Unifying Segmentation and Target Sentence Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To use SSMT during inference we propose dynamic decoding, a text generation algorithm that adapts segmentations as it generates translations. |
Francois Meyer; Jan Buys; | arxiv-cs.CL | 2023-05-11 |
| 929 | Text-image Matching for Multi-model Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Xiayang Shi; Zhenqiang Yu; Xuhui Wang; Yijun Li; Yufeng Niu; | The Journal of Supercomputing | 2023-05-09 |
| 930 | MultiTACRED: A Multilingual Version of The TAC Relation Extraction Dataset IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Relation extraction (RE) is a fundamental task in information extraction, whose extension to multilingual settings has been hindered by the lack of supervised resources comparable in size to large English datasets such as TACRED (Zhang et al., 2017). To address this gap, we introduce the MultiTACRED dataset, covering 12 typologically diverse languages from 9 language families, which is created by machine-translating TACRED instances and automatically projecting their entity annotations. |
Leonhard Hennig; Philippe Thomas; Sebastian Möller; | arxiv-cs.CL | 2023-05-08 |
| 931 | Label-Free Multi-Domain Machine Translation with Stage-wise Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a label-free multi-domain machine translation model which requires only a few or no domain-annotated data in training and no domain labels in inference. |
Fan Zhang; Mei Tu; Sangha Kim; Song Liu; Jinyao Yan; | arxiv-cs.CL | 2023-05-06 |
| 932 | Investigating Lexical Sharing in Multilingual Machine Translation for Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we investigate lexical sharing in multilingual machine translation (MT) from Hindi, Gujarati, Nepali into English. |
Sonal Sannigrahi; Rachel Bawden; | arxiv-cs.CL | 2023-05-04 |
| 933 | Unified Model Learning for Various Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although the dataset-specific models have achieved impressive performance, it is cumbersome as each dataset demands a model to be designed, trained, and stored. In this work, we aim to unify these translation tasks into a more general setting. |
YUNLONG LIANG et. al. | arxiv-cs.CL | 2023-05-04 |
| 934 | Learning Language-Specific Layers for Multilingual Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce Language-Specific Transformer Layers (LSLs), which allow us to increase model capacity, while keeping the amount of computation and the number of parameters used in the forward pass constant. |
Telmo Pessoa Pires; Robin M. Schmidt; Yi-Hsiu Liao; Stephan Peitz; | arxiv-cs.CL | 2023-05-04 |
| 935 | English-Assamese Neural Machine Translation Using Prior Alignment and Pre-trained Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
SAHINUR RAHMAN LASKAR et. al. | Comput. Speech Lang. | 2023-05-01 |
| 936 | Metamorphic Testing of Machine Translation Models Using Back Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine translation software has been widely adopted in recent years. The recent advance in deep learning research has massively improved the accuracy and fluency of the … |
Wentao Gao; Jiayuan He; Van-Thuan Pham; | 2023 IEEE/ACM International Workshop on Deep Learning for … | 2023-05-01 |
| 937 | Cross-lingual Text Reuse Detection at Document Level for English-Urdu Language Pair Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, the problem of Cross-Lingual Text Reuse Detection (CLTRD) has gained the interest of the research community due to the availability of large digital repositories … |
M. Sharjeel; I. Muneer; S. Nosheen; R. M. A. Nawab; Paul Rayson; | ACM Transactions on Asian and Low-Resource Language … | 2023-05-01 |
| 938 | Low-Resourced Machine Translation for Senegalese Wolof Language Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a parallel Wolof/French corpus of 123,000 sentences on which we conducted experiments on machine translation models based on Recurrent Neural Networks (RNN) in different data configurations. |
Derguene Mbaye; Moussa Diallo; Thierno Ibrahima Diop; | arxiv-cs.CL | 2023-04-30 |
| 939 | Cross-lingual Search for E-Commerce Based on Query Translatability and Mixed-Domain Fine-Tuning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Online stores in the US offer a unique scenario for Cross-Lingual Information Retrieval (CLIR) due to the mix of Spanish and English in user queries. Machine Translation (MT) … |
JESUS PEREZ-MARTIN et. al. | Companion Proceedings of the ACM Web Conference 2023 | 2023-04-30 |
| 940 | Rethinking The Reasonability of The Test Set for Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we manually annotate a monotonic test set based on the MuST-C English-Chinese test set, denoted as SiMuST-C. |
M. LIU et. al. | icassp | 2023-04-27 |
| 941 | Improving Speech-to-Speech Translation Through Unlabeled Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an effective way to utilize the massive existing unlabeled text from different languages to create a large amount of S2ST data to improve S2ST performance by applying various acoustic effects to the generated synthetic data. |
X. -P. NGUYEN et. al. | icassp | 2023-04-27 |
| 942 | Lost In Translation: Generating Adversarial Examples Robust to Round-Trip Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present a comprehensive study on the robustness of current text adversarial attacks to round-trip translation. |
N. Bhandari; P. -Y. Chen; | icassp | 2023-04-27 |
| 943 | A Corpus-Based Auto-encoder-and-Decoder Machine Translation Using Deep Neural Network for Translation from English to Telugu Language IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Mohan Mahanty; B. Vamsi; Dasari Madhavi; | SN Computer Science | 2023-04-26 |
| 944 | NAIST-SIC-Aligned: An Aligned English-Japanese Simultaneous Interpretation Corpus Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we aim to fill in the gap by introducing NAIST-SIC-Aligned, which is an automatically-aligned parallel English-Japanese SI dataset. |
JINMING ZHAO et. al. | arxiv-cs.CL | 2023-04-23 |
| 945 | NAIST-SIC-Aligned: Automatically-Aligned English-Japanese Simultaneous Interpretation Corpus Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: It remains a question that how simultaneous interpretation (SI) data affects simultaneous machine translation (SiMT). Research has been limited due to the lack of a large-scale … |
JINMING ZHAO et. al. | ArXiv | 2023-04-23 |
| 946 | Lost in Translationese? Reducing Translation Effect Using Abstract Meaning Representation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We compare our AMR-based approach against three other techniques based on machine translation or paraphrase generation. |
Shira Wein; Nathan Schneider; | arxiv-cs.CL | 2023-04-22 |
| 947 | Building A Participatory Data Design Approach to Examine Gender Bias in English-Twi Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This project attempts to build a data-design approach to examine the detection and mitigation of gender bias in an English– Twi machine translation. This project makes use of an … |
Abigail Oppong; | Extended Abstracts of the 2023 CHI Conference on Human … | 2023-04-19 |
| 948 | TransDocs: Optical Character Recognition with Word to Word Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, I have shown comparative study for pre-trained OCR while using deep learning model using LSTM-based seq2seq architecture with attention for machine translation. |
Abhishek Bamotra; Phani Krishna Uppala; | arxiv-cs.CV | 2023-04-15 |
| 949 | RISC: Generating Realistic Synthetic Bilingual Insurance Contract Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents RISC, an open-source Python package data generator (https://github.com/GRAAL-Research/risc). |
David Beauchemin; Richard Khoury; | arxiv-cs.CL | 2023-04-09 |
| 950 | LAHM : Large Annotated Dataset for Multi-Domain and Multilingual Hate Speech Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a new multilingual hate speech analysis dataset for English, Hindi, Arabic, French, German and Spanish languages for multiple domains across hate speech – Abuse, Racism, Sexism, Religious Hate and Extremism. |
Ankit Yadav; Shubham Chandel; Sushant Chatufale; Anil Bandhakavi; | arxiv-cs.CL | 2023-04-03 |
| 951 | LAHM : Large Annotated Dataset for Multilingual & Multi-Domain Hate Speech Identification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Current research on hate speech analysis is typically oriented towards monolingual and single classification tasks. In this paper, we present a new multilingual hate speech … |
Ankit Yadav; Shubham Chandel; Sushant Chatufale; Anil Bandhakavi; | ArXiv | 2023-04-03 |
| 952 | A Neural Attention-Based Encoder-Decoder Approach for English to Bangla Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine translation (MT) is the process of translating text from one language to another using bilingual data sets and grammatical rules. Recent works in the field of MT have … |
Abdullah Al Shiam; S. M. Redwan; Humaun Kabir; Jungpil Shin; | Comput. Sci. J. Moldova | 2023-04-01 |
| 953 | Varepsilon Kú Mask: Integrating Yorùbá Cultural Greetings Into Machine Translation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper investigates the performance of massively multilingual neural machine translation (NMT) systems in translating Yorùbá greetings (kú mask), which are a big part of … |
Idris Akinade; Jesujoba Oluwadara Alabi; David Ifeoluwa Adelani; Clement Odoje; D. Klakow; | ArXiv | 2023-03-31 |
| 954 | $\varepsilon$ KÚ Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper investigates the performance of massively multilingual neural machine translation (NMT) systems in translating Yor\`ub\’a greetings ($\varepsilon$ k\’u [MASK]), which are a big part of Yor\`ub\’a language and culture, into English. To evaluate these models, we present IkiniYor\`ub\’a, a Yor\`ub\’a-English translation dataset containing some Yor\`ub\’a greetings, and sample use cases. |
Idris Akinade; Jesujoba Alabi; David Adelani; Clement Odoje; Dietrich Klakow; | arxiv-cs.CL | 2023-03-31 |
| 955 | Sentiment Analysis of Multilingual Dataset of Bahraini Dialects, Arabic, and English Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Sentiment analysis is an application of natural language processing (NLP) that requires a machine learning algorithm and a dataset. In some cases, the dataset availability is … |
Thuraya Omran; Baraa T. Sharef; C. Grosan; Yongming Li; | Data | 2023-03-30 |
| 956 | Hallucinations in Large Multilingual Translation Models IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Existing research on hallucinations has primarily focused on small bilingual models trained on high-resource languages, leaving a gap in our understanding of hallucinations in massively multilingual models across diverse translation scenarios. In this work, we fill this gap by conducting a comprehensive analysis on both the M2M family of conventional neural machine translation models and ChatGPT, a general-purpose large language model~(LLM) that can be prompted for translation. |
NUNO M. GUERREIRO et. al. | arxiv-cs.CL | 2023-03-28 |
| 957 | Translate The Beauty in Songs: Jointly Learning to Align Melody and Translate Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose Lyrics-Melody Translation with Adaptive Grouping (LTAG), a holistic solution to automatic song translation by jointly modeling lyrics translation and lyrics-melody alignment. |
CHENGXI LI et. al. | arxiv-cs.CL | 2023-03-27 |
| 958 | Linguistically Informed ChatGPT Prompts to Enhance Japanese-Chinese Machine Translation: A Case Study on Attributive Clauses IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Present-day machine translation tools often fail to accurately translate attributive clauses from Japanese to Chinese. In light of this, this paper investigates the linguistic problem underlying such difficulties, namely how does the semantic role of the modified noun affect the selection of translation patterns for attributive clauses, from a linguistic perspective. |
Wenshi Gu; | arxiv-cs.CL | 2023-03-27 |
| 959 | Towards Making The Most of ChatGPT for Machine Translation IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we aim to further mine ChatGPT’s translation ability by revisiting several aspects: temperature, task information, and domain information, and correspondingly propose an optimal temperature setting and two (simple but effective) prompts: Task-Specific Prompts (TSP) and Domain-Specific Prompts (DSP). |
KEQIN PENG et. al. | arxiv-cs.CL | 2023-03-23 |
| 960 | Selective Data Augmentation for Robust Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose to use an e2e architecture for English-Hindi (en-hi) ST. We use two imperfect machine translation (MT) services to translate Libri-trans en text into hi text. |
Rajul Acharya; Ashish Panda; Sunil Kumar Kopparapu; | arxiv-cs.CL | 2023-03-22 |
| 961 | Towards Reliable Neural Machine Translation with Consistency-Aware Meta-Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A contributing factor to this problem is that NMT models trained with the one-to-one paradigm struggle to handle the source diversity phenomenon, where inputs with the same meaning can be expressed differently. In this work, we treat this problem as a bilevel optimization problem and present a consistency-aware meta-learning (CAML) framework derived from the model-agnostic meta-learning (MAML) algorithm to address it. |
Rongxiang Weng; Qiang Wang; Wensen Cheng; Changfeng Zhu; Min Zhang; | arxiv-cs.CL | 2023-03-20 |
| 962 | Translate Your Gibberish: Black-box Adversarial Attack on Machine Translation Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we present a simple approach to fool state-of-the-art machine translation tools in the task of translation from Russian to English and vice versa. |
Andrei Chertkov; Olga Tsymboi; Mikhail Pautov; Ivan Oseledets; | arxiv-cs.CL | 2023-03-20 |
| 963 | On The Scalability of Data Augmentation Techniques for Low-resource Machine Translation Between Chinese and Vietnamese Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Neural Machine Translation (NMT) has constantly been shown to be a standard choice to build a translation system, in both academia and industry. For low-resource language pairs, … |
Huan Vu; Ngoc-Dung Bui; | Journal of Information and Telecommunication | 2023-03-19 |
| 964 | ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To relax the dependency on labeled data of downstream tasks, we propose an intuitive and effective zero-shot learning framework, ZeroNLG, which can deal with multiple NLG tasks, including image-to-text (image captioning), video-to-text (video captioning), and text-to-text (neural machine translation), across English, Chinese, German, and French within a unified framework. |
BANG YANG et. al. | arxiv-cs.CL | 2023-03-11 |
| 965 | A Multi-stack RNN-based Neural Machine Translation Model for English to Pakistan Sign Language Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
U. Farooq; Mohd Shafry Mohd Rahim; Adnan Abid; | Neural Computing and Applications | 2023-03-11 |
| 966 | Implications of Multi-Word Expressions on English to Bharti Braille Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, we have shown the improvement of English to Bharti Braille machine translation system. We have shown how we can improve a baseline NMT model by adding some … |
Nisheeth Joshi; Pragya Katyayan; | 2023 6th International Conference on Information Systems … | 2023-03-03 |
| 967 | Rethinking The Reasonability of The Test Set for Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we manually annotate a monotonic test set based on the MuST-C English-Chinese test set, denoted as SiMuST-C. |
MENGGE LIU et. al. | arxiv-cs.CL | 2023-03-02 |
| 968 | Exploring The Potential of Machine Translation for Generating Named Entity Datasets: A Case Study Between Persian and English Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study focuses on the generation of Persian named entity datasets through the application of machine translation on English datasets. |
Amir Sartipi; Afsaneh Fatemi; | arxiv-cs.CL | 2023-02-19 |
| 969 | Zero and Few-Shot Localization of Task-Oriented Dialogue Agents with A Distilled Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose automatic methods that use ToD training data in a source language to build a high-quality functioning dialogue agent in another target language that has no training data (i.e. zero-shot) or a small training set (i.e. few-shot). |
Mehrad Moradshahi; Sina J. Semnani; Monica S. Lam; | arxiv-cs.CL | 2023-02-18 |
| 970 | How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present a comprehensive evaluation of GPT models for machine translation, covering various aspects such as quality of different GPT models in comparison with state-of-the-art research and commercial systems, effect of prompting strategies, robustness towards domain shifts and document-level translation. |
AMR HENDY et. al. | arxiv-cs.CL | 2023-02-17 |
| 971 | Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conduct comprehensive experiments and analyses on three benchmark datasets for English-German translation, and validate the effectiveness of two variants of DocFlat. |
Minghao Wu; George Foster; Lizhen Qu; Gholamreza Haffari; | arxiv-cs.CL | 2023-02-15 |
| 972 | Encoding Sentence Position in Context-Aware Neural Machine Translation with Concatenation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We compare various methods to encode sentence positions into token representations, including novel methods. |
Lorenzo Lupo; Marco Dinarelli; Laurent Besacier; | arxiv-cs.CL | 2023-02-13 |
| 973 | Approximating to The Real Translation Quality for Neural Machine Translation Via Causal Motivated Methods Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: It is hard to evaluate translations objectively and accurately, which limits the applications of machine translation. In this article, we assume that the above phenomenon is … |
Xuewen Shi; Heyan Huang; Ping Jian; Yi-Kun Tang; | ACM Transactions on Asian and Low-Resource Language … | 2023-02-13 |
| 974 | The Unreasonable Effectiveness of Few-shot Learning for Machine Translation IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We show that with only 5 examples of high-quality translation data shown at inference, a transformer decoder-only model trained solely with self-supervised learning, is able to match specialized supervised state-of-the-art models as well as more general commercial translation systems. |
XAVIER GARCIA et. al. | arxiv-cs.CL | 2023-02-02 |
| 975 | An Evaluation of Persian-English Machine Translation Datasets with Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Nowadays, many researchers are focusing their attention on the subject of machine translation (MT). However, Persian machine translation has remained unexplored despite a vast … |
Amir Sartipi; Meghdad Dehghan; Afsaneh Fatemi; | arxiv-cs.CL | 2023-02-01 |
| 976 | Adaptive Machine Translation with Large Language Models IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This work aims to investigate how we can utilize in-context learning to improve real-time adaptive MT. Our extensive experiments show promising results at translation time. |
Yasmin Moslem; Rejwanul Haque; John D. Kelleher; Andy Way; | arxiv-cs.CL | 2023-01-30 |
| 977 | Gender Neutralization for An Inclusive Machine Translation: from Theoretical Foundations to Open Challenges IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we explore gender-neutral translation (GNT) as a form of gender inclusivity and a goal to be achieved by machine translation (MT) models, which have been found to perpetuate gender bias and discrimination. |
Andrea Piergentili; Dennis Fucci; Beatrice Savoldi; Luisa Bentivogli; Matteo Negri; | arxiv-cs.CL | 2023-01-24 |
| 978 | Malayalam Natural Language Processing: Challenges in Building A Phrase-Based Statistical Machine Translation System IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Statistical Machine Translation (SMT) is a preferred Machine Translation approach to convert the text in a specific language into another by automatically learning translations … |
M. Sebastian; G. Santhosh Kumar; | ACM Transactions on Asian and Low-Resource Language … | 2023-01-19 |
| 979 | Improving Machine Translation with Phrase Pair Injection and Corpus Filtering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we show that the combination of Phrase Pair Injection and Corpus Filtering boosts the performance of Neural Machine Translation (NMT) systems. |
Akshay Batheja; Pushpak Bhattacharyya; | arxiv-cs.CL | 2023-01-19 |
| 980 | Machine Translation for Accessible Multi-Language Text Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we aim to leverage those very advances to demonstrate that multi-language analysis is currently accessible to all computational scholars. |
Edward W. Chew; William D. Weisman; Jingying Huang; Seth Frey; | arxiv-cs.CL | 2023-01-19 |
| 981 | Understanding and Detecting Hallucinations in Neural Machine Translation Via Model Introspection IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Neural sequence generation models are known to hallucinate, by producing outputs that are unrelated to the source text. These hallucinations are potentially harmful, yet it … |
Weijia Xu; Sweta Agrawal; Eleftheria Briakou; Marianna J. Martindale; Marine Carpuat; | arxiv-cs.CL | 2023-01-18 |
| 982 | Applying Automated Machine Translation to Educational Video Courses Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We studied the capability of automated machine translation in the online video education space by automatically translating Khan Academy videos with state-of-the-art translation models and applying text-to-speech synthesis and audio/video synchronization to build engaging videos in target languages. |
Linden Wang; | arxiv-cs.CL | 2023-01-08 |
| 983 | Building A Parallel Corpus and Training Translation Models Between Luganda and English Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we build a parallel corpus with 41,070 pairwise sentences for Luganda and English which is based on three different open-sourced corpora. |
Richard Kimera; Daniela N. Rim; Heeyoul Choi; | arxiv-cs.CL | 2023-01-06 |
| 984 | Statistical Machine Translation for Indic Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Different preprocessing approaches are proposed in this paper to handle the noise of the dataset. |
Sudhansu Bala Das; Divyajoti Panda; Tapas Kumar Mishra; Bidyut Kr. Patra; | arxiv-cs.CL | 2023-01-02 |
| 985 | UM-DFKI Maltese Speech Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: For the 2023 IWSLT Maltese Speech Translation Task, UM-DFKI jointly presents a cascade solution which achieves 0.6 BLEU. While this is the first time that a Maltese speech … |
A. WILLIAMS et. al. | International Workshop on Spoken Language Translation | 2023-01-01 |
| 986 | GUIT-NLP’s Submission to Shared Task: Low Resource Indic Language Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper describes the submission of the GUIT-NLP team in the “Shared Task: Low Resource Indic Language Translation” focusing on three low-resource language pairs: English-Mizo, … |
Mazida Akhtara Ahmed; Kuwali Talukdar; P. Boruah; Prof. Shikhar Kumar Sarma; Kishore Kashyap; | Conference on Machine Translation | 2023-01-01 |
| 987 | The Xiaomi AI Lab’s Speech Translation Systems for IWSLT 2023 Offline Task, Simultaneous Task and Speech-to-Speech Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This system description paper introduces the systems submitted by Xiaomi AI Lab to the three tracks of the IWSLT 2023 Evaluation Campaign, namely the offline speech translation … |
WUWEI HUANG et. al. | International Workshop on Spoken Language Translation | 2023-01-01 |
| 988 | Improving Formality-Sensitive Machine Translation Using Data-Centric Approaches and Prompt Engineering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, we present the KU x Upstage team’s submission for the Special Task on Formality Control on Spoken Language Translation, which involves translating English into four … |
Seugnjun Lee; Hyeonseok Moon; Chanjun Park; Heu-Jeoung Lim; | International Workshop on Spoken Language Translation | 2023-01-01 |
| 989 | A Deep Learning-Based Intelligent Quality Detection Model for Machine Translation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With more and more active international connections, the complex scenes-aware machine translation has been a novel concern in the area of natural language processing. Although … |
Meijuan Chen; | IEEE Access | 2023-01-01 |
| 990 | Simultaneous Interpreting As A Noisy Channel: How Much Information Gets Through Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We explore the relationship between information density/surprisal of source and target texts in translation and interpreting in the language pair English-German, looking at the … |
M. Kunilovskaya; Heike Przybyl; Ekaterina Lapshinova-Koltunski; Elke Teich; | Recent Advances in Natural Language Processing | 2023-01-01 |
| 991 | Transformer-Based Neural Machine Translation for Post-OCR Error Correction in Cursive Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Nehal Yasin; Imran Siddiqi; Momina Moetesum; Sadaf Abdul Rauf; | ICDAR Workshops | 2023-01-01 |
| 992 | Enhancing Code-mixed Text Generation Using Synthetic Data Filtering in Neural Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Code-Mixing, the act of mixing two or more languages, is a common communicative phenomenon in multi-lingual societies. The lack of quality in code-mixed data is a bottleneck for … |
D. Sravani; Radhika Mamidi; | Conference on Computational Natural Language Learning | 2023-01-01 |
| 993 | Alleviating Exposure Bias for Neural Machine Translation Via Contextual Augmentation and Self Distillation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In neural machine translation (NMT), most sequence-to-sequence (seq2seq) models are trained only with the teacher-forcing paradigm, where the ground truth history is used to … |
Zhidong Liu; Junhui Li; Muhua Zhu; | IEEE/ACM Transactions on Audio, Speech, and Language … | 2023-01-01 |
| 994 | MMT’s Submission for The WMT 2023 Quality Estimation Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents our submission to the WMT 2023 Quality Estimation (QE) shared task 1 (sentence-level subtask). We propose a straightforward training data augmentation approach … |
Yulong Wu; Viktor Schlegel; Daniel Beck; R. Batista-Navarro; | Conference on Machine Translation | 2023-01-01 |
| 995 | HW-TSC 2023 Submission for The Quality Estimation Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Quality estimation (QE) is an essential technique to assess machine translation quality without reference translations. In this paper, we focus on Huawei Translation Services … |
YUANG LI et. al. | Conference on Machine Translation | 2023-01-01 |
| 996 | CCEval: A Representative Evaluation Benchmark for The Chinese-centric Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Lianzhang Lou; Xi Yin; Yutao Xie; Yang Xiang; | Conference on Empirical Methods in Natural Language … | 2023-01-01 |
| 997 | An Automatic Error Detection Method for Machine Translation Results Via Deep Learning IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Nowadays, the rapid development of natural language processing has brought great progress for the area of machine translation. Various deep neural network-based machine … |
Weihong Zhang; | IEEE Access | 2023-01-01 |
| 998 | Machine Translation with Large Language Models: Prompting, Few-shot Learning, and Fine-tuning with QLoRA IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: While large language models have made remarkable advancements in natural language generation, their potential in machine translation, especially when fine-tuned, remains … |
Xuan Zhang; Navid Rajabi; Kevin Duh; Philipp Koehn; | Conference on Machine Translation | 2023-01-01 |
| 999 | KnowComp Submission for WMT23 Word-Level AutoCompletion Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The NLP community has recently witnessed the success of Large Language Models (LLMs) across various Natural Language Processing (NLP) tasks. However, the potential of LLMs for … |
Yi Wu; Haochen Shi; Weiqi Wang; Yangqiu Song; | Conference on Machine Translation | 2023-01-01 |
| 1000 | PRHLT’s Submission to WLAC 2023 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper describes our submission to the Word-Level AutoCompletion shared task of WMT23. We participated in the English–German and German–English categories. We extended our … |
Ángel Navarro; Miguel Domingo; Francisco Casacuberta; | Conference on Machine Translation | 2023-01-01 |