Tencent's Open-Source Models Dominate Translation Scene, Google Responds
Tech giant Tencent has made waves in the translation sector with its latest models, Hunyuan-MT-7B and Hunyuan-MT-Chimera-7B. These models, released as open source, have shown exceptional performance in both widely spoken and minority languages, outperforming established systems like Google Translate.
Tencent's models support bidirectional translation between 33 languages, including less frequently digitized ones. They focus on translation between Mandarin Chinese and ethnic minority languages of China, such as Kazakh, Uyghur, Mongolian, and Tibetan. The company's 7-billion-parameter models have achieved remarkable results despite being smaller than many foundational models.
In the international WMT2025 workshop, Tencent's models took first place in 30 out of 31 tested language combinations. They have outperformed systems like Google Translate, GPT-4.1, Claude 4 Sonnet, and Gemini 2.5 Pro in most categories. This success can be attributed to a five-stage training process that includes general text training, refinement with translation-specific data, supervised learning, reinforcement learning, and 'Weak-to-Strong' reinforcement learning. The models were trained on a dataset of 1.3 trillion tokens, covering 112 different languages and dialects, with a specific focus on minority languages.
The Chimera model uses a unique fusion approach. It receives multiple translation suggestions from different systems and combines them into an improved final translation.
Tencent's Hunyuan models have set a new standard in translation technology. Their open-source release allows for further development and improvement. Meanwhile, Google has announced new AI features for its translation service, including live bidirectional conversations and personalized language learning. The future of translation technology appears to be heading towards more accessible, accurate, and personalized experiences.