期刊目次

加入编委

期刊订阅

添加您的邮件地址以接收即将发行期刊数据:

Open Access Article

Modern Social Science Research. 2026; 6: (6) ; 28-34 ; DOI: 10.12208/j.ssr.20260200.

A comparative study on the quality of large language models’ translation in literary text: Taking excerpts from Distant Sunflower Fields as an example
主流大语言模型文学翻译质量比较研究——基于《遥远的向日葵地》节选的实证分析

作者: 时博文, 刘丽敏 *

福建农林大学 福建福州

*通讯作者: 刘丽敏,单位:福建农林大学 福建福州 ;

发布时间: 2026-06-11 总浏览量: 11

摘要

以李娟的散文集《遥远的向日葵地》中的连续性两节《外婆的世界》与《外婆的葬礼》为文本,分别用Kimi、Deepseek 3.2、Gemini 3.0 Flash、通义千问 3.5 Plus、通义千问 3 Max、GPT 5.4和文心一言七个大语言模型进行翻译,使用不同的翻译评估指标将各系统译文与已出版的人工译本进行比较评估,并采用豪斯量表和MQM量表对译文的准确性和文学性进行评价。结果显示,大语言模型进行文学翻译时已有良好表现,Kimi和Deepseek 3.2在各项评估中表现突出,而其他语言模型对特色文化内涵的理解传达和文学性表达上仍存在较大优化空间。

关键词: 大语言模型;《遥远的向日葵地》;翻译质量;评估;比较

Abstract

This study extracted two consecutive sections, “Grandma’s World” and “Grandma’s Funeral,” from the essay collection Distant Sunflower Fields. These texts were translated using seven large language models (Kimi, Deepseek 3.2, Gemini 3.0 Flash, Qwen-3.5 Plus, Qwen-3 Max, GPT 5.4, and ERNIE Bot). Various translation evaluation metrics (TTR, BLEU, METEOR) were employed to evaluate and compare each translation with the published human translation. Additionally, the House’s Scale and MQM Scale were utilized to evaluate the accuracy and literariness of the translations. The results indicate that Large Language Models (LLMs) have displayed robust capabilities in literary translation, with Kimi and Deepseek 3.2 notably distinguishing themselves in multi-dimensional assessments. Nevertheless, other peer models continue to face challenges in interpreting culture-specific nuances and mastering stylistic literary expression, highlighting a considerable gap that necessitates further optimization.

Key words: Large language models; Distant Sunflower Fields; Translation quality; Assessment; Comparison

参考文献 References

[1] Floridi, L. & Chiriatti, M. GPT-3: Its nature, scope, limits, and consequences[J]. Minds and Machines, 2020(4): 681-694.

[2] 李亚超,熊德意,张民等.藏汉神经网络机器翻译研究[J].中文信息学报,2017,31(06):103-109.

[3] Huang, L. etal. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions [EB/OL]. Retrieved from 

http://arxiv. org/abs/2311. 05232, 2023

[4] 赵衍,张慧,杨祎辰.大语言模型在文本翻译中的质量比较研究——以《繁花》翻译为例[J].外语电化教学,2024, (04):60-66+109.

[5] 胡开宝,李晓倩.大语言模型背景下翻译研究的发展:问题与前景[J].中国翻译,2023,44(06):64-73+192.

[6] 胡开宝,李娟.大语言模型背景下的翻译人才培养:挑战与前景[J].外语电化教学,2024,(06):3-7+105.

[7] 张曙康,赵朝永.大语言模型之于文学翻译的适切性研究——基于多指标评估的《边城》多模型译文质量对比[J].中国外语,2025, 22(04):85-95. 

[8] 赵浜,曹树金.国内外生成式AI大模型执行情报领域典型任务的测试分析[J].情报资料工作,2023,44(05):6-17.

引用本文

时博文, 刘丽敏, 主流大语言模型文学翻译质量比较研究——基于《遥远的向日葵地》节选的实证分析[J]. 现代社会科学研究, 2026; 6: (6) : 28-34.