ArXiv says submissions must be in English: are AI translators up for the job?

-


Artificial-intelligence translators could help researchers to meet the arXiv preprint server’s new mandate that all manuscripts be submitted in English.Credit: Sharaf Maksumov/Alamy

Every month, more than 20,000 scientific manuscripts by authors from around the world are posted on the preprint repository arXiv, the oldest and best-known preprint site. Now researchers uploading their work to the site are facing a new requirement: from 11 February, all submissions must be either written in English or accompanied by a full English translation.

Until now, authors have had to submit only an abstract in English. Staff at arXiv say that the English rule will make life easier for its moderators and keep its readership broad. “We can’t be fair in judging papers if they are not in English,” says Ralph Wijers, the chair of the arXiv editorial advisory council and an astronomer at the University of Amsterdam, whose native language is Dutch. The site, based at Cornell University in Ithaca, New York, does not undertake peer review, but a team of some 300 volunteer moderators verifies that submissions are “appropriate and topical”.

ArXiv hosts nearly 3 million preprints across eight subject areas, although the vast majority of the manuscripts are in computer science, physics and mathematics. Just 1% of submissions are in a language other than English. Nonetheless, the revised language policy has prompted some vocal complaints, including arguments that the burden of the mandate might deter people from making content such as PhD theses and preprints of textbook chapters public. Authors of such texts might think it is not worth the effort to translate them or to find an alternative venue for making them accessible

“I personally see it as a loss for our community,” says mathematician Angelo Lucia at the Polytechnic of Milan in Italy.

Several French mathematicians posted on the arXiv announcement saying they might take their manuscripts to the French preprint server HAL (Hyper Articles en Ligne), instead. HAL hosts works in several languages including English, French and Spanish, without requiring translations.

Machine translation

The arXiv policy specifies that automated translations, such as those done by artificial-intelligence chatbots, are acceptable, so long as they are faithful to the original work.

Editors at arXiv have some reservations about these systems’ capabilities, however. “Our advice is: feel free to use an AI or an LLM [large language model] to translate your text, but please check it,” says Wijers. “Our own experience is that AI translation is good but not good enough.”

This caution echoes that expressed by respondents to a Nature survey in 2025 of more than 5,000 researchers from around the world (respondents included both volunteers and randomly selected authors of recent papers). Although more than 90% of those surveyed felt it was acceptable to use AI to translate a paper into another language (and 8% had done so), more than half said this would be appropriate only if the translation was checked by a native speaker.

Delving deep

LLMs are widely considered to be excellent at generating conversational text, but limited attention has been given to their prowess at translating scientific papers.

James Zou, a computer scientist at Stanford University in California, and Hannah Kleidermacher, a doctoral student in electrical engineering also at Stanford, investigated one LLM’s ability to translate academic text from English into other languages. They asked GPT-4o — an LLM released by OpenAI in San Francisco, California, in 2024 — to create a 50-question multiple-choice quiz for each of six scientific papers in English across various topics, with an answer key. This produced an automated benchmark with which to evaluate the LLM’s performance. The authors then instructed the LLM to translate the six papers into 28 other languages, and take the quiz on the translated versions.



Source link

Latest news

Trump Admin’s Plans for $500 Million USIP Building May Violate Court Order, Say Former Workers

Last year, the Trump administration and members of the so-called Department of Government Efficiency (DOGE) forcibly took over...

This Chinese Startup Wants to Build a New Brain-Computer Interface—No Implant Required

China’s brain-computer interface industry is growing fast, and the newest company to emerge from the country is aiming...

This AI has chemical expertise — and helps synthesize 35 new compounds

Searching for blockbuster drugs and wonder materials is an arduous task for chemists. To make their promising...

Developmental convergence and divergence in human stem cell models of autism – Nature

de la Torre-Ubieta, L., Won, H., Stein, J. L. & Geschwind, D. H. Advancing the understanding of autism...

Five ways to make the academic workplace happier and healthier this year

When you spend enough years in academia, you begin to hear the same sentence spoken over and...

Critical social media posts linked to retractions of scientific papers

Posts on social-media platform X that are critical of scientific research can act as early warning signs...

Must read

You might also likeRELATED
Recommended to you