arXiv:2509.04432v3 Announce Type: replace 
Abstract: Temporal reasoning and knowledge are essential capabilities for language models (LMs). While much prior work has analyzed and improved temporal reasoning in LMs, most studies have focused solely on the Gregorian calendar. However, many non-Gregorian systems, such as the Japanese, Hijri, and Hebrew calendars, are in active use and reflect culturally grounded conceptions of time. If and how well current LMs can accurately handle such non-Gregorian calendars has not been evaluated so far. Here, we present a systematic evaluation of how well language models handle one such non-Gregorian system: the Japanese wareki. We create datasets that require temporal knowledge and reasoning in using wareki dates. Evaluating open and closed LMs, we find that some models can perform calendar conversions, but GPT-4o, Deepseek V3, and even Japanese-centric models struggle with Japanese calendar arithmetic and knowledge involving wareki dates. Error analysis suggests corpus frequency of Japanese calendar expressions and a Gregorian bias in the model's knowledge as possible explanations. Our results show the importance of developing LMs that are better equipped for culture-specific tasks such as calendar understanding.

أظهرت دراسة حديثة كيف تتعامل نماذج اللغة (ML) مع التقويم الياباني واريكي، حيث كشفت أن بعض النماذج يمكنها إجراء تحويلات تقويمية، لكنها تواجه صعوبات في الحسابات الخاصة بتواريخ واريكي. وهذا يبرز الحاجة إلى تحسين نماذج ML لتناسب الأنظمة غير الغريغورية، التي تحمل أهمية ثقافية. تؤكد النتائج على أهمية تطوير نماذج قادرة على فهم أطر زمنية متنوعة.

Un estudio reciente evaluó cómo los modelos de lenguaje (ML) manejan el calendario japonés wareki, revelando que, aunque algunos modelos pueden realizar conversiones de calendario, tienen dificultades con la aritmética específica del wareki. Esto resalta la necesidad de que los ML se adapten mejor a sistemas no gregorianos, que son culturalmente significativos. Los hallazgos enfatizan la importancia de desarrollar modelos que puedan entender diversos marcos temporales.

Une étude récente a évalué la capacité des modèles de langage (ML) à gérer le calendrier japonais wareki, révélant que certains modèles peuvent effectuer des conversions de calendrier, mais ont des difficultés avec l'arithmétique spécifique au wareki. Cela souligne la nécessité pour les ML de mieux s'adapter aux systèmes non grégoriens, qui sont culturellement significatifs. Les résultats mettent en évidence l'importance de développer des modèles capables de comprendre des cadres temporels divers.

A recent study evaluated how well language models (LMs) handle the Japanese wareki calendar, revealing that while some models can perform calendar conversions, they struggle with wareki-specific arithmetic. This highlights the need for LMs to better accommodate non-Gregorian systems, which are culturally significant. The findings emphasize the importance of developing models that can understand diverse temporal frameworks.

Can Language Models Handle a Non-Gregorian Calendar? The Case of the Japanese wareki

Was this article worth reading? Share it

LucidQuery AI

Airparser

Kansei

Japan AI List

AI Japanese Tutor

Meteoria

Ready to build your own newsroom?