Ai2's new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks

VentureBeat — AIFriday, December 12, 2025 at 5:00:00 AM
Ai2's new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks
  • The Allen Institute for AI (Ai2) has launched Olmo 3.1, an advanced iteration of its Olmo model family, which enhances reinforcement learning training to improve reasoning benchmarks. This update includes two optimized versions, Olmo 3.1 Think 32B for advanced research and Olmo 3.1 Instruct 32B for instruction-following tasks, alongside a programming-focused model, Olmo 3-Base.
  • This development signifies Ai2's commitment to pushing the boundaries of AI capabilities, particularly in efficiency, transparency, and control, which are crucial for enterprise applications. The extended reinforcement learning training schedule aims to bolster the model's performance in complex reasoning tasks.
  • The release of Olmo 3.1 aligns with a growing trend in AI towards models that prioritize customization and transparency, as seen in competing models like Qwen and Llama. This reflects a broader industry shift towards enhancing reasoning capabilities and coding skills in AI, which are essential for meeting the increasing demands of diverse applications in technology.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Intelligently Weighting Multiple Reference Models for Direct Preference Optimization of LLMs
NeutralArtificial Intelligence
A recent study introduces new weighting strategies for Multiple-Reference Preference Optimization (MRPO) in fine-tuning large language models (LLMs). These strategies aim to improve the alignment of LLMs with human preferences by leveraging a mixture of reference models, addressing the limitations of current ad-hoc methods that lead to unreliable performance.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about