Nvidia's Nemotron 3 swaps pure Transformers for a Mamba hybrid to run AI agents efficiently

THE DECODERWednesday, December 17, 2025 at 7:50:29 PM
Nvidia's Nemotron 3 swaps pure Transformers for a Mamba hybrid to run AI agents efficiently
  • Nvidia has introduced the Nemotron 3 family, which integrates Mamba and Transformer architectures to efficiently manage long context windows for AI agents. This hybrid approach aims to optimize resource usage while enhancing performance in AI applications.
  • The launch of Nemotron 3 signifies a pivotal moment for Nvidia, reinforcing its position as a leader in AI innovation. By adopting a mixture-of-experts architecture, Nvidia enhances the accuracy and reliability of its AI models, catering to the increasing demand for sophisticated AI solutions across various sectors.
  • This development reflects a broader trend in the AI industry, where companies are increasingly focusing on hybrid architectures to improve efficiency and performance. As competition intensifies, particularly with alternatives like Google's TPUs gaining traction, Nvidia's advancements with Nemotron 3 underscore its commitment to maintaining a competitive edge in the evolving landscape of AI technology.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Rivian Replaces Nvidia with Own AI Chip
PositiveArtificial Intelligence
Rivian has announced the replacement of Nvidia with its own AI chip, named RAP1, as part of its strategy to achieve Level 4 autonomy in its vehicles. This decision coincides with a partnership with Volkswagen, indicating a significant shift in Rivian's approach to autonomous driving technology.
Nvidia could slash GeForce RTX 50 production by up to 40% next year
NegativeArtificial Intelligence
Nvidia is reportedly considering a production cut of 30 to 40% for its GeForce RTX 50 series GPUs in the first half of 2026, primarily due to ongoing memory shortages. This decision, based on unconfirmed reports from Board Channels, suggests that availability of these graphics cards may become as challenging as it was at their initial launch.
SigMA: Path Signatures and Multi-head Attention for Learning Parameters in fBm-driven SDEs
PositiveArtificial Intelligence
A new neural architecture named SigMA has been introduced, integrating path signatures with multi-head self-attention for parameter learning in stochastic differential equations (SDEs) driven by fractional Brownian motion (fBm). This approach addresses the challenges posed by non-Markovian processes, which complicate traditional parameter estimation techniques.
Characterizing Mamba's Selective Memory using Auto-Encoders
NeutralArtificial Intelligence
A recent study has characterized the selective memory of Mamba's state space models (SSMs) using auto-encoders, revealing the types of tokens and sequences that are frequently forgotten during long sequence processing. This research addresses a critical knowledge gap in understanding the information loss associated with SSMs in language modeling.
Model Agnostic Preference Optimization for Medical Image Segmentation
PositiveArtificial Intelligence
A new training framework called Model Agnostic Preference Optimization (MAPO) has been introduced for medical image segmentation, which utilizes Dropout-driven stochastic segmentation hypotheses to create preference-consistent gradients without relying on direct ground-truth supervision. This model-agnostic approach supports various architectures, including 2D/3D CNNs and Transformers.
MS-Temba: Multi-Scale Temporal Mamba for Understanding Long Untrimmed Videos
PositiveArtificial Intelligence
The introduction of MS-Temba, a Multi-Scale Temporal Mamba model, addresses significant challenges in Temporal Action Detection (TAD) for untrimmed videos, particularly in Activities of Daily Living (ADL). This model enhances the ability to process long-duration videos, capture temporal variations, and detect overlapping actions effectively through the use of dilated State-space Models (SSMs).
Empirical Investigation of the Impact of Phase Information on Fault Diagnosis of Rotating Machinery
PositiveArtificial Intelligence
An empirical investigation has revealed that incorporating phase information significantly enhances fault diagnosis in rotating machinery. The study introduces two innovative phase-aware preprocessing strategies that effectively address random phase variations in multi-axis vibration data, demonstrating improvements across various deep learning architectures.
From Nvidia to OpenAI, Silicon Valley woos Westminster as ex-politicians take tech firm roles
NeutralArtificial Intelligence
The recent interactions between Silicon Valley executives and British politicians highlight a growing trend where former political leaders, such as George Osborne, Nick Clegg, and Tony Blair, are taking on roles in tech firms. This shift was exemplified by Nvidia CEO Jensen Huang's event in London during Donald Trump's state visit, where he promoted AI's potential and announced significant investments in the UK.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about