CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

arXiv — cs.LG•Wednesday, December 3, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

The introduction of CUDA-L2 represents a significant advancement in optimizing Half-precision General Matrix Multiply (HGEMM) CUDA kernels through the integration of large language models and reinforcement learning. This system has demonstrated superior performance compared to existing matrix multiplication libraries, including torch.matmul and Nvidia's cuBLAS, achieving notable speed improvements in offline execution modes.
This development is crucial for enhancing computational efficiency in various applications that rely on matrix multiplication, particularly in machine learning and data processing. By surpassing established benchmarks, CUDA-L2 positions itself as a valuable tool for developers and researchers seeking optimized performance in their computational tasks.
The emergence of CUDA-L2 aligns with ongoing trends in the field of artificial intelligence, where leveraging advanced algorithms and machine learning techniques is becoming increasingly vital. Additionally, the introduction of Low-Rank GEMM, which focuses on reducing computational complexity, highlights a broader movement towards optimizing matrix operations, suggesting a growing emphasis on efficiency and performance in AI-driven applications.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataTry the app

Hushl

AI-driven insights to accelerate enterprise innovation and growth.

Tech & Developer ToolsTry the app

Langtail

Build and deploy robust LLM applications quickly with your team.

Business & ProductivityTry the app

Continue Readings

Techmeme3 hours ago

Jensen Huang says "we don't know" if China would accept Nvidia's H200 AI chips even if the US relaxed export controls, following a meeting with President Trump (Bloomberg)

NeutralArtificial Intelligence

Nvidia CEO Jensen Huang expressed uncertainty regarding whether China would accept the company's H200 artificial intelligence chips, even if U.S. export restrictions were relaxed, following a meeting with President Trump. This statement reflects ongoing complexities in U.S.-China technology relations.

Read full article

via Techmeme

TechCrunch5 hours ago

Andy Jassy says Amazon’s Nvidia competitor chip is already a multibillion-dollar business

PositiveArtificial Intelligence

Amazon's CEO Andy Jassy announced that the company's new AI chip, designed to compete with Nvidia, has already become a multibillion-dollar business, highlighting Amazon's significant strides in the AI sector.

Read full article

via TechCrunch

Bloomberg Technology5 hours ago

AI Bears Will Watch the Party Through the Window: Ives

NeutralArtificial Intelligence

Dan Ives, the global head of technology research at Wedbush Securities, stated that it is too early to declare an AI bubble, emphasizing the U.S. commitment to maintaining its chip market against competitors like Huawei and Nvidia. This perspective was shared during an interview on Bloomberg The Close with Romaine Bostick.

Read full article

via Bloomberg Technology

International Business Times14 hours ago

Amazon Races to Beat Nvidia and Google with Trainium3 —AI Costs May Finally Drop

PositiveArtificial Intelligence

Amazon has launched its latest AI chip, Trainium3, alongside the multimodal Nova 2 Omni model at the re:Invent conference, marking a significant step in its efforts to enhance its artificial intelligence capabilities. This development intensifies the competition in the AI chip market, particularly against established players like Nvidia and Google.

Read full article

via International Business Times

TechCruncha day ago

Amazon challenges competitors with on-premises Nvidia ‘AI Factories’

PositiveArtificial Intelligence

Amazon has launched on-premises Nvidia ‘AI Factories’ in collaboration with Nvidia, integrating AWS technology with Nvidia's advanced AI chips to enhance its artificial intelligence capabilities. This initiative aims to provide businesses with robust AI solutions tailored for on-site deployment.

Read full article

via TechCrunch

TechSpota day ago

Nvidia dominates discrete GPU market with 92% share despite shifting focus to AI

NeutralArtificial Intelligence

Nvidia maintained a dominant position in the discrete GPU market with a 92% share in Q3 2025, despite a slight decline from 94% in the previous quarter. AMD and Intel have made modest gains, increasing their market shares to 7% and 1.4%, respectively. This shift indicates a competitive landscape as Nvidia continues to focus on AI technologies.

Read full article

via TechSpot

Phys.org — AI & Machine Learninga day ago

Google is relying on its own chips for its AI system Gemini. Here's why that's a seismic change for the industry

NeutralArtificial Intelligence

Google has shifted its focus to using its own Tensor Processing Units (TPUs) for its AI system, Gemini, marking a significant departure from its previous reliance on Nvidia's GPUs, which have long dominated the AI chip market.

Read full article

via Phys.org — AI & Machine Learning

VentureBeat — AI2 days ago

Mistral launches Mistral 3, a family of open models designed to run on laptops, drones, and edge devices

PositiveArtificial Intelligence

Mistral AI has launched the Mistral 3 family, a suite of 10 open-source AI models designed for various platforms, including laptops, drones, and edge devices. This release marks a significant step in the company's strategy to compete against major players like OpenAI and Google by providing accessible AI solutions across different applications.

Read full article

via VentureBeat — AI