Baidu unveils proprietary ERNIE 5 beating GPT-5 performance on charts, document understanding and more

VentureBeat — AIThursday, November 13, 2025 at 8:23:00 PM
Baidu unveils proprietary ERNIE 5 beating GPT-5 performance on charts, document understanding and more
Baidu's unveiling of ERNIE 5.0 comes at a pivotal moment as OpenAI has just updated its flagship model to GPT-5.1, enhancing its conversational abilities. Baidu's ERNIE 5.0 is positioned as a strong competitor, claiming superior performance in document understanding and other areas compared to GPT-5. This strategic move highlights the intensifying competition in the AI landscape, where companies like Baidu are pushing for a significant presence in the global market, especially following the mixed reviews of GPT-5. The release of ERNIE 5.0 reflects a broader trend of rapid advancements in AI technology, emphasizing the need for continuous innovation to meet evolving user demands.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings
UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation
PositiveArtificial Intelligence
UI programming is a complex aspect of software development. Recent advancements in visual language models (VLMs) show promise for automatic UI coding, yet existing methods face limitations in multimodal capabilities and iterative feedback. The UI2Code^N model addresses these issues through an interactive UI-to-code approach, enhancing performance by integrating UI generation, editing, and polishing. This model is trained using staged pretraining, fine-tuning, and reinforcement learning, aiming to improve multimodal coding significantly.
MicroVQA++: High-Quality Microscopy Reasoning Dataset with Weakly Supervised Graphs for Multimodal Large Language Model
PositiveArtificial Intelligence
MicroVQA++ is a newly introduced high-quality microscopy reasoning dataset designed for multimodal large language models (MLLMs). It is derived from the BIOMEDICA archive and consists of a three-stage process that includes expert-validated figure-caption pairs, a novel heterogeneous graph for filtering inconsistent samples, and human-checked multiple-choice questions. This dataset aims to enhance scientific reasoning in biomedical imaging, addressing the current limitations due to the lack of large-scale training data.