Baidu ERNIE multimodal AI beats GPT and Gemini in benchmarks

Artificial Intelligence NewsWednesday, November 12, 2025 at 4:09:44 PM
Baidu ERNIE multimodal AI beats GPT and Gemini in benchmarks
Baidu's latest multimodal AI model, ERNIE-4.5-VL-28B-A3B-Thinking, has achieved notable success by surpassing both GPT and Gemini in critical benchmarks. This advancement is particularly important as it focuses on enterprise data types that are frequently neglected by conventional text-centric models. By effectively analyzing complex data sources like engineering schematics, factory-floor video feeds, medical scans, and logistics dashboards, Baidu's ERNIE model promises to unlock valuable insights for businesses. The ability to interpret such diverse data is crucial for organizations seeking to enhance operational efficiency and make informed decisions. As AI continues to evolve, the implications of this development could lead to a significant shift in how enterprises utilize AI technologies, emphasizing the need for models that can handle multimodal data effectively.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings
Reinforcing Stereotypes of Anger: Emotion AI on African American Vernacular English
NegativeArtificial Intelligence
Automated emotion detection systems are increasingly utilized in various fields, including mental health and hiring. However, these models often fail to accurately recognize emotional expressions in dialects like African American Vernacular English (AAVE) due to reliance on dominant cultural norms. A study analyzing 2.7 million tweets from Los Angeles found that emotion recognition models exhibited significantly higher false positive rates for anger in AAVE compared to General American English (GAE), highlighting the limitations of current emotion AI technologies.
Death’s Job Game
PositiveArtificial Intelligence
The article discusses the creation of a 2D game titled 'Death's Job,' developed using Pygame. The developer embarked on this project out of curiosity about game development. After initially implementing game logic, including gravity and physics, the project faced challenges in finding suitable assets. Following a break due to Diwali and university exams, the developer returned to complete the game, generating matching sprites with Gemini and sourcing sound effects from Pixabay. The final product features an interactive splash screen, Flappy Bird-style jumping, smooth physics, and pixel-perfect collision detection.
It was acceptable in the 80s
PositiveArtificial Intelligence
The article discusses a project created for the Google AI Studio Education Track, where the author developed a ZX Spectrum-style Loading Screen Generator. This application, built using Google AI Studio, Gemini, and Imagen, allows users to generate nostalgic loading screens by providing prompts. The project aims to evoke the vibrant aesthetics of 80s gaming, characterized by chunky pixels and bright colors.