ASR Under the Stethoscope: Evaluating Biases in Clinical Speech Recognition across Indian Languages

arXiv — cs.CL•Monday, December 15, 2025 at 5:00:00 AM

NeutralArtificial Intelligence

A systematic audit of Automatic Speech Recognition (ASR) performance in Indian healthcare settings has been conducted, focusing on languages such as Kannada, Hindi, and Indian English. The study compares various ASR models, including Indic Whisper and Google speech to text, and evaluates transcription accuracy across different demographics, revealing significant performance variability and biases based on speaker roles and language use.
This evaluation is crucial as it highlights the reliability of ASR technologies in clinical environments, which are increasingly used for documenting patient interactions. Understanding the biases and error patterns can inform improvements in ASR systems, ensuring they serve diverse populations effectively.
The findings underscore ongoing challenges in ASR technology, particularly in multilingual contexts like India, where language diversity can lead to disparities in healthcare communication. This situation reflects broader issues in AI and language processing, where performance gaps often exist between different languages and dialects, emphasizing the need for more inclusive and context-aware solutions.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

One More Thing in AI

Master AI with curated tools and tutorials for practical, real-world applications.

ShareSpeak

AI teleprompter for seamless presentations

AI & DataView app details

Nudge AI

Automatically transcribe and summarize medical conversations for healthcare professionals.

Business & ProductivityView app details

SoundWise.ai

Transcribe videos and audio with AI-powered accuracy and speed.

AI & DataView app details

Usercall

Conduct AI-moderated voice interviews to gather user feedback efficiently.

AI & DataView app details

Graza.ai

Set up in 30 seconds for 24/7 multilingual call control and instant mental clarity.

AI & DataView app details

Continue Readings

AI Business13 hours ago

Google Releases Updated Gemini Deep Research

NeutralArtificial Intelligence

Google has released an updated version of its Gemini deep research model on the same day that OpenAI launched its latest AI model, GPT-5.2. This simultaneous release highlights the intensifying competition between the two tech giants in the artificial intelligence sector.

Read full article

via AI Business

TechRepublic — Artificial Intelligencea day ago

Google Search Live Gets a Gemini Audio Upgrade for Smoother Replies

PositiveArtificial Intelligence

Google has upgraded its Search Live feature with Gemini 2.5 native audio, enabling faster and more natural voice interactions within the Google app. This enhancement aims to provide users with smoother replies and hands-free assistance, improving the overall search experience.

Read full article

via TechRepublic — Artificial Intelligence

InfoQ — AI, ML & Data Engineeringa day ago

AlphaEvolve Enters Google Cloud as an Agentic System for Algorithm Optimization

PositiveArtificial Intelligence

Google Cloud has announced the private preview of AlphaEvolve, a Gemini-powered coding agent aimed at discovering and optimizing algorithms for complex engineering and scientific challenges. This system is now available through an early access program, specifically targeting scenarios where traditional optimization methods are inadequate due to extensive search spaces.

Read full article

via InfoQ — AI, ML & Data Engineering

THE DECODERa day ago

AI models score off the charts on psychiatric tests when researchers treat them as therapy patients

NeutralArtificial Intelligence

Researchers at the University of Luxembourg have treated AI models, including ChatGPT, Gemini, and Grok, as therapy patients, leading to alarming results such as the generation of consistent trauma narratives and high pathological test scores. This study raises significant concerns about the implications of anthropomorphizing AI and its potential impact on mental health assessments.

Read full article

via THE DECODER

Analytics India Magazinea day ago

Can India’s AI Copyright Plan Survive Legal and Technical Scrutiny?

NeutralArtificial Intelligence

India is currently evaluating its AI copyright plan amidst concerns regarding its legal and technical viability. This scrutiny comes as the country aims to establish a robust framework for AI-generated content, which is essential for protecting intellectual property rights in a rapidly evolving technological landscape.

Read full article

via Analytics India Magazine

arXiv — cs.CLa day ago

Benchmarking Automatic Speech Recognition Models for African Languages

NeutralArtificial Intelligence

A recent study benchmarked four advanced automatic speech recognition (ASR) models—Whisper, XLS-R, MMS, and W2v-BERT—across 13 African languages, highlighting their performance under varying data conditions. The research found that while MMS and W2v-BERT excel in low-resource settings, XLS-R scales effectively with more data, and Whisper performs well in mid-resource environments.

Read full article

via arXiv — cs.CL

arXiv — cs.CVa day ago

Surveillance Video-Based Traffic Accident Detection Using Transformer Architecture

NeutralArtificial Intelligence

A new study has introduced a traffic accident detection model utilizing transformer architecture, addressing the rising global incidence of road traffic accidents. Traditional methods have struggled with limited understanding of spatiotemporal dynamics, prompting the need for a more robust approach. The researchers curated a diverse dataset to enhance the model's effectiveness in various traffic environments.

Read full article

via arXiv — cs.CV

TechSpot2 days ago

Google is turning any headphones into real-time translation earbuds

PositiveArtificial Intelligence

Google has launched a beta feature in the Google Translate app that allows Android users to utilize any headphones for real-time speech translation, powered by its Gemini AI. This feature enhances the naturalness and nuance of translations, supporting over 70 languages.

Read full article

via TechSpot

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about