IndRegBias: A Dataset for Studying Indian Regional Biases in English and Code-Mixed Social Media Comments

arXiv — cs.CLWednesday, January 14, 2026 at 5:00:00 AM
  • A new dataset named IndRegBias has been introduced to study regional biases in English and code-mixed comments on social media platforms like Reddit and YouTube, focusing on Indian contexts. This dataset comprises 25,000 comments that reflect regional biases, which have been less explored compared to other social biases such as gender and race.
  • The development of the IndRegBias dataset is significant as it aims to fill a gap in research on regional biases in India, providing a resource for better understanding and addressing these biases in natural language processing applications.
  • This initiative highlights the growing recognition of the importance of regional biases in AI and NLP, paralleling other studies that address biases in speech recognition and language models, indicating a broader trend towards inclusivity and accuracy in AI technologies across diverse linguistic and cultural landscapes.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
India’s Emversity doubles valuation as it scales workers AI can’t replace
PositiveArtificial Intelligence
Emversity, an Indian startup focused on job-ready training, has successfully raised $30 million in a new funding round, doubling its valuation as it aims to scale its operations in a market increasingly focused on skills that artificial intelligence cannot replace.
Robot learns to lip sync by watching YouTube
NeutralArtificial Intelligence
A robot has learned to lip sync by observing YouTube videos, addressing a significant challenge in robotics where humanoids often struggle with realistic lip movements during conversations. This advancement highlights the importance of lip motion in human interaction, which constitutes nearly half of the attention during face-to-face communication.
Digg, rebooted under original founder Kevin Rose and Alexis Ohanian, launches its open beta; the site had been open to 67,000 users on an invite-only basis (Sarah Perez/TechCrunch)
PositiveArtificial Intelligence
Digg has officially launched its open beta, marking a significant reboot of the platform under original founders Kevin Rose and Alexis Ohanian. Previously accessible to 67,000 users on an invite-only basis, the site aims to position itself as a competitor to Reddit by enhancing community engagement and user interaction.
Digg launches its new Reddit rival to the public
NeutralArtificial Intelligence
Digg has officially launched its new platform, positioning itself as a competitor to Reddit, focusing on community engagement and user interaction. This marks a significant reboot of the earlier social news site, aiming to attract users seeking alternative social media experiences.
Google taps its massive data advantage with new Gemini feature
PositiveArtificial Intelligence
Google has introduced a new feature called 'Personal Intelligence' for its Gemini AI, which integrates data from Gmail, Google Photos, and YouTube to enhance user interactions. This feature aims to make the AI assistant more responsive and personalized by leveraging Google's extensive data resources.
Google Gemini Can Proactively Analyze Users’ Gmail, Photos, Searches
PositiveArtificial Intelligence
Alphabet Inc.'s Google has announced that its Gemini artificial intelligence assistant can now proactively analyze users' data across various platforms, including Gmail, Search, Photos, and YouTube, enhancing personalization for its consumer-facing AI product.
First-ever dataset to improve English-to-Malayalam machine translation fills critical gap for low-resource languages
PositiveArtificial Intelligence
Researchers at the University of Surrey have developed the world's first dataset designed to enhance English-to-Malayalam machine translation, addressing a significant gap for this low-resource language spoken by over 38 million people in India.
Use of AI to harm women has only just begun, experts warn
NegativeArtificial Intelligence
Experts warn that the use of AI to create harmful sexualized imagery, particularly targeting women and children, is just beginning, as evidenced by the controversial Grok AI chatbot developed by Elon Musk's xAI. Despite recent attempts to implement safeguards, users continue to exploit the tool for generating explicit content.

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about