IndRegBias: A Dataset for Studying Indian Regional Biases in English and Code-Mixed Social Media Comments
NeutralArtificial Intelligence
- A new dataset named IndRegBias has been introduced to study regional biases in English and code-mixed comments on social media platforms like Reddit and YouTube, focusing on Indian contexts. This dataset comprises 25,000 comments that reflect regional biases, which have been less explored compared to other social biases such as gender and race.
- The development of the IndRegBias dataset is significant as it aims to fill a gap in research on regional biases in India, providing a resource for better understanding and addressing these biases in natural language processing applications.
- This initiative highlights the growing recognition of the importance of regional biases in AI and NLP, paralleling other studies that address biases in speech recognition and language models, indicating a broader trend towards inclusivity and accuracy in AI technologies across diverse linguistic and cultural landscapes.
— via World Pulse Now AI Editorial System






