Introducing IndQA

December 8, 2025

TL;DR

IndQA is a new benchmark designed to evaluate AI models' understanding and reasoning in Indian languages.
It addresses limitations of existing benchmarks that are saturated and focus mainly on translation or multiple-choice tasks.
IndQA spans 2,278 questions across 12 languages and 10 cultural domains, developed with 261 domain experts from India.
The benchmark evaluates culturally nuanced, reasoning-heavy tasks that are difficult for current evaluations to capture.
IndQA uses a rubric-based grading approach, with questions and criteria created by native-level speakers and subject matter experts.
Questions were filtered adversarially, keeping only those where top AI models failed to produce acceptable answers, to preserve headroom for progress.
The benchmark aims to measure improvement over time within a model family or configuration, rather than serving as a direct language leaderboard.
The release of IndQA is intended to inspire the creation of similar benchmarks for other languages and cultural domains poorly covered by existing AI benchmarks.

Continue reading
the original article