tech
December 8, 2025
Introducing IndQA
A new benchmark for evaluating AI systems on Indian culture and languages.

TL;DR
- IndQA is a new benchmark designed to evaluate AI models' understanding and reasoning in Indian languages.
- It addresses limitations of existing benchmarks that are saturated and focus mainly on translation or multiple-choice tasks.
- IndQA spans 2,278 questions across 12 languages and 10 cultural domains, developed with 261 domain experts from India.
- The benchmark evaluates culturally nuanced, reasoning-heavy tasks that are difficult for current evaluations to capture.
- IndQA uses a rubric-based grading approach, with questions and criteria created by native-level speakers and subject matter experts.
- Questions were filtered adversarially, keeping only those where top AI models failed to produce acceptable answers, to preserve headroom for progress.
- The benchmark aims to measure improvement over time within a model family or configuration, rather than serving as a direct language leaderboard.
- The release of IndQA is intended to inspire the creation of similar benchmarks for other languages and cultural domains poorly covered by existing AI benchmarks.
Continue reading
the original article