JinaVDR: New Visual Document Retrieval Benchmark with 95 Tasks in 20 Languages

April 24, 2026

TL;DR

JinaVDR is a new benchmark for visual document retrieval, covering 95 tasks across 20 languages.
It evaluates models on visually complex and multilingual documents with intricate layouts like graphs, charts, and tables.
The benchmark incorporates diverse domains such as historic documents, legal texts, and scientific papers.
JinaVDR was constructed by repurposing existing datasets, manual annotation, synthetic generation, and repurposing crawled datasets.
Existing benchmarks like MTEB are primarily text-based, while ViDoRe and MIEB have limitations in language diversity and document complexity.
Benchmarking results indicate that many recent embedding models struggle with JinaVDR's tasks, with Jina-embeddings-v4 showing superior performance due to its multi-vector capability.
JinaVDR is being integrated into the MTEB framework to increase adoption and ease of use.
Limitations include size normalization by subsampling datasets and quality filtering for practical usability and evaluation quality.

Continue reading the original article