Tech

Technology shaping the modern world.

Jina Embeddings v4: Universal Embeddings for Multimodal Multilingual Retrieval

tech

neutral

Jina Embeddings v4: Universal Embeddings for Multimodal Multilingual Retrieval

Today we're releasing jina-embeddings-v4, our new 3.8 billion parameter universal embedding model for text and images. It includes a set of task-specific LoRA adapters that optimize performance for the most popular retrieval tasks, including query-document retrieval, semantic matching, and code search. jina-embeddings-v4 achieves state-of-the-art retrieval performance on multimodal and multilingual tasks across MTEB, MMTEB, CoIR, LongEmbed, STS, Jina-VDR, CLIP, and ViDoRe benchmarks, with particular strength in processing visually rich content such as tables, charts, diagrams, and mixture of them. The model supports both single-vector and multi-vector embeddings.

7 months ago

Jina Reranker v3: 0.6B Listwise Reranker for SOTA Multilingual Retrieval

tech

neutral

Jina Reranker v3: 0.6B Listwise Reranker for SOTA Multilingual Retrieval

We're excited to release jina-reranker-v3, our latest-generation reranker that delivers state-of-the-art performance across multilingual retrieval benchmarks. This 0.6B-parameter document reranker introduces a novel last but not late interaction that takes a fundamentally different approach from existing methods. jina-reranker-v3 works listwise: it applies causal attention between the query and all candidate documents within a single context window, enabling rich cross-document interactions before extracting contextual embeddings from each document's final token. Our new model achieves 61.94 nDCG@10 on BEIR outperforming Qwen3-Reranker-4B while being 6× smaller in size.

4 months ago

How Image Resolution Impacts Visual Document Retrieval

tech

neutral

How Image Resolution Impacts Visual Document Retrieval

Traditional computer vision models typically focus on mimicking human visual perception. jina-embeddings-v4 takes a different approach: it combines image and text processing to understand how people read and interpret information presented visually. Unlike OCR programs that simply digitize text, it actually parses complex visual materials like infographics, charts, diagrams, and tables—documents where both the text and visual elements carry semantic meaning. We call these "visually rich documents."

6 months ago

tech

neutral

What We Learned at SIGIR 2025

SIGIR (Special Interest Group on Information Retrieval) is a top-tier information retrieval conference bringing together researchers, developers, industry experts, and educators from across the globe to share the latest ground-breaking research. Jina AI was at this year’s conference in Padua in July, presenting our work on late chunking at the Robust IR Workshop.

5 months ago

Multimodal Embeddings in Llama.cpp and GGUF

tech

neutral

Multimodal Embeddings in Llama.cpp and GGUF

We have introduced multiple fixes in our fork of llama.cpp, so that it works with jina-embeddings-v4 on multimodal embeddings.

5 months ago

Agentic Workflow with Jina Remote MCP Server

tech

neutral

Agentic Workflow with Jina Remote MCP Server

We showed you, in a previous post, how to integrate Jina AI’s search and reader APIs with DeepSeek R1 to build a deep research agent, but it took a lot of custom code and prompt engineering to make it work. In this post we’ll do the same thing using the Model Context Protocol (MCP), which uses a lot less custom code and is portable to different LLMs, but is still subject to a few pitfalls along the way.

5 months ago