tech
December 3, 2025
Jina Code Embeddings: SOTA Code Retrieval at 0.5B and 1.5B
Today we're releasing jina-code-embeddings, a new suite of code embedding models in two sizes—0.5B and 1.5B parameters—along with 1-4 bit GGUF quantizations for both. Built on the latest code generation LLMs, these models achieve state-of-the-art retrieval performance despite their compact size. They support five retrieval tasks including nl2code, code2code, code2nl, code2completions, and qa across 15 programming languages including Python, JavaScript, Java, C++, C#, Go, Rust, TypeScript, SQL, MATLAB, R, Swift, Kotlin, HTML/CSS, PHP, Ruby, Scala, Perl, and Shell.

TL;DR
- Release of jina-code-embeddings models in 0.5B and 1.5B parameter sizes.
- Includes 1-4 bit GGUF quantizations for both model sizes.
- Built on latest code generation LLMs for state-of-the-art retrieval performance.
- Supports five retrieval tasks: nl2code, code2code, code2nl, code2completions, and qa.
- Compatible with 15 programming languages, including Python, JavaScript, Java, and C++.
Continue reading
the original article