tech

December 3, 2025

What We Learned at SIGIR 2025

SIGIR (Special Interest Group on Information Retrieval) is a top-tier information retrieval conference bringing together researchers, developers, industry experts, and educators from across the globe to share the latest ground-breaking research. Jina AI was at this year’s conference in Padua in July, presenting our work on late chunking at the Robust IR Workshop.

What We Learned at SIGIR 2025

TL;DR

  • Jina AI presented their 'late chunking' method at SIGIR 2025's Robust IR workshop, improving text retrieval by applying chunking after embedding.
  • SIGIR 2025 highlighted research in reranking, sparse retrieval, and LLM integration for information retrieval.
  • Keynotes included Stephen Robertson on BM25 and Iryna Gurevych on AI in scientific research.
  • CLIP-AdaM was presented for open-set 3D object retrieval using multi-view CLIP models.
  • A framework for 'compound retrieval systems' was proposed to optimize the combination of multiple rerankers.
  • RE-AdaptIR suggests using weight differences from fine-tuned models to improve embeddings for new domains.
  • Evaluations of LLM-based relevance judgment methods indicate binary judgments and pairwise comparisons perform best.
  • Research explored the interplay of LLMs as rankers, judges, and assistants in IR evaluation, noting potential biases.
  • A distinction was made between relevance and usefulness in search results, with LLMs showing alignment with human judgments on usefulness.
  • Limitations of LLM-based relevance assessment were discussed, including insufficient evidence, vulnerability to manipulation, bias, and overfitting risks.

Continue reading
the original article

Made withNostr