tech

December 19, 2025

Gemma Scope 2: Helping the AI Safety Community Deepen Understanding of Complex Language Model Behavior

Announcing Gemma Scope 2, a comprehensive, open suite of interpretability tools for the entire Gemma 3 family to accelerate AI safety research.

Gemma Scope 2: Helping the AI Safety Community Deepen Understanding of Complex Language Model Behavior

TL;DR

  • Gemma Scope 2 is a new, open suite of interpretability tools for Gemma 3 models.
  • It aims to make the internal decision-making processes of LLMs more transparent.
  • The tools enable researchers to trace potential risks and debug emergent behaviors.
  • This release is noted as the largest open-source release of interpretability tools by an AI lab.
  • Gemma Scope 2 includes upgraded tools like skip-transcoders and cross-layer transcoders, and utilizes the Matryoshka training technique.
  • Specialized tools are available for analyzing chatbot behaviors such as jailbreaks and refusal mechanisms.
  • An interactive demo and various resources are available for users to explore Gemma Scope 2.

Continue reading
the original article

Made withNostr