Detecting and Preventing Distillation Attacks

February 23, 2026

TL;DR

Three AI labs—DeepSeek, Moonshot, and MiniMax—used over 16 million exchanges via 24,000 fraudulent accounts to illicitly extract Claude's capabilities.
This "distillation" technique trains a less capable model on the outputs of a stronger one, allowing competitors to gain capabilities rapidly and at low cost.
Illicitly distilled models lack safeguards, creating national security risks by enabling the proliferation of dangerous capabilities and potentially being used by authoritarian governments for cyber operations, disinformation, and surveillance.
Distillation attacks undermine export controls designed to maintain AI leadership, as they allow foreign labs to bypass restrictions by acquiring capabilities through illicit means.
DeepSeek focused on reasoning, grading tasks, and censorship-safe alternatives; Moonshot AI targeted agentic reasoning, tool use, coding, and computer vision; MiniMax focused on agentic coding and tool use.
Labs circumvent access restrictions using commercial proxy services and "hydra cluster" architectures of fraudulent accounts.
Anthropic is investing in detection, intelligence sharing, access controls, and countermeasures, but stresses that a coordinated response across the AI industry, cloud providers, and policymakers is essential.

Continue reading the original article