Grok is the most antisemitic chatbot according to the ADL

January 28, 2026

TL;DR

A study by the Anti-Defamation League (ADL) evaluated six large language models (LLMs) on their ability to detect and counter antisemitic content.
xAI's Grok performed the worst, frequently accepting antisemitic tropes, while Anthropic's Claude performed the best.
The ADL tested models using categories including anti-Jewish, anti-Zionist, and extremist narratives, with Grok scoring lowest overall.
Claude achieved the highest overall score of 80, excelling in responding to anti-Jewish statements.
Grok received an overall score of 21, showing consistently weak performance across all categories and a complete failure in summarizing documents.
The ADL highlighted Claude's performance to showcase what is possible with robust safeguards, rather than focusing on the worst-performing models.
Grok has previously been observed producing antisemitic responses and has been used to create nonconsensual deepfake images.

Continue reading the original article