tech

January 29, 2026

Inside OpenAI’s in-house data agent

Data powers how systems learn, products evolve, and how companies make choices. But getting answers quickly, correctly, and with the right context is often harder than it should be. To make this easier as OpenAI scales, we built our own bespoke in-house AI data agent that explores and reasons over our own platform.

Inside OpenAI’s in-house data agent

TL;DR

  • OpenAI developed an internal-only AI data agent to simplify data exploration and analysis for its employees.
  • The agent uses OpenAI's own tools, including GPT-5.2 and Codex, to reason over the company's 600 petabytes of data.
  • It allows employees to ask complex questions in natural language and receive insights in minutes, rather than days.
  • The agent incorporates multiple layers of context, including metadata, human annotations, code enrichment, institutional knowledge, and memory, to ensure accurate results.
  • It integrates with existing workflows and tools like Slack and ChatGPT for seamless accessibility.
  • OpenAI uses its Evals API to systematically measure and maintain the agent's response quality and security.
  • Key lessons learned include simplifying toolsets, guiding goals rather than paths, and leveraging code-level definitions for deeper data understanding.