Strengthening cyber resilience as AI capabilities advance

December 12, 2025

TL;DR

AI model cybersecurity capabilities have significantly improved, with potential for advanced exploits and intrusion assistance.
OpenAI is strengthening models for defensive cybersecurity and developing tools for tasks like code auditing and vulnerability patching.
A defense-in-depth approach with layered safety measures is being used to mitigate risks and prevent misuse.
Training involves teaching models to refuse harmful requests while remaining helpful for legitimate uses.
Detection systems monitor for malicious activity, with enforcement combining automated and human review.
Expert red teaming is employed to evaluate and improve safety mitigations.
Initiatives like a trusted access program and the Aardvark agent are designed to support cyberdefense.
The Frontier Risk Council will advise on balancing capability and potential misuse.
Collaboration through the Frontier Model Forum aims to develop a shared understanding of threat models within the industry.
OpenAI plans further initiatives and grants to foster innovation in defensive cybersecurity.

Continue reading
the original article