OpenAI has introduced Codex Security, a new application security agent designed to identify complex vulnerabilities in software systems. The tool is currently rolling out in research preview for ChatGPT Pro, Enterprise, Business, and Edu customers through the Codex web interface, with free usage available for the first month.

Codex Security is designed to analyse software repositories in depth, detect vulnerabilities, validate them, and propose fixes. The company says the system focuses on surfacing high-confidence security issues, helping development teams spend less time triaging low-impact alerts.

Security teams today face a paradox. While AI tools promise to automate vulnerability detection, many existing systems generate large numbers of false positives or low-severity findings, forcing engineers to review alerts that may not matter. At the same time, the rise of AI-assisted coding is accelerating software development cycles, turning security review into a major bottleneck.

Codex Security attempts to address both challenges by combining agentic reasoning from OpenAI’s frontier models with automated validation. The goal is to help teams focus on vulnerabilities with real-world impact rather than sorting through thousands of alerts.

The system was previously known internally as Aardvark and began last year as a private beta with a small group of customers.

Context-Driven Scanning And Automated Fixes

A key difference in Codex Security’s approach lies in how it builds system context before scanning for vulnerabilities. Once configured, the agent analyses the repository structure and generates a project-specific threat model that captures what the system does, which components it trusts, and where potential attack surfaces may exist. Security teams can edit this model to ensure it reflects their architecture and risk posture.

Using this context, Codex Security scans the codebase to identify vulnerabilities and categorise them based on expected real-world impact. When possible, the system validates findings in sandboxed environments, helping distinguish genuine risks from noise.

The platform can also generate proof-of-concept demonstrations for vulnerabilities when it has access to environments tailored to the project, giving security teams stronger evidence and a clearer path to remediation.

Beyond detection, Codex Security proposes patches aligned with system architecture and behaviour, helping reduce the risk of introducing new bugs during remediation. Users can filter findings based on severity or relevance so teams can prioritise issues with the highest security impact.

The system also improves over time through feedback. When teams adjust the criticality of a finding, the agent refines its threat model and future scans, gradually improving precision.

Early Results And Expanding Role In Open Source Security

OpenAI says early internal deployments helped identify several critical issues, including a server-side request forgery (SSRF) vulnerability and a cross-tenant authentication flaw, both of which were patched within hours.

Over time, improvements in the system have reduced the noise generated during scans. In one case, scans on the same repositories showed an 84% reduction in noise since the initial rollout. The rate of findings with over-reported severity has dropped by more than 90%, while false positives have declined by over 50% across repositories.

Over the past 30 days, Codex Security analysed more than 1.2 million commits across external repositories in its beta cohort, identifying 792 critical findings and 10,561 high-severity issues, while critical vulnerabilities appeared in less than 0.1% of commits. Companies participating in the early access program have already begun integrating the tool into their development workflows.

“As a company laser-focused on product security, NETGEAR was pleased to join the early access program, and the results exceeded expectations. Codex Security integrated effortlessly into our robust security development environment, strengthening the pace and depth of our review processes. Its findings were impressively clear and comprehensive, often giving the sense that an experienced product security researcher was working alongside us," said Chandan Nandakumaraiah, Head of Product Security, NETGEAR, and Member of the CVE Board.

Beyond enterprise deployments, OpenAI is also using Codex Security to strengthen open-source ecosystems. The system has been used to scan open-source repositories and report high-impact vulnerabilities to maintainers across widely used projects, including OpenSSH, GnuTLS, GOGS, Thorium, libssh, PHP, and Chromium. So far, 14 CVEs have been assigned, with two involving dual reporting.

OpenAI has also launched Codex for OSS, a program offering open-source maintainers free ChatGPT Pro and Plus accounts, code review tools, and access to Codex Security. Projects such as vLLM have already used the system to detect and patch vulnerabilities during routine development.

As AI agents increasingly shape software development, tools that combine deep system context with automated validation could become central to modern DevSecOps workflows, helping teams identify and fix critical vulnerabilities earlier in the development cycle.