India's Finance Minister Nirmala Sitharaman has flagged Anthropic's Claude Mythos as an unprecedented cybersecurity threat to the banking sector and directed IBA-led hardening with CERT-In intelligence sharing. Mythos has already autonomously discovered thousands of zero-days across every major OS and browser at trivial cost. This is the preemptive playbook every CISO needs to run before similar capabilities reach the offensive ecosystem.
By Sunil Kumar | Security Research, May 2026
The Moment Everything Changed
For most of my career as a security practitioner, the phrase we are running out of time to patch was a slogan. After spending the last several weeks reading through Anthropic's Mythos Preview disclosure, the OSS-Fuzz benchmark numbers, and the wave of advisories that quietly hit critical infrastructure between mid-April and the end of the month, I no longer think of it as a slogan. It is a planning constraint. The arithmetic of vulnerability management has changed underneath us, and most boards are still reviewing security programs designed for a world where attackers needed weeks of human effort to weaponize a flaw.
On April 7, Anthropic announced Claude Mythos Preview together with Project Glasswing, a closed consortium giving twelve founding partners — AWS, Apple, Broadcom, Cisco, Google, the Linux Foundation, Microsoft, NVIDIA, JPMorgan Chase, and others — early access to a frontier model deemed too dangerous for general release. Inside Anthropic's lab, Mythos identified thousands of zero-day vulnerabilities across every major operating system and every major browser. Some of those flaws had been sitting in production code for decades.
A signed-integer overflow in the OpenBSD TCP SACK implementation that had survived 27 years. An H.264 slice-counter mismatch in FFmpeg that survived 16 years. CVE-2026-4747, an unauthenticated NFS RCE in FreeBSD that yields root from anywhere on the internet — found and exploited fully autonomously by Mythos, with no human steering. Anthropic's own engineers, by their admission with no formal security training, walked away with working exploits overnight.
The problem is not Mythos itself. Mythos is locked behind Glasswing. The problem is what Mythos proves: the unit economics of zero-day discovery just collapsed.
Why This Is Not Just Another AI Hype Cycle
I have been skeptical of AI-in-security claims for a long time. Most of what I tested between 2023 and 2025 was either glorified static analysis, or LLMs that hallucinated CVEs that did not exist. What changed with Mythos is verifiable in published numbers, not marketing copy.
Against the OSS-Fuzz corpus, Mythos achieved 595 tier-1 and tier-2 crashes and 10 tier-5 results — full control-flow hijack — versus a single tier-3 result from the previous-generation model. On Firefox, Mythos produced 181 working exploits where the prior generation produced two. The 1,000-repository sweep that surfaced thousands of high and critical issues cost Anthropic under twenty thousand dollars in compute. That is the line item that should worry every CISO.
The before-and-after picture is stark when laid out in operational terms — what an attacker needed yesterday versus what an AI agent can sustain today.
| Capability | Human-Driven Attacker (Pre-2026) | AI-Augmented Attacker (Today) |
|---|---|---|
| Recon across an unknown enterprise edge | Days to weeks of manual fingerprinting | Hours of automated, structured enumeration |
| Reading 200K LOC for vuln patterns | Senior reverse engineer, weeks of effort | Inference workload, single afternoon |
| Working exploit for memory corruption | Concentrated expertise, rare globally | Reproducible at low cost across many bugs |
| Chaining 4+ medium findings into business impact | Rare and bespoke | Single reasoning pass, repeated at scale |
| Cost to sweep 1,000 repositories | Not feasible without a large team | Under 20,000 USD in compute |
There are essentially two timelines now. In the first, models with Mythos-class capability remain inside Glasswing, defenders use them to harden critical software during the 90-day public-reporting window Anthropic promised, and offensive proliferation lags by six to eighteen months. In the second timeline, capability leaks faster — through open-weights replication, jailbreaks, fine-tuning of smaller models, or theft from a partner. Neither timeline is comforting, because the offensive ecosystem does not need Mythos itself. It only needs proof that the technique works.
PentestGPT, a fully open-source LLM-driven pentest agent, has been around since 2023 and is being forked aggressively. The capability gap between frontier defensive models and what an opportunistic threat cluster can stitch together with smaller open-weight models is closing every quarter.
When the Finance Minister Calls It Unprecedented
This is no longer a research-paper conversation. On April 25, 2026, Finance Minister Nirmala Sitharaman convened India's banking leadership to assess Mythos exposure. She called the threat unprecedented, noted that not much is known about how it will play out, and directed the Indian Banks' Association to lead a sector-wide hardening exercise — real-time threat-intelligence sharing with CERT-In, hardened core banking IT, and customer data as immediate priorities. MeitY was tasked with tracking the capability globally. The US government held parallel talks with Wall Street banks the same week.
I have spent the last week fielding the same question from CISOs at Indian banks, NBFCs, insurers, and fintechs: what does the regulator now expect us to have done? When the Finance Minister and MeitY are moving in lockstep on a single threat, the boardroom conversation has already moved past whether to act. It is now about how fast.
The asymmetry, importantly, currently favors defenders who move first. The same Mythos-class capability that worries regulators is, right now, being used by Glasswing partners to clear decades of accumulated vulnerability debt — the OpenBSD, FFmpeg, FreeBSD, and browser patches landing this month are proof of that. Indian banks and large enterprises are not locked out of this leverage either: Claude API on AWS Bedrock and Google Vertex is generally available, and a governed defensive AI program can start finding and closing the same kinds of issues internally before opportunistic adversaries get cheaper open-weight equivalents. The window is open. It will not stay open forever.
What AI Agents Actually Compress
It is worth being precise about what changes and what does not. AI agents are not magic. They do not invent attack classes. They compress three things very aggressively, and that compression is what reshapes the defender's calculus.
- Discovery time. A model can read a 200,000-line codebase, recognize a vulnerable pattern, and write a triggering input in an afternoon. The same work used to take a senior reverse engineer weeks.
- Exploitation cost. Working exploits — KASLR bypasses, JIT heap sprays, multi-packet ROP chains — used to require expertise concentrated in a few hundred people globally. Mythos showed that exploitation is now an inference workload.
- Chaining and reach. This is the one defenders consistently underestimate. An exposed API, predictable object IDs, weak token validation, and a misconfigured storage bucket might rate medium-medium-medium-low individually. An agent stitches them into mass data exfiltration in a single reasoning pass.
What does not change: the defender's fundamentals. Inventory still matters. Object-level authorization still matters. Least privilege still matters. The difference is that the cost of getting any of these wrong has gone up by an order of magnitude, and the patching window has shrunk to whatever the slowest dependency in your supply chain can deliver.
The AI Acceleration Factor: A Risk-Model Adjustment
Most enterprise risk frameworks I have audited still treat exploit difficulty as a static input. That assumption is broken. Over the last few weeks I have been working with security teams to introduce what we are calling an AI Acceleration Factor — a multiplier that re-weights inherent risk based on how amenable a vulnerability is to autonomous discovery and exploitation.
| Tier | Practical Definition | Examples | Risk Multiplier |
|---|---|---|---|
| Low | Requires rare physical access or deep, undocumented domain expertise | Hardware fault injection, custom proprietary protocol bugs | 1.0x |
| Medium | AI assists analysis or payload generation but human steering is required | Logic flaws in custom business workflows, niche framework abuse | 1.5x to 2x |
| High | Models can discover, validate, and exploit at machine speed | Memory corruption in widely deployed parsers, IDOR on documented APIs, deserialization in popular frameworks | 3x to 5x |
A CVSS-7.5 issue in your file-upload parser that historically would have sat in a backlog for three months is, under this framing, a CVSS-7.5 issue with a 4x multiplier — which means it competes for engineering attention with what your old model treated as criticals. The teams that have started running this exercise consistently find that 15 to 25 percent of their medium backlog is actually high-acceleration, and most of that lives in API surface, document parsers, and identity flows.
Where Defenders Lose Ground First
When I look at where the asymmetry is sharpest, five exposure dimensions consistently emerge. Microsoft's April 22 advisory called out the same five, which lined up almost exactly with what I had seen in a banking-sector engagement earlier in the month. The common thread: all five are discovery-bound problems.
| Exposure Dimension | Why AI Agents Win Here | Defender Hardening Move |
|---|---|---|
| Internet-facing assets | Forgotten subdomains, shadow apps, unmanaged edges fall easily to automated enumeration | Continuous external attack surface discovery and ownership tagging |
| Open-source dependencies | Transitive paths and unpatched runtimes are easy for a model to graph and validate | Live SBOM reconciled against runtime, dependency health beyond CVEs |
| Custom source code | Parsers, deserialization, auth and crypto logic yield to model-driven pattern recognition | Reachability-aware SAST, fuzzing in CI, threat-model parsers explicitly |
| Patching latency | Every additional day past disclosure is now an exploitable day | SLA tightening on internet-facing and KEV-listed issues, virtual patching where feasible |
| Identity and IAM paths | Token, role, and trust chains are graphs that models traverse efficiently | Attack-path analysis on identity, ruthless least privilege, hardened CI/CD |
The defender's first weakness is not that they cannot patch — it is that they do not know what they have. Continuous discovery — across external attack surface, API inventory, and software composition — is the only foundation the rest of the program holds up on. This is exactly where exposure-management tooling stops being a nice-to-have. A platform like SecureNexus CTEM, correlating signals from SecureNexus API POS for the API landscape, SecureNexus SOVA for software composition and the emerging AI-BOM dimension, and SecureNexus CSPM for cloud posture, lets teams reason about paths, not isolated tickets. Without that correlation, an AI-augmented attacker will see structure that a human-driven SOC simply cannot enumerate fast enough.
The Defensive AI Question Most Boards Are Not Asking
Every CISO I have spoken to in the last month has asked the same question in different words: how aggressively should we use AI agents internally? The honest answer is that you should be using them as aggressively as your governance allows, and your governance probably needs an upgrade. Defensive AI is the only realistic path to working through the legacy vulnerability debt that has been crushing remediation queues for a decade. Mythos-class capability turns a backlog that previously needed a hundred reverse engineers into something a small, well-instrumented team can actually finish.
The risk is not Mythos-class behavior leaking out — Glasswing is locked down. The risk is internal misuse, prompt-injection-driven exfiltration, scope creep into production data, and lack of audit trails. Every team I have seen succeed with defensive AI has put six controls in place before letting an agent loose.
- Written scope and approval. Targets, data classes, and time windows defined before any run.
- Production data boundary. No sensitive data flows into models or vendors that have not been approved by legal and compliance.
- Human review of generated exploits. Nothing fires against production without a senior practitioner signing off.
- Full prompt and tool-use logging. Every action the agent took, with timestamps, retained for audit.
- Rate limits and a kill switch. The agent can be stopped instantly, and it cannot fan out to thousands of targets without an explicit gate.
- Legal and compliance sign-off. Especially for anything touching customer-impacting systems or regulated data.
Skip any one of these, and the AI agent that was supposed to find your shadow APIs becomes the audit finding that ends your next regulator interaction.
A Tightened 90-Day Plan
The 90-day plans I wrote a year ago were aspirational. The version I am writing for clients in May 2026 is built around the assumption that an AI-augmented adversary is already running reconnaissance against the enterprise edge.
Days 1 to 30: Visibility shock test
Refresh the full external attack surface, pull every API into a single inventory (public, partner, undocumented, internal-but-exposed), reconcile the SBOM against the live runtime, and overlay the CISA KEV catalog. Pay special attention to admin panels, remote access, and vendor portals. Most teams discover during this phase that their inventory is 20 to 40 percent off.
Days 31 to 60: Exploitability validation
Stop scoring on CVSS alone. Run reachability and chaining analysis against crown-jewel applications. Fuzz the parsers. Walk the IAM graph for token, role, and trust chains. Validate that your high-severity findings are actually exploitable in your environment — and validate that your mediums are not actually high-acceleration items hiding in the backlog.
Days 61 to 90: Operational rewiring
Move risk reporting from CVSS-by-asset to attack-path-by-business-function. Stand up continuous validation in the form your team can actually sustain — purple teaming, fuzz integration into CI, scheduled ASM sweeps. Publish an executive dashboard that answers a single question: what is the longest time a critical exposure has gone unaddressed in our environment, and is that number going down each week? Add an AI governance policy if you do not already have one.
Sector Realities
For the BFSI teams I work with most often, the priorities are unmistakable, and they now come with explicit regulatory weight after the IBA-led directive: customer-facing APIs, payment flows, document upload paths in onboarding, mobile app backends, and vendor integrations. Object-level authorization is the single highest-leverage control. Most of the breaches I have investigated in this sector in the last 18 months would have been blunted by it. Banks should also be wiring real-time threat-intelligence pipes into CERT-In and the IBA peer network — that is no longer a maturity-model line item, it is an instruction.
For critical infrastructure operators, the urgency is around OT/IT boundaries, vendor remote access, and legacy systems that no AI agent should ever be able to reach unsupervised. Test the backups. Run the tabletop. Assume the patching window is shorter than your maintenance window. The threat clusters operating in this space have already been observed integrating AI into reconnaissance and credential-harvesting stages of the kill chain.
What I Tell Boards Now
When I am asked the boardroom question — are we ready for this? — I no longer answer with maturity scores. I answer with five questions, in this order.
- Do we have continuous, reconciled visibility of every internet-facing asset, API, dependency, and identity path?
- Can we measure exploitability — not just severity — in hours, not weeks?
- Have we modeled AI acceleration in our risk framework, or are we still scoring like it is 2023?
- Are we using defensive AI under governance, or are we waiting for our adversaries to set the pace?
- How fast can we contain a chained, AI-discovered attack path once it is identified?
If any of those answers is hand-wavy, the program is operating on borrowed time.
Closing Thoughts
Mythos is not the doomsday story. It is the proof point. Frontier AI agents capable of finding and chaining vulnerabilities now exist. Access controls will hold for a while, but the offensive ecosystem is already learning from the same playbook with smaller, cheaper models. The organizations I see pulling ahead are the ones that have stopped treating exposure management as a quarterly activity and started treating it as a real-time operational signal — fed by continuous discovery, prioritized by attack-path reasoning, and governed tightly enough that defensive AI is an accelerant rather than a liability. The winners will not be the ones with the biggest legacy controls budget. They will be the ones who put AI to work on their own attack surface first.
The models are here. The agents are hunting. The work is to make sure your environment is the boring, well-instrumented, well-correlated one they pass over in favor of an easier target.
About the Author
Sunil Yadav is the Founder and Head of Cybersecurity at X-Biz Techventures, with nearly two decades of experience across application security, red teaming, attack surface management, cloud security, API security, CTEM, threat intelligence, vulnerability management, and digital supply chain security.
