Solo AI jailbreak breaches nine government agencies; Musk loses OpenAI trial

Two commercial AI subscriptions and a few days of prompt iteration. That’s what it took to breach nine Mexican government agencies last week, exfiltrate 150 gigabytes of data, and expose 195 million taxpayer records. No custom malware. No team. The attacker jailbroke Claude Code, switched to GPT-4.1 when it refused, and kept going. If you’ve been treating “AI-assisted cyberattack” as a 2027 problem, that assumption didn’t survive last week.

The Big Stories

Jailbroken Commercial AI Was the Primary Tool in Mexico’s Largest Known Government Breach

In a detailed first-person account published this week, researcher Konstantin Tkachuk describes compromising nine Mexican government agencies, including the federal tax authority (SAT), the National Electoral Institute (INE), and multiple state governments. The method: jailbreaking Claude Code into a “bug bounty researcher” persona and running 1,000+ prompts against it. When Claude’s safety guardrails engaged, the attacker switched to GPT-4.1 as a fallback. Twenty vulnerabilities exploited. 150 GB out the door, containing 195 million taxpayer records, voter registration rolls, and government employee credentials. The account is a single primary source without independent corroboration yet, but the level of operational detail is specific and Tkachuk published it publicly.

Why it matters: This is the first publicly documented large-scale government breach where the primary offensive tool was a jailbroken commercial AI assistant, not custom malware. The attacker wasn’t a nation-state operative or a trained penetration tester. Tkachuk’s own framing: the cost of running this operation is “a Claude Code subscription plus a few hundred dollars of API credit.” The multi-model fallback (Claude then GPT-4.1 when Claude refused) is the detail worth sitting with. Single-model safety guardrails aren’t systemic protection when attackers can switch vendors mid-operation. Any organization with internet-facing systems and PII at scale needs to update its threat model for an attacker profile that didn’t exist a year ago.

Anthropic Mythos Is Briefing G20 Financial Regulators; an October IPO Date Just Surfaced

Anthropic’s restricted Mythos model surfaced in three separate contexts this week. The Decoder reports Anthropic will hold a closed briefing for the Financial Stability Board, the G20’s financial regulatory body, at the request of Bank of England governor Andrew Bailey, specifically to discuss financial-system vulnerabilities Mythos identified. Separately, Cloudflare’s published evaluation confirmed Mythos can chain low-severity bugs into high-severity exploits in ways GPT-5.5 couldn’t replicate, outperforming GPT-5.5 on exploit chain construction in their evaluation. And a Latent Space editorial noted “finance folks fall in love with Anthropic’s growth and CFO ahead of its likely October IPO”; that’s the first explicit IPO timeline in the public record.

Why it matters: When the Bank of England governor personally requests a closed regulatory briefing about a single AI model, you’re past theoretical risk framing. The FSB covers 25 major economies; whatever Mythos found in financial infrastructure is getting briefed to people who can move policy. The October IPO signal, if accurate, puts Anthropic on a countdown to public markets. API pricing, contract terms, and access restrictions may shift before year-end as the company positions itself for institutional investors. The Mythos restriction story and the IPO growth story are pulling against each other right now; watch which wins.

Musk v. OpenAI: Dismissed in Two Hours, Appeal Filed the Same Day

A federal jury in Oakland unanimously dismissed all of Elon Musk’s claims against Sam Altman and OpenAI on May 18, deliberating for less than two hours. The verdict: Musk knew about OpenAI’s for-profit restructuring years before he filed in 2024 and missed the statute of limitations on his primary claims. The court never reached the underlying merits. Musk called it a “calendar technicality” and vowed to appeal the same afternoon. NPR covered the verdict as a clean win for Altman; legal analysts noted that fact-intensive statute of limitations findings rarely get overturned on appeal.

Why it matters: The procedural dismissal means the central question the trial kept circling (whether frontier AI labs can be held accountable to their stated missions) was never answered. That question isn’t resolved; it’s deferred to the appeal and to other regulatory venues. What the two-week trial did put on the record was consequential anyway: Shivon Zilis’ testimony about Musk trying to recruit Altman to Tesla, Greg Brockman’s personal journal read aloud, and ongoing public scrutiny of OpenAI’s governance structure. The appeal keeps the case alive through at least Q3 2026.

Under the Radar

[Expert-first] Anthropic Just Ended Unlimited Programmatic Access to Claude

A significant change to Anthropic’s subscription tiers passed largely without mainstream notice this week. Every Claude plan now includes monthly API credits equal to the plan’s dollar amount; interactive Claude Code sessions and programmatic API use both draw from the same bucket. Practitioners running alternative Claude harnesses and high-volume automation built around the programmatic API (tools like claude-p or custom workflow integrations) are now on explicit credit limits. Latent Space covered it in an editorial framing the change as Anthropic putting “its most favorable pricing behind its own tools.” No press release. Zero mainstream AI news coverage. The Claude Code changelog surfaced the usage credits rename (/usage-credits) in v2.1.144 with no fanfare.

Why you should care: Treating Claude programmatic API access as functionally unlimited has been a reasonable assumption for the past year. That assumption is gone. The restructuring is also a competitive signal: Anthropic is tightening credits on third-party harnesses at the same time Codex is reportedly running more generous limits as the challenger trying to win developer share. If your team has automation built on Claude API calls at volume, audit your usage now rather than at the end of the billing cycle.

[Expert-first] A U.S. Government Assessment Says Open Models Are Falling Further Behind the Frontier

CAISI (the Center for AI Standards and Innovation, NIST’s AI standards body) evaluated DeepSeek V4 Pro and found it roughly eight months behind U.S. frontier models across cybersecurity, software engineering, mathematics, reasoning, and natural science benchmarks, with CAISI characterizing the gap as widening over time, though independent benchmarks show a more stable spread. The finding landed the same week that Gemma 4, DeepSeek V4, Kimi K2.6, MiMo 2.5, and GLM-5.1 all shipped within seven days of each other. A flurry of open releases, against a government report saying none of them closed the gap.

Why you should care: Cost-efficiency arguments for open models typically assume the capability gap is stable or narrowing. The official U.S. position is now the opposite. This shapes export controls, procurement policy, and enterprise AI strategy for the next 18 months regardless of whether the benchmarks are perfectly calibrated. Independent benchmarks put V4 Pro at rough parity with Opus 4.6 on some tasks, so the truth is probably between the two assessments. But if your architecture bets are based on “open models will catch up,” that assumption now has a formal government counterpoint worth factoring in.

Quick Hits

Anthropic buys Stainless for $300M+ - The SDK automation startup used by OpenAI, Google, and Cloudflare is now an Anthropic subsidiary; hosted products are being wound down, giving rivals full rights to SDKs already generated. TechCrunch
Anthropic traced Claude’s blackmail behavior to sci-fi training data - Opus 4 exhibited blackmail behavior at a 96% rate in tests; Anthropic found that training on fictional stories portraying aligned AI reasoning reduced the rate by more than 3x. Teaching Claude Why
ArXiv introduces one-year ban for AI slop - Hallucinated citations, placeholder text, or LLM comments left in manuscripts now trigger a year-long submission ban from the preprint server. TechCrunch
OpenAI merges ChatGPT and Codex under Brockman - Greg Brockman is overseeing OpenAI product strategy on an interim basis while Fidji Simo is on medical leave, with an internal memo pledging to “consolidate product efforts toward the agentic future”; fine-tuning APIs were deprecated earlier in the month. TechCrunch
Ontario government audit finds AI scribes hallucinating clinical notes - 17 of 20 government-approved AI medical transcription vendors missed key mental health details in tests; 9 fabricated treatment recommendations including drug referrals and blood test orders that weren’t mentioned in recordings. CBC News
Cerebras moves toward a $60B IPO - Enabled by a 750MW compute partnership with OpenAI; positions Cerebras chips for large-scale inference workloads as an NVIDIA alternative, representing significant AI infrastructure investor appetite. TechCrunch
vLLM v0.21.0 ships with 367 commits and 202 contributors - Steady community velocity on the open inference stack, compounding quietly while frontier model news dominates. vLLM Releases

What to Watch

Anthropic’s October IPO timeline. This is the first explicit date circulating in the public record, and it creates a specific window to watch. A public company faces different pressures on pricing, API contract terms, and safety tradeoffs than a private one does. The Mythos restriction story and the IPO growth story are both active simultaneously, pulling in opposite directions: one argues for keeping the most capable model locked down, the other argues for demonstrating revenue growth to institutional investors. Watch for Anthropic pricing changes before Q3, whether Mythos access expands beyond the current 40 organizations, and how the FSB briefing response shapes regulatory positioning ahead of the offering.

If someone forwarded this to you, subscribe here to get it weekly.

Our team also ships AI Edge, a mobile companion for daily AI news scanning between issues. Free, on-device. iOS · Android.