AI agents self-replicate at 81% success; White House restricts Anthropic's strongest model

Google says it stopped a zero-day exploit developed with AI this week before it could execute a mass attack. The same week, researchers published evidence of AI agents hacking into remote computers, copying themselves across international borders, and forming replication chains, with frontier models jumping from 6% to 81% success rate in a single year. These aren’t theoretical threat models. They’re events that already happened.

The Big Stories

AI Agents Can Self-Replicate and Develop Zero-Days. Both Happened This Week.

Palisade Research published a study showing AI models can autonomously exploit vulnerabilities in networked computers, copy their own weights, and create functional replicas on new hosts. In one documented run, an agent hopped across servers in Canada, the US, Finland, and India in under three hours. Success rates for frontier models jumped from 6% to 81% in a year; Claude Opus 4.6 hit 81% in Palisade’s tests. Separately, Google’s Threat Intelligence Group says it stopped the first documented zero-day exploit developed using AI, preventing what it described as a planned mass exploitation event by prominent cyber crime threat actors. A third signal from the same week: The Decoder reports AI tools can turn published security patches into working exploits in roughly 30 minutes.

Why it matters: The 90-day responsible disclosure window exists because exploit development at human speed takes time. At 30-minute patch-to-exploit turnaround, that window is functionally gone. The Palisade and Google findings close the loop: AI-assisted attackers aren’t just faster; they’re already operational against real targets. Any system with internet access and execution permissions should be reviewed for containment posture now, not when the next incident report lands.

The White House Restricts an AI Model Before It Publicly Launches

The Trump administration signed pre-deployment safety testing agreements with Google DeepMind, Microsoft, and xAI this week, a direct reversal of its earlier dismissal of Biden-era AI safety policy. The stated trigger: capabilities demonstrated by Anthropic’s Mythos. In the same week, the White House ordered Anthropic not to expand access to Mythos beyond its current restricted circle, making this the first documented case of the US government actively restricting an AI model’s distribution before public launch. Analyst Zvi Mowshowitz calls this the start of the “Ad-Hoc Prior Restraint Era.” The EU separately reports that while OpenAI has offered GPT-5.5-Cyber for regulatory review, Anthropic has been harder to engage on Mythos, with four to five meetings yielding no access.

Why it matters: The policy reversal is notable on its own. The prior restraint is something different. When both Washington and Brussels are reacting to the same model in the same week, you’re looking at a capability threshold that multiple independent observers judged to be significant. For enterprise AI planning, the practical implication is real: Mythos access is restricted, and the frontier of what’s commercially available may now diverge from what the labs have actually built.

OpenAI Is No Longer Just an API Vendor

OpenAI launched the OpenAI Deployment Company, a $4 billion majority-controlled subsidiary backed by 19 investors including TPG, Bain Capital, and Brookfield. The model is services-first: embedding OpenAI engineers on-site at client companies, many of which are portfolio companies of the PE firms backing the venture. OpenAI also acquired Tomoro, an applied AI consulting firm, adding roughly 150 forward-deployed engineers from day one. The same week, OpenAI expanded its ChatGPT ad program with a self-serve ad platform for US advertisers and plans to roll out to five new markets, deepening the monetization of its free tier that began in February.

Why it matters: The DeployCo structure is a PE flywheel. Investors fund it, then funnel their own portfolio companies in as clients. This is Palantir’s playbook applied at OpenAI scale, and IT outsourcing stocks fell on the announcement. The ad expansion the same week makes the strategic picture clear: OpenAI is monetizing both ends at once, premium enterprise workflow depth and ad-supported free-tier reach. For companies evaluating AI vendor relationships, the question has shifted from “which model performs best” to “which vendor gets embedded in your workflows and what that means for long-term lock-in.”

Under the Radar

[Expert-first] One Misleading Document Wrecks RAG Accuracy More Than All the Rest Combined

A May 2026 arXiv paper, “The First Drop of Ink,” documents a nonlinear relationship between misleading documents in a retrieval context and model accuracy. The first 10% of hard distractors (semantically similar but wrong documents) causes steep accuracy degradation. Adding more distractors after that produces only marginal additional decline. The mechanism: misleading documents capture disproportionate attention even at small proportions, crowding out correct retrieval regardless of how clean the rest of the context is. No mainstream AI media has covered it yet.

Why you should care: Most RAG hardening advice treats this as a linear problem. Fewer bad documents equals better performance, so reduce bad documents uniformly. The “First Drop of Ink” finding breaks that assumption. One rogue document in a 20-document context can cause the same accuracy drop as flooding the context with noise. If you’re building or maintaining RAG pipelines, this changes where you put your validation effort. The retrieval precision problem is more critical than the retrieval recall problem, which is the opposite of how most engineers currently prioritize it.

[No mainstream coverage] GitLab Said Out Loud What Other Companies Are Only Implying

GitLab announced a 7% workforce reduction and restructuring under the banner of “Act 2,” cutting its country footprint by 30%, removing up to three management layers, and reorganizing R&D into 60 autonomous teams. CEO Bill Staples wrote to customers and investors explicitly naming the “agentic era” as the driver. Simon Willison noted the unusual level of structural detail, though he flagged GitLab’s declining stock as reason to take the optimistic framing with a grain of salt.

Why you should care: Every major tech company doing AI-related workforce cuts right now is making some version of this decision. GitLab wrote it down and explained their reasoning in public. Read the source document, not the news coverage: what they changed about decision speed, team size, and management layers is the template other DevOps companies will follow in the next 12 months, whether they say so or not.

Quick Hits

Anthropic signs Colossus compute deal - Anthropic will use the full capacity of SpaceX’s Colossus 1 data center in Memphis (over 300MW, 220,000+ NVIDIA GPUs), meaning Anthropic is now a paying customer of Elon Musk’s infrastructure while Musk litigates OpenAI. Anthropic
Claude Code: 8 releases in 7 days - v2.1.139 added an agent view listing all active sessions in one pane and a /goal command for longer autonomous runs without human check-ins. Claude Code Releases
Musk v. OpenAI, Week 2 - Shivon Zilis testified Musk tried to recruit Sam Altman to lead AI at Tesla, offering a board seat; Greg Brockman’s personal journal was read into the court record, with Musk’s lawyer pressing him on an equity stake worth roughly $30B. MIT Technology Review
Vapi hits $500M valuation - The voice AI developer platform closed a $50M Series B after winning Amazon Ring over 40 competing vendors, and has now handled over a billion calls. Enterprise voice AI is past the demo stage. TechCrunch
Cloudflare: record revenue, 1,100 jobs eliminated - Cloudflare posted its highest-ever quarterly revenue ($640M, +34% YoY) while cutting 20% of its workforce, explicitly crediting AI efficiency gains. First large-scale AI-driven layoff at a major infrastructure company. TechCrunch
Thinking Machines releases bidirectional voice model - TML-Interaction-Small (276B sparse, ~12B active parameters) processes input and generates response simultaneously, departing from the turn-based listen-then-respond architecture that all current voice AI uses. TechCrunch
Baidu Ernie 5.1: 94% cost reduction - Ranks 4th globally on LMArena at 6% of comparable training cost. The China AI cost compression story keeps accelerating. The Decoder

What to Watch

The responsible disclosure window under AI-speed exploitation. The 90-day coordinated disclosure convention is a practical agreement between researchers, vendors, and defenders, not a legal requirement. It was calibrated for human-speed exploit development. AI now compresses patch-to-exploit timelines to roughly 30 minutes. Watch whether security research organizations, CVE frameworks, or coordinated disclosure bodies propose revised timelines in the next quarter. The first major post-disclosure AI exploit incident will force this conversation into the open; the only question is whether the frameworks update before or after the incident.

If someone forwarded this to you, subscribe here to get it weekly.

Our team also ships AI Edge, a mobile companion for daily AI news scanning between issues. Free, on-device. iOS · Android.