All Digests
Week 16, 20266 stories

The Week AI Found Every Bug

Anthropic's Claude Mythos discovered thousands of zero-day vulnerabilities and they won't let you use it. OpenAI bought a podcast. Google shipped a 2M-token context model. And the three biggest rivals in AI teamed up against China.

01.Claude Mythos & Project Glasswing

SecurityModels

Anthropic built the most capable AI model ever evaluated. Then they decided not to release it.

Claude Mythos is, by every available benchmark, the strongest model Anthropic has produced. But what made headlines wasn't the benchmarks. It was what happened when they pointed it at real software. Mythos autonomously discovered thousands of zero-day vulnerabilities across every major operating system, every major web browser, and a range of critical infrastructure software.

1,000s
Zero-Days Found
50
Organizations with Access
17 yrs
Oldest Bug Found
0
Public API Access

The FreeBSD moment

The flagship demonstration: Mythos fully autonomously identified and then exploited a 17-year-old remote code execution vulnerability in FreeBSD (CVE-2026-4747) that allows anyone to gain root on a machine running NFS. Starting from an unauthenticated user. Anywhere on the internet. That bug sat in production systems for nearly two decades. Mythos found it in minutes.

Project Glasswing

Instead of a public release, Anthropic created Project Glasswing, a gated program where 50 organizations, including AWS, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, Microsoft, and NVIDIA, get access to Mythos exclusively for defensive security work. The deal: find vulnerabilities in your own infrastructure, share learnings with the wider industry, and report what you discover.

Bruce Schneier wrote a detailed analysis calling the restriction "necessary." Simon Willison agreed. Foreign Affairs Forum called it "the most dangerous AI ever built and the emergency plan to control it."

What nobody is saying out loud

If Anthropic can build this, so can everyone else. The restricted release buys time, but it doesn't change the fundamental dynamic: offense is now cheaper than defense. A model that finds zero-days at this rate in friendly hands is terrifying. In unfriendly hands, it's a different conversation entirely.

thinkidiot take: This might be the most consequential AI release of 2026, precisely because it wasn't released. The fact that Anthropic chose to restrict access, eating the revenue, tells you how seriously they take the capability. The security industry just got a preview of what the next 12 months look like: AI-powered vulnerability discovery at a pace no human team can match. If you maintain critical infrastructure, get in line for Glasswing access. If you don't, make sure someone in your supply chain did.


02.The Anti-Distillation Alliance

GeopoliticsIndustry

OpenAI, Anthropic, and Google... working together. That's how bad the problem is.

The three biggest US AI labs announced they're sharing intelligence through the Frontier Model Forum to detect and counter adversarial distillation, where competitors extract the behavior of frontier models via the API and use those outputs to train cheaper imitations.

16M
Fraudulent Queries
24K
Fake Accounts
3
Labs Named
$160K
Cost to Distill

How it works

Someone signs up for thousands of API accounts. They feed the frontier model a carefully designed sequence of prompts, probing how the model "thinks," not just what it answers. They collect the outputs and use that corpus to fine-tune a much cheaper open-weights base model. The result: a model that behaves like Claude or GPT-5 at a fraction of the training cost.

Anthropic claims they detected over 16 million exchanges via roughly 24,000 fraudulent accounts. Three Chinese AI firms were specifically named: DeepSeek, Moonshot AI, and MiniMax.

The economics are brutal

Training a frontier model costs hundreds of millions to low billions of dollars per generation. A well-executed distillation run? About $160,000. That's the asymmetry that forced three sworn rivals to cooperate. Until April 2026, the Frontier Model Forum focused on research and policy. Now it's an operational intelligence-sharing center.

thinkidiot take: The distillation problem is real, but the framing matters. "Adversarial distillation" sounds scary. "Learning from examples" is literally what these models do. The line between "legitimate API use" and "distillation attack" is going to be extremely hard to draw, legally and technically. This alliance solves the symptom (fraudulent accounts) but not the root cause (the knowledge is in the outputs).


03.OpenAI Buys a Podcast

BusinessMedia

OpenAI acquired TBPN (The Breakout Product Network), a daily tech show hosted by John Coogan and Jordi Hays, for a price in the "low hundreds of millions" according to the FT. TBPN generated $5 million in ad revenue in 2025 and is on track for $30 million in 2026.

Why would a $300 billion AI company buy a podcast? Because OpenAI is becoming an advertising company.

$100B
2030 Ad Revenue Target
$2.5B
2026 Ad Revenue Forecast
$100M
Ad ARR (6 weeks in)
$30M
TBPN 2026 Revenue

OpenAI introduced experimental ads in January 2026. Six weeks later, the pilot program hit a $100 million annual recurring revenue run rate. The company is now forecasting $2.5 billion in ad revenue for 2026 and $100 billion annually by 2030.

TBPN will be housed within OpenAI's strategy organization. The two founders, who have interviewed Zuckerberg, Nadella, and Altman himself, will presumably continue producing content. But now the company behind ChatGPT owns a media property that interviews the CEOs of its competitors.

thinkidiot take: Sam Altman looked at Google's business model and thought "yeah, that." The ad revenue numbers are staggering, especially at the ramp rate. But the TBPN acquisition is the weird one. It's a $5M-revenue podcast bought for hundreds of millions by a company that prints text for a living. Either OpenAI sees a distribution channel nobody else does, or this is the most expensive talent acqui-hire in media history.


04.Gemini 3.1 Ultra Ships

ModelsMultimodal

Google launched Gemini 3.1 Ultra, its most significant model release of the year. The headline number: a 2 million token context window that works natively across text, image, audio, and video.

That's 1,500+ pages of text, or hours of video, in a single session. Unlike prior Gemini versions, 3.1 Ultra was designed from training to reason across all modalities simultaneously, not through transcription intermediaries.

Why it matters beyond context length

Most frontier models process images by converting them to text descriptions internally. Gemini 3.1 Ultra processes the raw pixels and audio waveforms alongside the text tokens. This means it can notice things that would be lost in transcription: the expression on a speaker's face, background sounds in a video, the layout and visual hierarchy of a document.

Gemini 3.1 Pro (the cheaper version) leads 13 of 16 major benchmarks and ties with GPT-5.4 Pro on the Artificial Analysis Intelligence Index at roughly one-third the API cost. Gemini has now reached 750 million users across its consumer products.

thinkidiot take: The 2M context window is impressive, but the native multimodal reasoning is the real story. "Upload a 2-hour meeting recording and ask questions about it" is a real workflow that Gemini can do today. The context war is over. Everyone has enough. The next competition is about what you can do with that context. Google is betting on native multimodal. It might be right.


05.Agents Hit 66% Success

ResearchAgents

AI agents went from 12% to 66% success on real computer tasks, according to the latest round of benchmarks. They can now navigate software, use tools, and complete multi-step workflows at a level approaching human competence.

This isn't one model on one benchmark. It's a pattern across multiple agent frameworks and evaluation suites. The jump happened fast, in roughly 6 months, driven by better tool use, longer context windows, and improved planning capabilities.

What changed

Three things converged: frontier models got better at following multi-step instructions (GPT-5.4 Thinking scored 75% on OSWorld-Verified, a 27-point jump). Context windows got long enough to hold an entire workflow. And frameworks like OpenClaw, Claude Code, and Cursor gave agents persistent access to real computer environments.

Every major model release in April 2026 emphasized agentic capabilities. This is not coincidental. The companies are building what the market is buying.

thinkidiot take: 66% sounds impressive until you realize that means the agent fails one out of three times. On trivial tasks, 66% is fine. On "deploy my code to production" or "send this email to all customers"? You want 99.9%. The capability is real. The reliability is not. That gap is the entire agent opportunity for the next 12 months.


06.Quick Hits

PwC: 20% of companies capture 75% of AI gains. PwC's 2026 AI Performance Study found that a small group of companies is pulling sharply ahead in financial returns from AI. The leading companies are focused on growth, not just productivity. Everyone else is still running pilots.

California SB 53 compliance begins. The first enforceable US regulatory framework for frontier AI took effect January 1, 2026. Five to eight companies (OpenAI, Anthropic, Google, Meta, Microsoft) must now publish safety frameworks, report incidents, and protect whistleblowers. First enforcement reviews are underway.

Meta deploys MTIA 400 chips. Meta's custom Training and Inference Accelerators are now running in production data centers. MTIA 450 and 500 variants are slated for mass deployment by 2027. The big question: can Meta's silicon close the gap with NVIDIA's H200s?

AI energy use can drop 100x. A paper from ScienceDaily reports a new approach combining neural networks with symbolic reasoning that cuts AI energy consumption by up to 100x while actually improving accuracy. Early days, but the energy problem may have an algorithmic solution, not just a hardware one.

Global Quantum + AI Challenge launches. The 2026 Global Quantum and AI Challenge opened April 16, targeting practical enterprise use cases at the intersection of quantum computing and AI. The money is following the convergence.

Gemini hits 750M users. Google's AI assistant crossed 750 million monthly active users across its consumer products, making it the most widely deployed frontier AI by user count.


Sources

Claude Mythos & Project Glasswing:

Anti-Distillation Alliance:

OpenAI & TBPN:

Gemini 3.1 Ultra:

Agents:

Quick Hits:

Join the Idiots

New lab every Sunday. No spam, unsubscribe anytime.