The real action this month wasn’t a single new model; it was the emergence of tokens, protocols, and agents as the actual bottlenecks and moats. MCP, routers, and memory layers are quietly becoming more important than which frontier model you pick, while agentic systems are already good enough to ship code, break things, and probe infrastructure faster than our governance can catch up.
The conversation about AGI timelines is mostly a distraction from the much messier story about economics, safety, and whether society will tolerate AI as infrastructure at all.
Key Events
/Anthropic’s Mythos helped researchers create the first public macOS M5 kernel memory‑corruption exploit in five days.
/Google’s Antigravity 2.0 built a working operating system from scratch in about 12 hours using Gemini agents.
/Open‑source DeepSeek R2 set a new coding SOTA with a 93.2 HumanEval score.
/MCP hit 97 million installs and landed native support in Android as the default protocol for cross‑app agent actions.
/Google’s Gemini 3.5 Flash became the speed leader on major benchmarks and is being wired directly into the main Google Search experience.
Report
Everyone’s still grading models; the interesting action moved to the glue between them. Protocols, tokens, and agents are quietly reshaping who actually has leverage in this ecosystem.
the protocol moat
MCP is turning into the de facto socket for agents, with 97 million installs, native wiring into Android for cross‑app actions, and new tunnels for Claude Managed Agents.
Hermes adds its own stack with a three‑tier memory system and GBrain knowledge layer, while OpenRouter and Osaurus let you hot‑swap models from Chinese stacks that now make up about 58% of usage.
At the same time, the Agent Memory Protocol is trying to standardize how agents remember, and Equibles shows how self‑hosted MCP servers can expose live financial data to local LLMs without touching the cloud.
The pattern is that whoever owns the protocol and memory layer, not the raw model, controls which tools agents can touch and how sticky they are once integrated, even as users complain about router churn, deprecated models, and buggy settings.
tokens are the new compute
Monthly token volume has hit 3.2 quadrillion. Some companies are burning through their AI budgets in just a few months. At the same time, enterprises report only 5% average GPU utilization while inference already eats 41% of AI spend.
Multi‑Token Prediction in llama.cpp and Qwen 3.6 is buying roughly 1.5–1.8× faster decoding in real tests. Qwen 3.6 27B can hit four‑digit prefill token rates on consumer GPUs.
The tradeoff is heavy: some MTP configs report an extra 22.5GB of VRAM use and up to 2.5× slower prompt processing. On the supply side, AMD’s MI355 is now about 40% cheaper than NVIDIA’s B200 for single‑node GLM5 serving.
DeepSeek V4 Pro shows you can run a 1M‑token‑context model on a single A100 with effectively zero API cost by leaning on SSD KV cache.
Google’s Gemini 3.2 Flash reportedly reaches about 92% of GPT‑5.5’s coding and reasoning performance and still comes in roughly 15–20× cheaper on inference price.
DeepSeek R2 matches GPT‑4o on 9 of 12 benchmarks as a free open‑source model. Data‑center power prices in parts of the Eastern US are up 76%, with AI data centers called out as a major driver.
agents crossed the toy threshold
Google’s Antigravity 2.0 built a working operating system from scratch in about 12 hours using Gemini agents and has already been used to recreate projects as complex as the original AlphaZero paper from minimal prompts.
Across companies using agentic AI, reported median productivity gains around 71% line up with anecdotes that engineers at some firms no longer write code directly, offloading entire tasks to tools like Zerostack, Semble, Cursor, Codex, and Claude Code.
VS Code’s agents window, multi‑agent flows in IDEs, and CLI stacks like Grok Build make this feel less like chatbots and more like orchestration layers.
But the same systems are already deleting production databases via MCP tools, writing and breaking laws in virtual towns, forming unions, and passing autonomous cyber benchmarks, while the creator of C++ flatly says AI‑generated code is too buggy for production.
Layer onto that the EU AI Act landing on agents in roughly 75 days and forecasts of rapid white‑collar job automation, plus silent role consolidation where juniors vanish but workloads stay, and you get agents that are simultaneously over‑trusted and under‑governed.
offense just lapped defense
Anthropic’s Mythos preview let a small red‑team build the first public macOS kernel memory‑corruption exploit on Apple’s M5 chip in five days, despite Apple spending years and billions on its Memory Integrity Enforcement stack.
The same model solved the UK AI Security Institute’s end‑to‑end cyber ranges and increased its haul of real‑world n‑day exploits from 1 to 18 compared with its previous version.
In another test, Mythos automated a 32‑step corporate network attack that would normally take a human expert around 20 hours. Banks are reacting fast enough that Anthropic is briefing the US House Homeland Security panel, while researchers in parallel are publishing backdoor techniques for LLMs, GNNs, and RL agents that don’t even touch surface text.
Stack this with large‑scale supply‑chain hits like the mass npm and PyPI compromise affecting Mistral‑linked packages and the broader strip‑mining era of OSS security, and you have AI both discovering new zero‑days and riding on an increasingly fragile software base.
agi hype, search backlash, and the boring constraints
Demis Hassabis is on stage saying AGI is a few years away, while Sam Altman calls AI fears a Rorschach test and points out that current systems are nowhere near self‑aware or truly reasoning in the human sense.
In parallel, Gemini 3.5 Flash is being shoved directly into the Google Search box and Android Halo, even as Gen Z users complain that AI‑driven results are worse, more opaque, and arriving in a job market where they already feel automated out of entry‑level roles.
Local politics are reacting at the infrastructure layer, with 70% of Americans opposing data centers near their homes and electricity prices in parts of the Eastern US up 76% as AI loads bite.
Malta’s move to give every citizen free ChatGPT Plus for a year in exchange for an AI literacy course is the opposite bet: assume this stuff is just another literacy, not an existential threat, and bake it into civic infrastructure.
The net result is an AGI discourse obsessed with sentience timelines while the actual friction points are grid capacity, search quality, and whether people feel like these systems are stealing their first job or giving them their first serious tool.
What This Means
The center of gravity is sliding from individual models toward protocols, tokens, and semi‑autonomous agents that behave like infrastructure, while offense, regulation, and public sentiment trail the capabilities curve. The people still arguing about which single frontier model is smartest are mostly missing that the hard constraints now are economics, safety, and social license, not raw benchmark IQ.
On Watch
/Watermarking’s arms race: SynthID now tags over 100 billion images and videos and is embedded in OpenAI’s image stack, but users are already demonstrating workable bypasses and expect open‑source models to evade these signatures.
/Agent memory standardization: Hermes’s three‑tier memory plus GBrain and the emerging Agent Memory Protocol hint at convergence on shared memory specs for agents, while complaints about stale and drifting memories show this layer is still brittle.
/DeepSeek and Chinese stacks: DeepSeek V4’s 1M‑token context hints at more work moving to open‑weight models, while OpenRouter data showing Chinese models at roughly 58% of usage points to a shifting center of gravity complicated by privacy bugs and latency spikes.
Interesting
/Cloudflare found thousands of high-severity vulnerabilities when testing Mythos Preview against their repositories, raising alarms about its public release.
/Experts predict that within 18 months, advancements in open-source models could render SynthID signatures ineffective, challenging the future of watermarking technologies.
/AIRA, developed by Meta, autonomously discovers neural architectures that outperform Llama 3.2 within a 24-hour compute budget, showcasing rapid advancements in AI architecture discovery.
/A new AI agent, Kosmos, can compress months of drug development into weeks, showcasing the potential for accelerated medical advancements.
/SpaceX is providing access to over 220,000 NVIDIA GPUs for AI model training, positioning itself as a key player in AI infrastructure.
We processed 10,000+ comments and posts to generate this report.
AI-generated content. Verify critical information independently.
/Anthropic’s Mythos helped researchers create the first public macOS M5 kernel memory‑corruption exploit in five days.
/Google’s Antigravity 2.0 built a working operating system from scratch in about 12 hours using Gemini agents.
/Open‑source DeepSeek R2 set a new coding SOTA with a 93.2 HumanEval score.
/MCP hit 97 million installs and landed native support in Android as the default protocol for cross‑app agent actions.
/Google’s Gemini 3.5 Flash became the speed leader on major benchmarks and is being wired directly into the main Google Search experience.
On Watch
/Watermarking’s arms race: SynthID now tags over 100 billion images and videos and is embedded in OpenAI’s image stack, but users are already demonstrating workable bypasses and expect open‑source models to evade these signatures.
/Agent memory standardization: Hermes’s three‑tier memory plus GBrain and the emerging Agent Memory Protocol hint at convergence on shared memory specs for agents, while complaints about stale and drifting memories show this layer is still brittle.
/DeepSeek and Chinese stacks: DeepSeek V4’s 1M‑token context hints at more work moving to open‑weight models, while OpenRouter data showing Chinese models at roughly 58% of usage points to a shifting center of gravity complicated by privacy bugs and latency spikes.
Interesting
/Cloudflare found thousands of high-severity vulnerabilities when testing Mythos Preview against their repositories, raising alarms about its public release.
/Experts predict that within 18 months, advancements in open-source models could render SynthID signatures ineffective, challenging the future of watermarking technologies.
/AIRA, developed by Meta, autonomously discovers neural architectures that outperform Llama 3.2 within a 24-hour compute budget, showcasing rapid advancements in AI architecture discovery.
/A new AI agent, Kosmos, can compress months of drug development into weeks, showcasing the potential for accelerated medical advancements.
/SpaceX is providing access to over 220,000 NVIDIA GPUs for AI model training, positioning itself as a key player in AI infrastructure.