The interesting frontier this month isn’t a new god‑model, it’s the messy stack underneath: NVIDIA turning into a semi‑open frontier lab, agents spreading everywhere while their protocols and state management fall apart, and AI‑generated code becoming a dependency that’s already causing outages.
At the same time, multimodal systems quietly crossed a line—TV series from Seedance, film edits from Kling, and Niantic’s 30‑billion‑image haul—making it clear that whoever controls data, memory, and reliability will matter more than whoever wins the AGI timeline argument.
Key Events
/GPT‑5.4 hit a $1B annualized run rate in net‑new API revenue within a week and became SOTA on ZeroBench.
/NVIDIA launched the Nemotron‑3 Super 120B model with a 1M‑token context and NVFP4 support, running about 2.2× faster than GPT‑OSS‑120B.
/Mistral Small 4 arrived as a 119B‑parameter model with a 256k context window in the new Mistral 4 family.
/Yann LeCun’s Advanced Machine Intelligence raised $1.03B to build AI systems with persistent memory and reasoning.
/Chinese studios started producing full TV series with Seedance 2.0 while ByteDance paused its global launch over copyright disputes.
Report
Everyone is arguing about AGI timelines while the interesting stuff is happening in the plumbing. The real action this month is NVIDIA quietly assembling a semi‑open frontier stack, agents overrunning everything, and AI coding hitting the reliability wall.
nvidia is quietly building the third frontier lab
NVIDIA’s Blackwell era now looks less like a GPU refresh and more like a semi‑open frontier stack. Nemotron 3 Super is a 120B‑parameter model tuned for multi‑agent applications, and the 120B‑A12B NVFP4 variant runs about 2.2× faster than GPT‑OSS‑120B. Blackwell token throughput has climbed from 400 tokens per second per GPU to 1,300 in four months.
DGX Spark and DGX Station put up to 20 petaflops of AI compute and 748GB of coherent memory in a single local box, while NemoClaw offers an open‑source, chip‑agnostic enterprise agent platform designed to run on Grace Blackwell with standardized safety controls.
In parallel, open‑weight contenders like Mistral Small 4 and locally deployable OmniCoder‑9B and Qwen 3.5‑27B show that serious coding and reasoning capabilities are no longer confined to closed labs, even if these models still bump into hardware limits and occasional crashes.
agents are everywhere, but the protocol layer is collapsing
Subagents have gone mainstream: Codex now uses specialized subagents for different parts of a task, and Claude exposes both sub‑agents for parallel execution and agent teams for longer negotiations.
OpenClaw auto‑generates subagents and routes work by task structure, and OpenClaw‑RL updates model weights from day‑to‑day interactions using feedback from replies and actions.
Frameworks like LangGraph 1.1 add type‑safe streaming, automatic dataclass coercion, and cryptographic identities for agents, yet users still say the hardest problems are state management and infrastructure when moving these systems into production.
DARPA’s AI Cyber Challenge produced powerful cyber‑reasoning systems, but OSS‑CRS shows these agents remain brittle and largely unusable outside their original competition context without heavy adaptation.
At the protocol layer, MCP is being declared dead as Perplexity drops it for classic APIs and CLIs after reports that MCP can cost up to 32× more tokens than CLI with only 72% reliability, while alternatives like LDP try to reframe agent communication around identity and delegation.
ai coding went from experiment to dependency, and the cracks are obvious
Anthropic reports that 70–90% of the code for its future AI models is now generated by Claude, effectively turning an LLM into the primary software engineer for the next wave of LLMs.
Stripe merges over 1,300 AI‑generated pull requests per week with no human‑written code, while developers using tools like Cursor and Copilot say they now rarely write code without AI.
The failure modes are no longer hypothetical: Amazon convened mandatory meetings after outages linked to AI‑assisted code changes and now requires senior engineers to approve such changes, and Atlassian cut 1,600 mostly‑engineering roles as it pivots hard into AI‑enhanced products.
Engineers describe this as vibe coding, where reviewing AI‑generated patches is mentally harder than writing code yourself, shifting the real skill gap to spotting subtle errors and driving measurable burnout and AI brain fry.
Security research is already exploiting the same stack, with AI agents detecting 45.6% of vulnerabilities in smart contracts and cyber‑reasoning systems like OSS‑CRS being packaged for real‑world open‑source projects.
multimodal is turning into the real platform layer
Gemini Embedding 2 collapses text, images, video, audio and PDFs into a single embedding space, supports 8,192‑token multimodal inputs, and works across more than 100 languages, making one embedding backbone span most data types people care about.
Kling 3.0 lets users edit film scenes, swap actors and control motion for up to 15‑second clips, and Media io’s integration adds synchronized audio outputs to those generations.
Seedance 2.0 is already being used by Chinese studios to produce full TV series in native 2K with detailed keyframe control and audio‑visual sync, even though ByteDance has paused its global launch over disputes about the copyrighted material used for training.
Grok Imagine now tops independent video leaderboards, while Anima Preview 2 offers style‑specialized illustration that many users rate above Illustrious yet still struggles with anatomically correct full‑body images and Mac hardware compatibility.
Niantic’s admission that 30 billion Pokémon Go images were used to train delivery robots’ vision systems underscores that the data feeding these multimodal systems is being harvested from everyday user behavior at planetary scale.
What This Means
The center of gravity is drifting away from single chatbots debating AGI dates toward messy stacks of specialized agents, semi‑open frontier models, and aggressively optimized hardware and runtimes that already do real work but routinely fail in novel ways. The consensus conversation about when AGI arrives misses that the substrate it would run on—NVIDIA‑style stacks, open/local ecosystems, and industrial agent workflows—is getting locked in right now, largely by whoever solves reliability, memory and security fastest.
On Watch
/MCP being called dead while LDP and plain APIs/CLIs gain favor signals an impending shakeout in agent tool protocols and who controls the agent–tool boundary.
/OpenClaw’s viral adoption in China, simultaneous government bans, and malware posing as OpenClaw installers show how fast an agent platform can turn into a security flashpoint.
/California’s new dataset‑disclosure law lands just as RAG document‑poisoning and copyright fights over models like Seedance 2.0 heat up, making training data the next major battleground.
Interesting
/- Researchers at Anthropic are observing early signs of recursive self-improvement in AI, potentially leading to significant advancements next year.
/- DeepSeek-R1's full 256-expert MoE layer is 78.9× faster than cuBLAS and uses 98.7% less energy.
/- Meta's investment in AI includes a 1-gigawatt compute cluster in Ohio for its Superintelligence Labs, showcasing its commitment to AI research.
/- Covenant-72B, with 72 billion parameters, is the largest decentralized LLM pre-training run, allowing broad participation.
/- NVIDIA's GreenBoost kernel modules allow large language models to run without modifying inference software by extending GPU VRAM using system RAM and NVMe storage.
We processed 10,000+ comments and posts to generate this report.
AI-generated content. Verify critical information independently.
/GPT‑5.4 hit a $1B annualized run rate in net‑new API revenue within a week and became SOTA on ZeroBench.
/NVIDIA launched the Nemotron‑3 Super 120B model with a 1M‑token context and NVFP4 support, running about 2.2× faster than GPT‑OSS‑120B.
/Mistral Small 4 arrived as a 119B‑parameter model with a 256k context window in the new Mistral 4 family.
/Yann LeCun’s Advanced Machine Intelligence raised $1.03B to build AI systems with persistent memory and reasoning.
/Chinese studios started producing full TV series with Seedance 2.0 while ByteDance paused its global launch over copyright disputes.
On Watch
/MCP being called dead while LDP and plain APIs/CLIs gain favor signals an impending shakeout in agent tool protocols and who controls the agent–tool boundary.
/OpenClaw’s viral adoption in China, simultaneous government bans, and malware posing as OpenClaw installers show how fast an agent platform can turn into a security flashpoint.
/California’s new dataset‑disclosure law lands just as RAG document‑poisoning and copyright fights over models like Seedance 2.0 heat up, making training data the next major battleground.
Interesting
/- Researchers at Anthropic are observing early signs of recursive self-improvement in AI, potentially leading to significant advancements next year.
/- DeepSeek-R1's full 256-expert MoE layer is 78.9× faster than cuBLAS and uses 98.7% less energy.
/- Meta's investment in AI includes a 1-gigawatt compute cluster in Ohio for its Superintelligence Labs, showcasing its commitment to AI research.
/- Covenant-72B, with 72 billion parameters, is the largest decentralized LLM pre-training run, allowing broad participation.
/- NVIDIA's GreenBoost kernel modules allow large language models to run without modifying inference software by extending GPU VRAM using system RAM and NVMe storage.