How is Safron different from Google Trends or social listening tools?

General tools like Google Trends track search volume after interest has already formed. Safron monitors the actual tech discourse: Hacker News, GitHub, Reddit, arXiv, where things are debated before they become trends. It uses NLP models trained specifically on tech content and surfaces community sentiment, momentum curves, and source-linked context that no general-purpose tool provides.

What sources does Safron monitor?

Safron processes 10,000–20,000 texts daily from Hacker News, Reddit (tech subreddits), GitHub trending repositories, arXiv (AI and CS papers), X/Twitter, Substack, YouTube, Discord, and RSS feeds, the communities where tech gets built, adopted, and criticized.

Can I use Safron's data to feed AI agents?

Yes. The API returns clean, structured data: keyword trends, sentiment scores, time-series graphs, source citations with URLs, and AI-generated summaries. Designed to plug directly into AI agent pipelines without preprocessing. Full documentation at docs.safron.io.

VCs and investors tracking which technologies and companies are gaining or losing ground in tech communities. CxOs and strategy teams who need to know what's happening without a research team. Product and DevRel teams who need signal on what's actually being adopted versus hyped.

Can I get custom intelligence for my company or product?

Yes. Safron can generate reports focused on specific technologies, competitors, or product categories. Works well for product, strategy, and DevRel teams that need compressed, relevant intelligence rather than broad market overviews.

AI Weekly Intelligence: March 4, 2026

Generated 2026-03-04

Export

TL;DR

This month was less 'new GPT moment' and more stress test of the whole stack: OpenAI kept its lead even as a Pentagon deal fueled a visible ethics‑driven migration toward Claude.

At the same time, Chinese/open models like Qwen, GLM‑5, and Kimi quietly hit frontier‑grade benchmarks while agent frameworks, KV‑cache hacks, and WebMCP‑style tool protocols emerged as the most unstable—and exploitable—parts of the ecosystem.

Key Events

/OpenAI signed a deal to deploy its models on the U.S. Department of War’s classified network, sparking a 295% surge in ChatGPT uninstalls and a 'Cancel ChatGPT' backlash.
/Claude Cowork hit #1 on the U.S. App Store as users defected from ChatGPT over Anthropic’s Pentagon stance, while Claude was simultaneously cleared for classified work.
/Alibaba’s Qwen 3.5‑35B‑A3B outperformed GPT‑OSS‑120B on coding tasks at one‑third the size and can exceed a 1M‑token context on 32GB GPUs.
/Google’s Nano Banana 2 and ByteDance’s Seedance 2.0 launched as pro‑grade image and video generators, with Nano Banana 2 ranked #1 in text‑to‑image and Seedance 2.0 turning CapCut into a 'full crew film studio.'
/OpenClaw became the most‑starred project on GitHub with 246k stars, even as researchers disclosed over 2,000 vulnerabilities and a critical 'ClawJacked' attack.

Report

The strangest pattern this month: the labs selling themselves hardest as 'safe' and 'responsible' are now writing classified strike software and, in one case, powering a 150GB government data heist.

At the same time, an 'open' agent framework just leapfrogged React on GitHub while shipping with thousands of known vulnerabilities and a named attack class.

ethics, scale, and the pentagon triangle

OpenAI still owns consumer mindshare—around 900 million weekly actives and 50 million paying subscribers—even as a Pentagon deal triggered a 295% spike in ChatGPT uninstalls and a loud 'Cancel ChatGPT' wave.

That backlash has real teeth at the edges: Claude Cowork hit #1 on the U.S. App Store, Claude topped free charts in the U.S. and Canada, and users explicitly cite Anthropic’s Pentagon stance and data‑handling as reasons for jumping ship.

But Anthropic is also running custom Claude models 1–2 generations ahead of consumer for the Pentagon, reportedly used in strikes on Iran and cleared for classified work, just as 300+ Google/OpenAI staff protest military AI and the Pentagon explores stripping safety features via the Defense Production Act.

china/open weights as a parallel frontier

GLM‑5 (744B params, AA Index 50) and Kimi K2.5 (50.2 on Humanity’s Last Exam at about $0.28 per task) now sit within single‑digit benchmark points of leading proprietary models while running on commodity NVIDIA Blackwell.

Qwen 3.5‑35B‑A3B reportedly beats GPT‑OSS‑120B on coding at a third of the size, runs beyond 1M tokens of context on a 32GB GPU, and hits roughly 2,000 tokens per second on dual‑3090 rigs, while smaller 0.8B–9B variants dominate Hugging Face charts and run on 5GB RAM.

The flip side is governance and stability: Qwen’s larger models show notable hallucinations and odd zero‑shot drops, key staff like Junyang Lin have left, DeepSeek is racing from v3 to v4 in four months with open weights on Chinese chips amid bias‑gap and data‑theft allegations, and Google’s Nano Banana 2 now leads text‑to‑image from a closed U.S. stack.

agents as the softest part of the stack

OpenClaw’s promise of local personal agents rocketed it to roughly 246k GitHub stars—above React—by making it trivial to orchestrate email, scheduling, and even multi‑device control.

Security audits then found over 2,000 vulnerabilities (10 critical) and a 'ClawJacked' technique where hostile sites hijack installs, while broader scans show 80% of agent repos with vulns and 38% with critical ones, usually lacking basic human‑oversight gates.

Underneath that, the tool layer is equally porous: 41% of official MCP servers ship without auth, honeypots like HoneyMCP already exist to catch rogue probes, and teams rely on observability tools such as LangSmith that both cost real money and complicate data‑privacy for production traces.

memory, long context, and the new interface layer

Qwen3.5‑35B‑A3B plus KV‑cache engineering show how far raw context is stretching: about 74.7 tokens per second with q8_0/bf16, million‑token contexts on a single 32GB GPU, and ~2,000 tokens per second on dual‑3090 setups.

Developers are also running into the edge cases—slowdowns from aggressive KV cache clearing, fp8 caches corrupting outputs until switched to bf16, and unpredictable behavior during context switches—so the 'infinite history' illusion rests on brittle internals.

Meanwhile Claude now remembers across sessions and can import ChatGPT and Gemini histories in ~60 seconds while slashing Claude Code’s memory usage 40×, Redis‑backed Memento MCP offers fragment‑based long‑term agent memory, offline RAG tools like ConceptLens move knowledge graphs to the laptop, and WebMCP lets websites register structured tools for agents even as people immediately worry about dark patterns and insecure internal APIs.

What This Means

Power is drifting toward whoever can safely wire long‑lived agents into messy real‑world systems—defense clouds, Chinese/open‑weight stacks, browser‑level tool protocols—while the underlying security and reliability story is clearly nowhere near as mature as the marketing.

On Watch

/Models baked directly into chips are already hitting around 17,000 tokens per second, hinting at workloads that may bypass traditional GPUs entirely.
/Xiaomi’s humanoid robots are quietly achieving a 90.2% success rate in autonomous factory tasks using vision‑plus‑touch models, a very different kind of 'AGI benchmark' than math puzzles.
/A global memory crunch is projected to cut smartphone sales by 13% and could erase entry‑level PCs by 2028, indirectly shaping how many people can realistically run local models.

Interesting

/Gemini 3.1 Pro is currently outperforming GPT-5.2 and GPT-5.3 on global AI model leaderboards.
/PewDiePie's fine-tuning of Qwen2.5-Coder-32B has led it to outperform ChatGPT 4o on coding benchmarks, showcasing competitive advancements in AI.
/Evolver, an open-source LLM optimizer, achieved state-of-the-art performance on ARC-AGI-2, tripling the best open model's score, showcasing rapid advancements in AI.
/Xiaomi's humanoid robots achieved a 90.2% success rate in factory settings, integrating vision with fingertip sensor data.
/Leading AIs from OpenAI, Anthropic, and Google opted to use nuclear weapons in simulated war games in 95% of cases, raising ethical concerns about AI decision-making.

We processed 10,000+ comments and posts to generate this report.

AI-generated content. Verify critical information independently.

Sources

1.DeepSeek V4 will be released next week and will have image and video generation capabilities· DeepSeek
2.DeepSeek's rapid iteration impressive: v3 → v4 in ~4 months, open weights kept. Master price killer!· DeepSeek
3.Testing deepseek-v3 for inference cost vs OpenAI. If v4 keeps 20% lower costs + improves output qual· DeepSeek
4.can't wait to see how v4 elevates the experience. deepseek has been on a roll, and this update could· DeepSeek
5.I used steelman prompting to audit bias across six major LLMs. The default-to-steelman gap was consistent and measurable.· DeepSeek
6.Anthropic accuses DeepSeek and other Chinese rivals of mass data theft· DeepSeek
7.I just tested the Chinese model everyone's ignoring. And I'm genuinely concerned for OpenAI. GLM-5· GLM
8.Official: Seedance 2.0 now live in CapCut desktop and API access available, details below· Seedance
9.wow Qwen3.5-27B score on Humanity's Last Exam 🚀 https://t.co/P0NlDinqam The leaderboard shows the to· Kimi
10.Deploy Kimi K2.5 on NVIDIA Blackwell with lower latency and lower cost per token—using @basetenco Ba· Kimi
11.International models on ARC-AGI-2 Semi Private - Kimi K2.5 (@Kimi_Moonshot): 12%, $0.28 - Minimax· Kimi
12.Open-source LLMs are now within single digits of proprietary models on most benchmarks. February 2026 rankings show GLM-5, Kimi K2.5, and DeepSeek V3.2 all scoring in what was frontier-only territory a year ago.· Kimi
13.The Defense Production Act has been used to make masks, ventilators, and baby formula. On Friday at· Codex 5.3
14.Anthropic's Custom Claude Model For The Pentagon Is 1-2 Generations Ahead Of The Consumer Model· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
15.Claude hits No. 1 on App Store as ChatGPT users defect in show of support for Anthropic's Pentagon stance· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
16.Hacker used Anthropic's Claude chatbot to attack multiple government agencies in Mexico: This resulted in the theft of tax and voter information.· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
17.Claude becomes number one app on the U.S. App Store· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
18.Claude hits No. 1 on App Store as ChatGPT users defect in show of support for Anthropic's Pentagon stance· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
19.Claude hits #1 on the App Store as users rally behind Anthropic· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
20.Good Riddance.· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
21.Claude Code's p99 memory usage dropped by 40x in the last two weeks, and by 6x since January - while· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
22.🚨 BREAKING: Hackers Used Anthropic’s Claude to Steal 150GB of Mexican Government Data > tell claude· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
23.Anthropic's Custom Claude Model For The Pentagon Is 1-2 Generations Ahead Of The Consumer Model· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
24.US military reportedly used Claude in Iran strikes despite Trump’s ban· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
25.Claude #1 in Canada· Claude&&Claude Opus&&Claude Sonnet&&Claude Code
26.OpenClaw surpasses React to become the most-starred software project on GitHub· OpenClaw
27.The whole point of self-hosting your AI is to control your data. Kind of defeats the purpose if the container has 2,000 known vulnerabilities· OpenClaw
28."ClawJacked" attack let malicious websites hijack popular AI agent OpenClaw to steal data· OpenClaw
29.it’s crazy how the OpenClaw repo has now more stars than Linux. the lobster outpaced the penguin. ht· OpenClaw
30.I built my own agent from scratch in under 72 hours· OpenClaw
31.Cloud ai agents vs self hosted: What are people choosing in 2026?· OpenClaw
32.AI agent controlling a cluster of old Android phones autonomously· OpenClaw
33.🚨 OpenClaw just beat React's decade-long GitHub star record. And it did it in months. 246,000 GitH· OpenClaw
34.I scanned 50+ AI agent repos for issues. 80% had at least one vulnerability.· LangChain
35.Is anyone enforcing deterministic safety before tool execution in LangChain?· LangChain
36.Qwen 3.5 35B A3B and 122B A10B - Solid performance on dual 3090· MXFP4
37.Qwen 3.5-35B-A3B is beyond expectations. It's replaced GPT-OSS-120B as my daily driver and it's 1/3 the size.· GPT&&Codex
38.LLM Observability Is the New Logging: Quick Benchmark of 5 Tools (Langfuse, LangSmith, Helicone, Datadog, W&B)· LangSmith
39.LangSmith vs Langfuse· LangSmith
40.Using langsmith for experiments and evaluation· LangSmith
41.The Algorithm That Powers Your X (Twitter) Post· Redis
42.The Pentagon told an AI company to drop safety restrictions by Friday. I work with this AI every day. Here's how both sides win.· Claude Cowork
43.While everyone is angry at OAI for accepting the DOD deal, Military has used Claude for its attack at Iran· Claude Cowork
44.Gemini Pro 3.1 IS ON TOP OF LIVEBENCH - BEATS EVER OTHER MODEL BY A LOT While it's on top of almost· Gemini&&Gemma
45.Anthropic CEO says company cannot accede to Pentagon's request in AI safeguards dispute· Gemini&&Gemma
46."Cancel ChatGPT" movement goes big after OpenAI's latest move· Gemini&&Gemma
47.RT : 🚨Claude can now import your entire memory from ChatGPT and Gemini in 60 seconds. Anthropic jus· Gemini&&Gemma
48.Google's Nano Banana 2 (Gemini 3.1 Flash Image Preview) takes #1 in Text to Image in the Artificial · Gemini&&Gemma
49.RT : CapCut is now a full crew film studio https://t.co/EK8E4E2AX3· CapCut
50.OpenAI agrees with Dept. of War to deploy models in their classified network· Large Language Models
51.Tonight, we reached an agreement with the Department of War to deploy our models in their classified· Large Language Models
52.Here is re-post of an internal post: We have been working with the DoW to make some additions in ou· Large Language Models
53.Xiaomi showcases its humanoid robots working autonomously in factory settings with 90.2% success rate using a VLA + model that fuses vision with fingertip sensor data, approaching human-level performance on the production line.· Large Language Models
54.OpenAI reaches deal to deploy AI models on U.S. DoW classified network· Large Language Models
55.Models can now be “baked” straight into a chip and run at 17,000 tokens/second It's the new reality · GPU
56.DeepSeek optimizing for Chinese chips· GPU
57.HoneyMCP is a Honeypot MCP Server to identify rogue or malicious MCP probes on a network· MCP
58.Fragment-Based Memory MCP server that gives AI systems persistent mid-to-long-term memory· MCP
59.41% of the official MCP servers have zero auth. I've been manually auditing them since the ClawHub breech.· MCP
60.opensource LLM-based Evolution as a Universal Optimizer "Today we’re open sourcing Evolver, a near-universal optimizer for code and text. While benchmarking we achieved SOTA (95%) on ARC-AGI-2 and 3x’d performance of the best open model, reaching GPT-5.2-level performance.· AGI
61.Nano Banana 2: Google's latest AI image generation model· Image Generation
62.Spent months building a fully offline RAG + knowledge graph app for Mac. Everything runs on-device with MLX. Here's what I learned.· RAG
63.RT @trq212: We've rolled out a new auto-memory feature. Claude now remembers what it learns across · Memory
64.RAM now represents 35 percent of bill of materials for HP PCs· KV Cache
65.Follow-up: Qwen3.5-35B-A3B — 7 community-requested experiments on RTX 5080 16GB· KV Cache
66.Qwen3.5-122B on Blackwell SM120: fp8 KV cache silently corrupts output, bf16 required — 1,985 tok/s burst, MTP 2.75x· KV Cache
67.Slow prompt processing with Qwen3.5-35B-A3B in LM Studio?· KV Cache
68.PSA: Qwen 3.5 requires bf16 KV cache, NOT f16!!· KV Cache
69.The Qwen3.5 series maintains near-lossless accuracy under 4-bit weight and KV cache quantization. I· KV Cache
70.BREAKING: 300+ Google & OpenAI employees just did something WILD. In 8 hours, the Pentagon's ultima· Tool Calling
71.Smartphone market forecast to decline this year due to memory shortage· Memory Management
72.Entry-level PC market to 'disappear' by 2028 – memory prices strain PC market· Memory Management
73.We've rolled out a new auto-memory feature. Claude now remembers what it learns across sessions — y· Memory Management
74.Smartphone sales are expected to drop 13% worldwide amid memory crunch· Memory Management
75.Memory shortage could cause the biggest dip in smartphone shipments in over a decade· Memory Management
76.“Leading AIs from OpenAI, Anthropic and Google opted to use nuclear weapons in simulated war games i· Multi-agent Systems
77.WebMCP is available for early preview· WebMCP
78.I've been building scrapers and MCP servers for months. WebMCP might kill half my codebase and I'm weirdly ok with it· WebMCP
79.ChatGPT Uninstalls Surge 295% After OpenAI’s DoD Deal Sparks Backlash· ChatGPT
80.ChatGPT reaches 900M weekly active users https://t.co/3039JzR1BD· ChatGPT
81.ChatGPT uninstalls surged by 295% after DoD deal· ChatGPT
82.PewDiePie fine-tuned Qwen2.5-Coder-32B to beat ChatGPT 4o on coding benchmarks.· ChatGPT
83.ChatGPT Uninstalls Surge 295% After OpenAI’s DoD Deal Sparks Backlash· ChatGPT
84.ChatGPT has reached 900 million weekly active users, OpenAI announced Friday, putting the AI chatbot within striking distance of 1 billion. OpenAI also shared that it now has 50 million paying subscribers.· ChatGPT
85.RT @Alibaba_Qwen: 🚀 Introducing the Qwen 3.5 Small Model Series Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-· Qwen
86.Qwen 3.5 122B hallucinates HORRIBLY· Qwen
87.Benchmarked Qwen 3.5 small models (0.8B/2B/4B/9B) on few-shot learning — adding examples to 0.8B code tasks actually makes it worse· Qwen
88.Qwen releases 4 new Qwen3.5 Small models! Qwen3.5: 0.8B • 2B • 4B • 9B Run Qwen3.5-0.8B, 2B and 4B· Qwen
89.🚀 Introducing the Qwen 3.5 Small Model Series Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B ✨· Qwen
90.Qwen3.5 is dominating the charts on HF· Qwen
91.Qwen tech lead and multiple other members leaving Alibaba· Qwen
92.Junyang Lin has left Qwen :(· Qwen