TL;DR
Nginx, curl, and BitLocker all picked up serious new vulnerabilities, while npm, GitHub Actions, Hugging Face and Supabase had real supply-chain and auth failures. At the same time, AI coding tools are being pushed as mandatory, local LLM stacks just got much faster, and a few teams quietly burned seven figures on tokens and cloud AI.
The boring parts of your stack—reverse proxies, CI configs, billing, and “helper” tools—are now where things are most likely to break or leak.
Key Events
Report
Two things moved this week that actually touch your stack: core infra you probably run (Nginx, curl, BitLocker endpoints) picked up new, real vulns, and the AI/tooling ecosystem is starting to look like a cost and security problem, not just a productivity toy.
The rest of the noise clusters around those: supply‑chain compromises, mandated AI coding tools, and AI infra economics getting sharp edges.
Nginx Rift is a newly disclosed 18‑year‑old bug (CVE‑2026‑42945) that allows remote code execution via a heap buffer overflow in the rewrite module on versions below 1.30.1 and 1.31.0.
Mythos also identified a new curl vulnerability and discussed it directly with Daniel Stenberg, adding scrutiny even to one of the tightest‑engineered tools in our stacks.
On Windows endpoints, the YellowKey zero‑day shows BitLocker‑protected Windows 11 drives can be unlocked with just a USB stick, bypassing default TPM‑only configs and effectively acting as a backdoor to “encrypted” disks.
Security folks are pointing out that BitLocker without a PIN or extra auth is especially weak, which is exactly how many corporate fleets ship it.
An npm supply‑chain attack hit 84 TanStack packages and over 170 packages total with more than 400 malicious versions that steal cloud credentials and tokens.
The same campaign and the Mini Shai‑Hulud worm abused GitHub Actions cache poisoning to grab CI/CD secrets, so the blast radius includes your workflows, not just your dependencies.
Outside npm, a fake “OpenAI Privacy Filter” on Hugging Face shipped a Rust infostealer to 244k downloads, and JDownloader’s site was hacked to deliver a Python RAT instead of the real installer.
Scans of vibe‑coded apps and Supabase projects show 90% with at least one vuln, 44% with auth gaps, and 22% of Supabase projects leaking data, often from missing Row Level Security and committed tokens.
One company now mandates daily GitHub Copilot usage for all engineers and even runs a prompt leaderboard, turning AI assistance into a tracked performance metric instead of an optional tool.
At the same time, Airbnb says AI writes 60% of its new code, Mistral’s founder claims their engineers “no longer write code,” and Claude Code is credited with 134k+ daily GitHub commits and 71% median productivity gains in agentic AI reports.
On the ground, devs complain Copilot is less accurate on large projects, Cursor loses context on big repos, and Claude Code costs $200/month while still feeling slow, with some worrying they’re losing core coding skills amid rising AI tool spend.
Scanners keep finding that 90% of vibe‑coded repos have vulns and 44% have auth gaps, and data science teams report spending time reviewing often‑wrong AI analyses, so code volume is up but correctness is not.
On the local side, llama.cpp just merged Multi‑Token Prediction for Qwen3.6, giving roughly a 40% speed boost and up to ~34 tokens/s, with 27B and 35B models now running on low‑VRAM setups and even a GTX 1080 hitting ~24 tokens/s.
Vulkan backends reach around 127 tokens/s while using slightly less VRAM than ROCm for KV cache, and vLLM 0.21 adds MTP for Gemma plus multi‑B200 scaling for higher per‑GPU throughput.
NVIDIA’s NVFP4 format is being used to train 12B‑parameter LLMs on 10T tokens and to serve models like Nemotron 3 Omni at roughly 270 tokens/s, while a single AMD MI300X can run a full cinematic video pipeline end‑to‑end, which pushes more serious workloads onto single nodes instead of big clusters.
At the same time, enterprises report average GPU utilization around 5% with inference eating 41% of AI costs, and projects like OpenClaw and CodexBar have each burned around $1.3M on token spend, even as cheaper options like Gemini Pro (about $12 per million output tokens) coexist with incidents like a $30k runaway Bedrock bill and buggy, unverified models on OpenRouter.
What This Means
The emerging split is between orgs that treat infra, AI models, and dev tooling as code to be measured, locked down, and swapped when needed, and those that still treat them as black‑box products to trust by default. Across the board, reliability, security, and cost observability are dominating the conversation more than new features or benchmarks.
On Watch
Interesting
We processed 10,000+ comments and posts to generate this report.
AI-generated content. Verify critical information independently.
Sources
Key Events
On Watch
Interesting