Digest

June 24, 2026

Fourteen links on agent loops, engineering review, prompt injection, open models, tools, infrastructure, and game AI.

← Digest

June 24

2026

Wednesday

14 reads

Agents

The Coming Loop

— Armin Ronacher

On the two loops inside agentic coding — the inner agent loop that ends when the model says "done," and the outer harness loop that decides whether to keep going — and why the second is remarkable on disposable, verifiable work but corrosive on code meant to last.

Engineering

Slow Down to Speed Up

— The Pragmatic Engineer

How the November 2025 agents multiplied code output while human review stayed flat — and how Meta's largest-ever incident traced back to AI-written, AI-reviewed code shipping past a gutted Trust & Safety team.

Agents

Coinbase cut idea-to-production by 90%

— Cursor

How Coinbase compressed its delivery cycle from 20 days to 1.8 using Plan Mode and five-to-seven parallel agents, with 75% of pull requests now opened by an agent.

Security

Prompt Injection as Role Confusion

— Simon Willison

A research finding that models infer who is speaking from a text's style rather than its role tags — and that rewriting an attack to read slightly off-format drops its success rate from 61% to 10%.

Security

After Mythos: AI Red-Teaming with Gray Swan

— Latent Space

On an automated red-teamer that now out-ranks human professionals, the finding that larger models are not automatically safer, and the "Lethal Trifecta" — untrusted input, private data, and an exfiltration path together.

Open Models

GLM-5.2 Is the Step Change for Open Models

— Interconnects

Why GLM-5.2 is the first open-weight model that works as a general agent inside a Claude Code-style harness, narrowing the US–China gap to about 6.8 months at a fraction of the price.

Research

VibeThinker: 3B Matches the Giants on Verifiable Reasoning

— arXiv

A 3-billion-parameter model that ties 600B–1T flagships on math and competitive programming where answers are machine-checkable, via a two-stage "Spectrum-to-Signal" post-training recipe.

Agents

CUGA: An Open Agent Harness

— Hugging Face / IBM

IBM's open harness tops AppWorld and WebArena on an open-weight model by moving planning, state, and reflection into the harness, leaving developers to write only tools and prompts.

Tools

Oak: A Version Control System Rebuilt for Agents

— oak.space

Flat Mercurial-style manifests and lazy mounting give an agent seconds-to-first-edit on a multi-GB monorepo without cloning the whole thing — at the cost of leaving the Git ecosystem behind.

Architecture

In Praise of memcached

— jchri.st

An argument that memcached suits caching precisely because it does less — no persistence, no clustering — forcing correct "cache can vanish" semantics and sidestepping the Redis-as-database trap.

Science

GPT-5 Pro Helps Crack a 3-Year Immunology Mystery

— OpenAI

How GPT-5 Pro gave an immunologist a new angle on T-cell behavior that explained an experiment he had been unable to account for over three years.

Game Dev

The Low-Tech AI of Elden Ring

— nega.tv

How FromSoftware builds boss behavior without planning algorithms — a pushdown-automaton goal stack, weighted-random action selection, and interrupt callbacks that keep designers in full control.

Infrastructure

Why American Data Centers Can't Plug In

— Works in Progress

Why the bottleneck for AI data centers is not power but a first-come interconnection queue that fills with speculative projects — and how auctioning slots and pricing flexibility could clear it.

Cities

Why the West Stopped Making Land

— Works in Progress

How land reclamation stalled across the West around 1970 — not by prohibition, but by litigable environmental review that pushed single-project approval times into decades.