Guides
Long-form visual explainers. Each one is built to be the article you send a colleague when they ask "how does this actually work?"
-
The KV Cache, Illustrated
How modern LLMs actually remember the conversation while they generate. From the redundant computation that made naïve attention untenable, through what's exactly inside the cache, to the optimizations every production team uses on top: paged attention, GQA, prompt caching, attention sinks, speculative decoding. With worked math and thirteen diagrams.
Jun 15, 2026
-
A Visual Guide to AI Agent Memory
Your agent rebuilds its entire mind from scratch on every single turn. This guide explains — visually, from first principles — how agents remember, forget, and what actually works in production.
Jun 10, 2026
-
Agents Need to Sleep: The Architecture of Memory Consolidation
Why a 24/7 agent inevitably suffers from context collapse, and how to build an offline memory consolidation daemon inspired by human sleep.
Jun 10, 2026
-
A Visual Guide to Belief Revision in AI Agents
Your agent stores facts but cannot change its mind. This guide explains, visually, why vector databases fail at contradictions, what forty years of belief revision theory can teach us, and how to build the machinery that lets an agent update what it knows.
Jun 2, 2026