LLM Articles

Language at the edge, logic at the core

Long-form notes on LLMs and deterministic systems

I write deep technical essays about how large language models behave in real systems, with a focus on observability, reliability, and the boundary between narrative and proof. The goal is simple: keep the magic, remove the nonsense.

Observability Distributed systems LLM internals Deterministic governance Cost to serve

Most articles are intentionally long. Each entry links to a dedicated page.

Articles

Newest first

Published Dec 20, 2025 10 pages

Transformers

KV Caching: Why long prompts get expensive

A step-by-step inside-the-box explanation of what is cached, what is not, and why reuse is limited. Especially relevant for multiple questions over the same large document.

Read (HTML)

Published Oct 15, 2025 15 pages

LLMs × Observability

Language Without Logic: Understanding LLMs in Observability

A practical argument for why LLMs are exceptional at explanation but unreliable as causal engines. Includes concrete failure modes: arithmetic, long context decay, prompt bias, and the cost of fake reasoning.

Read (HTML)

Draft March, 2026 Draft

Embedding Geometry

Embedding Geometry: How meaning emerges from training

A step-by-step inside-the-box explanation of how meaning emerges from training, why dimensions have no names, and why rotated spaces still think the same.

Read draft (HTML)