Field Notes

The Build Log

Practical notes on agentic AI, RAG, LLM engineering, Go infrastructure, and shipping full stack AI products in production — from one engineer's desk.

Agentic AILLMProduction

Agentic AI in Production: What Actually Works in 2026

Demos are easy; production agents are not. Here's what separates an agent you can trust with real users from a notebook that impresses on a Friday.

May 28, 20268 min read
RAGLLMRetrieval

RAG Is Not Dead — It Just Grew Up

Every time context windows grow, someone declares RAG dead. Then they get the bill, and the hallucinations, and they come back to retrieval.

May 12, 20267 min read
ObservabilityLLMProduction

Observability for LLM Pipelines: Tracing, Evaluation Metrics, and Per-Request Cost Attribution

You cannot improve what you cannot observe. LLM pipelines have unique observability needs — token cost, quality drift, and latency across external APIs that don't behave like your own services.

May 9, 202611 min read
GoInfrastructureAI

Why Go Is Quietly Becoming the Language of AI Infrastructure

Python owns the notebook. But the gateways, orchestrators, and high-throughput pipelines around your model? More and more of them are written in Go.

Apr 22, 20266 min read
GoArchitecturePatterns

Event-Driven Go in Practice: CQRS and Event Sourcing — When They Help and When They Hurt

CQRS and event sourcing are powerful patterns with real production benefits. They're also expensive to implement correctly, and most systems don't need them. Here's how to tell the difference.

Apr 17, 202612 min read
HiringFreelanceAI Engineering

Hiring a Freelance Full Stack AI Engineer: A Founder's Guide

You don't always need an AI team. Sometimes you need one engineer who can own the UI, the API, the infra, and the model layer — and actually finish.

Apr 3, 20266 min read
LLMCostProduction

Cutting LLM Costs in Production: Caching, Model Routing, and Graceful Fallbacks

The first LLM bill in production is always a surprise. Here are the specific techniques — semantic caching, model routing, fallback chains — that actually reduce it without making your product worse.

Mar 28, 202610 min read
SaaSAIArchitecture

Multi-Tenant SaaS Architecture for an AI Reporting Engine at 48K Req/Min

Building SaaS for enterprise means one tenant's burst traffic cannot become another tenant's outage. At 48K requests per minute across 6 tenants, noisy-neighbor control isn't optional — it's the product.

Mar 5, 202613 min read
ArchitectureScaleNode.js

Scaling a Translation Platform to 1M+ Requests/Day Across 70+ Languages

One million translation requests per day sounds like a scale problem. It is — but the harder problems are cache invalidation, language-pair cost asymmetry, and keeping p99 tolerable when a user submits a 50,000-word document.

Feb 11, 202611 min read
GoAWSArchitecture

Building a Real-Time Fraud-Detection Pipeline at 48K Events/Sec

Fraud doesn't wait for your system to warm up. Here's how we built a pipeline that processes 48,000 events every second and still responds in 12ms at the 99th percentile.

Jan 21, 202612 min read