Internal reference systems and prototypes that show how we build production AI — architecture, what gets logged, where evals run, and the failure modes we design for. Clearly labelled; not client case studies.
Reference system
A control layer for agent runs, costs, evals, approvals, and failures.
View demo →
Reference system
Ingestion to hybrid retrieval to reranking to cited answers.
View demo →
Reference system
A bounded LangGraph agent with explicit state, tools, and approvals.
View demo →
Build note
Repo instructions, skills, MCP, and a GitHub Actions review loop.
View demo →
Prototype
A TypeScript MCP server exposing internal tools behind auth.
View demo →
Reference system
Routes tasks across OpenAI, Claude, Gemini, and Grok with fallback and cost control.
View demo →
Build logs, agentic engineering decisions, agent failures, evals, and what survives real users. Sent weekly, never more.