// WRITING / TAGS / AI & LLMS

The infrastructure side of AI. What it actually takes to run language models in production, and why most of the hard parts have nothing to do with the model.

Almost every conversation about AI right now is about the model. Almost none of the actual cost, risk, or complexity lives there. The hard parts are everything else. Where the weights live. How the inference layer is operated. How data flows in and out. How it's observed. How it fails. What happens when the model behind your product gets deprecated next quarter.

These posts come from running LLMs on our own infrastructure. Ollama plus persistent storage, vector stores, RAG pipelines, the operational reality of model lifecycle management. The lessons mostly aren't AI-specific. They're platform engineering lessons that AI workloads happen to surface faster.

If you're standing up an AI initiative and the conversation has been entirely about prompts and accuracy, the posts here are aimed squarely at the questions you'll be asking in three months.

// POSTS 0 entries

// no posts tagged ai yet