// WRITING / TAGS / OPERATIONS

Running infrastructure in production. Upgrades, incidents, runbooks, and the operational discipline that keeps systems honest.

Operations is what happens after the architecture diagram lands and the migration plan is signed off. It's the upgrade you need to run on Tuesday, the alert that fires at 2am, the runbook that's three quarters out of date, and the slow accumulation of operational scar tissue that becomes institutional knowledge.

These posts come from the operating side of platform work. What we've broken and how. What we changed afterward. The runbook patterns that survive contact with reality. The operational habits that distinguish a system you can operate from a system you can demo.

If you've ever inherited a platform that "just works" and quietly discovered that nobody knew which parts of it were load-bearing, you'll recognize most of what's here.

// POSTS 9 entries
  1. FIG. 01

    The Cloud Didn't Simplify Infrastructure. It Redistributed the Complexity.

    The cloud didn't make infrastructure simpler. It moved the complexity somewhere less visible and replaced some of it with operational surfaces you have to learn from scratch. The 'cloud is awesome' moment and the 'I have no idea how this actually works' moment are often the same capability viewed six months apart.

  2. FIG. 02

    GitOps Is Not Continuous Delivery: The Difference Matters

    GitOps and continuous delivery are not the same thing. Most teams conflate them in ways that create real operational problems. GitOps is a deployment and reconciliation model. Continuous delivery is a software delivery practice. They compose well but they're solving different problems, and treating them as synonyms produces systems where neither works as well as it should.

  3. FIG. 03

    The Operational Surface Is the Cost Nobody Counts

    Most architecture evaluations compare tools in isolation. The better question is: what's the right tool given the operational surface I'm already committed to? Adding a new tool has a real cost that almost never shows up in the analysis. Reusing what's already there has a real value that almost never gets counted.

  4. FIG. 04

    You Can't Outsource Understanding

    You can delegate the work. You can use managed services. You can hire people who know the thing you don't. What you can't do is outsource the comprehension. When something breaks at 2am, the understanding either exists or it doesn't.

  5. FIG. 05

    handle_absent_entries: remove Almost Deleted Everything

    The thing that makes declarative automation powerful is exactly the thing that makes it dangerous. I wrote a user management task with handle_absent_entries: remove, defined a partial list, and RouterOS refused to execute because it would have deleted the last user with full access permissions. The safety net caught it. The lesson is about knowing where aggressive automation ends and self-inflicted disaster begins.

  6. FIG. 06

    MikroTik Will Delete Everything. It's Still the Right Choice.

    The 24-hour activation window is real. The support response time on a Friday night is real. The disk wipe if you miss the window is real. MikroTik is still the right choice. All of these things are true at the same time.

  7. FIG. 07

    Automating Network Config on Live Hardware

    The destination was never in question. The uncertainty was entirely in the path: how Ansible, RouterOS, and a task sequence would negotiate the journey from current state to desired state on live hardware.

  8. FIG. 08

    Bootstrapping Network Gear You've Never Touched Before

    The on-site session was two and a half hours. Rack, cable, verify, leave. That wasn't luck. The bootstrap happened weeks earlier at a desk, not in the rack.

  9. FIG. 09

    What Done Looks Like in Infrastructure Work

    Infrastructure work has a done problem that software delivery mostly solved and we haven't caught up. 'It's working' is not done. 'Nobody is complaining' is not done. And the cost of not defining done is that you never actually finish anything. You just stop actively working on it.