// WRITING / TAGS / INFRASTRUCTURE

The substrate everything else runs on. Compute, network, storage, the hardware decisions, and the trade-offs that shape what can be built above.

Infrastructure is the layer where the abstractions stop. Bare metal, switching fabric, storage arrays, the racks they sit in, and the operational model that keeps them honest. The decisions here have the longest half-life of anything we work on. Get them wrong and you'll be living with the consequences for years.

These posts cover both the physical side of infrastructure and the philosophical one. When to own hardware versus rent it. How to think about lifecycle management on equipment with a 10-year horizon. What cloud actually abstracts and what it just relocates. Why "the cloud" still has datacenters underneath.

Some of these come from the lab. Our own half rack of bare metal in a Los Angeles datacenter where we build everything we advise on. Others come from client work where the infrastructure question was the whole question.

// POSTS 12 entries
  1. FIG. 01

    The Cloud Didn't Simplify Infrastructure. It Redistributed the Complexity.

    The cloud didn't make infrastructure simpler. It moved the complexity somewhere less visible and replaced some of it with operational surfaces you have to learn from scratch. The 'cloud is awesome' moment and the 'I have no idea how this actually works' moment are often the same capability viewed six months apart.

  2. FIG. 02

    Idempotency Is a Promise

    Tools offer idempotency as a good intention. Systems need it as a guarantee. Knowing when to step in and upgrade one to the other is what separates automation that works from automation that works until it doesn't.

  3. FIG. 03

    Why Infrastructure Is Always Somebody's Second Priority

    Infrastructure work has a visibility problem baked into the nature of the work itself. When it's working nobody notices. When it fails everyone notices. That asymmetry shapes every prioritization conversation infrastructure teams ever have, and it doesn't fix itself with better communication.

  4. FIG. 04

    The Operational Surface Is the Cost Nobody Counts

    Most architecture evaluations compare tools in isolation. The better question is: what's the right tool given the operational surface I'm already committed to? Adding a new tool has a real cost that almost never shows up in the analysis. Reusing what's already there has a real value that almost never gets counted.

  5. FIG. 05

    You Can't Outsource Understanding

    You can delegate the work. You can use managed services. You can hire people who know the thing you don't. What you can't do is outsource the comprehension. When something breaks at 2am, the understanding either exists or it doesn't.

  6. FIG. 06

    The Plan Is Not the Schedule

    Good planning isn't about staying on schedule — it's about making better decisions in flight, taking on deliberate technical debt with clear eyes, and arriving at the right destination even when the route changes.

  7. FIG. 07

    handle_absent_entries: remove Almost Deleted Everything

    The thing that makes declarative automation powerful is exactly the thing that makes it dangerous. I wrote a user management task with handle_absent_entries: remove, defined a partial list, and RouterOS refused to execute because it would have deleted the last user with full access permissions. The safety net caught it. The lesson is about knowing where aggressive automation ends and self-inflicted disaster begins.

  8. FIG. 08

    MikroTik Will Delete Everything. It's Still the Right Choice.

    The 24-hour activation window is real. The support response time on a Friday night is real. The disk wipe if you miss the window is real. MikroTik is still the right choice. All of these things are true at the same time.

  9. FIG. 09

    Automating Network Config on Live Hardware

    The destination was never in question. The uncertainty was entirely in the path: how Ansible, RouterOS, and a task sequence would negotiate the journey from current state to desired state on live hardware.

  10. FIG. 10

    Bootstrapping Network Gear You've Never Touched Before

    The on-site session was two and a half hours. Rack, cable, verify, leave. That wasn't luck. The bootstrap happened weeks earlier at a desk, not in the rack.

  11. FIG. 11

    What Done Looks Like in Infrastructure Work

    Infrastructure work has a done problem that software delivery mostly solved and we haven't caught up. 'It's working' is not done. 'Nobody is complaining' is not done. And the cost of not defining done is that you never actually finish anything. You just stop actively working on it.

  12. FIG. 12

    Four Waves: How a Home Lab Grows Up

    A home lab isn't a static thing. It grows through distinct phases. Wave one is making something work. Wave two is making it more complicated. Wave three is adding rigor. Wave four is building a true datacenter corollary. Most people stop at wave two. Wave four is where the interesting work is.