RAMBLINGS// field log
Of ramblings & field notes.
Homelab, Star Citizen, 3D printing, TTRPGs — whatever I'm nerding out on. Infrequent, and at length.
WORTH YOUR TIME
The canary migration went perfectly, so I ran the same playbook on the last two nodes. They found five new ways to make me earn it — node-local data that vaporises on reinstall, an OSD that booted faster than its network, a password bug I'd only half-fixed, a restore that raced itself, and a serial number I wrongly swore I couldn't read.
Enterprise SSDs arrived, so I migrated a live Talos control plane onto them. First I had to fix the backups, then learn that swapping a boot disk on Talos isn't a swap at all — it's a rebuild. Plus the canary node that taught me five things I only half-believed.
I was wrong about LACP. Time to rewire the Ceph fabric. Part 1 of 2 — the plan.
What happens when you put consumer NVMe under an etcd + Ceph mon workload. Part 1 of 3.
Ditching Ollama for LocalAI, battling P2P federation that doesn't work in Kubernetes, and building a self-hosted AI stack with persistent memory.
A journey through TrueNAS, Oracle Cloud, and Hetzner before finally landing on AWS Graviton for running Android containers with acceptable latency from New Zealand.
Deploying Pterodactyl Panel on Kubernetes with Wings running on TrueNAS for self-hosted game server management
How a CephFS sparse file handling quirk silently corrupted my app configs during VolSync restores—and the multi-day recovery effort across qbittorrent, sabnzbd, sonarr, radarr, and filebrowser using a mix of Kopia snapshots and old Restic backups.
BGP was supposed to fix my hairpin routing issues. It didn't. Here's how CoreDNS rewriting saved the day when pods couldn't reach LoadBalancer VIPs on the same node.
A real-world walkthrough of upgrading Ceph from v18 (Reef) through v19 (Squid) to v20 (Tentacle) via GitOps—including the correction of my wrong assumptions about Rook version constraints.
How I replaced Barman Cloud Plugin with pgBackRest to get true dual-destination full backups to both Backblaze B2 and Cloudflare R2, then migrated my entire PostgreSQL infrastructure to PostgreSQL 18.
Why I deployed a self-hosted GitHub Actions runner and Cloudflare Pages to serve JSON schemas extracted from my cluster's CRDs, eliminating dependency on third-party schema hosts.
Why I moved from Cilium L2 announcements to BGP for LoadBalancer IP advertisement, and how a dedicated Services VLAN simplified everything.
Why etcd fragments over time and how to reclaim disk space with talosctl etcd defrag.
How I migrated my Kubernetes PVC backups from Restic to Kopia with a 3-2-1 backup strategy: hourly NFS backups for fast restores, plus daily cloud backups to Backblaze B2 and Cloudflare R2 for disaster recovery.
Why I moved every Flux Kustomization into its target namespace, the challenges with substituteFrom, and how strategic patching made it work.
How I replaced per-app Tailscale ingresses with a single Connector and Split DNS for same-URL-everywhere remote access
How I configured Model Context Protocol servers to give Claude Code superpowers over my Kubernetes cluster
What I broke, how I wiped everything, and the steps I'm using to bootstrap Talos + Flux again.
Setting up the offical Tesla Fleet Addon for Home-Assistant with Kubernetes
Rolling out Ollama in Kubernetes with shared storage and Open-WebUI
Prerequisites and getting Open-WebUI up and running
Breaking down the buzzwords and tech behind today's AI boom
All connections stopped
When recovery goes bad
Protecting your investment