The archive, mapped
Every blog post and case study on the site with the principles and playbooks each piece argues for. Filter by any principle or playbook to see just the evidence behind it.
Matching the filter
6 items- Two-Lane Text GPU Allocation: Quality + Vision/Fast (Plus a Media Lane)Feb 9, 2026 · 11 min
How I redistributed 6 models across 3 GPU nodes to eliminate contention, using priority-based shared groups and label-based aliases for routing and failover.
How I run multiple OpenAI-compatible LLM endpoints on a small K3s cluster with AMD Radeon GPUs, and what I had to do to make it stable.
- Standing Up a GPU-Ready Private AI Platform (Harvester + K3s + Flux + GitLab)Dec 29, 2025 · 6 min
Field notes from building and operating a small private GPU platform with Harvester, K3s, and a GitLab -> Flux delivery loop.
- Hybrid/On-Prem GPU: The Boring GitOps PathDec 29, 2025 · 4 min
A practical guide to running GPU workloads on-prem or hybrid, using Kubernetes and GitOps patterns that make operations boring.
- Welcome to My HomelabNov 27, 2025 · 5 min
The infrastructure, product surfaces, and live demos behind the FlexInfer, Loom, and fi-fhir work I publish here.
How I built FlexDeck, a full-stack operations dashboard with real-time K8s monitoring, GitLab CI/CD visualization, and AI model management using Go and SolidJS.