Skip to main content
Method

Principles and Playbooks

A decade of platform, integration, and AI work converges on a small set of reusable ideas. This is the thesis the case studies and blog posts argue in detail, and the four playbooks a team can actually adopt.

This page is designed to print cleanly. Use your browser’s Print to PDF to share a handout version.

The thesis

Most “AI” and “platform” content argues about tools. The tools change every quarter. What holds up is the shape of the work: an explicit loop, an evidence trail, graceful degradation, a single config that generates many surfaces, a contract that makes “looks right” into “is right,” and a boring GitOps substrate underneath.

Every serious write-up on this site argues one of those six ideas, usually with scar tissue attached. The playbooks are how those ideas become something a team can adopt without re-deriving them from first principles.

PrincipleReviewed Apr 2026 · today

Explicit Control Loops

If you cannot draw the loop, you do not have a process — you have vibes.

A reliable system, whether it is an AI agent, a deploy pipeline, or an on-call rotation, is a loop with observable state, explicit gates, and a bounded retry budget. The model or the tool is almost never the thing that makes it work. The control structure is.

The smallest useful version of the loop is: produce an artifact, verify it against explicit criteria, revise until acceptable, or fail loudly with a reason. Everything else — prompts, models, tooling — is supporting machinery around that shape.

When something is drifting, the first question is not "which model" but "what is the loop, and where is it open?" Missing gates, invisible state, and unbounded retries cause more production incidents than bad models do.

PrincipleReviewed Apr 2026 · today

Evidence First, Measure Before You Optimize

Every non-trivial claim needs a number, a link, or a reproducible command behind it.

Most "optimize X" work starts in the wrong place because the baseline is assumed, not measured. Before I accept a claim about GPU cost, inference latency, integration volume, or agent reliability, I want the baseline: what is being measured, how, over what window, with what instrumentation.

This applies to my own claims too. A principle without evidence is a preference. A case study without metrics is a story. Every serious write-up on this site traces back to a repo, a command, a dashboard, or a concrete failure mode.

Measurement is not a deliverable on its own — the point is that it unlocks honest prioritization. Once the baseline is real, the conversation shifts from "what should we do" to "what will move the number most for the least effort."

PrincipleReviewed Apr 2026 · today

Warnings Over Errors, Graceful Degradation by Default

Real-world data is messy. Halting on every recoverable issue means halting all the time.

In healthcare integration, inference serving, and agent workflows, the input is almost never "clean." HL7v2 messages drop fields. Partner APIs return undocumented shapes. Models occasionally produce malformed tool calls. A system that treats every deviation as a fatal error spends all its time on incident pages and none of its time moving data.

The alternative is not ignoring problems — it is classifying them. Recoverable issues become warnings that flow downstream with the record. Unrecoverable ones hit a dead-letter queue with enough context to replay. The pipeline keeps moving. The operator gets a prioritized queue instead of an alert storm.

This is the same pattern as circuit breakers, bulkheads, and bounded retries — applied one layer up, at the shape of the data itself. The guiding question is: what is the minimum that must be true for this record to be useful downstream, and what should we log but tolerate?

PrincipleReviewed Apr 2026 · today

One Config, Many Surfaces

Variance belongs in configuration. The code that reads it should stay small and consistent.

Every platform I have shipped has converged on the same shape: a single source of truth that generates multiple consumer surfaces. fi-fhir has Source Profiles that drive five parsers. Loom has one registry that generates MCP configs for six AI assistants. GitOps has a single manifest tree that reconciles two Kubernetes clusters.

The payoff is not elegance — it is blast radius. When a new feed, assistant, or cluster needs to be added, the change touches one file. When behavior drifts, the diff is in one place. You spend the saved time on the parts that actually vary: the profile, the adapter, the specific overlay.

This pattern also surfaces a useful question when new complexity shows up: is this a new feature of the config, or is it the signal that a new surface needs its own generator? That question has saved me more refactors than any tool.

PrincipleReviewed Apr 2026 · today

Contract-Driven Integration

Schemas, checklists, citations, and invariants are what turn "looks right" into "is right."

Every painful integration I have seen had an implicit contract somewhere. The spec said one thing, production said another, and the gap was absorbed by a human reading tickets. Making the contract explicit — as a schema, a checklist, a CEL expression, or a property-based test — is the single highest-leverage change in most integration work.

The same applies to AI agents. "The model produced reasonable output" is not a verification strategy. A structured schema, a citation check, or a policy gate turns a generator into an agent. It also turns silent failure into loud, actionable failure, which is the only kind you can operate on.

Contracts are also the vehicle for institutional memory. A new engineer on the team does not need to re-derive the patient matching behavior from eight ticket threads — they read the schema and the test, and the system tells them what it actually does.

PrincipleReviewed Apr 2026 · today

GitOps as the Boring Substrate

Pin versions, declare resources, reconcile from git. Operations should be boring on purpose.

Every AI platform that becomes load-bearing for a team eventually becomes infrastructure. At that point, the question stops being "how fast can we iterate" and becomes "how do we stop surprising ourselves on a Saturday." The answer is almost always the same: pin image versions, declare resources, reconcile from git, and make imperative changes the exception rather than the default.

The boringness is the feature. A GPU cluster running on a Flux-reconciled overlay is legible to anyone who can read YAML. A deployment that requires three kubectl commands and a vibe check is legible only to the person who last ran them. The former scales; the latter does not.

This principle is deliberately unexciting, and it is the one that most separates real platforms from demos.

Working through a similar transition?

If your team is adopting AI-assisted development, standing up GPU infrastructure, or taming healthcare integration work, I am happy to compare notes.