Private AI, operated like real infrastructure
FlexInfer is the runtime anchor for private inference and model customization. Loom coordinates MCP, context, and agent fleets around that runtime. MentatLab turns direct model/API calls into DAG workflows. fi-fhir proves the pattern on sensitive healthcare integration work.
The story starts with private inference
The stack is intentionally layered: run and customize models inside your boundary, govern how agents reach tools and context, orchestrate repeatable DAG workflows over direct API models, and use fi-fhir when the workload is sensitive healthcare ETL instead of generic demo data.
Operate private model runtimes first
FlexInfer gives teams a Kubernetes-native control plane for model lifecycle, OpenAI-compatible routing, GPU-aware placement, scale-to-zero activation, quantization, adapters, and model artifact delivery.
Can this model run, scale, route, and roll back inside our cluster boundary?
FlexInfer product →Govern the agent and tool boundary
Loom and Loom Core turn MCP sprawl into a single governed entrypoint with registry sync, daemon routing, tiered memory, fleet state, sandbox execution, RBAC, audit, and HUD visibility.
Which agents can reach which tools, with what context, and what audit trail?
Loom Core product →Compose repeatable DAG workflows
MentatLab is the mission-control surface for DAG agent orchestration over direct API model nodes and internal services. In this stack, those model calls can target FlexInfer private endpoints instead of a frontier coding harness.
Can operators design, validate, run, and observe the workflow as a graph?
MentatLab product →Apply it to sensitive integration work
fi-fhir ties the platform story to healthcare ETL: Source Profiles, HL7v2/FHIR parsing, semantic events, workflow routing, terminology mapping, and AI-assisted explanations inside a controlled environment.
Can legacy clinical data become validated, explainable, routable events?
fi-fhir product →Each product owns a clear boundary
The platform pitch only works if each layer has a defensible job. These are the boundaries I would validate first in a private deployment.
FlexInfer
Kubernetes-native model lifecycle, OpenAI-compatible routing, GPU-aware runtime controls, quantization, adapters, and artifact delivery for private or hybrid inference.
- Deployment boundary
- Model runtime placement, scheduling, caching, activation, and adapter workflows stay inside your cluster boundary.
- Integration boundary
- Applications hit standard inference APIs while runtime operations stay inside your network and observability stack.
Loom
Loom plus Loom Core coordinate MCP servers, editor config, agent sessions, tiered context, Weaver routing, sandbox execution, RBAC, audit, HUD, and mobile fleet visibility.
- Deployment boundary
- Centralizes MCP server lifecycle, daemon routing, context memory, and policy boundaries for internal tools and agent access.
- Integration boundary
- Defines how agents reach internal systems with bounded context, auditable routing, and least-privilege intent.
MentatLab
Mission Control for DAG-based agent workflows over direct API model calls, internal services, and private inference endpoints such as FlexInfer-hosted models.
- Deployment boundary
- UI, gateway, and orchestrator services run inside your Kubernetes footprint alongside private model and integration services.
- Integration boundary
- Workflows call direct model/API nodes and MCP-governed tools without depending on frontier coding harness semantics.
fi-fhir
Healthcare-focused ingestion and transformation workflows across HL7v2, FHIR, CSV, EDI X12, and CDA/CCDA with profile-driven, testable data handling.
- Deployment boundary
- Data transformation pipeline runs in your controlled environment and deployment topology.
- Integration boundary
- Source Profiles, semantic events, validation, routing, and terminology mapping isolate source variability while preserving operational traceability.
The handoffs are the product
FlexInfer, Loom, MentatLab, and fi-fhir are useful separately. The stronger pitch is how the handoffs preserve private runtime control, context governance, workflow visibility, and sensitive data traceability.
FlexInfer + Loom
FlexInfer runs private model workloads in-cluster while Loom governs MCP routing, context memory, policy boundaries, and agent/tool access.
- Deployment boundary
- Model runtime placement, rollout safety, and capacity controls stay inside your cluster boundary.
- Integration boundary
- Agent access is routed through auditable context policies before reaching internal runtime services and private model endpoints.
Loom + MentatLab
Loom provides policy-governed context and tool routing while MentatLab delivers mission-control UX for DAG design, execution planning, and run visibility.
- Deployment boundary
- UI, gateway, and orchestrator services run in your private Kubernetes footprint.
- Integration boundary
- Operator workflows connect to direct model/API nodes, internal MCP-governed tools, and runtime services without moving control to shared SaaS planes.
FlexInfer + fi-fhir
fi-fhir handles profile-driven healthcare transformation while FlexInfer provides private runtime execution for downstream model workflows and AI-assisted integration development.
- Deployment boundary
- Data transformation and runtime execution remain inside your controlled environment.
- Integration boundary
- Profile-driven mapping isolates source variability while preserving operational traceability.
Operator mission-control surfaces
Loom makes the MCP, context, and fleet boundary visible. MentatLab makes DAG design, execution planning, and run state visible when workflows call direct model APIs and internal services.
MentatLab
Mission Control interface for building, validating, monitoring, and executing DAG-based agent workflows in private environments.
Loom Core governs context routing and policy boundaries. MentatLab provides the DAG design and run-visibility layer over direct API model calls, internal services, and private FlexInfer endpoints.
- Deployment boundary
- UI, gateway, and orchestrator services run inside your Kubernetes footprint alongside internal agent workloads.
- Integration boundary
- Connects orchestration workflows to direct model/API nodes, internal MCP-governed tools, and runtime services without moving data to shared SaaS control planes.
Loom Companion
SwiftUI app for fleet monitoring, session management, real-time alerts, and lightweight operator control from iPhone and iPad.
Loom Core exposes a frozen v1 mobile API (18 endpoints) with OAuth PKCE auth; Companion consumes it for on-the-go visibility into agents, sessions, and infrastructure.
- Deployment boundary
- Connects via LAN (trusted network) or Gateway (zero-trust with mandatory TLS) to your Loom Core HUD instance.
- Integration boundary
- Read-only fleet access with scope-gated mutations. Mobile tokens are isolated from internal agent routes with per-actor rate limiting and structured audit logging.
Current foundations and in-progress controls
4 controls are available today. 2 controls are currently in progress.
Healthcare-aligned implementation stories
A healthcare-first proof path demonstrates how this portfolio handles sensitive data transformations and operational reliability in private environments.
fi-fhir healthcare integration workflows
Profile-driven parsing and transformation for HL7v2 and FHIR in production-oriented workflows.
Operationalizing healthcare API integrations
Reliability patterns for long-lived, partner-facing integration surfaces under operational pressure.
Move from positioning to execution
Start with a readiness audit to baseline risk, cost, and deployment constraints. Then scope architecture work for your environment.