Ensemble Overview - iGent Concert

Ensemble is the inference authority in the Fabric stack. Managed dev uses the company-wide staging Ensemble endpoint. Optional local checkouts can run the Go server for local-mode validation.

Legacy REST/catalog documentation still exists for historical clients, but the current service story is Fabric RPC first.

What It Does

Lists available models through inference/models.list.
Generates responses through inference/generate.
Adapts provider APIs behind a common Fabric inference surface.
Streams generation events when the selected transport supports streaming.
Reports health for hosted staging and optional local-mode deployments.

Active Providers

Current active adapter coverage includes:

Provider family	Notes
Anthropic	Native Anthropic adapter
OpenAI-compatible	OpenAI and compatible APIs
Gemini	Google Gemini adapter
OpenRouter	Aggregated model provider
Mock	Development and validation path

Provider availability depends on configuration and credentials in the runtime environment.

Runtime

Aspect	Current shape
Language	Go
Default endpoint	SSM-hydrated staging `ENSEMBLE_URL`
Local endpoint	`127.0.0.1:8004` only when `FABRIC_ENSEMBLE_MODE=local`
Public host	`https://ensemble.fabric.dev.aws.igent.ai` retained as compatibility
Active route	Native Ensemble API routes
Legacy route	`ensemble-rest` on `127.0.0.1:8007` when `ensemble/bin/ensemble.old` exists

Current Limits

The active Fabric adapter supports model list and generation. Status retrieval and cancellation surfaces are intentionally conservative placeholders unless a concrete runtime implementation is enabled.

In The System

Agents normally reach Ensemble as part of runtime work coordinated by Podium. Diminuendo may also broker model information or inference calls when the product surface needs a single gateway entry point.