iGent Concert
/ensemble/architecture/

Ensemble Architecture

Current architecture of the Ensemble Fabric inference service

Ensemble keeps provider-specific details behind a Fabric inference boundary. The active path is intentionally small: receive a Fabric RPC call, validate it, choose the provider adapter, execute the request, and return or stream a normalized result.

Active Request Path

Fabric client or gateway
        |
        | POST /rpc
        v
Ensemble Fabric server
        |
        +--> capability/model registry
        +--> provider adapter selection
        +--> generation execution
        +--> normalized Fabric response/events

Components

ComponentJob
RPC serverAccepts Fabric JSON-RPC requests on /rpc
Model registryReports configured model availability and capabilities
Provider adaptersTranslate Fabric inference requests to provider APIs
Streaming layerEmits generation deltas and terminal events where supported
Health/observabilityExposes service health and telemetry for the root stack

Legacy Compatibility

Some older docs and clients refer to /api/v1/generate, /api/v1/stream, /api/v1/models, and response-persistence endpoints. In the current dev stack those belong to the optional ensemble-rest compatibility service, which only starts when ensemble/bin/ensemble.old exists.

New Fabric work should target /rpc and the inference/* method family.