Specification Details
Core Specification
Section titled “Core Specification”The core specification (v1/spec.yaml) defines the standard vocabulary that all provider manifests and runtimes share.
Standard Parameters
Section titled “Standard Parameters”These parameters have consistent meaning across all providers:
| Parameter | Type | Description |
|---|---|---|
temperature | float | Randomness control (0.0 – 2.0) |
max_tokens | integer | Maximum response tokens |
top_p | float | Nucleus sampling threshold |
stream | boolean | Enable streaming response |
stop | string[] | Stop sequences |
tools | object[] | Tool/function definitions |
tool_choice | string/object | Tool selection mode |
response_format | object | Structured output format |
Provider manifests map these standard names to provider-specific parameter names. For example, OpenAI uses max_completion_tokens while Anthropic uses max_tokens.
Streaming Events
Section titled “Streaming Events”The specification defines unified streaming event types that runtimes emit:
| Event | Description |
|---|---|
PartialContentDelta | Text content fragment |
ThinkingDelta | Reasoning/thinking block (extended thinking models) |
ToolCallStarted | Function/tool invocation begins |
PartialToolCall | Tool call argument streaming |
ToolCallEnded | Tool invocation complete |
StreamEnd | Response stream complete |
StreamError | Stream-level error |
Metadata | Usage statistics, model info |
Provider manifests declare JSONPath-based rules that map provider-specific events to these standard types.
Error Classes (V2 Standard Codes)
Section titled “Error Classes (V2 Standard Codes)”V2 defines 13 standardized error codes. Provider-specific errors are mapped to these codes for consistent handling across runtimes:
| Code | Name | Category | Retryable | Fallbackable |
|---|---|---|---|---|
| E1001 | invalid_request | Client | No | No |
| E1002 | authentication | Client | No | Yes |
| E1003 | permission_denied | Client | No | No |
| E1004 | not_found | Client | No | No |
| E1005 | request_too_large | Client | No | No |
| E2001 | rate_limited | Rate | Yes | Yes |
| E2002 | quota_exhausted | Rate | No | Yes |
| E3001 | server_error | Server | Yes | Yes |
| E3002 | overloaded | Server | Yes | Yes |
| E3003 | timeout | Server | Yes | Yes |
| E4001 | conflict | Operational | Yes | No |
| E4002 | cancelled | Operational | No | No |
| E9999 | unknown | Unknown | No | No |
- Retryable — Runtimes may retry the request (with backoff) for transient failures
- Fallbackable — Runtimes may try an alternative provider or model in a fallback chain
Retry Policies
Section titled “Retry Policies”The spec defines standard retry strategies:
retry_policy: strategy: "exponential_backoff" max_retries: 3 initial_delay_ms: 1000 max_delay_ms: 30000 backoff_multiplier: 2.0 retryable_errors: - "rate_limited" - "overloaded" - "server_error" - "timeout"Termination Reasons
Section titled “Termination Reasons”Normalized finish reasons for response completion:
| Reason | Description |
|---|---|
end_turn | Natural completion |
max_tokens | Token limit reached |
tool_use | Model wants to call a tool |
stop_sequence | Stop sequence encountered |
content_filter | Filtered by content policy |
API Families
Section titled “API Families”Providers are categorized into API families to prevent request/response format confusion:
openai— OpenAI-compatible APIs (also used by Groq, Together, DeepSeek, etc.)anthropic— Anthropic Messages APIgemini— Google Gemini APIcustom— Provider-specific format
Next Steps
Section titled “Next Steps”- Provider Manifests — How provider configs work
- Model Registry — Model configuration details