Fallbacks
Specify model or provider fallback with your Universal endpoint to specify what to do if a request fails.
For example, you could set up a gateway endpoint that:
- Sends a request to Workers AI Inference API.
- If that request fails, proceeds to OpenAI.
graph TD
A[AI Gateway] --> B[Request to Workers AI Inference API]
B -->|Success| C[Return Response]
B -->|Failure| D[Request to OpenAI API]
D --> E[Return Response]
You can add as many fallbacks as you need, just by adding another object in the array.
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id} \ --header 'Content-Type: application/json' \ --data '[ { "provider": "workers-ai", "endpoint": "@cf/meta/llama-3.1-8b-instruct", "headers": { "Authorization": "Bearer {cloudflare_token}", "Content-Type": "application/json" }, "query": { "messages": [ { "role": "system", "content": "You are a friendly assistant" }, { "role": "user", "content": "What is Cloudflare?" } ] } }, { "provider": "openai", "endpoint": "chat/completions", "headers": { "Authorization": "Bearer {open_ai_token}", "Content-Type": "application/json" }, "query": { "model": "gpt-4o-mini", "stream": true, "messages": [ { "role": "user", "content": "What is Cloudflare?" } ] } }]'When using the Universal endpoint with fallbacks, the response header cf-aig-step indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response.
cf-aig-step:0– The first (primary) model was used successfully.cf-aig-step:1– The request fell back to the second model.cf-aig-step:2– The request fell back to the third model.- Subsequent steps – Each fallback increments the step number by 1.