mirror of
https://github.com/ksyasuda/dotfiles.git
synced 2026-03-21 18:11:27 -07:00
update skills
This commit is contained in:
175
.agents/skills/cloudflare-deploy/references/ai-gateway/README.md
Normal file
175
.agents/skills/cloudflare-deploy/references/ai-gateway/README.md
Normal file
@@ -0,0 +1,175 @@
|
||||
# Cloudflare AI Gateway
|
||||
|
||||
Expert guidance for implementing Cloudflare AI Gateway - a universal gateway for AI model providers with analytics, caching, rate limiting, and routing capabilities.
|
||||
|
||||
## When to Use This Reference
|
||||
|
||||
- Setting up AI Gateway for any AI provider (OpenAI, Anthropic, Workers AI, etc.)
|
||||
- Implementing caching, rate limiting, or request retry/fallback
|
||||
- Configuring dynamic routing with A/B testing or model fallbacks
|
||||
- Managing provider API keys securely with BYOK
|
||||
- Adding security features (guardrails, DLP)
|
||||
- Setting up observability with logging and custom metadata
|
||||
- Debugging AI Gateway requests or optimizing configurations
|
||||
|
||||
## Quick Start
|
||||
|
||||
**What's your setup?**
|
||||
|
||||
- **Using Vercel AI SDK** → Pattern 1 (recommended) - see [sdk-integration.md](./sdk-integration.md)
|
||||
- **Using OpenAI SDK** → Pattern 2 - see [sdk-integration.md](./sdk-integration.md)
|
||||
- **Cloudflare Worker + Workers AI** → Pattern 3 - see [sdk-integration.md](./sdk-integration.md)
|
||||
- **Direct HTTP (any language)** → Pattern 4 - see [configuration.md](./configuration.md)
|
||||
- **Framework (LangChain, etc.)** → See [sdk-integration.md](./sdk-integration.md)
|
||||
|
||||
## Pattern 1: Vercel AI SDK (Recommended)
|
||||
|
||||
Most modern pattern using official `ai-gateway-provider` package with automatic fallbacks.
|
||||
|
||||
```typescript
|
||||
import { createAiGateway } from 'ai-gateway-provider';
|
||||
import { createOpenAI } from '@ai-sdk/openai';
|
||||
import { generateText } from 'ai';
|
||||
|
||||
const gateway = createAiGateway({
|
||||
accountId: process.env.CF_ACCOUNT_ID,
|
||||
gateway: process.env.CF_GATEWAY_ID,
|
||||
});
|
||||
|
||||
const openai = createOpenAI({
|
||||
apiKey: process.env.OPENAI_API_KEY
|
||||
});
|
||||
|
||||
// Single model
|
||||
const { text } = await generateText({
|
||||
model: gateway(openai('gpt-4o')),
|
||||
prompt: 'Hello'
|
||||
});
|
||||
|
||||
// Automatic fallback array
|
||||
const { text } = await generateText({
|
||||
model: gateway([
|
||||
openai('gpt-4o'), // Try first
|
||||
anthropic('claude-sonnet-4-5'), // Fallback
|
||||
]),
|
||||
prompt: 'Hello'
|
||||
});
|
||||
```
|
||||
|
||||
**Install:** `npm install ai-gateway-provider ai @ai-sdk/openai @ai-sdk/anthropic`
|
||||
|
||||
## Pattern 2: OpenAI SDK
|
||||
|
||||
Drop-in replacement for OpenAI API with multi-provider support.
|
||||
|
||||
```typescript
|
||||
import OpenAI from 'openai';
|
||||
|
||||
const client = new OpenAI({
|
||||
apiKey: process.env.OPENAI_API_KEY,
|
||||
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/compat`,
|
||||
defaultHeaders: {
|
||||
'cf-aig-authorization': `Bearer ${cfToken}` // For authenticated gateways
|
||||
}
|
||||
});
|
||||
|
||||
// Switch providers by changing model format: {provider}/{model}
|
||||
const response = await client.chat.completions.create({
|
||||
model: 'openai/gpt-4o', // or 'anthropic/claude-sonnet-4-5'
|
||||
messages: [{ role: 'user', content: 'Hello!' }]
|
||||
});
|
||||
```
|
||||
|
||||
## Pattern 3: Workers AI Binding
|
||||
|
||||
For Cloudflare Workers using Workers AI.
|
||||
|
||||
```typescript
|
||||
export default {
|
||||
async fetch(request, env, ctx) {
|
||||
const response = await env.AI.run(
|
||||
'@cf/meta/llama-3-8b-instruct',
|
||||
{ messages: [{ role: 'user', content: 'Hello!' }] },
|
||||
{
|
||||
gateway: {
|
||||
id: 'my-gateway',
|
||||
metadata: { userId: '123', team: 'engineering' }
|
||||
}
|
||||
}
|
||||
);
|
||||
|
||||
return Response.json(response);
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
## Headers Quick Reference
|
||||
|
||||
| Header | Purpose | Example | Notes |
|
||||
|--------|---------|---------|-------|
|
||||
| `cf-aig-authorization` | Gateway auth | `Bearer {token}` | Required for authenticated gateways |
|
||||
| `cf-aig-metadata` | Tracking | `{"userId":"x"}` | Max 5 entries, flat structure |
|
||||
| `cf-aig-cache-ttl` | Cache duration | `3600` | Seconds, min 60, max 2592000 (30 days) |
|
||||
| `cf-aig-skip-cache` | Bypass cache | `true` | - |
|
||||
| `cf-aig-cache-key` | Custom cache key | `my-key` | Must be unique per response |
|
||||
| `cf-aig-collect-log` | Skip logging | `false` | Default: true |
|
||||
| `cf-aig-cache-status` | Cache hit/miss | Response only | `HIT` or `MISS` |
|
||||
|
||||
## In This Reference
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| [sdk-integration.md](./sdk-integration.md) | Vercel AI SDK, OpenAI SDK, Workers binding patterns |
|
||||
| [configuration.md](./configuration.md) | Dashboard setup, wrangler, API tokens |
|
||||
| [features.md](./features.md) | Caching, rate limits, guardrails, DLP, BYOK, unified billing |
|
||||
| [dynamic-routing.md](./dynamic-routing.md) | Fallbacks, A/B testing, conditional routing |
|
||||
| [troubleshooting.md](./troubleshooting.md) | Debugging, errors, observability, gotchas |
|
||||
|
||||
## Reading Order
|
||||
|
||||
| Task | Files |
|
||||
|------|-------|
|
||||
| First-time setup | README + [configuration.md](./configuration.md) |
|
||||
| SDK integration | README + [sdk-integration.md](./sdk-integration.md) |
|
||||
| Enable caching | README + [features.md](./features.md) |
|
||||
| Setup fallbacks | README + [dynamic-routing.md](./dynamic-routing.md) |
|
||||
| Debug errors | README + [troubleshooting.md](./troubleshooting.md) |
|
||||
|
||||
## Architecture
|
||||
|
||||
AI Gateway acts as a proxy between your application and AI providers:
|
||||
|
||||
```
|
||||
Your App → AI Gateway → AI Provider (OpenAI, Anthropic, etc.)
|
||||
↓
|
||||
Analytics, Caching, Rate Limiting, Logging
|
||||
```
|
||||
|
||||
**Key URL patterns:**
|
||||
- Unified API (OpenAI-compatible): `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions`
|
||||
- Provider-specific: `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider}/{endpoint}`
|
||||
- Dynamic routes: Use route name instead of model: `dynamic/{route-name}`
|
||||
|
||||
## Gateway Types
|
||||
|
||||
1. **Unauthenticated Gateway**: Open access (not recommended for production)
|
||||
2. **Authenticated Gateway**: Requires `cf-aig-authorization` header with Cloudflare API token (recommended)
|
||||
|
||||
## Provider Authentication Options
|
||||
|
||||
1. **Unified Billing**: Use AI Gateway billing to pay for inference (keyless mode - no provider API key needed)
|
||||
2. **BYOK (Store Keys)**: Store provider API keys in Cloudflare dashboard
|
||||
3. **Request Headers**: Include provider API key in each request
|
||||
|
||||
## Related Skills
|
||||
|
||||
- [Workers AI](../workers-ai/README.md) - For `env.AI.run()` details
|
||||
- [Agents SDK](../agents-sdk/README.md) - For stateful AI patterns
|
||||
- [Vectorize](../vectorize/README.md) - For RAG patterns with embeddings
|
||||
|
||||
## Resources
|
||||
|
||||
- [Official Docs](https://developers.cloudflare.com/ai-gateway/)
|
||||
- [API Reference](https://developers.cloudflare.com/api/resources/ai_gateway/)
|
||||
- [Provider Guides](https://developers.cloudflare.com/ai-gateway/usage/providers/)
|
||||
- [Discord Community](https://discord.cloudflare.com)
|
||||
@@ -0,0 +1,111 @@
|
||||
# Configuration & Setup
|
||||
|
||||
## Creating a Gateway
|
||||
|
||||
### Dashboard
|
||||
AI > AI Gateway > Create Gateway > Configure (auth, caching, rate limiting, logging)
|
||||
|
||||
### API
|
||||
```bash
|
||||
curl -X POST https://api.cloudflare.com/client/v4/accounts/{account_id}/ai-gateway/gateways \
|
||||
-H "Authorization: Bearer $CF_API_TOKEN" -H "Content-Type: application/json" \
|
||||
-d '{"id":"my-gateway","cache_ttl":3600,"rate_limiting_interval":60,"rate_limiting_limit":100,"collect_logs":true}'
|
||||
```
|
||||
|
||||
**Naming:** lowercase alphanumeric + hyphens (e.g., `prod-api`, `dev-chat`)
|
||||
|
||||
## Wrangler Integration
|
||||
|
||||
```toml
|
||||
[ai]
|
||||
binding = "AI"
|
||||
|
||||
[[ai.gateway]]
|
||||
id = "my-gateway"
|
||||
```
|
||||
|
||||
```bash
|
||||
wrangler secret put CF_API_TOKEN
|
||||
wrangler secret put OPENAI_API_KEY # If not using BYOK
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
### Gateway Auth (protects gateway access)
|
||||
```typescript
|
||||
const client = new OpenAI({
|
||||
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`,
|
||||
defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` }
|
||||
});
|
||||
```
|
||||
|
||||
### Provider Auth Options
|
||||
|
||||
**1. Unified Billing (keyless)** - pay through Cloudflare, no provider key:
|
||||
```typescript
|
||||
const client = new OpenAI({
|
||||
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`,
|
||||
defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` }
|
||||
});
|
||||
```
|
||||
Supports: OpenAI, Anthropic, Google AI Studio
|
||||
|
||||
**2. BYOK** - store keys in dashboard (Provider Keys > Add), no key in code
|
||||
|
||||
**3. Request Headers** - pass provider key per request:
|
||||
```typescript
|
||||
const client = new OpenAI({
|
||||
apiKey: process.env.OPENAI_API_KEY,
|
||||
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`,
|
||||
defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` }
|
||||
});
|
||||
```
|
||||
|
||||
## API Token Permissions
|
||||
|
||||
- **Gateway management:** AI Gateway - Read + Edit
|
||||
- **Gateway access:** AI Gateway - Read (minimum)
|
||||
|
||||
## Gateway Management API
|
||||
|
||||
```bash
|
||||
# List
|
||||
curl https://api.cloudflare.com/client/v4/accounts/{account_id}/ai-gateway/gateways \
|
||||
-H "Authorization: Bearer $CF_API_TOKEN"
|
||||
|
||||
# Get
|
||||
curl .../gateways/{gateway_id}
|
||||
|
||||
# Update
|
||||
curl -X PUT .../gateways/{gateway_id} \
|
||||
-d '{"cache_ttl":7200,"rate_limiting_limit":200}'
|
||||
|
||||
# Delete
|
||||
curl -X DELETE .../gateways/{gateway_id}
|
||||
```
|
||||
|
||||
## Getting IDs
|
||||
|
||||
- **Account ID:** Dashboard > Overview > Copy
|
||||
- **Gateway ID:** AI Gateway > Gateway name column
|
||||
|
||||
## Python Example
|
||||
|
||||
```python
|
||||
from openai import OpenAI
|
||||
import os
|
||||
|
||||
client = OpenAI(
|
||||
api_key=os.environ.get("OPENAI_API_KEY"),
|
||||
base_url=f"https://gateway.ai.cloudflare.com/v1/{os.environ['CF_ACCOUNT_ID']}/{os.environ['GATEWAY_ID']}/openai",
|
||||
default_headers={"cf-aig-authorization": f"Bearer {os.environ['CF_API_TOKEN']}"}
|
||||
)
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always authenticate gateways in production**
|
||||
2. **Use BYOK or unified billing** - secrets out of code
|
||||
3. **Environment-specific gateways** - separate dev/staging/prod
|
||||
4. **Set rate limits** - prevent runaway costs
|
||||
5. **Enable logging** - track usage, debug issues
|
||||
@@ -0,0 +1,82 @@
|
||||
# Dynamic Routing
|
||||
|
||||
Configure complex routing in dashboard without code changes. Use route names instead of model names.
|
||||
|
||||
## Usage
|
||||
|
||||
```typescript
|
||||
const response = await client.chat.completions.create({
|
||||
model: 'dynamic/smart-chat', // Route name from dashboard
|
||||
messages: [{ role: 'user', content: 'Hello!' }]
|
||||
});
|
||||
```
|
||||
|
||||
## Node Types
|
||||
|
||||
| Node | Purpose | Use Case |
|
||||
|------|---------|----------|
|
||||
| **Conditional** | Branch on metadata | Paid vs free users, geo routing |
|
||||
| **Percentage** | A/B split traffic | Model testing, gradual rollouts |
|
||||
| **Rate Limit** | Enforce quotas | Per-user/team limits |
|
||||
| **Budget Limit** | Cost quotas | Per-user spending caps |
|
||||
| **Model** | Call provider | Final destination |
|
||||
|
||||
## Metadata
|
||||
|
||||
Pass via header (max 5 entries, flat only):
|
||||
```typescript
|
||||
headers: {
|
||||
'cf-aig-metadata': JSON.stringify({
|
||||
userId: 'user-123',
|
||||
tier: 'pro',
|
||||
region: 'us-east'
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
## Common Patterns
|
||||
|
||||
**Multi-model fallback:**
|
||||
```
|
||||
Start → GPT-4 → On error: Claude → On error: Llama
|
||||
```
|
||||
|
||||
**Tiered access:**
|
||||
```
|
||||
Conditional: tier == 'enterprise' → GPT-4 (no limit)
|
||||
Conditional: tier == 'pro' → Rate Limit 1000/hr → GPT-4o
|
||||
Conditional: tier == 'free' → Rate Limit 10/hr → GPT-4o-mini
|
||||
```
|
||||
|
||||
**Gradual rollout:**
|
||||
```
|
||||
Percentage: 10% → New model, 90% → Old model
|
||||
```
|
||||
|
||||
**Cost-based fallback:**
|
||||
```
|
||||
Budget Limit: $100/day per teamId
|
||||
< 80%: GPT-4
|
||||
>= 80%: GPT-4o-mini
|
||||
>= 100%: Error
|
||||
```
|
||||
|
||||
## Version Management
|
||||
|
||||
- Save changes as new version
|
||||
- Test with `model: 'dynamic/route@v2'`
|
||||
- Roll back by deploying previous version
|
||||
|
||||
## Monitoring
|
||||
|
||||
Dashboard → Gateway → Dynamic Routes:
|
||||
- Request count per path
|
||||
- Success/error rates
|
||||
- Latency/cost by path
|
||||
|
||||
## Limitations
|
||||
|
||||
- Max 5 metadata entries
|
||||
- Values: string/number/boolean/null only
|
||||
- No nested objects
|
||||
- Route names: alphanumeric + hyphens
|
||||
@@ -0,0 +1,96 @@
|
||||
# Features & Capabilities
|
||||
|
||||
## Caching
|
||||
|
||||
Dashboard: Settings → Cache Responses → Enable
|
||||
|
||||
```typescript
|
||||
// Custom TTL (1 hour)
|
||||
headers: { 'cf-aig-cache-ttl': '3600' }
|
||||
|
||||
// Skip cache
|
||||
headers: { 'cf-aig-skip-cache': 'true' }
|
||||
|
||||
// Custom cache key
|
||||
headers: { 'cf-aig-cache-key': 'greeting-en' }
|
||||
```
|
||||
|
||||
**Limits:** TTL 60s - 30 days. **Does NOT work with streaming.**
|
||||
|
||||
## Rate Limiting
|
||||
|
||||
Dashboard: Settings → Rate-limiting → Enable
|
||||
|
||||
- **Fixed window:** Resets at intervals
|
||||
- **Sliding window:** Rolling window (more accurate)
|
||||
- Returns `429` when exceeded
|
||||
|
||||
## Guardrails
|
||||
|
||||
Dashboard: Settings → Guardrails → Enable
|
||||
|
||||
Filter prompts/responses for inappropriate content. Actions: Flag (log) or Block (reject).
|
||||
|
||||
## Data Loss Prevention (DLP)
|
||||
|
||||
Dashboard: Settings → DLP → Enable
|
||||
|
||||
Detect PII (emails, SSNs, credit cards). Actions: Flag, Block, or Redact.
|
||||
|
||||
## Billing Modes
|
||||
|
||||
| Mode | Description | Setup |
|
||||
|------|-------------|-------|
|
||||
| **Unified Billing** | Pay through Cloudflare, no provider keys | Use `cf-aig-authorization` header only |
|
||||
| **BYOK** | Store provider keys in dashboard | Add keys in Provider Keys section |
|
||||
| **Pass-through** | Send provider key with each request | Include provider's auth header |
|
||||
|
||||
## Zero Data Retention
|
||||
|
||||
Dashboard: Settings → Privacy → Zero Data Retention
|
||||
|
||||
No prompts/responses stored. Request counts and costs still tracked.
|
||||
|
||||
## Logging
|
||||
|
||||
Dashboard: Settings → Logs → Enable (up to 10M logs)
|
||||
|
||||
Each entry: prompt, response, provider, model, tokens, cost, duration, cache status, metadata.
|
||||
|
||||
```typescript
|
||||
// Skip logging for request
|
||||
headers: { 'cf-aig-collect-log': 'false' }
|
||||
```
|
||||
|
||||
**Export:** Use Logpush to S3, GCS, Datadog, Splunk, etc.
|
||||
|
||||
## Custom Cost Tracking
|
||||
|
||||
For models not in Cloudflare's pricing database:
|
||||
|
||||
Dashboard: Gateway → Settings → Custom Costs
|
||||
|
||||
Or via API: set `model`, `input_cost`, `output_cost`.
|
||||
|
||||
## Supported Providers (22+)
|
||||
|
||||
| Provider | Unified API | Notes |
|
||||
|----------|-------------|-------|
|
||||
| OpenAI | `openai/gpt-4o` | Full support |
|
||||
| Anthropic | `anthropic/claude-sonnet-4-5` | Full support |
|
||||
| Google AI | `google-ai-studio/gemini-2.0-flash` | Full support |
|
||||
| Workers AI | `workersai/@cf/meta/llama-3` | Native |
|
||||
| Azure OpenAI | `azure-openai/*` | Deployment names |
|
||||
| AWS Bedrock | Provider endpoint only | `/bedrock/*` |
|
||||
| Groq | `groq/*` | Fast inference |
|
||||
| Mistral, Cohere, Perplexity, xAI, DeepSeek, Cerebras | Full support | - |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. Enable caching for deterministic prompts
|
||||
2. Set rate limits to prevent abuse
|
||||
3. Use guardrails for user-facing AI
|
||||
4. Enable DLP for sensitive data
|
||||
5. Use unified billing or BYOK for simpler key management
|
||||
6. Enable logging for debugging
|
||||
7. Use zero data retention when privacy required
|
||||
@@ -0,0 +1,114 @@
|
||||
# AI Gateway SDK Integration
|
||||
|
||||
## Vercel AI SDK (Recommended)
|
||||
|
||||
```typescript
|
||||
import { createAiGateway } from 'ai-gateway-provider';
|
||||
import { createOpenAI } from '@ai-sdk/openai';
|
||||
import { generateText } from 'ai';
|
||||
|
||||
const gateway = createAiGateway({
|
||||
accountId: process.env.CF_ACCOUNT_ID,
|
||||
gateway: process.env.CF_GATEWAY_ID,
|
||||
apiKey: process.env.CF_API_TOKEN // Optional for auth gateways
|
||||
});
|
||||
|
||||
const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });
|
||||
|
||||
// Single model
|
||||
const { text } = await generateText({
|
||||
model: gateway(openai('gpt-4o')),
|
||||
prompt: 'Hello'
|
||||
});
|
||||
|
||||
// Automatic fallback array
|
||||
const { text } = await generateText({
|
||||
model: gateway([
|
||||
openai('gpt-4o'),
|
||||
anthropic('claude-sonnet-4-5'),
|
||||
openai('gpt-4o-mini')
|
||||
]),
|
||||
prompt: 'Complex task'
|
||||
});
|
||||
```
|
||||
|
||||
### Options
|
||||
|
||||
```typescript
|
||||
model: gateway(openai('gpt-4o'), {
|
||||
cacheKey: 'my-key',
|
||||
cacheTtl: 3600,
|
||||
metadata: { userId: 'u123', team: 'eng' }, // Max 5 entries
|
||||
retries: { maxAttempts: 3, backoff: 'exponential' }
|
||||
})
|
||||
```
|
||||
|
||||
## OpenAI SDK
|
||||
|
||||
```typescript
|
||||
const client = new OpenAI({
|
||||
apiKey: process.env.OPENAI_API_KEY,
|
||||
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`,
|
||||
defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` }
|
||||
});
|
||||
|
||||
// Unified API - switch providers via model name
|
||||
model: 'openai/gpt-4o' // or 'anthropic/claude-sonnet-4-5'
|
||||
```
|
||||
|
||||
## Anthropic SDK
|
||||
|
||||
```typescript
|
||||
const client = new Anthropic({
|
||||
apiKey: process.env.ANTHROPIC_API_KEY,
|
||||
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/anthropic`,
|
||||
defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` }
|
||||
});
|
||||
```
|
||||
|
||||
## Workers AI Binding
|
||||
|
||||
```toml
|
||||
# wrangler.toml
|
||||
[ai]
|
||||
binding = "AI"
|
||||
[[ai.gateway]]
|
||||
id = "my-gateway"
|
||||
```
|
||||
|
||||
```typescript
|
||||
await env.AI.run('@cf/meta/llama-3-8b-instruct',
|
||||
{ messages: [...] },
|
||||
{ gateway: { id: 'my-gateway', metadata: { userId: '123' } } }
|
||||
);
|
||||
```
|
||||
|
||||
## LangChain / LlamaIndex
|
||||
|
||||
```typescript
|
||||
// Use OpenAI SDK pattern with custom baseURL
|
||||
new ChatOpenAI({
|
||||
configuration: {
|
||||
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
## HTTP / cURL
|
||||
|
||||
```bash
|
||||
curl https://gateway.ai.cloudflare.com/v1/{account}/{gateway}/openai/chat/completions \
|
||||
-H "Authorization: Bearer $OPENAI_KEY" \
|
||||
-H "cf-aig-authorization: Bearer $CF_TOKEN" \
|
||||
-H "cf-aig-metadata: {\"userId\":\"123\"}" \
|
||||
-d '{"model":"gpt-4o","messages":[...]}'
|
||||
```
|
||||
|
||||
## Headers Reference
|
||||
|
||||
| Header | Purpose |
|
||||
|--------|---------|
|
||||
| `cf-aig-authorization` | Gateway auth token |
|
||||
| `cf-aig-metadata` | JSON object (max 5 keys) |
|
||||
| `cf-aig-cache-ttl` | Cache TTL in seconds |
|
||||
| `cf-aig-skip-cache` | `true` to bypass cache |
|
||||
@@ -0,0 +1,88 @@
|
||||
# AI Gateway Troubleshooting
|
||||
|
||||
## Common Errors
|
||||
|
||||
| Error | Cause | Fix |
|
||||
|-------|-------|-----|
|
||||
| 401 | Missing `cf-aig-authorization` header | Add header with CF API token |
|
||||
| 403 | Invalid provider key / BYOK expired | Check provider key in dashboard |
|
||||
| 429 | Rate limit exceeded | Increase limit or implement backoff |
|
||||
|
||||
### 401 Fix
|
||||
|
||||
```typescript
|
||||
const client = new OpenAI({
|
||||
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`,
|
||||
defaultHeaders: { 'cf-aig-authorization': `Bearer ${CF_API_TOKEN}` }
|
||||
});
|
||||
```
|
||||
|
||||
### 429 Retry Pattern
|
||||
|
||||
```typescript
|
||||
async function requestWithRetry(fn, maxRetries = 3) {
|
||||
for (let i = 0; i < maxRetries; i++) {
|
||||
try { return await fn(); }
|
||||
catch (e) {
|
||||
if (e.status === 429 && i < maxRetries - 1) {
|
||||
await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000));
|
||||
continue;
|
||||
}
|
||||
throw e;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Gotchas
|
||||
|
||||
| Issue | Reality |
|
||||
|-------|---------|
|
||||
| Metadata limits | Max 5 entries, flat only (no nesting) |
|
||||
| Cache key collision | Use unique keys per expected response |
|
||||
| BYOK + Unified Billing | Mutually exclusive |
|
||||
| Rate limit scope | Per-gateway, not per-user (use dynamic routing for per-user) |
|
||||
| Log delay | 30-60 seconds normal |
|
||||
| Streaming + caching | **Incompatible** |
|
||||
| Model name (unified API) | Prefix required: `openai/gpt-4o`, not `gpt-4o` |
|
||||
|
||||
## Cache Not Working
|
||||
|
||||
**Causes:**
|
||||
- Different request params (temperature, etc.)
|
||||
- Streaming enabled
|
||||
- Caching disabled in settings
|
||||
|
||||
**Check:** `response.headers.get('cf-aig-cache-status')` → HIT or MISS
|
||||
|
||||
## Logs Not Appearing
|
||||
|
||||
1. Check logging enabled: Dashboard → Gateway → Settings
|
||||
2. Remove `cf-aig-collect-log: false` header
|
||||
3. Wait 30-60 seconds
|
||||
4. Check log limit (10M default)
|
||||
|
||||
## Debugging
|
||||
|
||||
```bash
|
||||
# Test connectivity
|
||||
curl -v https://gateway.ai.cloudflare.com/v1/{account}/{gateway}/openai/models \
|
||||
-H "Authorization: Bearer $OPENAI_KEY" \
|
||||
-H "cf-aig-authorization: Bearer $CF_TOKEN"
|
||||
```
|
||||
|
||||
```typescript
|
||||
// Check response headers
|
||||
console.log('Cache:', response.headers.get('cf-aig-cache-status'));
|
||||
console.log('Request ID:', response.headers.get('cf-ray'));
|
||||
```
|
||||
|
||||
## Analytics
|
||||
|
||||
Dashboard → AI Gateway → Select gateway
|
||||
|
||||
**Metrics:** Requests, tokens, latency (p50/p95/p99), cache hit rate, costs
|
||||
|
||||
**Log filters:** `status: error`, `provider: openai`, `cost > 0.01`, `duration > 1000`
|
||||
|
||||
**Export:** Logpush to S3/GCS/Datadog/Splunk
|
||||
Reference in New Issue
Block a user