update skills

This commit is contained in:
2026-03-17 16:53:22 -07:00
parent 0b0783ef8e
commit f9a530667e
389 changed files with 54512 additions and 1 deletions

View File

@@ -0,0 +1,175 @@
# Cloudflare AI Gateway
Expert guidance for implementing Cloudflare AI Gateway - a universal gateway for AI model providers with analytics, caching, rate limiting, and routing capabilities.
## When to Use This Reference
- Setting up AI Gateway for any AI provider (OpenAI, Anthropic, Workers AI, etc.)
- Implementing caching, rate limiting, or request retry/fallback
- Configuring dynamic routing with A/B testing or model fallbacks
- Managing provider API keys securely with BYOK
- Adding security features (guardrails, DLP)
- Setting up observability with logging and custom metadata
- Debugging AI Gateway requests or optimizing configurations
## Quick Start
**What's your setup?**
- **Using Vercel AI SDK** → Pattern 1 (recommended) - see [sdk-integration.md](./sdk-integration.md)
- **Using OpenAI SDK** → Pattern 2 - see [sdk-integration.md](./sdk-integration.md)
- **Cloudflare Worker + Workers AI** → Pattern 3 - see [sdk-integration.md](./sdk-integration.md)
- **Direct HTTP (any language)** → Pattern 4 - see [configuration.md](./configuration.md)
- **Framework (LangChain, etc.)** → See [sdk-integration.md](./sdk-integration.md)
## Pattern 1: Vercel AI SDK (Recommended)
Most modern pattern using official `ai-gateway-provider` package with automatic fallbacks.
```typescript
import { createAiGateway } from 'ai-gateway-provider';
import { createOpenAI } from '@ai-sdk/openai';
import { generateText } from 'ai';
const gateway = createAiGateway({
accountId: process.env.CF_ACCOUNT_ID,
gateway: process.env.CF_GATEWAY_ID,
});
const openai = createOpenAI({
apiKey: process.env.OPENAI_API_KEY
});
// Single model
const { text } = await generateText({
model: gateway(openai('gpt-4o')),
prompt: 'Hello'
});
// Automatic fallback array
const { text } = await generateText({
model: gateway([
openai('gpt-4o'), // Try first
anthropic('claude-sonnet-4-5'), // Fallback
]),
prompt: 'Hello'
});
```
**Install:** `npm install ai-gateway-provider ai @ai-sdk/openai @ai-sdk/anthropic`
## Pattern 2: OpenAI SDK
Drop-in replacement for OpenAI API with multi-provider support.
```typescript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/compat`,
defaultHeaders: {
'cf-aig-authorization': `Bearer ${cfToken}` // For authenticated gateways
}
});
// Switch providers by changing model format: {provider}/{model}
const response = await client.chat.completions.create({
model: 'openai/gpt-4o', // or 'anthropic/claude-sonnet-4-5'
messages: [{ role: 'user', content: 'Hello!' }]
});
```
## Pattern 3: Workers AI Binding
For Cloudflare Workers using Workers AI.
```typescript
export default {
async fetch(request, env, ctx) {
const response = await env.AI.run(
'@cf/meta/llama-3-8b-instruct',
{ messages: [{ role: 'user', content: 'Hello!' }] },
{
gateway: {
id: 'my-gateway',
metadata: { userId: '123', team: 'engineering' }
}
}
);
return Response.json(response);
}
};
```
## Headers Quick Reference
| Header | Purpose | Example | Notes |
|--------|---------|---------|-------|
| `cf-aig-authorization` | Gateway auth | `Bearer {token}` | Required for authenticated gateways |
| `cf-aig-metadata` | Tracking | `{"userId":"x"}` | Max 5 entries, flat structure |
| `cf-aig-cache-ttl` | Cache duration | `3600` | Seconds, min 60, max 2592000 (30 days) |
| `cf-aig-skip-cache` | Bypass cache | `true` | - |
| `cf-aig-cache-key` | Custom cache key | `my-key` | Must be unique per response |
| `cf-aig-collect-log` | Skip logging | `false` | Default: true |
| `cf-aig-cache-status` | Cache hit/miss | Response only | `HIT` or `MISS` |
## In This Reference
| File | Purpose |
|------|---------|
| [sdk-integration.md](./sdk-integration.md) | Vercel AI SDK, OpenAI SDK, Workers binding patterns |
| [configuration.md](./configuration.md) | Dashboard setup, wrangler, API tokens |
| [features.md](./features.md) | Caching, rate limits, guardrails, DLP, BYOK, unified billing |
| [dynamic-routing.md](./dynamic-routing.md) | Fallbacks, A/B testing, conditional routing |
| [troubleshooting.md](./troubleshooting.md) | Debugging, errors, observability, gotchas |
## Reading Order
| Task | Files |
|------|-------|
| First-time setup | README + [configuration.md](./configuration.md) |
| SDK integration | README + [sdk-integration.md](./sdk-integration.md) |
| Enable caching | README + [features.md](./features.md) |
| Setup fallbacks | README + [dynamic-routing.md](./dynamic-routing.md) |
| Debug errors | README + [troubleshooting.md](./troubleshooting.md) |
## Architecture
AI Gateway acts as a proxy between your application and AI providers:
```
Your App → AI Gateway → AI Provider (OpenAI, Anthropic, etc.)
Analytics, Caching, Rate Limiting, Logging
```
**Key URL patterns:**
- Unified API (OpenAI-compatible): `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions`
- Provider-specific: `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider}/{endpoint}`
- Dynamic routes: Use route name instead of model: `dynamic/{route-name}`
## Gateway Types
1. **Unauthenticated Gateway**: Open access (not recommended for production)
2. **Authenticated Gateway**: Requires `cf-aig-authorization` header with Cloudflare API token (recommended)
## Provider Authentication Options
1. **Unified Billing**: Use AI Gateway billing to pay for inference (keyless mode - no provider API key needed)
2. **BYOK (Store Keys)**: Store provider API keys in Cloudflare dashboard
3. **Request Headers**: Include provider API key in each request
## Related Skills
- [Workers AI](../workers-ai/README.md) - For `env.AI.run()` details
- [Agents SDK](../agents-sdk/README.md) - For stateful AI patterns
- [Vectorize](../vectorize/README.md) - For RAG patterns with embeddings
## Resources
- [Official Docs](https://developers.cloudflare.com/ai-gateway/)
- [API Reference](https://developers.cloudflare.com/api/resources/ai_gateway/)
- [Provider Guides](https://developers.cloudflare.com/ai-gateway/usage/providers/)
- [Discord Community](https://discord.cloudflare.com)

View File

@@ -0,0 +1,111 @@
# Configuration & Setup
## Creating a Gateway
### Dashboard
AI > AI Gateway > Create Gateway > Configure (auth, caching, rate limiting, logging)
### API
```bash
curl -X POST https://api.cloudflare.com/client/v4/accounts/{account_id}/ai-gateway/gateways \
-H "Authorization: Bearer $CF_API_TOKEN" -H "Content-Type: application/json" \
-d '{"id":"my-gateway","cache_ttl":3600,"rate_limiting_interval":60,"rate_limiting_limit":100,"collect_logs":true}'
```
**Naming:** lowercase alphanumeric + hyphens (e.g., `prod-api`, `dev-chat`)
## Wrangler Integration
```toml
[ai]
binding = "AI"
[[ai.gateway]]
id = "my-gateway"
```
```bash
wrangler secret put CF_API_TOKEN
wrangler secret put OPENAI_API_KEY # If not using BYOK
```
## Authentication
### Gateway Auth (protects gateway access)
```typescript
const client = new OpenAI({
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`,
defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` }
});
```
### Provider Auth Options
**1. Unified Billing (keyless)** - pay through Cloudflare, no provider key:
```typescript
const client = new OpenAI({
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`,
defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` }
});
```
Supports: OpenAI, Anthropic, Google AI Studio
**2. BYOK** - store keys in dashboard (Provider Keys > Add), no key in code
**3. Request Headers** - pass provider key per request:
```typescript
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`,
defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` }
});
```
## API Token Permissions
- **Gateway management:** AI Gateway - Read + Edit
- **Gateway access:** AI Gateway - Read (minimum)
## Gateway Management API
```bash
# List
curl https://api.cloudflare.com/client/v4/accounts/{account_id}/ai-gateway/gateways \
-H "Authorization: Bearer $CF_API_TOKEN"
# Get
curl .../gateways/{gateway_id}
# Update
curl -X PUT .../gateways/{gateway_id} \
-d '{"cache_ttl":7200,"rate_limiting_limit":200}'
# Delete
curl -X DELETE .../gateways/{gateway_id}
```
## Getting IDs
- **Account ID:** Dashboard > Overview > Copy
- **Gateway ID:** AI Gateway > Gateway name column
## Python Example
```python
from openai import OpenAI
import os
client = OpenAI(
api_key=os.environ.get("OPENAI_API_KEY"),
base_url=f"https://gateway.ai.cloudflare.com/v1/{os.environ['CF_ACCOUNT_ID']}/{os.environ['GATEWAY_ID']}/openai",
default_headers={"cf-aig-authorization": f"Bearer {os.environ['CF_API_TOKEN']}"}
)
```
## Best Practices
1. **Always authenticate gateways in production**
2. **Use BYOK or unified billing** - secrets out of code
3. **Environment-specific gateways** - separate dev/staging/prod
4. **Set rate limits** - prevent runaway costs
5. **Enable logging** - track usage, debug issues

View File

@@ -0,0 +1,82 @@
# Dynamic Routing
Configure complex routing in dashboard without code changes. Use route names instead of model names.
## Usage
```typescript
const response = await client.chat.completions.create({
model: 'dynamic/smart-chat', // Route name from dashboard
messages: [{ role: 'user', content: 'Hello!' }]
});
```
## Node Types
| Node | Purpose | Use Case |
|------|---------|----------|
| **Conditional** | Branch on metadata | Paid vs free users, geo routing |
| **Percentage** | A/B split traffic | Model testing, gradual rollouts |
| **Rate Limit** | Enforce quotas | Per-user/team limits |
| **Budget Limit** | Cost quotas | Per-user spending caps |
| **Model** | Call provider | Final destination |
## Metadata
Pass via header (max 5 entries, flat only):
```typescript
headers: {
'cf-aig-metadata': JSON.stringify({
userId: 'user-123',
tier: 'pro',
region: 'us-east'
})
}
```
## Common Patterns
**Multi-model fallback:**
```
Start → GPT-4 → On error: Claude → On error: Llama
```
**Tiered access:**
```
Conditional: tier == 'enterprise' → GPT-4 (no limit)
Conditional: tier == 'pro' → Rate Limit 1000/hr → GPT-4o
Conditional: tier == 'free' → Rate Limit 10/hr → GPT-4o-mini
```
**Gradual rollout:**
```
Percentage: 10% → New model, 90% → Old model
```
**Cost-based fallback:**
```
Budget Limit: $100/day per teamId
< 80%: GPT-4
>= 80%: GPT-4o-mini
>= 100%: Error
```
## Version Management
- Save changes as new version
- Test with `model: 'dynamic/route@v2'`
- Roll back by deploying previous version
## Monitoring
Dashboard → Gateway → Dynamic Routes:
- Request count per path
- Success/error rates
- Latency/cost by path
## Limitations
- Max 5 metadata entries
- Values: string/number/boolean/null only
- No nested objects
- Route names: alphanumeric + hyphens

View File

@@ -0,0 +1,96 @@
# Features & Capabilities
## Caching
Dashboard: Settings → Cache Responses → Enable
```typescript
// Custom TTL (1 hour)
headers: { 'cf-aig-cache-ttl': '3600' }
// Skip cache
headers: { 'cf-aig-skip-cache': 'true' }
// Custom cache key
headers: { 'cf-aig-cache-key': 'greeting-en' }
```
**Limits:** TTL 60s - 30 days. **Does NOT work with streaming.**
## Rate Limiting
Dashboard: Settings → Rate-limiting → Enable
- **Fixed window:** Resets at intervals
- **Sliding window:** Rolling window (more accurate)
- Returns `429` when exceeded
## Guardrails
Dashboard: Settings → Guardrails → Enable
Filter prompts/responses for inappropriate content. Actions: Flag (log) or Block (reject).
## Data Loss Prevention (DLP)
Dashboard: Settings → DLP → Enable
Detect PII (emails, SSNs, credit cards). Actions: Flag, Block, or Redact.
## Billing Modes
| Mode | Description | Setup |
|------|-------------|-------|
| **Unified Billing** | Pay through Cloudflare, no provider keys | Use `cf-aig-authorization` header only |
| **BYOK** | Store provider keys in dashboard | Add keys in Provider Keys section |
| **Pass-through** | Send provider key with each request | Include provider's auth header |
## Zero Data Retention
Dashboard: Settings → Privacy → Zero Data Retention
No prompts/responses stored. Request counts and costs still tracked.
## Logging
Dashboard: Settings → Logs → Enable (up to 10M logs)
Each entry: prompt, response, provider, model, tokens, cost, duration, cache status, metadata.
```typescript
// Skip logging for request
headers: { 'cf-aig-collect-log': 'false' }
```
**Export:** Use Logpush to S3, GCS, Datadog, Splunk, etc.
## Custom Cost Tracking
For models not in Cloudflare's pricing database:
Dashboard: Gateway → Settings → Custom Costs
Or via API: set `model`, `input_cost`, `output_cost`.
## Supported Providers (22+)
| Provider | Unified API | Notes |
|----------|-------------|-------|
| OpenAI | `openai/gpt-4o` | Full support |
| Anthropic | `anthropic/claude-sonnet-4-5` | Full support |
| Google AI | `google-ai-studio/gemini-2.0-flash` | Full support |
| Workers AI | `workersai/@cf/meta/llama-3` | Native |
| Azure OpenAI | `azure-openai/*` | Deployment names |
| AWS Bedrock | Provider endpoint only | `/bedrock/*` |
| Groq | `groq/*` | Fast inference |
| Mistral, Cohere, Perplexity, xAI, DeepSeek, Cerebras | Full support | - |
## Best Practices
1. Enable caching for deterministic prompts
2. Set rate limits to prevent abuse
3. Use guardrails for user-facing AI
4. Enable DLP for sensitive data
5. Use unified billing or BYOK for simpler key management
6. Enable logging for debugging
7. Use zero data retention when privacy required

View File

@@ -0,0 +1,114 @@
# AI Gateway SDK Integration
## Vercel AI SDK (Recommended)
```typescript
import { createAiGateway } from 'ai-gateway-provider';
import { createOpenAI } from '@ai-sdk/openai';
import { generateText } from 'ai';
const gateway = createAiGateway({
accountId: process.env.CF_ACCOUNT_ID,
gateway: process.env.CF_GATEWAY_ID,
apiKey: process.env.CF_API_TOKEN // Optional for auth gateways
});
const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Single model
const { text } = await generateText({
model: gateway(openai('gpt-4o')),
prompt: 'Hello'
});
// Automatic fallback array
const { text } = await generateText({
model: gateway([
openai('gpt-4o'),
anthropic('claude-sonnet-4-5'),
openai('gpt-4o-mini')
]),
prompt: 'Complex task'
});
```
### Options
```typescript
model: gateway(openai('gpt-4o'), {
cacheKey: 'my-key',
cacheTtl: 3600,
metadata: { userId: 'u123', team: 'eng' }, // Max 5 entries
retries: { maxAttempts: 3, backoff: 'exponential' }
})
```
## OpenAI SDK
```typescript
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`,
defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` }
});
// Unified API - switch providers via model name
model: 'openai/gpt-4o' // or 'anthropic/claude-sonnet-4-5'
```
## Anthropic SDK
```typescript
const client = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/anthropic`,
defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` }
});
```
## Workers AI Binding
```toml
# wrangler.toml
[ai]
binding = "AI"
[[ai.gateway]]
id = "my-gateway"
```
```typescript
await env.AI.run('@cf/meta/llama-3-8b-instruct',
{ messages: [...] },
{ gateway: { id: 'my-gateway', metadata: { userId: '123' } } }
);
```
## LangChain / LlamaIndex
```typescript
// Use OpenAI SDK pattern with custom baseURL
new ChatOpenAI({
configuration: {
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`
}
});
```
## HTTP / cURL
```bash
curl https://gateway.ai.cloudflare.com/v1/{account}/{gateway}/openai/chat/completions \
-H "Authorization: Bearer $OPENAI_KEY" \
-H "cf-aig-authorization: Bearer $CF_TOKEN" \
-H "cf-aig-metadata: {\"userId\":\"123\"}" \
-d '{"model":"gpt-4o","messages":[...]}'
```
## Headers Reference
| Header | Purpose |
|--------|---------|
| `cf-aig-authorization` | Gateway auth token |
| `cf-aig-metadata` | JSON object (max 5 keys) |
| `cf-aig-cache-ttl` | Cache TTL in seconds |
| `cf-aig-skip-cache` | `true` to bypass cache |

View File

@@ -0,0 +1,88 @@
# AI Gateway Troubleshooting
## Common Errors
| Error | Cause | Fix |
|-------|-------|-----|
| 401 | Missing `cf-aig-authorization` header | Add header with CF API token |
| 403 | Invalid provider key / BYOK expired | Check provider key in dashboard |
| 429 | Rate limit exceeded | Increase limit or implement backoff |
### 401 Fix
```typescript
const client = new OpenAI({
baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`,
defaultHeaders: { 'cf-aig-authorization': `Bearer ${CF_API_TOKEN}` }
});
```
### 429 Retry Pattern
```typescript
async function requestWithRetry(fn, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try { return await fn(); }
catch (e) {
if (e.status === 429 && i < maxRetries - 1) {
await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000));
continue;
}
throw e;
}
}
}
```
## Gotchas
| Issue | Reality |
|-------|---------|
| Metadata limits | Max 5 entries, flat only (no nesting) |
| Cache key collision | Use unique keys per expected response |
| BYOK + Unified Billing | Mutually exclusive |
| Rate limit scope | Per-gateway, not per-user (use dynamic routing for per-user) |
| Log delay | 30-60 seconds normal |
| Streaming + caching | **Incompatible** |
| Model name (unified API) | Prefix required: `openai/gpt-4o`, not `gpt-4o` |
## Cache Not Working
**Causes:**
- Different request params (temperature, etc.)
- Streaming enabled
- Caching disabled in settings
**Check:** `response.headers.get('cf-aig-cache-status')` → HIT or MISS
## Logs Not Appearing
1. Check logging enabled: Dashboard → Gateway → Settings
2. Remove `cf-aig-collect-log: false` header
3. Wait 30-60 seconds
4. Check log limit (10M default)
## Debugging
```bash
# Test connectivity
curl -v https://gateway.ai.cloudflare.com/v1/{account}/{gateway}/openai/models \
-H "Authorization: Bearer $OPENAI_KEY" \
-H "cf-aig-authorization: Bearer $CF_TOKEN"
```
```typescript
// Check response headers
console.log('Cache:', response.headers.get('cf-aig-cache-status'));
console.log('Request ID:', response.headers.get('cf-ray'));
```
## Analytics
Dashboard → AI Gateway → Select gateway
**Metrics:** Requests, tokens, latency (p50/p95/p99), cache hit rate, costs
**Log filters:** `status: error`, `provider: openai`, `cost > 0.01`, `duration > 1000`
**Export:** Logpush to S3/GCS/Datadog/Splunk