update skills

2026-07-03 21:13:31 -07:00 · 2026-03-17 16:53:22 -07:00
parent 0b0783ef8e
commit f9a530667e
389 changed files with 54512 additions and 1 deletions
@@ -0,0 +1,138 @@
+# Cloudflare AI Search Reference
+
+Expert guidance for implementing Cloudflare AI Search (formerly AutoRAG), Cloudflare's managed semantic search and RAG service.
+
+## Overview
+
+**AI Search** is a managed RAG (Retrieval-Augmented Generation) pipeline that combines:
+- Automatic semantic indexing of your content
+- Vector similarity search
+- Built-in LLM generation
+
+**Key value propositions:**
+- **Zero vector management** - No manual embedding, indexing, or storage
+- **Auto-indexing** - Content automatically re-indexed every 6 hours
+- **Built-in generation** - Optional AI response generation from retrieved context
+- **Multi-source** - Index from R2 buckets or website crawls
+
+**Data source options:**
+- **R2 bucket** - Index files from Cloudflare R2 (supports MD, TXT, HTML, PDF, DOC, CSV, JSON)
+- **Website** - Crawl and index website content (requires Cloudflare-hosted domain)
+
+**Indexing lifecycle:**
+- Automatic 6-hour refresh cycle
+- Manual "Force Sync" available (30s rate limit)
+- Not designed for real-time updates
+
+## Quick Start
+
+**1. Create AI Search instance in dashboard:**
+- Go to Cloudflare Dashboard → AI Search → Create
+- Choose data source (R2 or website)
+- Configure instance name and settings
+
+**2. Configure Worker:**
+
+```jsonc
+// wrangler.jsonc
+{
+  "ai": {
+    "binding": "AI"
+  }
+}
+```
+
+**3. Use in Worker:**
+
+```typescript
+export default {
+  async fetch(request, env) {
+    const answer = await env.AI.autorag("my-search-instance").aiSearch({
+      query: "How do I configure caching?",
+      model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast"
+    });
+    
+    return Response.json({ answer: answer.response });
+  }
+};
+```
+
+## When to Use AI Search
+
+### AI Search vs Vectorize
+
+| Factor | AI Search | Vectorize |
+|--------|-----------|-----------|
+| **Management** | Fully managed | Manual embedding + indexing |
+| **Use when** | Want zero-ops RAG pipeline | Need custom embeddings/control |
+| **Indexing** | Automatic (6hr cycle) | Manual via API |
+| **Generation** | Built-in optional | Bring your own LLM |
+| **Data sources** | R2 or website | Manual insert |
+| **Best for** | Docs, support, enterprise search | Custom ML pipelines, real-time |
+
+### AI Search vs Direct Workers AI
+
+| Factor | AI Search | Workers AI (direct) |
+|--------|-----------|---------------------|
+| **Context** | Automatic retrieval | Manual context building |
+| **Use when** | Need RAG (search + generate) | Simple generation tasks |
+| **Indexing** | Built-in | Not applicable |
+| **Best for** | Knowledge bases, docs | Simple chat, transformations |
+
+### search() vs aiSearch()
+
+| Method | Returns | Use When |
+|--------|---------|----------|
+| `search()` | Search results only | Building custom UI, need raw chunks |
+| `aiSearch()` | AI response + results | Need ready-to-use answer (chatbot, Q&A) |
+
+### Real-time Updates Consideration
+
+**AI Search is NOT ideal if:**
+- Need real-time content updates (<6 hours)
+- Content changes multiple times per hour
+- Strict freshness requirements
+
+**AI Search IS ideal if:**
+- Content relatively stable (docs, policies, knowledge bases)
+- 6-hour refresh acceptable
+- Prefer zero-ops over real-time
+
+## Platform Limits
+
+| Limit | Value |
+|-------|-------|
+| Max instances per account | 10 |
+| Max files per instance | 100,000 |
+| Max file size | 4 MB |
+| Index frequency | Every 6 hours |
+| Force Sync rate limit | Once per 30 seconds |
+| Filter nesting depth | 2 levels |
+| Filters per compound | 10 |
+| Score threshold range | 0.0 - 1.0 |
+
+## Reading Order
+
+Navigate these references based on your task:
+
+| Task | Read | Est. Time |
+|------|------|-----------|
+| **Understand AI Search** | README only | 5 min |
+| **Implement basic search** | README → api.md | 10 min |
+| **Configure data source** | README → configuration.md | 10 min |
+| **Production patterns** | patterns.md | 15 min |
+| **Debug issues** | gotchas.md | 10 min |
+| **Full implementation** | README → api.md → patterns.md | 30 min |
+
+## In This Reference
+
+- **[api.md](api.md)** - API endpoints, methods, TypeScript interfaces
+- **[configuration.md](configuration.md)** - Setup, data sources, wrangler config
+- **[patterns.md](patterns.md)** - Common patterns, decision guidance, code examples
+- **[gotchas.md](gotchas.md)** - Troubleshooting, code-level gotchas, limits
+
+## See Also
+
+- [Cloudflare AI Search Docs](https://developers.cloudflare.com/ai-search/)
+- [Workers AI Docs](https://developers.cloudflare.com/workers-ai/)
+- [Vectorize Docs](https://developers.cloudflare.com/vectorize/)
@@ -0,0 +1,87 @@
+# AI Search API Reference
+
+## Workers Binding
+
+```typescript
+const answer = await env.AI.autorag("instance-name").aiSearch(options);
+const results = await env.AI.autorag("instance-name").search(options);
+const instances = await env.AI.autorag("_").listInstances();
+```
+
+## aiSearch() Options
+
+```typescript
+interface AiSearchOptions {
+  query: string;                          // User query
+  model: string;                          // Workers AI model ID
+  system_prompt?: string;                 // LLM instructions
+  rewrite_query?: boolean;                // Fix typos (default: false)
+  max_num_results?: number;               // Max chunks (default: 10)
+  ranking_options?: { score_threshold?: number }; // 0.0-1.0 (default: 0.3)
+  reranking?: { enabled: boolean; model: string };
+  stream?: boolean;                       // Stream response (default: false)
+  filters?: Filter;                       // Metadata filters
+  page?: string;                          // Pagination token
+}
+```
+
+## Response
+
+```typescript
+interface AiSearchResponse {
+  search_query: string;      // Query used (rewritten if enabled)
+  response: string;          // AI-generated answer
+  data: SearchResult[];      // Retrieved chunks
+  has_more: boolean;
+  next_page?: string;
+}
+
+interface SearchResult {
+  id: string;
+  score: number;
+  content: string;
+  metadata: { filename: string; folder: string; timestamp: number };
+}
+```
+
+## Filters
+
+```typescript
+// Comparison
+{ column: "folder", operator: "gte", value: "docs/" }
+
+// Compound
+{ operator: "and", filters: [
+  { column: "folder", operator: "gte", value: "docs/" },
+  { column: "timestamp", operator: "gte", value: 1704067200 }
+]}
+```
+
+**Operators:** `eq`, `ne`, `gt`, `gte`, `lt`, `lte`
+
+**Built-in metadata:** `filename`, `folder`, `timestamp` (Unix seconds)
+
+## Streaming
+
+```typescript
+const stream = await env.AI.autorag("docs").aiSearch({ query, model, stream: true });
+return new Response(stream, { headers: { "Content-Type": "text/event-stream" } });
+```
+
+## Error Types
+
+| Error | Cause |
+|-------|-------|
+| `AutoRAGNotFoundError` | Instance doesn't exist |
+| `AutoRAGUnauthorizedError` | Invalid/missing token |
+| `AutoRAGValidationError` | Invalid parameters |
+
+## REST API
+
+```bash
+curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/autorag/rags/{NAME}/ai-search \
+  -H "Authorization: Bearer {TOKEN}" \
+  -d '{"query": "...", "model": "@cf/meta/llama-3.3-70b-instruct-fp8-fast"}'
+```
+
+Requires Service API token with "AI Search - Read" permission.
@@ -0,0 +1,88 @@
+# AI Search Configuration
+
+## Worker Setup
+
+```jsonc
+// wrangler.jsonc
+{
+  "ai": { "binding": "AI" }
+}
+```
+
+```typescript
+interface Env {
+  AI: Ai;
+}
+
+const answer = await env.AI.autorag("my-instance").aiSearch({
+  query: "How do I configure caching?",
+  model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast"
+});
+```
+
+## Data Sources
+
+### R2 Bucket
+
+Dashboard: AI Search → Create Instance → Select R2 bucket
+
+**Supported formats:** `.md`, `.txt`, `.html`, `.pdf`, `.doc`, `.docx`, `.csv`, `.json`
+
+**Auto-indexed metadata:** `filename`, `folder`, `timestamp`
+
+### Website Crawler
+
+Requirements:
+- Domain on Cloudflare
+- `sitemap.xml` at root
+- Bot protection must allow `CloudflareAISearch` user agent
+
+## Path Filtering (R2)
+
+```
+docs/**/*.md          # All .md in docs/ recursively
+**/*.draft.md         # Exclude (use in exclude patterns)
+```
+
+## Indexing
+
+- **Automatic:** Every 6 hours
+- **Force Sync:** Dashboard button (30s rate limit between syncs)
+- **Pause:** Settings → Pause Indexing (existing index remains searchable)
+
+## Service API Token
+
+Dashboard: AI Search → Instance → Use AI Search → API → Create Token
+
+Permissions:
+- **Read** - search operations
+- **Edit** - instance management
+
+Store securely:
+```bash
+wrangler secret put AI_SEARCH_TOKEN
+```
+
+## Multi-Environment
+
+```toml
+# wrangler.toml
+[env.production.vars]
+AI_SEARCH_INSTANCE = "prod-docs"
+
+[env.staging.vars]
+AI_SEARCH_INSTANCE = "staging-docs"
+```
+
+```typescript
+const answer = await env.AI.autorag(env.AI_SEARCH_INSTANCE).aiSearch({ query });
+```
+
+## Monitoring
+
+```typescript
+const instances = await env.AI.autorag("_").listInstances();
+console.log(instances.find(i => i.name === "docs"));
+```
+
+Dashboard shows: files indexed, status, last index time, storage usage.
@@ -0,0 +1,81 @@
+# AI Search Gotchas
+
+## Type Safety
+
+**Timestamp precision:** Use seconds (10-digit), not milliseconds.
+```typescript
+const nowInSeconds = Math.floor(Date.now() / 1000); // Correct
+```
+
+**Folder prefix matching:** Use `gte` for "starts with" on paths.
+```typescript
+filters: { column: "folder", operator: "gte", value: "docs/api/" } // Matches nested
+```
+
+## Filter Limitations
+
+| Limit | Value |
+|-------|-------|
+| Max nesting depth | 2 levels |
+| Filters per compound | 10 |
+| `or` operator | Same column, `eq` only |
+
+**OR restriction example:**
+```typescript
+// ✅ Valid: same column, eq only
+{ operator: "or", filters: [
+  { column: "folder", operator: "eq", value: "docs/" },
+  { column: "folder", operator: "eq", value: "guides/" }
+]}
+```
+
+## Indexing Issues
+
+| Problem | Cause | Solution |
+|---------|-------|----------|
+| File not indexed | Unsupported format or >4MB | Check format (.md/.txt/.html/.pdf/.doc/.csv/.json) |
+| Index out of sync | 6-hour index cycle | Wait or use "Force Sync" (30s rate limit) |
+| Empty results | Index incomplete | Check dashboard for indexing status |
+
+## Auth Errors
+
+| Error | Cause | Fix |
+|-------|-------|-----|
+| `AutoRAGUnauthorizedError` | Invalid/missing token | Create Service API token with AI Search permissions |
+| `AutoRAGNotFoundError` | Wrong instance name | Verify exact name from dashboard |
+
+## Performance
+
+**Slow responses (>3s):**
+```typescript
+// Add score threshold + limit results
+ranking_options: { score_threshold: 0.5 },
+max_num_results: 10
+```
+
+**Empty results debug:**
+1. Remove filters, test basic query
+2. Lower `score_threshold` to 0.1
+3. Check index is populated
+
+## Limits
+
+| Resource | Limit |
+|----------|-------|
+| Instances per account | 10 |
+| Files per instance | 100,000 |
+| Max file size | 4 MB |
+| Index frequency | 6 hours |
+
+## Anti-Patterns
+
+**Use env vars for instance names:**
+```typescript
+const answer = await env.AI.autorag(env.AI_SEARCH_INSTANCE).aiSearch({...});
+```
+
+**Handle specific error types:**
+```typescript
+if (error instanceof AutoRAGNotFoundError) { /* 404 */ }
+if (error instanceof AutoRAGUnauthorizedError) { /* 401 */ }
+```
@@ -0,0 +1,85 @@
+# AI Search Patterns
+
+## search() vs aiSearch()
+
+| Use | Method | Returns |
+|-----|--------|---------|
+| Custom UI, analytics | `search()` | Raw chunks only (~100-300ms) |
+| Chatbots, Q&A | `aiSearch()` | AI response + chunks (~500-2000ms) |
+
+## rewrite_query
+
+| Setting | Use When |
+|---------|----------|
+| `true` | User input (typos, vague queries) |
+| `false` | LLM-generated queries (already optimized) |
+
+## Multitenancy (Folder-Based)
+
+```typescript
+const answer = await env.AI.autorag("saas-docs").aiSearch({
+  query: "refund policy",
+  model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
+  filters: {
+    column: "folder",
+    operator: "gte",  // "starts with" pattern
+    value: `tenants/${tenantId}/`
+  }
+});
+```
+
+## Streaming
+
+```typescript
+const stream = await env.AI.autorag("docs").aiSearch({
+  query, model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast", stream: true
+});
+return new Response(stream, { headers: { "Content-Type": "text/event-stream" } });
+```
+
+## Score Threshold
+
+| Threshold | Use |
+|-----------|-----|
+| 0.3 (default) | Broad recall, exploratory |
+| 0.5 | Balanced, production default |
+| 0.7 | High precision, critical accuracy |
+
+## System Prompt Template
+
+```typescript
+const systemPrompt = `You are a documentation assistant.
+- Answer ONLY based on provided context
+- If context doesn't contain answer, say "I don't have information"
+- Include code examples from context`;
+```
+
+## Compound Filters
+
+```typescript
+// OR: Multiple folders
+filters: {
+  operator: "or",
+  filters: [
+    { column: "folder", operator: "gte", value: "docs/api/" },
+    { column: "folder", operator: "gte", value: "docs/auth/" }
+  ]
+}
+
+// AND: Folder + date
+filters: {
+  operator: "and",
+  filters: [
+    { column: "folder", operator: "gte", value: "docs/" },
+    { column: "timestamp", operator: "gte", value: oneWeekAgoSeconds }
+  ]
+}
+```
+
+## Reranking
+
+Enable for high-stakes use cases (adds ~300ms latency):
+
+```typescript
+reranking: { enabled: true, model: "@cf/baai/bge-reranker-base" }
+```