update skills

2026-07-03 09:13:32 -07:00 · 2026-03-17 16:53:22 -07:00
parent 0b0783ef8e
commit f9a530667e
389 changed files with 54512 additions and 1 deletions
@@ -0,0 +1,133 @@
+# Cloudflare Vectorize
+
+Globally distributed vector database for AI applications. Store and query vector embeddings for semantic search, recommendations, RAG, and classification.
+
+**Status:** Generally Available (GA) | **Last Updated:** 2026-01-27
+
+## Quick Start
+
+```typescript
+// 1. Create index
+// npx wrangler vectorize create my-index --dimensions=768 --metric=cosine
+
+// 2. Configure binding (wrangler.jsonc)
+// { "vectorize": [{ "binding": "VECTORIZE", "index_name": "my-index" }] }
+
+// 3. Query vectors
+const matches = await env.VECTORIZE.query(queryVector, { topK: 5 });
+```
+
+## Key Features
+
+- **10M vectors per index** (V2)
+- Dimensions up to 1536 (32-bit float)
+- Three distance metrics: cosine, euclidean, dot-product
+- Metadata filtering (up to 10 indexes)
+- Namespace support (50K namespaces paid, 1K free)
+- Seamless Workers AI integration
+- Global distribution
+
+## Reading Order
+
+| Task | Files to Read |
+|------|---------------|
+| New to Vectorize | README only |
+| Implement feature | README + api + patterns |
+| Setup/configure | README + configuration |
+| Debug issues | gotchas |
+| Integrate with AI | README + patterns |
+| RAG implementation | README + patterns |
+
+## File Guide
+
+- **README.md** (this file): Overview, quick decisions
+- **api.md**: Runtime API, types, operations (query/insert/upsert)
+- **configuration.md**: Setup, CLI, metadata indexes
+- **patterns.md**: RAG, Workers AI, OpenAI, LangChain, multi-tenant
+- **gotchas.md**: Limits, pitfalls, troubleshooting
+
+## Distance Metric Selection
+
+Choose based on your use case:
+
+```
+What are you building?
+├─ Text/semantic search → cosine (most common)
+├─ Image similarity → euclidean
+├─ Recommendation system → dot-product
+└─ Pre-normalized vectors → dot-product
+```
+
+| Metric | Best For | Score Interpretation |
+|--------|----------|---------------------|
+| `cosine` | Text embeddings, semantic similarity | Higher = closer (1.0 = identical) |
+| `euclidean` | Absolute distance, spatial data | Lower = closer (0.0 = identical) |
+| `dot-product` | Recommendations, normalized vectors | Higher = closer |
+
+**Note:** Index configuration is immutable. Cannot change dimensions or metric after creation.
+
+## Multi-Tenancy Strategy
+
+```
+How many tenants?
+├─ < 50K tenants → Use namespaces (recommended)
+│   ├─ Fastest (filter before vector search)
+│   └─ Strict isolation
+├─ > 50K tenants → Use metadata filtering
+│   ├─ Slower (post-filter after vector search)
+│   └─ Requires metadata index
+└─ Per-tenant indexes → Only if compliance mandated
+    └─ 50K index limit per account (paid plan)
+```
+
+## Common Workflows
+
+### Semantic Search
+
+```typescript
+// 1. Generate embedding
+const result = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
+
+// 2. Query Vectorize
+const matches = await env.VECTORIZE.query(result.data[0], {
+  topK: 5,
+  returnMetadata: "indexed"
+});
+```
+
+### RAG Pattern
+
+```typescript
+// 1. Generate query embedding
+const embedding = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
+
+// 2. Search Vectorize
+const matches = await env.VECTORIZE.query(embedding.data[0], { topK: 5 });
+
+// 3. Fetch full documents from R2/D1/KV
+const docs = await Promise.all(matches.matches.map(m => 
+  env.R2.get(m.metadata.key).then(obj => obj?.text())
+));
+
+// 4. Generate LLM response with context
+const answer = await env.AI.run("@cf/meta/llama-3-8b-instruct", {
+  prompt: `Context: ${docs.join("\n\n")}\n\nQuestion: ${query}\n\nAnswer:`
+});
+```
+
+## Critical Gotchas
+
+See `gotchas.md` for details. Most important:
+
+1. **Async mutations**: Inserts take 5-10s to be queryable
+2. **500 batch limit**: Workers API enforces 500 vectors per call (undocumented)
+3. **Metadata truncation**: `"indexed"` returns first 64 bytes only
+4. **topK with metadata**: Max 20 (not 100) when using returnValues or returnMetadata: "all"
+5. **Metadata indexes first**: Must create before inserting vectors
+
+## Resources
+
+- [Official Docs](https://developers.cloudflare.com/vectorize/)
+- [Client API Reference](https://developers.cloudflare.com/vectorize/reference/client-api/)
+- [Workers AI Models](https://developers.cloudflare.com/workers-ai/models/#text-embeddings)
+- [Discord: #vectorize](https://discord.cloudflare.com)
@@ -0,0 +1,88 @@
+# Vectorize API Reference
+
+## Types
+
+```typescript
+interface VectorizeVector {
+  id: string;                    // Max 64 bytes
+  values: number[];              // Must match index dimensions
+  namespace?: string;            // Optional partition (max 64 bytes)
+  metadata?: Record<string, any>; // Max 10 KiB
+}
+```
+
+## Query
+
+```typescript
+const matches = await env.VECTORIZE.query(queryVector, {
+  topK: 10,                        // Max 100 (or 20 with returnValues/returnMetadata:"all")
+  returnMetadata: "indexed",       // "none" | "indexed" | "all"
+  returnValues: false,
+  namespace: "tenant-123",
+  filter: { category: "docs" }
+});
+// matches.matches[0] = { id, score, metadata? }
+```
+
+**returnMetadata:** `"none"` (fastest) → `"indexed"` (recommended) → `"all"` (topK max 20)
+
+**queryById (V2 only):** Search using existing vector as query.
+```typescript
+await env.VECTORIZE.queryById("doc-123", { topK: 5 });
+```
+
+## Insert/Upsert
+
+```typescript
+// Insert: ignores duplicates (keeps first)
+await env.VECTORIZE.insert([{ id, values, metadata }]);
+
+// Upsert: overwrites duplicates (keeps last)
+await env.VECTORIZE.upsert([{ id, values, metadata }]);
+```
+
+**Max 500 vectors per call.** Queryable after 5-10 seconds.
+
+## Other Operations
+
+```typescript
+// Get by IDs
+const vectors = await env.VECTORIZE.getByIds(["id1", "id2"]);
+
+// Delete (max 1000 IDs per call)
+await env.VECTORIZE.deleteByIds(["id1", "id2"]);
+
+// Index info
+const info = await env.VECTORIZE.describe();
+// { dimensions, metric, vectorCount }
+```
+
+## Filtering
+
+Requires metadata index. Filter operators:
+
+| Operator | Example |
+|----------|---------|
+| `$eq` (implicit) | `{ category: "docs" }` |
+| `$ne` | `{ status: { $ne: "deleted" } }` |
+| `$in` / `$nin` | `{ tag: { $in: ["sale"] } }` |
+| `$lt`, `$lte`, `$gt`, `$gte` | `{ price: { $lt: 100 } }` |
+
+**Constraints:** Max 2048 bytes, no dots/`$` in keys, values: string/number/boolean/null.
+
+## Performance
+
+| Configuration | topK Limit | Speed |
+|--------------|------------|-------|
+| No metadata | 100 | Fastest |
+| `returnMetadata: "indexed"` | 100 | Fast |
+| `returnMetadata: "all"` | 20 | Slower |
+| `returnValues: true` | 20 | Slower |
+
+**Batch operations:** Always batch (500/call) for optimal throughput.
+
+```typescript
+for (let i = 0; i < vectors.length; i += 500) {
+  await env.VECTORIZE.upsert(vectors.slice(i, i + 500));
+}
+```
@@ -0,0 +1,88 @@
+# Vectorize Configuration
+
+## Create Index
+
+```bash
+npx wrangler vectorize create my-index --dimensions=768 --metric=cosine
+```
+
+**⚠️ Dimensions and metric are immutable** - cannot change after creation.
+
+## Worker Binding
+
+```jsonc
+// wrangler.jsonc
+{
+  "vectorize": [
+    { "binding": "VECTORIZE", "index_name": "my-index" }
+  ]
+}
+```
+
+```typescript
+interface Env {
+  VECTORIZE: Vectorize;
+}
+```
+
+## Metadata Indexes
+
+**Must create BEFORE inserting vectors** - existing vectors not retroactively indexed.
+
+```bash
+wrangler vectorize create-metadata-index my-index --property-name=category --type=string
+wrangler vectorize create-metadata-index my-index --property-name=price --type=number
+```
+
+| Type | Use For |
+|------|---------|
+| `string` | Categories, tags (first 64 bytes indexed) |
+| `number` | Prices, timestamps |
+| `boolean` | Flags |
+
+## CLI Commands
+
+```bash
+# Index management
+wrangler vectorize list
+wrangler vectorize info <index-name>
+wrangler vectorize delete <index-name>
+
+# Vector operations
+wrangler vectorize insert <index-name> --file=embeddings.ndjson
+wrangler vectorize get <index-name> --ids=id1,id2
+wrangler vectorize delete-by-ids <index-name> --ids=id1,id2
+
+# Metadata indexes
+wrangler vectorize list-metadata-index <index-name>
+wrangler vectorize delete-metadata-index <index-name> --property-name=field
+```
+
+## Bulk Upload (NDJSON)
+
+```json
+{"id": "1", "values": [0.1, 0.2, ...], "metadata": {"category": "docs"}}
+{"id": "2", "values": [0.4, 0.5, ...], "namespace": "tenant-abc"}
+```
+
+**Limits:** 5000 vectors per file, 100 MB max
+
+## Cardinality Best Practice
+
+Bucket high-cardinality data:
+```typescript
+// ❌ Millisecond timestamps
+metadata: { timestamp: Date.now() }
+
+// ✅ 5-minute buckets
+metadata: { timestamp_bucket: Math.floor(Date.now() / 300000) * 300000 }
+```
+
+## Production Checklist
+
+1. Create index with correct dimensions
+2. Create metadata indexes FIRST
+3. Test bulk upload
+4. Configure bindings
+5. Deploy Worker
+6. Verify queries
@@ -0,0 +1,76 @@
+# Vectorize Gotchas
+
+## Critical Warnings
+
+### Async Mutations
+Insert/upsert/delete return immediately but vectors aren't queryable for 5-10 seconds.
+
+### Batch Size Limit
+**Workers API: 500 vectors max per call** (undocumented, silently truncates)
+
+```typescript
+// ✅ Chunk into 500
+for (let i = 0; i < vectors.length; i += 500) {
+  await env.VECTORIZE.upsert(vectors.slice(i, i + 500));
+}
+```
+
+### Metadata Truncation
+`returnMetadata: "indexed"` returns only first 64 bytes of strings. Use `"all"` for complete metadata (but max topK drops to 20).
+
+### topK Limits
+
+| returnMetadata | returnValues | Max topK |
+|----------------|--------------|----------|
+| `"none"` / `"indexed"` | `false` | 100 |
+| `"all"` | any | **20** |
+| any | `true` | **20** |
+
+### Metadata Indexes First
+Create BEFORE inserting - existing vectors not retroactively indexed.
+
+```bash
+# ✅ Create index FIRST
+wrangler vectorize create-metadata-index my-index --property-name=category --type=string
+wrangler vectorize insert my-index --file=data.ndjson
+```
+
+### Index Config Immutable
+Cannot change dimensions/metric after creation. Must create new index and migrate.
+
+## Limits (V2)
+
+| Resource | Limit |
+|----------|-------|
+| Vectors per index | 10,000,000 |
+| Max dimensions | 1536 |
+| Batch upsert (Workers) | **500** |
+| Indexed string metadata | **64 bytes** |
+| Metadata indexes | 10 |
+| Namespaces | 50,000 (paid) / 1,000 (free) |
+
+## Common Mistakes
+
+1. **Wrong embedding shape:** Extract `result.data[0]` from Workers AI
+2. **Metadata index after data:** Re-upsert all vectors
+3. **Insert vs upsert:** `insert` ignores duplicates, `upsert` overwrites
+4. **Not batching:** Individual inserts ~1K/min, batched ~200K+/min
+
+## Troubleshooting
+
+**No results?**
+- Wait 5-10s after insert
+- Check namespace spelling (case-sensitive)
+- Verify metadata index exists
+- Check dimension mismatch
+
+**Metadata filter not working?**
+- Index must exist before data insert
+- Strings >64 bytes truncated
+- Use dot notation for nested: `"product.category"`
+
+## Model Dimensions
+
+- `@cf/baai/bge-small-en-v1.5`: 384
+- `@cf/baai/bge-base-en-v1.5`: 768
+- `@cf/baai/bge-large-en-v1.5`: 1024
@@ -0,0 +1,90 @@
+# Vectorize Patterns
+
+## Workers AI Integration
+
+```typescript
+// Generate embedding + query
+const result = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
+const matches = await env.VECTORIZE.query(result.data[0], { topK: 5 }); // Pass data[0]!
+```
+
+| Model | Dimensions |
+|-------|------------|
+| `@cf/baai/bge-small-en-v1.5` | 384 |
+| `@cf/baai/bge-base-en-v1.5` | 768 (recommended) |
+| `@cf/baai/bge-large-en-v1.5` | 1024 |
+
+## OpenAI Integration
+
+```typescript
+const response = await openai.embeddings.create({ model: "text-embedding-ada-002", input: query });
+const matches = await env.VECTORIZE.query(response.data[0].embedding, { topK: 5 });
+```
+
+## RAG Pattern
+
+```typescript
+// 1. Embed query
+const emb = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
+
+// 2. Search vectors
+const matches = await env.VECTORIZE.query(emb.data[0], { topK: 5, returnMetadata: "indexed" });
+
+// 3. Fetch full docs from R2/D1/KV
+const docs = await Promise.all(matches.matches.map(m => env.R2.get(m.metadata.key).then(o => o?.text())));
+
+// 4. Generate with context
+const answer = await env.AI.run("@cf/meta/llama-3-8b-instruct", {
+  prompt: `Context:\n${docs.filter(Boolean).join("\n\n")}\n\nQuestion: ${query}\n\nAnswer:`
+});
+```
+
+## Multi-Tenant
+
+### Namespaces (< 50K tenants, fastest)
+
+```typescript
+await env.VECTORIZE.upsert([{ id: "1", values: emb, namespace: `tenant-${id}` }]);
+await env.VECTORIZE.query(vec, { namespace: `tenant-${id}`, topK: 10 });
+```
+
+### Metadata Filter (> 50K tenants)
+
+```bash
+wrangler vectorize create-metadata-index my-index --property-name=tenantId --type=string
+```
+
+```typescript
+await env.VECTORIZE.upsert([{ id: "1", values: emb, metadata: { tenantId: id } }]);
+await env.VECTORIZE.query(vec, { filter: { tenantId: id }, topK: 10 });
+```
+
+## Hybrid Search
+
+```typescript
+const matches = await env.VECTORIZE.query(vec, {
+  topK: 20,
+  filter: {
+    category: { $in: ["tech", "science"] },
+    published: { $gte: lastMonthTimestamp }
+  }
+});
+```
+
+## Batch Ingestion
+
+```typescript
+const BATCH = 500;
+for (let i = 0; i < vectors.length; i += BATCH) {
+  await env.VECTORIZE.upsert(vectors.slice(i, i + BATCH));
+}
+```
+
+## Best Practices
+
+1. **Pass `data[0]`** not `data` or full response
+2. **Batch 500** vectors per upsert
+3. **Create metadata indexes** before inserting
+4. **Use namespaces** for tenant isolation (faster than filters)
+5. **`returnMetadata: "indexed"`** for best speed/data balance
+6. **Handle 5-10s mutation delay** in async operations