update skills

This commit is contained in:
2026-03-17 16:53:22 -07:00
parent 0b0783ef8e
commit f9a530667e
389 changed files with 54512 additions and 1 deletions

View File

@@ -0,0 +1,133 @@
# Cloudflare Vectorize
Globally distributed vector database for AI applications. Store and query vector embeddings for semantic search, recommendations, RAG, and classification.
**Status:** Generally Available (GA) | **Last Updated:** 2026-01-27
## Quick Start
```typescript
// 1. Create index
// npx wrangler vectorize create my-index --dimensions=768 --metric=cosine
// 2. Configure binding (wrangler.jsonc)
// { "vectorize": [{ "binding": "VECTORIZE", "index_name": "my-index" }] }
// 3. Query vectors
const matches = await env.VECTORIZE.query(queryVector, { topK: 5 });
```
## Key Features
- **10M vectors per index** (V2)
- Dimensions up to 1536 (32-bit float)
- Three distance metrics: cosine, euclidean, dot-product
- Metadata filtering (up to 10 indexes)
- Namespace support (50K namespaces paid, 1K free)
- Seamless Workers AI integration
- Global distribution
## Reading Order
| Task | Files to Read |
|------|---------------|
| New to Vectorize | README only |
| Implement feature | README + api + patterns |
| Setup/configure | README + configuration |
| Debug issues | gotchas |
| Integrate with AI | README + patterns |
| RAG implementation | README + patterns |
## File Guide
- **README.md** (this file): Overview, quick decisions
- **api.md**: Runtime API, types, operations (query/insert/upsert)
- **configuration.md**: Setup, CLI, metadata indexes
- **patterns.md**: RAG, Workers AI, OpenAI, LangChain, multi-tenant
- **gotchas.md**: Limits, pitfalls, troubleshooting
## Distance Metric Selection
Choose based on your use case:
```
What are you building?
├─ Text/semantic search → cosine (most common)
├─ Image similarity → euclidean
├─ Recommendation system → dot-product
└─ Pre-normalized vectors → dot-product
```
| Metric | Best For | Score Interpretation |
|--------|----------|---------------------|
| `cosine` | Text embeddings, semantic similarity | Higher = closer (1.0 = identical) |
| `euclidean` | Absolute distance, spatial data | Lower = closer (0.0 = identical) |
| `dot-product` | Recommendations, normalized vectors | Higher = closer |
**Note:** Index configuration is immutable. Cannot change dimensions or metric after creation.
## Multi-Tenancy Strategy
```
How many tenants?
├─ < 50K tenants → Use namespaces (recommended)
│ ├─ Fastest (filter before vector search)
│ └─ Strict isolation
├─ > 50K tenants → Use metadata filtering
│ ├─ Slower (post-filter after vector search)
│ └─ Requires metadata index
└─ Per-tenant indexes → Only if compliance mandated
└─ 50K index limit per account (paid plan)
```
## Common Workflows
### Semantic Search
```typescript
// 1. Generate embedding
const result = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
// 2. Query Vectorize
const matches = await env.VECTORIZE.query(result.data[0], {
topK: 5,
returnMetadata: "indexed"
});
```
### RAG Pattern
```typescript
// 1. Generate query embedding
const embedding = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
// 2. Search Vectorize
const matches = await env.VECTORIZE.query(embedding.data[0], { topK: 5 });
// 3. Fetch full documents from R2/D1/KV
const docs = await Promise.all(matches.matches.map(m =>
env.R2.get(m.metadata.key).then(obj => obj?.text())
));
// 4. Generate LLM response with context
const answer = await env.AI.run("@cf/meta/llama-3-8b-instruct", {
prompt: `Context: ${docs.join("\n\n")}\n\nQuestion: ${query}\n\nAnswer:`
});
```
## Critical Gotchas
See `gotchas.md` for details. Most important:
1. **Async mutations**: Inserts take 5-10s to be queryable
2. **500 batch limit**: Workers API enforces 500 vectors per call (undocumented)
3. **Metadata truncation**: `"indexed"` returns first 64 bytes only
4. **topK with metadata**: Max 20 (not 100) when using returnValues or returnMetadata: "all"
5. **Metadata indexes first**: Must create before inserting vectors
## Resources
- [Official Docs](https://developers.cloudflare.com/vectorize/)
- [Client API Reference](https://developers.cloudflare.com/vectorize/reference/client-api/)
- [Workers AI Models](https://developers.cloudflare.com/workers-ai/models/#text-embeddings)
- [Discord: #vectorize](https://discord.cloudflare.com)

View File

@@ -0,0 +1,88 @@
# Vectorize API Reference
## Types
```typescript
interface VectorizeVector {
id: string; // Max 64 bytes
values: number[]; // Must match index dimensions
namespace?: string; // Optional partition (max 64 bytes)
metadata?: Record<string, any>; // Max 10 KiB
}
```
## Query
```typescript
const matches = await env.VECTORIZE.query(queryVector, {
topK: 10, // Max 100 (or 20 with returnValues/returnMetadata:"all")
returnMetadata: "indexed", // "none" | "indexed" | "all"
returnValues: false,
namespace: "tenant-123",
filter: { category: "docs" }
});
// matches.matches[0] = { id, score, metadata? }
```
**returnMetadata:** `"none"` (fastest) → `"indexed"` (recommended) → `"all"` (topK max 20)
**queryById (V2 only):** Search using existing vector as query.
```typescript
await env.VECTORIZE.queryById("doc-123", { topK: 5 });
```
## Insert/Upsert
```typescript
// Insert: ignores duplicates (keeps first)
await env.VECTORIZE.insert([{ id, values, metadata }]);
// Upsert: overwrites duplicates (keeps last)
await env.VECTORIZE.upsert([{ id, values, metadata }]);
```
**Max 500 vectors per call.** Queryable after 5-10 seconds.
## Other Operations
```typescript
// Get by IDs
const vectors = await env.VECTORIZE.getByIds(["id1", "id2"]);
// Delete (max 1000 IDs per call)
await env.VECTORIZE.deleteByIds(["id1", "id2"]);
// Index info
const info = await env.VECTORIZE.describe();
// { dimensions, metric, vectorCount }
```
## Filtering
Requires metadata index. Filter operators:
| Operator | Example |
|----------|---------|
| `$eq` (implicit) | `{ category: "docs" }` |
| `$ne` | `{ status: { $ne: "deleted" } }` |
| `$in` / `$nin` | `{ tag: { $in: ["sale"] } }` |
| `$lt`, `$lte`, `$gt`, `$gte` | `{ price: { $lt: 100 } }` |
**Constraints:** Max 2048 bytes, no dots/`$` in keys, values: string/number/boolean/null.
## Performance
| Configuration | topK Limit | Speed |
|--------------|------------|-------|
| No metadata | 100 | Fastest |
| `returnMetadata: "indexed"` | 100 | Fast |
| `returnMetadata: "all"` | 20 | Slower |
| `returnValues: true` | 20 | Slower |
**Batch operations:** Always batch (500/call) for optimal throughput.
```typescript
for (let i = 0; i < vectors.length; i += 500) {
await env.VECTORIZE.upsert(vectors.slice(i, i + 500));
}
```

View File

@@ -0,0 +1,88 @@
# Vectorize Configuration
## Create Index
```bash
npx wrangler vectorize create my-index --dimensions=768 --metric=cosine
```
**⚠️ Dimensions and metric are immutable** - cannot change after creation.
## Worker Binding
```jsonc
// wrangler.jsonc
{
"vectorize": [
{ "binding": "VECTORIZE", "index_name": "my-index" }
]
}
```
```typescript
interface Env {
VECTORIZE: Vectorize;
}
```
## Metadata Indexes
**Must create BEFORE inserting vectors** - existing vectors not retroactively indexed.
```bash
wrangler vectorize create-metadata-index my-index --property-name=category --type=string
wrangler vectorize create-metadata-index my-index --property-name=price --type=number
```
| Type | Use For |
|------|---------|
| `string` | Categories, tags (first 64 bytes indexed) |
| `number` | Prices, timestamps |
| `boolean` | Flags |
## CLI Commands
```bash
# Index management
wrangler vectorize list
wrangler vectorize info <index-name>
wrangler vectorize delete <index-name>
# Vector operations
wrangler vectorize insert <index-name> --file=embeddings.ndjson
wrangler vectorize get <index-name> --ids=id1,id2
wrangler vectorize delete-by-ids <index-name> --ids=id1,id2
# Metadata indexes
wrangler vectorize list-metadata-index <index-name>
wrangler vectorize delete-metadata-index <index-name> --property-name=field
```
## Bulk Upload (NDJSON)
```json
{"id": "1", "values": [0.1, 0.2, ...], "metadata": {"category": "docs"}}
{"id": "2", "values": [0.4, 0.5, ...], "namespace": "tenant-abc"}
```
**Limits:** 5000 vectors per file, 100 MB max
## Cardinality Best Practice
Bucket high-cardinality data:
```typescript
// ❌ Millisecond timestamps
metadata: { timestamp: Date.now() }
// ✅ 5-minute buckets
metadata: { timestamp_bucket: Math.floor(Date.now() / 300000) * 300000 }
```
## Production Checklist
1. Create index with correct dimensions
2. Create metadata indexes FIRST
3. Test bulk upload
4. Configure bindings
5. Deploy Worker
6. Verify queries

View File

@@ -0,0 +1,76 @@
# Vectorize Gotchas
## Critical Warnings
### Async Mutations
Insert/upsert/delete return immediately but vectors aren't queryable for 5-10 seconds.
### Batch Size Limit
**Workers API: 500 vectors max per call** (undocumented, silently truncates)
```typescript
// ✅ Chunk into 500
for (let i = 0; i < vectors.length; i += 500) {
await env.VECTORIZE.upsert(vectors.slice(i, i + 500));
}
```
### Metadata Truncation
`returnMetadata: "indexed"` returns only first 64 bytes of strings. Use `"all"` for complete metadata (but max topK drops to 20).
### topK Limits
| returnMetadata | returnValues | Max topK |
|----------------|--------------|----------|
| `"none"` / `"indexed"` | `false` | 100 |
| `"all"` | any | **20** |
| any | `true` | **20** |
### Metadata Indexes First
Create BEFORE inserting - existing vectors not retroactively indexed.
```bash
# ✅ Create index FIRST
wrangler vectorize create-metadata-index my-index --property-name=category --type=string
wrangler vectorize insert my-index --file=data.ndjson
```
### Index Config Immutable
Cannot change dimensions/metric after creation. Must create new index and migrate.
## Limits (V2)
| Resource | Limit |
|----------|-------|
| Vectors per index | 10,000,000 |
| Max dimensions | 1536 |
| Batch upsert (Workers) | **500** |
| Indexed string metadata | **64 bytes** |
| Metadata indexes | 10 |
| Namespaces | 50,000 (paid) / 1,000 (free) |
## Common Mistakes
1. **Wrong embedding shape:** Extract `result.data[0]` from Workers AI
2. **Metadata index after data:** Re-upsert all vectors
3. **Insert vs upsert:** `insert` ignores duplicates, `upsert` overwrites
4. **Not batching:** Individual inserts ~1K/min, batched ~200K+/min
## Troubleshooting
**No results?**
- Wait 5-10s after insert
- Check namespace spelling (case-sensitive)
- Verify metadata index exists
- Check dimension mismatch
**Metadata filter not working?**
- Index must exist before data insert
- Strings >64 bytes truncated
- Use dot notation for nested: `"product.category"`
## Model Dimensions
- `@cf/baai/bge-small-en-v1.5`: 384
- `@cf/baai/bge-base-en-v1.5`: 768
- `@cf/baai/bge-large-en-v1.5`: 1024

View File

@@ -0,0 +1,90 @@
# Vectorize Patterns
## Workers AI Integration
```typescript
// Generate embedding + query
const result = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
const matches = await env.VECTORIZE.query(result.data[0], { topK: 5 }); // Pass data[0]!
```
| Model | Dimensions |
|-------|------------|
| `@cf/baai/bge-small-en-v1.5` | 384 |
| `@cf/baai/bge-base-en-v1.5` | 768 (recommended) |
| `@cf/baai/bge-large-en-v1.5` | 1024 |
## OpenAI Integration
```typescript
const response = await openai.embeddings.create({ model: "text-embedding-ada-002", input: query });
const matches = await env.VECTORIZE.query(response.data[0].embedding, { topK: 5 });
```
## RAG Pattern
```typescript
// 1. Embed query
const emb = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
// 2. Search vectors
const matches = await env.VECTORIZE.query(emb.data[0], { topK: 5, returnMetadata: "indexed" });
// 3. Fetch full docs from R2/D1/KV
const docs = await Promise.all(matches.matches.map(m => env.R2.get(m.metadata.key).then(o => o?.text())));
// 4. Generate with context
const answer = await env.AI.run("@cf/meta/llama-3-8b-instruct", {
prompt: `Context:\n${docs.filter(Boolean).join("\n\n")}\n\nQuestion: ${query}\n\nAnswer:`
});
```
## Multi-Tenant
### Namespaces (< 50K tenants, fastest)
```typescript
await env.VECTORIZE.upsert([{ id: "1", values: emb, namespace: `tenant-${id}` }]);
await env.VECTORIZE.query(vec, { namespace: `tenant-${id}`, topK: 10 });
```
### Metadata Filter (> 50K tenants)
```bash
wrangler vectorize create-metadata-index my-index --property-name=tenantId --type=string
```
```typescript
await env.VECTORIZE.upsert([{ id: "1", values: emb, metadata: { tenantId: id } }]);
await env.VECTORIZE.query(vec, { filter: { tenantId: id }, topK: 10 });
```
## Hybrid Search
```typescript
const matches = await env.VECTORIZE.query(vec, {
topK: 20,
filter: {
category: { $in: ["tech", "science"] },
published: { $gte: lastMonthTimestamp }
}
});
```
## Batch Ingestion
```typescript
const BATCH = 500;
for (let i = 0; i < vectors.length; i += BATCH) {
await env.VECTORIZE.upsert(vectors.slice(i, i + BATCH));
}
```
## Best Practices
1. **Pass `data[0]`** not `data` or full response
2. **Batch 500** vectors per upsert
3. **Create metadata indexes** before inserting
4. **Use namespaces** for tenant isolation (faster than filters)
5. **`returnMetadata: "indexed"`** for best speed/data balance
6. **Handle 5-10s mutation delay** in async operations