mirror of
https://github.com/ksyasuda/dotfiles.git
synced 2026-03-21 06:11:27 -07:00
update skills
This commit is contained in:
133
.agents/skills/cloudflare-deploy/references/vectorize/README.md
Normal file
133
.agents/skills/cloudflare-deploy/references/vectorize/README.md
Normal file
@@ -0,0 +1,133 @@
|
||||
# Cloudflare Vectorize
|
||||
|
||||
Globally distributed vector database for AI applications. Store and query vector embeddings for semantic search, recommendations, RAG, and classification.
|
||||
|
||||
**Status:** Generally Available (GA) | **Last Updated:** 2026-01-27
|
||||
|
||||
## Quick Start
|
||||
|
||||
```typescript
|
||||
// 1. Create index
|
||||
// npx wrangler vectorize create my-index --dimensions=768 --metric=cosine
|
||||
|
||||
// 2. Configure binding (wrangler.jsonc)
|
||||
// { "vectorize": [{ "binding": "VECTORIZE", "index_name": "my-index" }] }
|
||||
|
||||
// 3. Query vectors
|
||||
const matches = await env.VECTORIZE.query(queryVector, { topK: 5 });
|
||||
```
|
||||
|
||||
## Key Features
|
||||
|
||||
- **10M vectors per index** (V2)
|
||||
- Dimensions up to 1536 (32-bit float)
|
||||
- Three distance metrics: cosine, euclidean, dot-product
|
||||
- Metadata filtering (up to 10 indexes)
|
||||
- Namespace support (50K namespaces paid, 1K free)
|
||||
- Seamless Workers AI integration
|
||||
- Global distribution
|
||||
|
||||
## Reading Order
|
||||
|
||||
| Task | Files to Read |
|
||||
|------|---------------|
|
||||
| New to Vectorize | README only |
|
||||
| Implement feature | README + api + patterns |
|
||||
| Setup/configure | README + configuration |
|
||||
| Debug issues | gotchas |
|
||||
| Integrate with AI | README + patterns |
|
||||
| RAG implementation | README + patterns |
|
||||
|
||||
## File Guide
|
||||
|
||||
- **README.md** (this file): Overview, quick decisions
|
||||
- **api.md**: Runtime API, types, operations (query/insert/upsert)
|
||||
- **configuration.md**: Setup, CLI, metadata indexes
|
||||
- **patterns.md**: RAG, Workers AI, OpenAI, LangChain, multi-tenant
|
||||
- **gotchas.md**: Limits, pitfalls, troubleshooting
|
||||
|
||||
## Distance Metric Selection
|
||||
|
||||
Choose based on your use case:
|
||||
|
||||
```
|
||||
What are you building?
|
||||
├─ Text/semantic search → cosine (most common)
|
||||
├─ Image similarity → euclidean
|
||||
├─ Recommendation system → dot-product
|
||||
└─ Pre-normalized vectors → dot-product
|
||||
```
|
||||
|
||||
| Metric | Best For | Score Interpretation |
|
||||
|--------|----------|---------------------|
|
||||
| `cosine` | Text embeddings, semantic similarity | Higher = closer (1.0 = identical) |
|
||||
| `euclidean` | Absolute distance, spatial data | Lower = closer (0.0 = identical) |
|
||||
| `dot-product` | Recommendations, normalized vectors | Higher = closer |
|
||||
|
||||
**Note:** Index configuration is immutable. Cannot change dimensions or metric after creation.
|
||||
|
||||
## Multi-Tenancy Strategy
|
||||
|
||||
```
|
||||
How many tenants?
|
||||
├─ < 50K tenants → Use namespaces (recommended)
|
||||
│ ├─ Fastest (filter before vector search)
|
||||
│ └─ Strict isolation
|
||||
├─ > 50K tenants → Use metadata filtering
|
||||
│ ├─ Slower (post-filter after vector search)
|
||||
│ └─ Requires metadata index
|
||||
└─ Per-tenant indexes → Only if compliance mandated
|
||||
└─ 50K index limit per account (paid plan)
|
||||
```
|
||||
|
||||
## Common Workflows
|
||||
|
||||
### Semantic Search
|
||||
|
||||
```typescript
|
||||
// 1. Generate embedding
|
||||
const result = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
|
||||
|
||||
// 2. Query Vectorize
|
||||
const matches = await env.VECTORIZE.query(result.data[0], {
|
||||
topK: 5,
|
||||
returnMetadata: "indexed"
|
||||
});
|
||||
```
|
||||
|
||||
### RAG Pattern
|
||||
|
||||
```typescript
|
||||
// 1. Generate query embedding
|
||||
const embedding = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
|
||||
|
||||
// 2. Search Vectorize
|
||||
const matches = await env.VECTORIZE.query(embedding.data[0], { topK: 5 });
|
||||
|
||||
// 3. Fetch full documents from R2/D1/KV
|
||||
const docs = await Promise.all(matches.matches.map(m =>
|
||||
env.R2.get(m.metadata.key).then(obj => obj?.text())
|
||||
));
|
||||
|
||||
// 4. Generate LLM response with context
|
||||
const answer = await env.AI.run("@cf/meta/llama-3-8b-instruct", {
|
||||
prompt: `Context: ${docs.join("\n\n")}\n\nQuestion: ${query}\n\nAnswer:`
|
||||
});
|
||||
```
|
||||
|
||||
## Critical Gotchas
|
||||
|
||||
See `gotchas.md` for details. Most important:
|
||||
|
||||
1. **Async mutations**: Inserts take 5-10s to be queryable
|
||||
2. **500 batch limit**: Workers API enforces 500 vectors per call (undocumented)
|
||||
3. **Metadata truncation**: `"indexed"` returns first 64 bytes only
|
||||
4. **topK with metadata**: Max 20 (not 100) when using returnValues or returnMetadata: "all"
|
||||
5. **Metadata indexes first**: Must create before inserting vectors
|
||||
|
||||
## Resources
|
||||
|
||||
- [Official Docs](https://developers.cloudflare.com/vectorize/)
|
||||
- [Client API Reference](https://developers.cloudflare.com/vectorize/reference/client-api/)
|
||||
- [Workers AI Models](https://developers.cloudflare.com/workers-ai/models/#text-embeddings)
|
||||
- [Discord: #vectorize](https://discord.cloudflare.com)
|
||||
88
.agents/skills/cloudflare-deploy/references/vectorize/api.md
Normal file
88
.agents/skills/cloudflare-deploy/references/vectorize/api.md
Normal file
@@ -0,0 +1,88 @@
|
||||
# Vectorize API Reference
|
||||
|
||||
## Types
|
||||
|
||||
```typescript
|
||||
interface VectorizeVector {
|
||||
id: string; // Max 64 bytes
|
||||
values: number[]; // Must match index dimensions
|
||||
namespace?: string; // Optional partition (max 64 bytes)
|
||||
metadata?: Record<string, any>; // Max 10 KiB
|
||||
}
|
||||
```
|
||||
|
||||
## Query
|
||||
|
||||
```typescript
|
||||
const matches = await env.VECTORIZE.query(queryVector, {
|
||||
topK: 10, // Max 100 (or 20 with returnValues/returnMetadata:"all")
|
||||
returnMetadata: "indexed", // "none" | "indexed" | "all"
|
||||
returnValues: false,
|
||||
namespace: "tenant-123",
|
||||
filter: { category: "docs" }
|
||||
});
|
||||
// matches.matches[0] = { id, score, metadata? }
|
||||
```
|
||||
|
||||
**returnMetadata:** `"none"` (fastest) → `"indexed"` (recommended) → `"all"` (topK max 20)
|
||||
|
||||
**queryById (V2 only):** Search using existing vector as query.
|
||||
```typescript
|
||||
await env.VECTORIZE.queryById("doc-123", { topK: 5 });
|
||||
```
|
||||
|
||||
## Insert/Upsert
|
||||
|
||||
```typescript
|
||||
// Insert: ignores duplicates (keeps first)
|
||||
await env.VECTORIZE.insert([{ id, values, metadata }]);
|
||||
|
||||
// Upsert: overwrites duplicates (keeps last)
|
||||
await env.VECTORIZE.upsert([{ id, values, metadata }]);
|
||||
```
|
||||
|
||||
**Max 500 vectors per call.** Queryable after 5-10 seconds.
|
||||
|
||||
## Other Operations
|
||||
|
||||
```typescript
|
||||
// Get by IDs
|
||||
const vectors = await env.VECTORIZE.getByIds(["id1", "id2"]);
|
||||
|
||||
// Delete (max 1000 IDs per call)
|
||||
await env.VECTORIZE.deleteByIds(["id1", "id2"]);
|
||||
|
||||
// Index info
|
||||
const info = await env.VECTORIZE.describe();
|
||||
// { dimensions, metric, vectorCount }
|
||||
```
|
||||
|
||||
## Filtering
|
||||
|
||||
Requires metadata index. Filter operators:
|
||||
|
||||
| Operator | Example |
|
||||
|----------|---------|
|
||||
| `$eq` (implicit) | `{ category: "docs" }` |
|
||||
| `$ne` | `{ status: { $ne: "deleted" } }` |
|
||||
| `$in` / `$nin` | `{ tag: { $in: ["sale"] } }` |
|
||||
| `$lt`, `$lte`, `$gt`, `$gte` | `{ price: { $lt: 100 } }` |
|
||||
|
||||
**Constraints:** Max 2048 bytes, no dots/`$` in keys, values: string/number/boolean/null.
|
||||
|
||||
## Performance
|
||||
|
||||
| Configuration | topK Limit | Speed |
|
||||
|--------------|------------|-------|
|
||||
| No metadata | 100 | Fastest |
|
||||
| `returnMetadata: "indexed"` | 100 | Fast |
|
||||
| `returnMetadata: "all"` | 20 | Slower |
|
||||
| `returnValues: true` | 20 | Slower |
|
||||
|
||||
**Batch operations:** Always batch (500/call) for optimal throughput.
|
||||
|
||||
```typescript
|
||||
for (let i = 0; i < vectors.length; i += 500) {
|
||||
await env.VECTORIZE.upsert(vectors.slice(i, i + 500));
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,88 @@
|
||||
# Vectorize Configuration
|
||||
|
||||
## Create Index
|
||||
|
||||
```bash
|
||||
npx wrangler vectorize create my-index --dimensions=768 --metric=cosine
|
||||
```
|
||||
|
||||
**⚠️ Dimensions and metric are immutable** - cannot change after creation.
|
||||
|
||||
## Worker Binding
|
||||
|
||||
```jsonc
|
||||
// wrangler.jsonc
|
||||
{
|
||||
"vectorize": [
|
||||
{ "binding": "VECTORIZE", "index_name": "my-index" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
```typescript
|
||||
interface Env {
|
||||
VECTORIZE: Vectorize;
|
||||
}
|
||||
```
|
||||
|
||||
## Metadata Indexes
|
||||
|
||||
**Must create BEFORE inserting vectors** - existing vectors not retroactively indexed.
|
||||
|
||||
```bash
|
||||
wrangler vectorize create-metadata-index my-index --property-name=category --type=string
|
||||
wrangler vectorize create-metadata-index my-index --property-name=price --type=number
|
||||
```
|
||||
|
||||
| Type | Use For |
|
||||
|------|---------|
|
||||
| `string` | Categories, tags (first 64 bytes indexed) |
|
||||
| `number` | Prices, timestamps |
|
||||
| `boolean` | Flags |
|
||||
|
||||
## CLI Commands
|
||||
|
||||
```bash
|
||||
# Index management
|
||||
wrangler vectorize list
|
||||
wrangler vectorize info <index-name>
|
||||
wrangler vectorize delete <index-name>
|
||||
|
||||
# Vector operations
|
||||
wrangler vectorize insert <index-name> --file=embeddings.ndjson
|
||||
wrangler vectorize get <index-name> --ids=id1,id2
|
||||
wrangler vectorize delete-by-ids <index-name> --ids=id1,id2
|
||||
|
||||
# Metadata indexes
|
||||
wrangler vectorize list-metadata-index <index-name>
|
||||
wrangler vectorize delete-metadata-index <index-name> --property-name=field
|
||||
```
|
||||
|
||||
## Bulk Upload (NDJSON)
|
||||
|
||||
```json
|
||||
{"id": "1", "values": [0.1, 0.2, ...], "metadata": {"category": "docs"}}
|
||||
{"id": "2", "values": [0.4, 0.5, ...], "namespace": "tenant-abc"}
|
||||
```
|
||||
|
||||
**Limits:** 5000 vectors per file, 100 MB max
|
||||
|
||||
## Cardinality Best Practice
|
||||
|
||||
Bucket high-cardinality data:
|
||||
```typescript
|
||||
// ❌ Millisecond timestamps
|
||||
metadata: { timestamp: Date.now() }
|
||||
|
||||
// ✅ 5-minute buckets
|
||||
metadata: { timestamp_bucket: Math.floor(Date.now() / 300000) * 300000 }
|
||||
```
|
||||
|
||||
## Production Checklist
|
||||
|
||||
1. Create index with correct dimensions
|
||||
2. Create metadata indexes FIRST
|
||||
3. Test bulk upload
|
||||
4. Configure bindings
|
||||
5. Deploy Worker
|
||||
6. Verify queries
|
||||
@@ -0,0 +1,76 @@
|
||||
# Vectorize Gotchas
|
||||
|
||||
## Critical Warnings
|
||||
|
||||
### Async Mutations
|
||||
Insert/upsert/delete return immediately but vectors aren't queryable for 5-10 seconds.
|
||||
|
||||
### Batch Size Limit
|
||||
**Workers API: 500 vectors max per call** (undocumented, silently truncates)
|
||||
|
||||
```typescript
|
||||
// ✅ Chunk into 500
|
||||
for (let i = 0; i < vectors.length; i += 500) {
|
||||
await env.VECTORIZE.upsert(vectors.slice(i, i + 500));
|
||||
}
|
||||
```
|
||||
|
||||
### Metadata Truncation
|
||||
`returnMetadata: "indexed"` returns only first 64 bytes of strings. Use `"all"` for complete metadata (but max topK drops to 20).
|
||||
|
||||
### topK Limits
|
||||
|
||||
| returnMetadata | returnValues | Max topK |
|
||||
|----------------|--------------|----------|
|
||||
| `"none"` / `"indexed"` | `false` | 100 |
|
||||
| `"all"` | any | **20** |
|
||||
| any | `true` | **20** |
|
||||
|
||||
### Metadata Indexes First
|
||||
Create BEFORE inserting - existing vectors not retroactively indexed.
|
||||
|
||||
```bash
|
||||
# ✅ Create index FIRST
|
||||
wrangler vectorize create-metadata-index my-index --property-name=category --type=string
|
||||
wrangler vectorize insert my-index --file=data.ndjson
|
||||
```
|
||||
|
||||
### Index Config Immutable
|
||||
Cannot change dimensions/metric after creation. Must create new index and migrate.
|
||||
|
||||
## Limits (V2)
|
||||
|
||||
| Resource | Limit |
|
||||
|----------|-------|
|
||||
| Vectors per index | 10,000,000 |
|
||||
| Max dimensions | 1536 |
|
||||
| Batch upsert (Workers) | **500** |
|
||||
| Indexed string metadata | **64 bytes** |
|
||||
| Metadata indexes | 10 |
|
||||
| Namespaces | 50,000 (paid) / 1,000 (free) |
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
1. **Wrong embedding shape:** Extract `result.data[0]` from Workers AI
|
||||
2. **Metadata index after data:** Re-upsert all vectors
|
||||
3. **Insert vs upsert:** `insert` ignores duplicates, `upsert` overwrites
|
||||
4. **Not batching:** Individual inserts ~1K/min, batched ~200K+/min
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**No results?**
|
||||
- Wait 5-10s after insert
|
||||
- Check namespace spelling (case-sensitive)
|
||||
- Verify metadata index exists
|
||||
- Check dimension mismatch
|
||||
|
||||
**Metadata filter not working?**
|
||||
- Index must exist before data insert
|
||||
- Strings >64 bytes truncated
|
||||
- Use dot notation for nested: `"product.category"`
|
||||
|
||||
## Model Dimensions
|
||||
|
||||
- `@cf/baai/bge-small-en-v1.5`: 384
|
||||
- `@cf/baai/bge-base-en-v1.5`: 768
|
||||
- `@cf/baai/bge-large-en-v1.5`: 1024
|
||||
@@ -0,0 +1,90 @@
|
||||
# Vectorize Patterns
|
||||
|
||||
## Workers AI Integration
|
||||
|
||||
```typescript
|
||||
// Generate embedding + query
|
||||
const result = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
|
||||
const matches = await env.VECTORIZE.query(result.data[0], { topK: 5 }); // Pass data[0]!
|
||||
```
|
||||
|
||||
| Model | Dimensions |
|
||||
|-------|------------|
|
||||
| `@cf/baai/bge-small-en-v1.5` | 384 |
|
||||
| `@cf/baai/bge-base-en-v1.5` | 768 (recommended) |
|
||||
| `@cf/baai/bge-large-en-v1.5` | 1024 |
|
||||
|
||||
## OpenAI Integration
|
||||
|
||||
```typescript
|
||||
const response = await openai.embeddings.create({ model: "text-embedding-ada-002", input: query });
|
||||
const matches = await env.VECTORIZE.query(response.data[0].embedding, { topK: 5 });
|
||||
```
|
||||
|
||||
## RAG Pattern
|
||||
|
||||
```typescript
|
||||
// 1. Embed query
|
||||
const emb = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] });
|
||||
|
||||
// 2. Search vectors
|
||||
const matches = await env.VECTORIZE.query(emb.data[0], { topK: 5, returnMetadata: "indexed" });
|
||||
|
||||
// 3. Fetch full docs from R2/D1/KV
|
||||
const docs = await Promise.all(matches.matches.map(m => env.R2.get(m.metadata.key).then(o => o?.text())));
|
||||
|
||||
// 4. Generate with context
|
||||
const answer = await env.AI.run("@cf/meta/llama-3-8b-instruct", {
|
||||
prompt: `Context:\n${docs.filter(Boolean).join("\n\n")}\n\nQuestion: ${query}\n\nAnswer:`
|
||||
});
|
||||
```
|
||||
|
||||
## Multi-Tenant
|
||||
|
||||
### Namespaces (< 50K tenants, fastest)
|
||||
|
||||
```typescript
|
||||
await env.VECTORIZE.upsert([{ id: "1", values: emb, namespace: `tenant-${id}` }]);
|
||||
await env.VECTORIZE.query(vec, { namespace: `tenant-${id}`, topK: 10 });
|
||||
```
|
||||
|
||||
### Metadata Filter (> 50K tenants)
|
||||
|
||||
```bash
|
||||
wrangler vectorize create-metadata-index my-index --property-name=tenantId --type=string
|
||||
```
|
||||
|
||||
```typescript
|
||||
await env.VECTORIZE.upsert([{ id: "1", values: emb, metadata: { tenantId: id } }]);
|
||||
await env.VECTORIZE.query(vec, { filter: { tenantId: id }, topK: 10 });
|
||||
```
|
||||
|
||||
## Hybrid Search
|
||||
|
||||
```typescript
|
||||
const matches = await env.VECTORIZE.query(vec, {
|
||||
topK: 20,
|
||||
filter: {
|
||||
category: { $in: ["tech", "science"] },
|
||||
published: { $gte: lastMonthTimestamp }
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
## Batch Ingestion
|
||||
|
||||
```typescript
|
||||
const BATCH = 500;
|
||||
for (let i = 0; i < vectors.length; i += BATCH) {
|
||||
await env.VECTORIZE.upsert(vectors.slice(i, i + BATCH));
|
||||
}
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Pass `data[0]`** not `data` or full response
|
||||
2. **Batch 500** vectors per upsert
|
||||
3. **Create metadata indexes** before inserting
|
||||
4. **Use namespaces** for tenant isolation (faster than filters)
|
||||
5. **`returnMetadata: "indexed"`** for best speed/data balance
|
||||
6. **Handle 5-10s mutation delay** in async operations
|
||||
Reference in New Issue
Block a user