mirror of
https://github.com/ksyasuda/dotfiles.git
synced 2026-03-21 06:11:27 -07:00
1.9 KiB
1.9 KiB
Vectorize Configuration
Create Index
npx wrangler vectorize create my-index --dimensions=768 --metric=cosine
⚠️ Dimensions and metric are immutable - cannot change after creation.
Worker Binding
// wrangler.jsonc
{
"vectorize": [
{ "binding": "VECTORIZE", "index_name": "my-index" }
]
}
interface Env {
VECTORIZE: Vectorize;
}
Metadata Indexes
Must create BEFORE inserting vectors - existing vectors not retroactively indexed.
wrangler vectorize create-metadata-index my-index --property-name=category --type=string
wrangler vectorize create-metadata-index my-index --property-name=price --type=number
| Type | Use For |
|---|---|
string |
Categories, tags (first 64 bytes indexed) |
number |
Prices, timestamps |
boolean |
Flags |
CLI Commands
# Index management
wrangler vectorize list
wrangler vectorize info <index-name>
wrangler vectorize delete <index-name>
# Vector operations
wrangler vectorize insert <index-name> --file=embeddings.ndjson
wrangler vectorize get <index-name> --ids=id1,id2
wrangler vectorize delete-by-ids <index-name> --ids=id1,id2
# Metadata indexes
wrangler vectorize list-metadata-index <index-name>
wrangler vectorize delete-metadata-index <index-name> --property-name=field
Bulk Upload (NDJSON)
{"id": "1", "values": [0.1, 0.2, ...], "metadata": {"category": "docs"}}
{"id": "2", "values": [0.4, 0.5, ...], "namespace": "tenant-abc"}
Limits: 5000 vectors per file, 100 MB max
Cardinality Best Practice
Bucket high-cardinality data:
// ❌ Millisecond timestamps
metadata: { timestamp: Date.now() }
// ✅ 5-minute buckets
metadata: { timestamp_bucket: Math.floor(Date.now() / 300000) * 300000 }
Production Checklist
- Create index with correct dimensions
- Create metadata indexes FIRST
- Test bulk upload
- Configure bindings
- Deploy Worker
- Verify queries