Files
dotfiles/.agents/skills/cloudflare-deploy/references/durable-objects/gotchas.md
2026-03-17 16:53:22 -07:00

7.1 KiB

Durable Objects Gotchas

Common Errors

"Hibernation Cleared My In-Memory State"

Problem: Variables lost after hibernation
Cause: DO auto-hibernates when idle; in-memory state not persisted
Solution: Use ctx.storage for critical data, ws.serializeAttachment() for per-connection metadata

// ❌ Wrong - lost on hibernation
private userCount = 0;
async webSocketMessage(ws: WebSocket, msg: string) {
  this.userCount++;  // Lost!
}

// ✅ Right - persisted
async webSocketMessage(ws: WebSocket, msg: string) {
  const count = this.ctx.storage.kv.get("userCount") || 0;
  this.ctx.storage.kv.put("userCount", count + 1);
}

"setTimeout Didn't Fire After Restart"

Problem: Scheduled work lost on eviction
Cause: setTimeout in-memory only; eviction clears timers
Solution: Use ctx.storage.setAlarm() for reliable scheduling

// ❌ Wrong - lost on eviction
setTimeout(() => this.cleanup(), 3600000);

// ✅ Right - survives eviction
await this.ctx.storage.setAlarm(Date.now() + 3600000);
async alarm() { await this.cleanup(); }

"Constructor Runs on Every Wake"

Problem: Expensive init logic slows all requests
Cause: Constructor runs on every wake (first request after eviction OR after hibernation)
Solution: Lazy initialization or cache in storage

Critical understanding: Constructor runs in two scenarios:

  1. Cold start - DO evicted from memory, first request creates new instance
  2. Wake from hibernation - DO with WebSockets hibernated, message/alarm wakes it
// ❌ Wrong - expensive on every wake
constructor(ctx: DurableObjectState, env: Env) {
  super(ctx, env);
  this.heavyData = this.loadExpensiveData();  // Slow!
}

// ✅ Right - lazy load
private heavyData?: HeavyData;
private getHeavyData() {
  if (!this.heavyData) this.heavyData = this.loadExpensiveData();
  return this.heavyData;
}

"Durable Object Overloaded (503 errors)"

Problem: 503 errors under load
Cause: Single DO exceeding ~1K req/s throughput limit
Solution: Shard across multiple DOs (see Patterns: Sharding)

"Storage Quota Exceeded (Write failures)"

Problem: Write operations failing
Cause: DO storage exceeding 10GB limit or account quota
Solution: Cleanup with alarms, use deleteAll() for old data, upgrade plan

"CPU Time Exceeded (Terminated)"

Problem: Request terminated mid-execution
Cause: Processing exceeding 30s CPU time default limit
Solution: Increase limits.cpu_ms in wrangler.jsonc (max 300s) or chunk work

"WebSockets Disconnect on Eviction"

Problem: Connections drop unexpectedly
Cause: DO evicted from memory without hibernation API
Solution: Use WebSocket hibernation handlers + client reconnection logic

"Migration Failed (Deploy error)"

Cause: Non-unique tags, non-sequential tags, or invalid class names in migration
Solution: Check tag uniqueness/sequential ordering and verify class names are correct

"RPC Method Not Found"

Cause: compatibility_date < 2024-04-03 preventing RPC usage
Solution: Update compatibility_date to >= 2024-04-03 or use fetch() instead of RPC

"Only One Alarm Allowed"

Cause: Need multiple scheduled tasks but only one alarm supported per DO
Solution: Use event queue pattern to schedule multiple tasks with single alarm

"Race Condition Despite Single-Threading"

Problem: Concurrent requests see inconsistent state
Cause: Async operations allow request interleaving (await = yield point)
Solution: Use blockConcurrencyWhile() for critical sections or atomic storage ops

// ❌ Wrong - race condition
async incrementCounter() {
  const count = await this.ctx.storage.get("count") || 0;
  // ⚠️ Another request could execute here during await
  await this.ctx.storage.put("count", count + 1);
}

// ✅ Right - atomic operation
async incrementCounter() {
  return this.ctx.storage.sql.exec(
    "INSERT INTO counters (id, value) VALUES (1, 1) ON CONFLICT(id) DO UPDATE SET value = value + 1 RETURNING value"
  ).one().value;
}

// ✅ Right - explicit locking
async criticalOperation() {
  await this.ctx.blockConcurrencyWhile(async () => {
    const count = await this.ctx.storage.get("count") || 0;
    await this.ctx.storage.put("count", count + 1);
  });
}

"Migration Rollback Not Supported"

Cause: Attempting to rollback a migration after deployment
Solution: Test with --dry-run before deploying; migrations cannot be rolled back

"deleted_classes Destroys Data"

Problem: Migration deleted all data
Cause: deleted_classes migration immediately destroys all DO instances and data
Solution: Test with --dry-run; use transferred_classes to preserve data during moves

"Cold Starts Are Slow"

Problem: First request after eviction takes longer
Cause: DO constructor + initial storage access on cold start
Solution: Expected behavior; optimize constructor, use connection pooling in clients, consider warming strategy for critical DOs

// Warming strategy (periodically ping critical DOs)
export default {
  async scheduled(event: ScheduledEvent, env: Env) {
    const criticalIds = ["auth", "sessions", "locks"];
    await Promise.all(criticalIds.map(name => {
      const id = env.MY_DO.idFromName(name);
      const stub = env.MY_DO.get(id);
      return stub.ping();  // Keep warm
    }));
  }
};

Limits

Limit Free Paid Notes
SQLite storage per DO 10 GB 10 GB Per Durable Object instance
SQLite total storage 5 GB Unlimited Account-wide quota
Key+value size 2 MB 2 MB Single KV pair (SQLite/async)
CPU time default 30s 30s Per request; configurable
CPU time max 300s 300s Set via limits.cpu_ms
DO classes 100 500 Distinct DO class definitions
SQL columns 100 100 Per table
SQL statement size 100 KB 100 KB Max SQL query size
WebSocket message size 32 MiB 32 MiB Per message
Request throughput ~1K req/s ~1K req/s Per DO (soft limit - shard for more)
Alarms per DO 1 1 Use queue pattern for multiple events
Total DOs Unlimited Unlimited Create as many instances as needed
WebSockets Unlimited Unlimited Within 128MB memory limit per DO
Memory per DO 128 MB 128 MB In-memory state + WebSocket buffers

Hibernation Caveats

  1. Memory cleared - All in-memory variables lost; reconstruct from storage or deserializeAttachment()
  2. Constructor reruns - Runs on wake; avoid expensive operations, use lazy initialization
  3. No guarantees - DO may evict instead of hibernate; design for both
  4. Attachment limit - serializeAttachment() data must be JSON-serializable, keep small
  5. Alarm wakes DO - Alarm prevents hibernation until handler completes
  6. WebSocket state not automatic - Must explicitly persist with serializeAttachment() or storage

See Also