update skills

2026-06-13 09:13:31 -07:00 · 2026-03-17 16:53:22 -07:00
parent 0b0783ef8e
commit f9a530667e
389 changed files with 54512 additions and 1 deletions
@@ -0,0 +1,164 @@
+# Upgrading to GPT-5.4
+
+Use this guide when the user explicitly asks to upgrade an existing integration to GPT-5.4. Pair it with current OpenAI docs lookups. The default target string is `gpt-5.4`.
+
+## Upgrade posture
+
+Upgrade with the narrowest safe change set:
+
+- replace the model string first
+- update only the prompts that are directly tied to that model usage
+- prefer prompt-only upgrades when possible
+- if the upgrade would require API-surface changes, parameter rewrites, tool rewiring, or broader code edits, mark it as blocked instead of stretching the scope
+
+## Upgrade workflow
+
+1. Inventory current model usage.
+   - Search for model strings, client calls, and prompt-bearing files.
+   - Include inline prompts, prompt templates, YAML or JSON configs, Markdown docs, and saved prompts when they are clearly tied to a model usage site.
+2. Pair each model usage with its prompt surface.
+   - Prefer the closest prompt surface first: inline system or developer text, then adjacent prompt files, then shared templates.
+   - If you cannot confidently tie a prompt to the model usage, say so instead of guessing.
+3. Classify the source model family.
+   - Common buckets: `gpt-4o` or `gpt-4.1`, `o1` or `o3` or `o4-mini`, early `gpt-5`, later `gpt-5.x`, or mixed and unclear.
+4. Decide the upgrade class.
+   - `model string only`
+   - `model string + light prompt rewrite`
+   - `blocked without code changes`
+5. Run the no-code compatibility gate.
+   - Check whether the current integration can accept `gpt-5.4` without API-surface changes or implementation changes.
+   - For long-running Responses or tool-heavy agents, check whether `phase` is already preserved or round-tripped when the host replays assistant items or uses preambles.
+   - If compatibility depends on code changes, return `blocked`.
+   - If compatibility is unclear, return `unknown` rather than improvising.
+6. Recommend the upgrade.
+   - Default replacement string: `gpt-5.4`
+   - Keep the intervention small and behavior-preserving.
+7. Deliver a structured recommendation.
+   - `Current model usage`
+   - `Recommended model-string updates`
+   - `Starting reasoning recommendation`
+   - `Prompt updates`
+   - `Phase assessment` when the flow is long-running, replayed, or tool-heavy
+   - `No-code compatibility check`
+   - `Validation plan`
+   - `Launch-day refresh items`
+
+Output rule:
+
+- Always emit a starting `reasoning_effort_recommendation` for each usage site.
+- If the repo exposes the current reasoning setting, preserve it first unless the source guide says otherwise.
+- If the repo does not expose the current setting, use the source-family starting mapping instead of returning `null`.
+
+## Upgrade outcomes
+
+### `model string only`
+
+Choose this when:
+
+- the existing prompts are already short, explicit, and task-bounded
+- the workflow is not strongly research-heavy, tool-heavy, multi-agent, batch or completeness-sensitive, or long-horizon
+- there are no obvious compatibility blockers
+
+Default action:
+
+- replace the model string with `gpt-5.4`
+- keep prompts unchanged
+- validate behavior with existing evals or spot checks
+
+### `model string + light prompt rewrite`
+
+Choose this when:
+
+- the old prompt was compensating for weaker instruction following
+- the workflow needs more persistence than the default tool-use behavior will likely provide
+- the task needs stronger completeness, citation discipline, or verification
+- the upgraded model becomes too verbose or under-complete unless instructed otherwise
+- the workflow is research-heavy and needs stronger handling of sparse or empty retrieval results
+- the workflow is coding-oriented, tool-heavy, or multi-agent, but the existing API surface and tool definitions can remain unchanged
+
+Default action:
+
+- replace the model string with `gpt-5.4`
+- add one or two targeted prompt blocks
+- read `references/gpt-5p4-prompting-guide.md` to choose the smallest prompt changes that recover the old behavior
+- avoid broad prompt cleanup unrelated to the upgrade
+- for research workflows, default to `research_mode` + `citation_rules` + `empty_result_handling`; add `tool_persistence_rules` when the host already uses retrieval tools
+- for dependency-aware or tool-heavy workflows, default to `tool_persistence_rules` + `dependency_checks` + `verification_loop`; add `parallel_tool_calling` only when retrieval steps are truly independent
+- for coding or terminal workflows, default to `terminal_tool_hygiene` + `verification_loop`
+- for multi-agent support or triage workflows, default to at least one of `tool_persistence_rules`, `completeness_contract`, or `verification_loop`
+- for long-running Responses agents with preambles or multiple assistant messages, explicitly review whether `phase` is already handled; if adding or preserving `phase` would require code edits, mark the path as `blocked`
+- do not classify a coding or tool-using Responses workflow as `blocked` just because the visible snippet is minimal; prefer `model string + light prompt rewrite` unless the repo clearly shows that a safe GPT-5.4 path would require host-side code changes
+
+### `blocked`
+
+Choose this when:
+
+- the upgrade appears to require API-surface changes
+- the upgrade appears to require parameter rewrites or reasoning-setting changes that are not exposed outside implementation code
+- the upgrade would require changing tool definitions, tool handler wiring, or schema contracts
+- you cannot confidently identify the prompt surface tied to the model usage
+
+Default action:
+
+- do not improvise a broader upgrade
+- report the blocker and explain that the fix is out of scope for this guide
+
+## No-code compatibility checklist
+
+Before recommending a no-code upgrade, check:
+
+1. Can the current host accept the `gpt-5.4` model string without changing client code or API surface?
+2. Are the related prompts identifiable and editable?
+3. Does the host depend on behavior that likely needs API-surface changes, parameter rewrites, or tool rewiring?
+4. Would the likely fix be prompt-only, or would it need implementation changes?
+5. Is the prompt surface close enough to the model usage that you can make a targeted change instead of a broad cleanup?
+6. For long-running Responses or tool-heavy agents, is `phase` already preserved if the host relies on preambles, replayed assistant items, or multiple assistant messages?
+
+If item 1 is no, items 3 through 4 point to implementation work, or item 6 is no and the fix needs code changes, return `blocked`.
+
+If item 2 is no, return `unknown` unless the user can point to the prompt location.
+
+Important:
+
+- Existing use of tools, agents, or multiple usage sites is not by itself a blocker.
+- If the current host can keep the same API surface and the same tool definitions, prefer `model string + light prompt rewrite` over `blocked`.
+- Reserve `blocked` for cases that truly require implementation changes, not cases that only need stronger prompt steering.
+
+## Scope boundaries
+
+This guide may:
+
+- update or recommend updated model strings
+- update or recommend updated prompts
+- inspect code and prompt files to understand where those changes belong
+- inspect whether existing Responses flows already preserve `phase`
+- flag compatibility blockers
+
+This guide may not:
+
+- move Chat Completions code to Responses
+- move Responses code to another API surface
+- rewrite parameter shapes
+- change tool definitions or tool-call handling
+- change structured-output wiring
+- add or retrofit `phase` handling in implementation code
+- edit business logic, orchestration logic, or SDK usage beyond a literal model-string replacement
+
+If a safe GPT-5.4 upgrade requires any of those changes, mark the path as blocked and out of scope.
+
+## Validation plan
+
+- Validate each upgraded usage site with existing evals or realistic spot checks.
+- Check whether the upgraded model still matches expected latency, output shape, and quality.
+- If prompt edits were added, confirm each block is doing real work instead of adding noise.
+- If the workflow has downstream impact, add a lightweight verification pass before finalization.
+
+## Launch-day refresh items
+
+When final GPT-5.4 guidance changes:
+
+1. Replace release-candidate assumptions with final GPT-5.4 guidance where appropriate.
+2. Re-check whether the default target string should stay `gpt-5.4` for all source families.
+3. Re-check any prompt-block recommendations whose semantics may have changed.
+4. Re-check research, citation, and compatibility guidance against the final model behavior.
+5. Re-run the same upgrade scenarios and confirm the blocked-versus-viable boundaries still hold.