This commit is contained in:
2026-03-26 22:54:43 -07:00
parent 8c2da7fea7
commit a86f091338
32 changed files with 1261 additions and 2688 deletions

View File

@@ -0,0 +1,160 @@
---
name: plugin-creator
description: Create and scaffold plugin directories for Codex with a required `.codex-plugin/plugin.json`, optional plugin folders/files, and baseline placeholders you can edit before publishing or testing. Use when Codex needs to create a new local plugin, add optional plugin structure, or generate or update repo-root `.agents/plugins/marketplace.json` entries for plugin ordering and availability metadata.
---
# Plugin Creator
## Quick Start
1. Run the scaffold script:
```bash
# Plugin names are normalized to lower-case hyphen-case and must be <= 64 chars.
# The generated folder and plugin.json name are always the same.
# Run from repo root (or replace .agents/... with the absolute path to this SKILL).
# By default creates in <repo_root>/plugins/<plugin-name>.
python3 .agents/skills/plugin-creator/scripts/create_basic_plugin.py <plugin-name>
```
2. Open `<plugin-path>/.codex-plugin/plugin.json` and replace `[TODO: ...]` placeholders.
3. Generate or update the repo marketplace entry when the plugin should appear in Codex UI ordering:
```bash
# marketplace.json always lives at <repo-root>/.agents/plugins/marketplace.json
python3 .agents/skills/plugin-creator/scripts/create_basic_plugin.py my-plugin --with-marketplace
```
For a home-local plugin, treat `<home>` as the root and use:
```bash
python3 .agents/skills/plugin-creator/scripts/create_basic_plugin.py my-plugin \
--path ~/plugins \
--marketplace-path ~/.agents/plugins/marketplace.json \
--with-marketplace
```
4. Generate/adjust optional companion folders as needed:
```bash
python3 .agents/skills/plugin-creator/scripts/create_basic_plugin.py my-plugin --path <parent-plugin-directory> \
--with-skills --with-hooks --with-scripts --with-assets --with-mcp --with-apps --with-marketplace
```
`<parent-plugin-directory>` is the directory where the plugin folder `<plugin-name>` will be created (for example `~/code/plugins`).
## What this skill creates
- If the user has not made the plugin location explicit, ask whether they want a repo-local plugin or a home-local plugin before generating marketplace entries.
- Creates plugin root at `/<parent-plugin-directory>/<plugin-name>/`.
- Always creates `/<parent-plugin-directory>/<plugin-name>/.codex-plugin/plugin.json`.
- Fills the manifest with the full schema shape, placeholder values, and the complete `interface` section.
- Creates or updates `<repo-root>/.agents/plugins/marketplace.json` when `--with-marketplace` is set.
- If the marketplace file does not exist yet, seed top-level `name` plus `interface.displayName` placeholders before adding the first plugin entry.
- `<plugin-name>` is normalized using skill-creator naming rules:
- `My Plugin``my-plugin`
- `My--Plugin``my-plugin`
- underscores, spaces, and punctuation are converted to `-`
- result is lower-case hyphen-delimited with consecutive hyphens collapsed
- Supports optional creation of:
- `skills/`
- `hooks/`
- `scripts/`
- `assets/`
- `.mcp.json`
- `.app.json`
## Marketplace workflow
- `marketplace.json` always lives at `<repo-root>/.agents/plugins/marketplace.json`.
- For a home-local plugin, use the same convention with `<home>` as the root:
`~/.agents/plugins/marketplace.json` plus `./plugins/<plugin-name>`.
- Marketplace root metadata supports top-level `name` plus optional `interface.displayName`.
- Treat plugin order in `plugins[]` as render order in Codex. Append new entries unless a user explicitly asks to reorder the list.
- `displayName` belongs inside the marketplace `interface` object, not individual `plugins[]` entries.
- Each generated marketplace entry must include all of:
- `policy.installation`
- `policy.authentication`
- `category`
- Default new entries to:
- `policy.installation: "AVAILABLE"`
- `policy.authentication: "ON_INSTALL"`
- Override defaults only when the user explicitly specifies another allowed value.
- Allowed `policy.installation` values:
- `NOT_AVAILABLE`
- `AVAILABLE`
- `INSTALLED_BY_DEFAULT`
- Allowed `policy.authentication` values:
- `ON_INSTALL`
- `ON_USE`
- Treat `policy.products` as an override. Omit it unless the user explicitly requests product gating.
- The generated plugin entry shape is:
```json
{
"name": "plugin-name",
"source": {
"source": "local",
"path": "./plugins/plugin-name"
},
"policy": {
"installation": "AVAILABLE",
"authentication": "ON_INSTALL"
},
"category": "Productivity"
}
```
- Use `--force` only when intentionally replacing an existing marketplace entry for the same plugin name.
- If `<repo-root>/.agents/plugins/marketplace.json` does not exist yet, create it with top-level `"name"`, an `"interface"` object containing `"displayName"`, and a `plugins` array, then add the new entry.
- For a brand-new marketplace file, the root object should look like:
```json
{
"name": "[TODO: marketplace-name]",
"interface": {
"displayName": "[TODO: Marketplace Display Name]"
},
"plugins": [
{
"name": "plugin-name",
"source": {
"source": "local",
"path": "./plugins/plugin-name"
},
"policy": {
"installation": "AVAILABLE",
"authentication": "ON_INSTALL"
},
"category": "Productivity"
}
]
}
```
## Required behavior
- Outer folder name and `plugin.json` `"name"` are always the same normalized plugin name.
- Do not remove required structure; keep `.codex-plugin/plugin.json` present.
- Keep manifest values as placeholders until a human or follow-up step explicitly fills them.
- If creating files inside an existing plugin path, use `--force` only when overwrite is intentional.
- Preserve any existing marketplace `interface.displayName`.
- When generating marketplace entries, always write `policy.installation`, `policy.authentication`, and `category` even if their values are defaults.
- Add `policy.products` only when the user explicitly asks for that override.
- Keep marketplace `source.path` relative to repo root as `./plugins/<plugin-name>`.
## Reference to exact spec sample
For the exact canonical sample JSON for both plugin manifests and marketplace entries, use:
- `references/plugin-json-spec.md`
## Validation
After editing `SKILL.md`, run:
```bash
python3 <path-to-skill-creator>/scripts/quick_validate.py .agents/skills/plugin-creator
```

View File

@@ -0,0 +1,6 @@
interface:
display_name: "Plugin Creator"
short_description: "Scaffold plugins and marketplace entries"
default_prompt: "Use $plugin-creator to scaffold a plugin with placeholder plugin.json, optional structure, and a marketplace.json entry."
icon_small: "./assets/plugin-creator-small.svg"
icon_large: "./assets/plugin-creator.png"

View File

@@ -0,0 +1,3 @@
<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="currentColor" viewBox="0 0 20 20">
<path fill="#0D0D0D" d="M12.03 4.113a3.612 3.612 0 0 1 5.108 5.108l-6.292 6.29c-.324.324-.56.561-.791.752l-.235.176c-.205.14-.422.261-.65.36l-.229.093a4.136 4.136 0 0 1-.586.16l-.764.134-2.394.4c-.142.024-.294.05-.423.06-.098.007-.232.01-.378-.026l-.149-.05a1.081 1.081 0 0 1-.521-.474l-.046-.093a1.104 1.104 0 0 1-.075-.527c.01-.129.035-.28.06-.422l.398-2.394c.1-.602.162-.987.295-1.35l.093-.23c.1-.228.22-.445.36-.65l.176-.235c.19-.232.428-.467.751-.79l6.292-6.292Zm-5.35 7.232c-.35.35-.534.535-.66.688l-.11.147a2.67 2.67 0 0 0-.24.433l-.062.154c-.08.22-.124.462-.232 1.112l-.398 2.394-.001.001h.003l2.393-.399.717-.126a2.63 2.63 0 0 0 .394-.105l.154-.063a2.65 2.65 0 0 0 .433-.24l.147-.11c.153-.126.339-.31.688-.66l4.988-4.988-3.227-3.226-4.987 4.988Zm9.517-6.291a2.281 2.281 0 0 0-3.225 0l-.364.362 3.226 3.227.363-.364c.89-.89.89-2.334 0-3.225ZM4.583 1.783a.3.3 0 0 1 .294.241c.117.585.347 1.092.707 1.48.357.385.859.668 1.549.783a.3.3 0 0 1 0 .592c-.69.115-1.192.398-1.549.783-.315.34-.53.77-.657 1.265l-.05.215a.3.3 0 0 1-.588 0c-.117-.585-.347-1.092-.707-1.48-.357-.384-.859-.668-1.549-.783a.3.3 0 0 1 0-.592c.69-.115 1.192-.398 1.549-.783.36-.388.59-.895.707-1.48l.015-.05a.3.3 0 0 1 .279-.19Z"/>
</svg>

After

Width:  |  Height:  |  Size: 1.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 KiB

View File

@@ -0,0 +1,170 @@
# Plugin JSON sample spec
```json
{
"name": "plugin-name",
"version": "1.2.0",
"description": "Brief plugin description",
"author": {
"name": "Author Name",
"email": "author@example.com",
"url": "https://github.com/author"
},
"homepage": "https://docs.example.com/plugin",
"repository": "https://github.com/author/plugin",
"license": "MIT",
"keywords": ["keyword1", "keyword2"],
"skills": "./skills/",
"hooks": "./hooks.json",
"mcpServers": "./.mcp.json",
"apps": "./.app.json",
"interface": {
"displayName": "Plugin Display Name",
"shortDescription": "Short description for subtitle",
"longDescription": "Long description for details page",
"developerName": "OpenAI",
"category": "Productivity",
"capabilities": ["Interactive", "Write"],
"websiteURL": "https://openai.com/",
"privacyPolicyURL": "https://openai.com/policies/row-privacy-policy/",
"termsOfServiceURL": "https://openai.com/policies/row-terms-of-use/",
"defaultPrompt": [
"Summarize my inbox and draft replies for me.",
"Find open bugs and turn them into Linear tickets.",
"Review today's meetings and flag scheduling gaps."
],
"brandColor": "#3B82F6",
"composerIcon": "./assets/icon.png",
"logo": "./assets/logo.png",
"screenshots": [
"./assets/screenshot1.png",
"./assets/screenshot2.png",
"./assets/screenshot3.png"
]
}
}
```
## Field guide
### Top-level fields
- `name` (`string`): Plugin identifier (kebab-case, no spaces). Required if `plugin.json` is provided and used as manifest name and component namespace.
- `version` (`string`): Plugin semantic version.
- `description` (`string`): Short purpose summary.
- `author` (`object`): Publisher identity.
- `name` (`string`): Author or team name.
- `email` (`string`): Contact email.
- `url` (`string`): Author/team homepage or profile URL.
- `homepage` (`string`): Documentation URL for plugin usage.
- `repository` (`string`): Source code URL.
- `license` (`string`): License identifier (for example `MIT`, `Apache-2.0`).
- `keywords` (`array` of `string`): Search/discovery tags.
- `skills` (`string`): Relative path to skill directories/files.
- `hooks` (`string`): Hook config path.
- `mcpServers` (`string`): MCP config path.
- `apps` (`string`): App manifest path for plugin integrations.
- `interface` (`object`): Interface/UX metadata block for plugin presentation.
### `interface` fields
- `displayName` (`string`): User-facing title shown for the plugin.
- `shortDescription` (`string`): Brief subtitle used in compact views.
- `longDescription` (`string`): Longer description used on details screens.
- `developerName` (`string`): Human-readable publisher name.
- `category` (`string`): Plugin category bucket.
- `capabilities` (`array` of `string`): Capability list from implementation.
- `websiteURL` (`string`): Public website for the plugin.
- `privacyPolicyURL` (`string`): Privacy policy URL.
- `termsOfServiceURL` (`string`): Terms of service URL.
- `defaultPrompt` (`array` of `string`): Starter prompts shown in composer/UX context.
- Include at most 3 strings. Entries after the first 3 are ignored and will not be included.
- Each string is capped at 128 characters. Longer entries are truncated.
- Prefer short starter prompts around 50 characters so they scan well in the UI.
- `brandColor` (`string`): Theme color for the plugin card.
- `composerIcon` (`string`): Path to icon asset.
- `logo` (`string`): Path to logo asset.
- `screenshots` (`array` of `string`): List of screenshot asset paths.
- Screenshot entries must be PNG filenames and stored under `./assets/`.
- Keep file paths relative to plugin root.
### Path conventions and defaults
- Path values should be relative and begin with `./`.
- `skills`, `hooks`, and `mcpServers` are supplemented on top of default component discovery; they do not replace defaults.
- Custom path values must follow the plugin root convention and naming/namespacing rules.
- This repos scaffold writes `.codex-plugin/plugin.json`; treat that as the manifest location this skill generates.
# Marketplace JSON sample spec
`marketplace.json` depends on where the plugin should live:
- Repo plugin: `<repo-root>/.agents/plugins/marketplace.json`
- Local plugin: `~/.agents/plugins/marketplace.json`
```json
{
"name": "openai-curated",
"interface": {
"displayName": "ChatGPT Official"
},
"plugins": [
{
"name": "linear",
"source": {
"source": "local",
"path": "./plugins/linear"
},
"policy": {
"installation": "AVAILABLE",
"authentication": "ON_INSTALL"
},
"category": "Productivity"
}
]
}
```
## Marketplace field guide
### Top-level fields
- `name` (`string`): Marketplace identifier or catalog name.
- `interface` (`object`, optional): Marketplace presentation metadata.
- `plugins` (`array`): Ordered plugin entries. This order determines how Codex renders plugins.
### `interface` fields
- `displayName` (`string`, optional): User-facing marketplace title.
### Plugin entry fields
- `name` (`string`): Plugin identifier. Match the plugin folder name and `plugin.json` `name`.
- `source` (`object`): Plugin source descriptor.
- `source` (`string`): Use `local` for this repo workflow.
- `path` (`string`): Relative plugin path based on the marketplace root.
- Repo plugin: `./plugins/<plugin-name>`
- Local plugin in `~/.agents/plugins/marketplace.json`: `./plugins/<plugin-name>`
- The same relative path convention is used for both repo-rooted and home-rooted marketplaces.
- Example: with `~/.agents/plugins/marketplace.json`, `./plugins/<plugin-name>` resolves to `~/plugins/<plugin-name>`.
- `policy` (`object`): Marketplace policy block. Always include it.
- `installation` (`string`): Availability policy.
- Allowed values: `NOT_AVAILABLE`, `AVAILABLE`, `INSTALLED_BY_DEFAULT`
- Default for new entries: `AVAILABLE`
- `authentication` (`string`): Authentication timing policy.
- Allowed values: `ON_INSTALL`, `ON_USE`
- Default for new entries: `ON_INSTALL`
- `products` (`array` of `string`, optional): Product override for this plugin entry. Omit it unless product gating is explicitly requested.
- `category` (`string`): Display category bucket. Always include it.
### Marketplace generation rules
- `displayName` belongs under the top-level `interface` object, not individual plugin entries.
- When creating a new marketplace file from scratch, seed `interface.displayName` alongside top-level `name`.
- Always include `policy.installation`, `policy.authentication`, and `category` on every generated or updated plugin entry.
- Treat `policy.products` as an override and omit it unless explicitly requested.
- Append new entries unless the user explicitly requests reordering.
- Replace an existing entry for the same plugin only when overwrite is intentional.
- Choose marketplace location to match the plugin destination:
- Repo plugin: `<repo-root>/.agents/plugins/marketplace.json`
- Local plugin: `~/.agents/plugins/marketplace.json`

View File

@@ -0,0 +1,301 @@
#!/usr/bin/env python3
"""Scaffold a plugin directory and optionally update marketplace.json."""
from __future__ import annotations
import argparse
import json
import re
from pathlib import Path
from typing import Any
MAX_PLUGIN_NAME_LENGTH = 64
DEFAULT_PLUGIN_PARENT = Path.cwd() / "plugins"
DEFAULT_MARKETPLACE_PATH = Path.cwd() / ".agents" / "plugins" / "marketplace.json"
DEFAULT_INSTALL_POLICY = "AVAILABLE"
DEFAULT_AUTH_POLICY = "ON_INSTALL"
DEFAULT_CATEGORY = "Productivity"
DEFAULT_MARKETPLACE_DISPLAY_NAME = "[TODO: Marketplace Display Name]"
VALID_INSTALL_POLICIES = {"NOT_AVAILABLE", "AVAILABLE", "INSTALLED_BY_DEFAULT"}
VALID_AUTH_POLICIES = {"ON_INSTALL", "ON_USE"}
def normalize_plugin_name(plugin_name: str) -> str:
"""Normalize a plugin name to lowercase hyphen-case."""
normalized = plugin_name.strip().lower()
normalized = re.sub(r"[^a-z0-9]+", "-", normalized)
normalized = normalized.strip("-")
normalized = re.sub(r"-{2,}", "-", normalized)
return normalized
def validate_plugin_name(plugin_name: str) -> None:
if not plugin_name:
raise ValueError("Plugin name must include at least one letter or digit.")
if len(plugin_name) > MAX_PLUGIN_NAME_LENGTH:
raise ValueError(
f"Plugin name '{plugin_name}' is too long ({len(plugin_name)} characters). "
f"Maximum is {MAX_PLUGIN_NAME_LENGTH} characters."
)
def build_plugin_json(plugin_name: str) -> dict:
return {
"name": plugin_name,
"version": "[TODO: 1.2.0]",
"description": "[TODO: Brief plugin description]",
"author": {
"name": "[TODO: Author Name]",
"email": "[TODO: author@example.com]",
"url": "[TODO: https://github.com/author]",
},
"homepage": "[TODO: https://docs.example.com/plugin]",
"repository": "[TODO: https://github.com/author/plugin]",
"license": "[TODO: MIT]",
"keywords": ["[TODO: keyword1]", "[TODO: keyword2]"],
"skills": "[TODO: ./skills/]",
"hooks": "[TODO: ./hooks.json]",
"mcpServers": "[TODO: ./.mcp.json]",
"apps": "[TODO: ./.app.json]",
"interface": {
"displayName": "[TODO: Plugin Display Name]",
"shortDescription": "[TODO: Short description for subtitle]",
"longDescription": "[TODO: Long description for details page]",
"developerName": "[TODO: OpenAI]",
"category": "[TODO: Productivity]",
"capabilities": ["[TODO: Interactive]", "[TODO: Write]"],
"websiteURL": "[TODO: https://openai.com/]",
"privacyPolicyURL": "[TODO: https://openai.com/policies/row-privacy-policy/]",
"termsOfServiceURL": "[TODO: https://openai.com/policies/row-terms-of-use/]",
"defaultPrompt": [
"[TODO: Summarize my inbox and draft replies for me.]",
"[TODO: Find open bugs and turn them into tickets.]",
"[TODO: Review today's meetings and flag gaps.]",
],
"brandColor": "[TODO: #3B82F6]",
"composerIcon": "[TODO: ./assets/icon.png]",
"logo": "[TODO: ./assets/logo.png]",
"screenshots": [
"[TODO: ./assets/screenshot1.png]",
"[TODO: ./assets/screenshot2.png]",
"[TODO: ./assets/screenshot3.png]",
],
},
}
def build_marketplace_entry(
plugin_name: str,
install_policy: str,
auth_policy: str,
category: str,
) -> dict[str, Any]:
return {
"name": plugin_name,
"source": {
"source": "local",
"path": f"./plugins/{plugin_name}",
},
"policy": {
"installation": install_policy,
"authentication": auth_policy,
},
"category": category,
}
def load_json(path: Path) -> dict[str, Any]:
with path.open() as handle:
return json.load(handle)
def build_default_marketplace() -> dict[str, Any]:
return {
"name": "[TODO: marketplace-name]",
"interface": {
"displayName": DEFAULT_MARKETPLACE_DISPLAY_NAME,
},
"plugins": [],
}
def validate_marketplace_interface(payload: dict[str, Any]) -> None:
interface = payload.get("interface")
if interface is not None and not isinstance(interface, dict):
raise ValueError("marketplace.json field 'interface' must be an object.")
def update_marketplace_json(
marketplace_path: Path,
plugin_name: str,
install_policy: str,
auth_policy: str,
category: str,
force: bool,
) -> None:
if marketplace_path.exists():
payload = load_json(marketplace_path)
else:
payload = build_default_marketplace()
if not isinstance(payload, dict):
raise ValueError(f"{marketplace_path} must contain a JSON object.")
validate_marketplace_interface(payload)
plugins = payload.setdefault("plugins", [])
if not isinstance(plugins, list):
raise ValueError(f"{marketplace_path} field 'plugins' must be an array.")
new_entry = build_marketplace_entry(plugin_name, install_policy, auth_policy, category)
for index, entry in enumerate(plugins):
if isinstance(entry, dict) and entry.get("name") == plugin_name:
if not force:
raise FileExistsError(
f"Marketplace entry '{plugin_name}' already exists in {marketplace_path}. "
"Use --force to overwrite that entry."
)
plugins[index] = new_entry
break
else:
plugins.append(new_entry)
write_json(marketplace_path, payload, force=True)
def write_json(path: Path, data: dict, force: bool) -> None:
if path.exists() and not force:
raise FileExistsError(f"{path} already exists. Use --force to overwrite.")
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("w") as handle:
json.dump(data, handle, indent=2)
handle.write("\n")
def create_stub_file(path: Path, payload: dict, force: bool) -> None:
if path.exists() and not force:
return
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("w") as handle:
json.dump(payload, handle, indent=2)
handle.write("\n")
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Create a plugin skeleton with placeholder plugin.json."
)
parser.add_argument("plugin_name")
parser.add_argument(
"--path",
default=str(DEFAULT_PLUGIN_PARENT),
help=(
"Parent directory for plugin creation (defaults to <cwd>/plugins). "
"When using a home-rooted marketplace, use <home>/plugins."
),
)
parser.add_argument("--with-skills", action="store_true", help="Create skills/ directory")
parser.add_argument("--with-hooks", action="store_true", help="Create hooks/ directory")
parser.add_argument("--with-scripts", action="store_true", help="Create scripts/ directory")
parser.add_argument("--with-assets", action="store_true", help="Create assets/ directory")
parser.add_argument("--with-mcp", action="store_true", help="Create .mcp.json placeholder")
parser.add_argument("--with-apps", action="store_true", help="Create .app.json placeholder")
parser.add_argument(
"--with-marketplace",
action="store_true",
help=(
"Create or update <cwd>/.agents/plugins/marketplace.json. "
"Marketplace entries always point to ./plugins/<plugin-name> relative to the "
"marketplace root."
),
)
parser.add_argument(
"--marketplace-path",
default=str(DEFAULT_MARKETPLACE_PATH),
help=(
"Path to marketplace.json (defaults to <cwd>/.agents/plugins/marketplace.json). "
"For a home-rooted marketplace, use <home>/.agents/plugins/marketplace.json."
),
)
parser.add_argument(
"--install-policy",
default=DEFAULT_INSTALL_POLICY,
choices=sorted(VALID_INSTALL_POLICIES),
help="Marketplace policy.installation value",
)
parser.add_argument(
"--auth-policy",
default=DEFAULT_AUTH_POLICY,
choices=sorted(VALID_AUTH_POLICIES),
help="Marketplace policy.authentication value",
)
parser.add_argument(
"--category",
default=DEFAULT_CATEGORY,
help="Marketplace category value",
)
parser.add_argument("--force", action="store_true", help="Overwrite existing files")
return parser.parse_args()
def main() -> None:
args = parse_args()
raw_plugin_name = args.plugin_name
plugin_name = normalize_plugin_name(raw_plugin_name)
if plugin_name != raw_plugin_name:
print(f"Note: Normalized plugin name from '{raw_plugin_name}' to '{plugin_name}'.")
validate_plugin_name(plugin_name)
plugin_root = (Path(args.path).expanduser().resolve() / plugin_name)
plugin_root.mkdir(parents=True, exist_ok=True)
plugin_json_path = plugin_root / ".codex-plugin" / "plugin.json"
write_json(plugin_json_path, build_plugin_json(plugin_name), args.force)
optional_directories = {
"skills": args.with_skills,
"hooks": args.with_hooks,
"scripts": args.with_scripts,
"assets": args.with_assets,
}
for folder, enabled in optional_directories.items():
if enabled:
(plugin_root / folder).mkdir(parents=True, exist_ok=True)
if args.with_mcp:
create_stub_file(
plugin_root / ".mcp.json",
{"mcpServers": {}},
args.force,
)
if args.with_apps:
create_stub_file(
plugin_root / ".app.json",
{
"apps": {},
},
args.force,
)
if args.with_marketplace:
marketplace_path = Path(args.marketplace_path).expanduser().resolve()
update_marketplace_json(
marketplace_path,
plugin_name,
args.install_policy,
args.auth_policy,
args.category,
args.force,
)
print(f"Created plugin scaffold: {plugin_root}")
print(f"plugin manifest: {plugin_json_path}")
if args.with_marketplace:
print(f"marketplace manifest: {marketplace_path}")
if __name__ == "__main__":
main()

View File

@@ -1,213 +0,0 @@
---
name: receiving-code-review
description: Use when receiving code review feedback, before implementing suggestions, especially if feedback seems unclear or technically questionable - requires technical rigor and verification, not performative agreement or blind implementation
---
# Code Review Reception
## Overview
Code review requires technical evaluation, not emotional performance.
**Core principle:** Verify before implementing. Ask before assuming. Technical correctness over social comfort.
## The Response Pattern
```
WHEN receiving code review feedback:
1. READ: Complete feedback without reacting
2. UNDERSTAND: Restate requirement in own words (or ask)
3. VERIFY: Check against codebase reality
4. EVALUATE: Technically sound for THIS codebase?
5. RESPOND: Technical acknowledgment or reasoned pushback
6. IMPLEMENT: One item at a time, test each
```
## Forbidden Responses
**NEVER:**
- "You're absolutely right!" (explicit CLAUDE.md violation)
- "Great point!" / "Excellent feedback!" (performative)
- "Let me implement that now" (before verification)
**INSTEAD:**
- Restate the technical requirement
- Ask clarifying questions
- Push back with technical reasoning if wrong
- Just start working (actions > words)
## Handling Unclear Feedback
```
IF any item is unclear:
STOP - do not implement anything yet
ASK for clarification on unclear items
WHY: Items may be related. Partial understanding = wrong implementation.
```
**Example:**
```
your human partner: "Fix 1-6"
You understand 1,2,3,6. Unclear on 4,5.
❌ WRONG: Implement 1,2,3,6 now, ask about 4,5 later
✅ RIGHT: "I understand items 1,2,3,6. Need clarification on 4 and 5 before proceeding."
```
## Source-Specific Handling
### From your human partner
- **Trusted** - implement after understanding
- **Still ask** if scope unclear
- **No performative agreement**
- **Skip to action** or technical acknowledgment
### From External Reviewers
```
BEFORE implementing:
1. Check: Technically correct for THIS codebase?
2. Check: Breaks existing functionality?
3. Check: Reason for current implementation?
4. Check: Works on all platforms/versions?
5. Check: Does reviewer understand full context?
IF suggestion seems wrong:
Push back with technical reasoning
IF can't easily verify:
Say so: "I can't verify this without [X]. Should I [investigate/ask/proceed]?"
IF conflicts with your human partner's prior decisions:
Stop and discuss with your human partner first
```
**your human partner's rule:** "External feedback - be skeptical, but check carefully"
## YAGNI Check for "Professional" Features
```
IF reviewer suggests "implementing properly":
grep codebase for actual usage
IF unused: "This endpoint isn't called. Remove it (YAGNI)?"
IF used: Then implement properly
```
**your human partner's rule:** "You and reviewer both report to me. If we don't need this feature, don't add it."
## Implementation Order
```
FOR multi-item feedback:
1. Clarify anything unclear FIRST
2. Then implement in this order:
- Blocking issues (breaks, security)
- Simple fixes (typos, imports)
- Complex fixes (refactoring, logic)
3. Test each fix individually
4. Verify no regressions
```
## When To Push Back
Push back when:
- Suggestion breaks existing functionality
- Reviewer lacks full context
- Violates YAGNI (unused feature)
- Technically incorrect for this stack
- Legacy/compatibility reasons exist
- Conflicts with your human partner's architectural decisions
**How to push back:**
- Use technical reasoning, not defensiveness
- Ask specific questions
- Reference working tests/code
- Involve your human partner if architectural
**Signal if uncomfortable pushing back out loud:** "Strange things are afoot at the Circle K"
## Acknowledging Correct Feedback
When feedback IS correct:
```
✅ "Fixed. [Brief description of what changed]"
✅ "Good catch - [specific issue]. Fixed in [location]."
✅ [Just fix it and show in the code]
❌ "You're absolutely right!"
❌ "Great point!"
❌ "Thanks for catching that!"
❌ "Thanks for [anything]"
❌ ANY gratitude expression
```
**Why no thanks:** Actions speak. Just fix it. The code itself shows you heard the feedback.
**If you catch yourself about to write "Thanks":** DELETE IT. State the fix instead.
## Gracefully Correcting Your Pushback
If you pushed back and were wrong:
```
✅ "You were right - I checked [X] and it does [Y]. Implementing now."
✅ "Verified this and you're correct. My initial understanding was wrong because [reason]. Fixing."
❌ Long apology
❌ Defending why you pushed back
❌ Over-explaining
```
State the correction factually and move on.
## Common Mistakes
| Mistake | Fix |
|---------|-----|
| Performative agreement | State requirement or just act |
| Blind implementation | Verify against codebase first |
| Batch without testing | One at a time, test each |
| Assuming reviewer is right | Check if breaks things |
| Avoiding pushback | Technical correctness > comfort |
| Partial implementation | Clarify all items first |
| Can't verify, proceed anyway | State limitation, ask for direction |
## Real Examples
**Performative Agreement (Bad):**
```
Reviewer: "Remove legacy code"
❌ "You're absolutely right! Let me remove that..."
```
**Technical Verification (Good):**
```
Reviewer: "Remove legacy code"
✅ "Checking... build target is 10.15+, this API needs 13+. Need legacy for backward compat. Current impl has wrong bundle ID - fix it or drop pre-13 support?"
```
**YAGNI (Good):**
```
Reviewer: "Implement proper metrics tracking with database, date filters, CSV export"
✅ "Grepped codebase - nothing calls this endpoint. Remove it (YAGNI)? Or is there usage I'm missing?"
```
**Unclear Item (Good):**
```
your human partner: "Fix items 1-6"
You understand 1,2,3,6. Unclear on 4,5.
✅ "Understand 1,2,3,6. Need clarification on 4 and 5 before implementing."
```
## GitHub Thread Replies
When replying to inline review comments on GitHub, reply in the comment thread (`gh api repos/{owner}/{repo}/pulls/{pr}/comments/{id}/replies`), not as a top-level PR comment.
## The Bottom Line
**External feedback = suggestions to evaluate, not orders to follow.**
Verify. Question. Then implement.
No performative agreement. Technical rigor always.

View File

@@ -1,105 +0,0 @@
---
name: requesting-code-review
description: Use when completing tasks, implementing major features, or before merging to verify work meets requirements
---
# Requesting Code Review
Dispatch superpowers:code-reviewer subagent to catch issues before they cascade.
**Core principle:** Review early, review often.
## When to Request Review
**Mandatory:**
- After each task in subagent-driven development
- After completing major feature
- Before merge to main
**Optional but valuable:**
- When stuck (fresh perspective)
- Before refactoring (baseline check)
- After fixing complex bug
## How to Request
**1. Get git SHAs:**
```bash
BASE_SHA=$(git rev-parse HEAD~1) # or origin/main
HEAD_SHA=$(git rev-parse HEAD)
```
**2. Dispatch code-reviewer subagent:**
Use Task tool with superpowers:code-reviewer type, fill template at `code-reviewer.md`
**Placeholders:**
- `{WHAT_WAS_IMPLEMENTED}` - What you just built
- `{PLAN_OR_REQUIREMENTS}` - What it should do
- `{BASE_SHA}` - Starting commit
- `{HEAD_SHA}` - Ending commit
- `{DESCRIPTION}` - Brief summary
**3. Act on feedback:**
- Fix Critical issues immediately
- Fix Important issues before proceeding
- Note Minor issues for later
- Push back if reviewer is wrong (with reasoning)
## Example
```
[Just completed Task 2: Add verification function]
You: Let me request code review before proceeding.
BASE_SHA=$(git log --oneline | grep "Task 1" | head -1 | awk '{print $1}')
HEAD_SHA=$(git rev-parse HEAD)
[Dispatch superpowers:code-reviewer subagent]
WHAT_WAS_IMPLEMENTED: Verification and repair functions for conversation index
PLAN_OR_REQUIREMENTS: Task 2 from docs/plans/deployment-plan.md
BASE_SHA: a7981ec
HEAD_SHA: 3df7661
DESCRIPTION: Added verifyIndex() and repairIndex() with 4 issue types
[Subagent returns]:
Strengths: Clean architecture, real tests
Issues:
Important: Missing progress indicators
Minor: Magic number (100) for reporting interval
Assessment: Ready to proceed
You: [Fix progress indicators]
[Continue to Task 3]
```
## Integration with Workflows
**Subagent-Driven Development:**
- Review after EACH task
- Catch issues before they compound
- Fix before moving to next task
**Executing Plans:**
- Review after each batch (3 tasks)
- Get feedback, apply, continue
**Ad-Hoc Development:**
- Review before merge
- Review when stuck
## Red Flags
**Never:**
- Skip review because "it's simple"
- Ignore Critical issues
- Proceed with unfixed Important issues
- Argue with valid technical feedback
**If reviewer wrong:**
- Push back with technical reasoning
- Show code/tests that prove it works
- Request clarification
See template at: requesting-code-review/code-reviewer.md

View File

@@ -1,146 +0,0 @@
# Code Review Agent
You are reviewing code changes for production readiness.
**Your task:**
1. Review {WHAT_WAS_IMPLEMENTED}
2. Compare against {PLAN_OR_REQUIREMENTS}
3. Check code quality, architecture, testing
4. Categorize issues by severity
5. Assess production readiness
## What Was Implemented
{DESCRIPTION}
## Requirements/Plan
{PLAN_REFERENCE}
## Git Range to Review
**Base:** {BASE_SHA}
**Head:** {HEAD_SHA}
```bash
git diff --stat {BASE_SHA}..{HEAD_SHA}
git diff {BASE_SHA}..{HEAD_SHA}
```
## Review Checklist
**Code Quality:**
- Clean separation of concerns?
- Proper error handling?
- Type safety (if applicable)?
- DRY principle followed?
- Edge cases handled?
**Architecture:**
- Sound design decisions?
- Scalability considerations?
- Performance implications?
- Security concerns?
**Testing:**
- Tests actually test logic (not mocks)?
- Edge cases covered?
- Integration tests where needed?
- All tests passing?
**Requirements:**
- All plan requirements met?
- Implementation matches spec?
- No scope creep?
- Breaking changes documented?
**Production Readiness:**
- Migration strategy (if schema changes)?
- Backward compatibility considered?
- Documentation complete?
- No obvious bugs?
## Output Format
### Strengths
[What's well done? Be specific.]
### Issues
#### Critical (Must Fix)
[Bugs, security issues, data loss risks, broken functionality]
#### Important (Should Fix)
[Architecture problems, missing features, poor error handling, test gaps]
#### Minor (Nice to Have)
[Code style, optimization opportunities, documentation improvements]
**For each issue:**
- File:line reference
- What's wrong
- Why it matters
- How to fix (if not obvious)
### Recommendations
[Improvements for code quality, architecture, or process]
### Assessment
**Ready to merge?** [Yes/No/With fixes]
**Reasoning:** [Technical assessment in 1-2 sentences]
## Critical Rules
**DO:**
- Categorize by actual severity (not everything is Critical)
- Be specific (file:line, not vague)
- Explain WHY issues matter
- Acknowledge strengths
- Give clear verdict
**DON'T:**
- Say "looks good" without checking
- Mark nitpicks as Critical
- Give feedback on code you didn't review
- Be vague ("improve error handling")
- Avoid giving a clear verdict
## Example Output
```
### Strengths
- Clean database schema with proper migrations (db.ts:15-42)
- Comprehensive test coverage (18 tests, all edge cases)
- Good error handling with fallbacks (summarizer.ts:85-92)
### Issues
#### Important
1. **Missing help text in CLI wrapper**
- File: index-conversations:1-31
- Issue: No --help flag, users won't discover --concurrency
- Fix: Add --help case with usage examples
2. **Date validation missing**
- File: search.ts:25-27
- Issue: Invalid dates silently return no results
- Fix: Validate ISO format, throw error with example
#### Minor
1. **Progress indicators**
- File: indexer.ts:130
- Issue: No "X of Y" counter for long operations
- Impact: Users don't know how long to wait
### Recommendations
- Add progress reporting for user experience
- Consider config file for excluded projects (portability)
### Assessment
**Ready to merge: With fixes**
**Reasoning:** Core implementation is solid with good architecture and tests. Important issues (help text, date validation) are easily fixed and don't affect core functionality.
```

View File

@@ -199,4 +199,4 @@
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
limitations under the License.

View File

@@ -1,7 +1,8 @@
---
name: skill-creator
description: Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.
license: Complete terms in LICENSE.txt
description: Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Codex's capabilities with specialized knowledge, workflows, or tool integrations.
metadata:
short-description: Create or update a skill
---
# Skill Creator
@@ -10,9 +11,9 @@ This skill provides guidance for creating effective skills.
## About Skills
Skills are modular, self-contained packages that extend Claude's capabilities by providing
Skills are modular, self-contained folders that extend Codex's capabilities by providing
specialized knowledge, workflows, and tools. Think of them as "onboarding guides" for specific
domains or tasks—they transform Claude from a general-purpose agent into a specialized agent
domains or tasks—they transform Codex from a general-purpose agent into a specialized agent
equipped with procedural knowledge that no model can fully possess.
### What Skills Provide
@@ -26,9 +27,9 @@ equipped with procedural knowledge that no model can fully possess.
### Concise is Key
The context window is a public good. Skills share the context window with everything else Claude needs: system prompt, conversation history, other Skills' metadata, and the actual user request.
The context window is a public good. Skills share the context window with everything else Codex needs: system prompt, conversation history, other Skills' metadata, and the actual user request.
**Default assumption: Claude is already very smart.** Only add context Claude doesn't already have. Challenge each piece of information: "Does Claude really need this explanation?" and "Does this paragraph justify its token cost?"
**Default assumption: Codex is already very smart.** Only add context Codex doesn't already have. Challenge each piece of information: "Does Codex really need this explanation?" and "Does this paragraph justify its token cost?"
Prefer concise examples over verbose explanations.
@@ -42,7 +43,15 @@ Match the level of specificity to the task's fragility and variability:
**Low freedom (specific scripts, few parameters)**: Use when operations are fragile and error-prone, consistency is critical, or a specific sequence must be followed.
Think of Claude as exploring a path: a narrow bridge with cliffs needs specific guardrails (low freedom), while an open field allows many routes (high freedom).
Think of Codex as exploring a path: a narrow bridge with cliffs needs specific guardrails (low freedom), while an open field allows many routes (high freedom).
### Protect Validation Integrity
You may use subagents during iteration to validate whether a skill works on realistic tasks or whether a suspected problem is real. This is most useful when you want an independent pass on the skill's behavior, outputs, or failure modes after a revision. Only do this when it is possible to start new subagents.
When using subagents for validation, treat that as an evaluation surface. The goal is to learn whether the skill generalizes, not whether another agent can reconstruct the answer from leaked context.
Prefer raw artifacts such as example prompts, outputs, diffs, logs, or traces. Give the minimum task-local context needed to perform the validation. Avoid passing the intended answer, suspected bug, intended fix, or your prior conclusions unless the validation explicitly requires them.
### Anatomy of a Skill
@@ -53,9 +62,10 @@ skill-name/
├── SKILL.md (required)
│ ├── YAML frontmatter metadata (required)
│ │ ├── name: (required)
│ │ ── description: (required)
│ │ └── compatibility: (optional, rarely needed)
│ │ ── description: (required)
│ └── Markdown instructions (required)
├── agents/ (recommended)
│ └── openai.yaml - UI metadata for skill lists and chips
└── Bundled Resources (optional)
├── scripts/ - Executable code (Python/Bash/etc.)
├── references/ - Documentation intended to be loaded into context as needed
@@ -66,9 +76,19 @@ skill-name/
Every SKILL.md consists of:
- **Frontmatter** (YAML): Contains `name` and `description` fields (required), plus optional fields like `license`, `metadata`, and `compatibility`. Only `name` and `description` are read by Claude to determine when the skill triggers, so be clear and comprehensive about what the skill is and when it should be used. The `compatibility` field is for noting environment requirements (target product, system packages, etc.) but most skills don't need it.
- **Frontmatter** (YAML): Contains `name` and `description` fields. These are the only fields that Codex reads to determine when the skill gets used, thus it is very important to be clear and comprehensive in describing what the skill is, and when it should be used.
- **Body** (Markdown): Instructions and guidance for using the skill. Only loaded AFTER the skill triggers (if at all).
#### Agents metadata (recommended)
- UI-facing metadata for skill lists and chips
- Read references/openai_yaml.md before generating values and follow its descriptions and constraints
- Create: human-facing `display_name`, `short_description`, and `default_prompt` by reading the skill
- Generate deterministically by passing the values as `--interface key=value` to `scripts/generate_openai_yaml.py` or `scripts/init_skill.py`
- On updates: validate `agents/openai.yaml` still matches SKILL.md; regenerate if stale
- Only include other optional interface fields (icons, brand color) if explicitly provided
- See references/openai_yaml.md for field definitions and examples
#### Bundled Resources (optional)
##### Scripts (`scripts/`)
@@ -78,27 +98,27 @@ Executable code (Python/Bash/etc.) for tasks that require deterministic reliabil
- **When to include**: When the same code is being rewritten repeatedly or deterministic reliability is needed
- **Example**: `scripts/rotate_pdf.py` for PDF rotation tasks
- **Benefits**: Token efficient, deterministic, may be executed without loading into context
- **Note**: Scripts may still need to be read by Claude for patching or environment-specific adjustments
- **Note**: Scripts may still need to be read by Codex for patching or environment-specific adjustments
##### References (`references/`)
Documentation and reference material intended to be loaded as needed into context to inform Claude's process and thinking.
Documentation and reference material intended to be loaded as needed into context to inform Codex's process and thinking.
- **When to include**: For documentation that Claude should reference while working
- **When to include**: For documentation that Codex should reference while working
- **Examples**: `references/finance.md` for financial schemas, `references/mnda.md` for company NDA template, `references/policies.md` for company policies, `references/api_docs.md` for API specifications
- **Use cases**: Database schemas, API documentation, domain knowledge, company policies, detailed workflow guides
- **Benefits**: Keeps SKILL.md lean, loaded only when Claude determines it's needed
- **Benefits**: Keeps SKILL.md lean, loaded only when Codex determines it's needed
- **Best practice**: If files are large (>10k words), include grep search patterns in SKILL.md
- **Avoid duplication**: Information should live in either SKILL.md or references files, not both. Prefer references files for detailed information unless it's truly core to the skill—this keeps SKILL.md lean while making information discoverable without hogging the context window. Keep only essential procedural instructions and workflow guidance in SKILL.md; move detailed reference material, schemas, and examples to references files.
##### Assets (`assets/`)
Files not intended to be loaded into context, but rather used within the output Claude produces.
Files not intended to be loaded into context, but rather used within the output Codex produces.
- **When to include**: When the skill needs files that will be used in the final output
- **Examples**: `assets/logo.png` for brand assets, `assets/slides.pptx` for PowerPoint templates, `assets/frontend-template/` for HTML/React boilerplate, `assets/font.ttf` for typography
- **Use cases**: Templates, images, icons, boilerplate code, fonts, sample documents that get copied or modified
- **Benefits**: Separates output resources from documentation, enables Claude to use files without loading them into context
- **Benefits**: Separates output resources from documentation, enables Codex to use files without loading them into context
#### What to Not Include in a Skill
@@ -110,7 +130,7 @@ A skill should only contain essential files that directly support its functional
- CHANGELOG.md
- etc.
The skill should only contain the information needed for an AI agent to do the job at hand. It should not contain auxilary context about the process that went into creating it, setup and testing procedures, user-facing documentation, etc. Creating additional documentation files just adds clutter and confusion.
The skill should only contain the information needed for an AI agent to do the job at hand. It should not contain auxiliary context about the process that went into creating it, setup and testing procedures, user-facing documentation, etc. Creating additional documentation files just adds clutter and confusion.
### Progressive Disclosure Design Principle
@@ -118,7 +138,7 @@ Skills use a three-level loading system to manage context efficiently:
1. **Metadata (name + description)** - Always in context (~100 words)
2. **SKILL.md body** - When skill triggers (<5k words)
3. **Bundled resources** - As needed by Claude (Unlimited because scripts can be executed without reading into context window)
3. **Bundled resources** - As needed by Codex (Unlimited because scripts can be executed without reading into context window)
#### Progressive Disclosure Patterns
@@ -143,7 +163,7 @@ Extract text with pdfplumber:
- **Examples**: See [EXAMPLES.md](EXAMPLES.md) for common patterns
```
Claude loads FORMS.md, REFERENCE.md, or EXAMPLES.md only when needed.
Codex loads FORMS.md, REFERENCE.md, or EXAMPLES.md only when needed.
**Pattern 2: Domain-specific organization**
@@ -159,7 +179,7 @@ bigquery-skill/
└── marketing.md (campaigns, attribution)
```
When a user asks about sales metrics, Claude only reads sales.md.
When a user asks about sales metrics, Codex only reads sales.md.
Similarly, for skills supporting multiple frameworks or variants, organize by variant:
@@ -172,7 +192,7 @@ cloud-deploy/
└── azure.md (Azure deployment patterns)
```
When the user chooses AWS, Claude only reads aws.md.
When the user chooses AWS, Codex only reads aws.md.
**Pattern 3: Conditional details**
@@ -193,12 +213,12 @@ For simple edits, modify the XML directly.
**For OOXML details**: See [OOXML.md](OOXML.md)
```
Claude reads REDLINING.md or OOXML.md only when the user needs those features.
Codex reads REDLINING.md or OOXML.md only when the user needs those features.
**Important guidelines:**
- **Avoid deeply nested references** - Keep references one level deep from SKILL.md. All reference files should link directly from SKILL.md.
- **Structure longer reference files** - For files longer than 100 lines, include a table of contents at the top so Claude can see the full scope when previewing.
- **Structure longer reference files** - For files longer than 100 lines, include a table of contents at the top so Codex can see the full scope when previewing.
## Skill Creation Process
@@ -208,11 +228,19 @@ Skill creation involves these steps:
2. Plan reusable skill contents (scripts, references, assets)
3. Initialize the skill (run init_skill.py)
4. Edit the skill (implement resources and write SKILL.md)
5. Package the skill (run package_skill.py)
6. Iterate based on real usage
5. Validate the skill (run quick_validate.py)
6. Iterate based on real usage and forward-test complex skills.
Follow these steps in order, skipping only if there is a clear reason why they are not applicable.
### Skill Naming
- Use lowercase letters, digits, and hyphens only; normalize user-provided titles to hyphen-case (e.g., "Plan Mode" -> `plan-mode`).
- When generating names, generate a name under 64 characters (letters, digits, hyphens).
- Prefer short, verb-led phrases that describe the action.
- Namespace by tool when it improves clarity or triggering (e.g., `gh-address-comments`, `linear-address-issue`).
- Name the skill folder exactly after the skill name.
### Step 1: Understanding the Skill with Concrete Examples
Skip this step only when the skill's usage patterns are already clearly understood. It remains valuable even when working with an existing skill.
@@ -225,6 +253,7 @@ For example, when building an image-editor skill, relevant questions include:
- "Can you give some examples of how this skill would be used?"
- "I can imagine users asking for things like 'Remove the red-eye from this image' or 'Rotate this image'. Are there other ways you imagine this skill being used?"
- "What would a user say that should trigger this skill?"
- "Where should I create this skill? If you do not have a preference, I will place it in `$CODEX_HOME/skills` (or `~/.codex/skills` when `CODEX_HOME` is unset) so Codex can discover it automatically."
To avoid overwhelming users, avoid asking too many questions in a single message. Start with the most important questions and follow up as needed for better effectiveness.
@@ -258,37 +287,49 @@ To establish the skill's contents, analyze each concrete example to create a lis
At this point, it is time to actually create the skill.
Skip this step only if the skill being developed already exists, and iteration or packaging is needed. In this case, continue to the next step.
Skip this step only if the skill being developed already exists. In this case, continue to the next step.
Before running `init_skill.py`, ask where the user wants the skill created. If they do not specify a location, default to `$CODEX_HOME/skills`; when `CODEX_HOME` is unset, fall back to `~/.codex/skills` so the skill is auto-discovered.
When creating a new skill from scratch, always run the `init_skill.py` script. The script conveniently generates a new template skill directory that automatically includes everything a skill requires, making the skill creation process much more efficient and reliable.
Usage:
```bash
scripts/init_skill.py <skill-name> --path <output-directory>
scripts/init_skill.py <skill-name> --path <output-directory> [--resources scripts,references,assets] [--examples]
```
Examples:
```bash
scripts/init_skill.py my-skill --path "${CODEX_HOME:-$HOME/.codex}/skills"
scripts/init_skill.py my-skill --path "${CODEX_HOME:-$HOME/.codex}/skills" --resources scripts,references
scripts/init_skill.py my-skill --path ~/work/skills --resources scripts --examples
```
The script:
- Creates the skill directory at the specified path
- Generates a SKILL.md template with proper frontmatter and TODO placeholders
- Creates example resource directories: `scripts/`, `references/`, and `assets/`
- Adds example files in each directory that can be customized or deleted
- Creates `agents/openai.yaml` using agent-generated `display_name`, `short_description`, and `default_prompt` passed via `--interface key=value`
- Optionally creates resource directories based on `--resources`
- Optionally adds example files when `--examples` is set
After initialization, customize or remove the generated SKILL.md and example files as needed.
After initialization, customize the SKILL.md and add resources as needed. If you used `--examples`, replace or delete placeholder files.
Generate `display_name`, `short_description`, and `default_prompt` by reading the skill, then pass them as `--interface key=value` to `init_skill.py` or regenerate with:
```bash
scripts/generate_openai_yaml.py <path/to/skill-folder> --interface key=value
```
Only include other optional interface fields when the user explicitly provides them. For full field descriptions and examples, see references/openai_yaml.md.
### Step 4: Edit the Skill
When editing the (newly-generated or existing) skill, remember that the skill is being created for another instance of Claude to use. Include information that would be beneficial and non-obvious to Claude. Consider what procedural knowledge, domain-specific details, or reusable assets would help another Claude instance execute these tasks more effectively.
When editing the (newly-generated or existing) skill, remember that the skill is being created for another instance of Codex to use. Include information that would be beneficial and non-obvious to Codex. Consider what procedural knowledge, domain-specific details, or reusable assets would help another Codex instance execute these tasks more effectively.
#### Learn Proven Design Patterns
Consult these helpful guides based on your skill's needs:
- **Multi-step processes**: See references/workflows.md for sequential workflows and conditional logic
- **Specific output formats or quality standards**: See references/output-patterns.md for template and example patterns
These files contain established best practices for effective skill design.
After substantial revisions, or if the skill is particularly tricky, you should use subagents to forward-test the skill on realistic tasks or artifacts. When doing so, pass the artifact under validation rather than your diagnosis of what is wrong, and keep the prompt generic enough that success depends on transferable reasoning rather than hidden ground truth.
#### Start with Reusable Skill Contents
@@ -296,7 +337,7 @@ To begin implementation, start with the reusable resources identified above: `sc
Added scripts must be tested by actually running them to ensure there are no bugs and that the output matches what is expected. If there are many similar scripts, only a representative sample needs to be tested to ensure confidence that they all work while balancing time to completion.
Any example files and directories not needed for the skill should be deleted. The initialization script creates example files in `scripts/`, `references/`, and `assets/` to demonstrate structure, but most skills won't need all of them.
If you used `--examples`, delete any placeholder files that are not needed for the skill. Only create resource directories that are actually required.
#### Update SKILL.md
@@ -307,10 +348,10 @@ Any example files and directories not needed for the skill should be deleted. Th
Write the YAML frontmatter with `name` and `description`:
- `name`: The skill name
- `description`: This is the primary triggering mechanism for your skill, and helps Claude understand when to use the skill.
- `description`: This is the primary triggering mechanism for your skill, and helps Codex understand when to use the skill.
- Include both what the Skill does and specific triggers/contexts for when to use it.
- Include all "when to use" information here - Not in the body. The body is only loaded after triggering, so "When to Use This Skill" sections in the body are not helpful to Claude.
- Example description for a `docx` skill: "Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. Use when Claude needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks"
- Include all "when to use" information here - Not in the body. The body is only loaded after triggering, so "When to Use This Skill" sections in the body are not helpful to Codex.
- Example description for a `docx` skill: "Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. Use when Codex needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks"
Do not include any other fields in YAML frontmatter.
@@ -318,40 +359,58 @@ Do not include any other fields in YAML frontmatter.
Write instructions for using the skill and its bundled resources.
### Step 5: Packaging a Skill
### Step 5: Validate the Skill
Once development of the skill is complete, it must be packaged into a distributable .skill file that gets shared with the user. The packaging process automatically validates the skill first to ensure it meets all requirements:
Once development of the skill is complete, validate the skill folder to catch basic issues early:
```bash
scripts/package_skill.py <path/to/skill-folder>
scripts/quick_validate.py <path/to/skill-folder>
```
Optional output directory specification:
```bash
scripts/package_skill.py <path/to/skill-folder> ./dist
```
The packaging script will:
1. **Validate** the skill automatically, checking:
- YAML frontmatter format and required fields
- Skill naming conventions and directory structure
- Description completeness and quality
- File organization and resource references
2. **Package** the skill if validation passes, creating a .skill file named after the skill (e.g., `my-skill.skill`) that includes all files and maintains the proper directory structure for distribution. The .skill file is a zip file with a .skill extension.
If validation fails, the script will report the errors and exit without creating a package. Fix any validation errors and run the packaging command again.
The validation script checks YAML frontmatter format, required fields, and naming rules. If validation fails, fix the reported issues and run the command again.
### Step 6: Iterate
After testing the skill, users may request improvements. Often this happens right after using the skill, with fresh context of how the skill performed.
After testing the skill, you may detect the skill is complex enough that it requires forward-testing; or users may request improvements.
**Iteration workflow:**
User testing often this happens right after using the skill, with fresh context of how the skill performed.
**Forward-testing and iteration workflow:**
1. Use the skill on real tasks
2. Notice struggles or inefficiencies
3. Identify how SKILL.md or bundled resources should be updated
4. Implement changes and test again
5. Forward-test if it is reasonable and appropriate
## Forward-testing
To forward-test, launch subagents as a way to stress test the skill with minimal context.
Subagents should *not* know that they are being asked to test the skill. They should be treated as
an agent asked to perform a task by the user. Prompts to subagents should look like:
`Use $skill-x at /path/to/skill-x to solve problem y`
Not:
`Review the skill at /path/to/skill-x; pretend a user asks you to...`
Decision rule for forward-testing:
- Err on the side of forward-testing
- Ask for approval if you think there's a risk that forward-testing would:
* take a long time,
* require additional approvals from the user, or
* modify live production systems
In these cases, show the user your proposed prompt and request (1) a yes/no decision, and
(2) any suggested modifictions.
Considerations when forward-testing:
- use fresh threads for independent passes
- pass the skill, and a request in a similar way the user would.
- pass raw artifacts, not your conclusions
- avoid showing expected answers or intended fixes
- rebuild context from source artifacts after each iteration
- review the subagent's output and reasoning and emitted artifacts
- avoid leaving artifacts the agent can find on disk between iterations;
clean up subagents' artifacts to avoid additional contamination.
If forward-testing only succeeds when subagents see leaked context, tighten the skill or the
forward-testing setup before trusting the result.

View File

@@ -0,0 +1,5 @@
interface:
display_name: "Skill Creator"
short_description: "Create or update a skill"
icon_small: "./assets/skill-creator-small.svg"
icon_large: "./assets/skill-creator.png"

View File

@@ -0,0 +1,3 @@
<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="currentColor" viewBox="0 0 20 20">
<path fill="#0D0D0D" d="M12.03 4.113a3.612 3.612 0 0 1 5.108 5.108l-6.292 6.29c-.324.324-.56.561-.791.752l-.235.176c-.205.14-.422.261-.65.36l-.229.093a4.136 4.136 0 0 1-.586.16l-.764.134-2.394.4c-.142.024-.294.05-.423.06-.098.007-.232.01-.378-.026l-.149-.05a1.081 1.081 0 0 1-.521-.474l-.046-.093a1.104 1.104 0 0 1-.075-.527c.01-.129.035-.28.06-.422l.398-2.394c.1-.602.162-.987.295-1.35l.093-.23c.1-.228.22-.445.36-.65l.176-.235c.19-.232.428-.467.751-.79l6.292-6.292Zm-5.35 7.232c-.35.35-.534.535-.66.688l-.11.147a2.67 2.67 0 0 0-.24.433l-.062.154c-.08.22-.124.462-.232 1.112l-.398 2.394-.001.001h.003l2.393-.399.717-.126a2.63 2.63 0 0 0 .394-.105l.154-.063a2.65 2.65 0 0 0 .433-.24l.147-.11c.153-.126.339-.31.688-.66l4.988-4.988-3.227-3.226-4.987 4.988Zm9.517-6.291a2.281 2.281 0 0 0-3.225 0l-.364.362 3.226 3.227.363-.364c.89-.89.89-2.334 0-3.225ZM4.583 1.783a.3.3 0 0 1 .294.241c.117.585.347 1.092.707 1.48.357.385.859.668 1.549.783a.3.3 0 0 1 0 .592c-.69.115-1.192.398-1.549.783-.315.34-.53.77-.657 1.265l-.05.215a.3.3 0 0 1-.588 0c-.117-.585-.347-1.092-.707-1.48-.357-.384-.859-.668-1.549-.783a.3.3 0 0 1 0-.592c.69-.115 1.192-.398 1.549-.783.36-.388.59-.895.707-1.48l.015-.05a.3.3 0 0 1 .279-.19Z"/>
</svg>

After

Width:  |  Height:  |  Size: 1.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 KiB

View File

@@ -0,0 +1,49 @@
# openai.yaml fields (full example + descriptions)
`agents/openai.yaml` is an extended, product-specific config intended for the machine/harness to read, not the agent. Other product-specific config can also live in the `agents/` folder.
## Full example
```yaml
interface:
display_name: "Optional user-facing name"
short_description: "Optional user-facing description"
icon_small: "./assets/small-400px.png"
icon_large: "./assets/large-logo.svg"
brand_color: "#3B82F6"
default_prompt: "Optional surrounding prompt to use the skill with"
dependencies:
tools:
- type: "mcp"
value: "github"
description: "GitHub MCP server"
transport: "streamable_http"
url: "https://api.githubcopilot.com/mcp/"
policy:
allow_implicit_invocation: true
```
## Field descriptions and constraints
Top-level constraints:
- Quote all string values.
- Keep keys unquoted.
- For `interface.default_prompt`: generate a helpful, short (typically 1 sentence) example starting prompt based on the skill. It must explicitly mention the skill as `$skill-name` (e.g., "Use $skill-name-here to draft a concise weekly status update.").
- `interface.display_name`: Human-facing title shown in UI skill lists and chips.
- `interface.short_description`: Human-facing short UI blurb (2564 chars) for quick scanning.
- `interface.icon_small`: Path to a small icon asset (relative to skill dir). Default to `./assets/` and place icons in the skill's `assets/` folder.
- `interface.icon_large`: Path to a larger logo asset (relative to skill dir). Default to `./assets/` and place icons in the skill's `assets/` folder.
- `interface.brand_color`: Hex color used for UI accents (e.g., badges).
- `interface.default_prompt`: Default prompt snippet inserted when invoking the skill.
- `dependencies.tools[].type`: Dependency category. Only `mcp` is supported for now.
- `dependencies.tools[].value`: Identifier of the tool or dependency.
- `dependencies.tools[].description`: Human-readable explanation of the dependency.
- `dependencies.tools[].transport`: Connection type when `type` is `mcp`.
- `dependencies.tools[].url`: MCP server URL when `type` is `mcp`.
- `policy.allow_implicit_invocation`: When false, the skill is not injected into
the model context by default, but can still be invoked explicitly via `$skill`.
Defaults to true.

View File

@@ -0,0 +1,226 @@
#!/usr/bin/env python3
"""
OpenAI YAML Generator - Creates agents/openai.yaml for a skill folder.
Usage:
generate_openai_yaml.py <skill_dir> [--name <skill_name>] [--interface key=value]
"""
import argparse
import re
import sys
from pathlib import Path
ACRONYMS = {
"GH",
"MCP",
"API",
"CI",
"CLI",
"LLM",
"PDF",
"PR",
"UI",
"URL",
"SQL",
}
BRANDS = {
"openai": "OpenAI",
"openapi": "OpenAPI",
"github": "GitHub",
"pagerduty": "PagerDuty",
"datadog": "DataDog",
"sqlite": "SQLite",
"fastapi": "FastAPI",
}
SMALL_WORDS = {"and", "or", "to", "up", "with"}
ALLOWED_INTERFACE_KEYS = {
"display_name",
"short_description",
"icon_small",
"icon_large",
"brand_color",
"default_prompt",
}
def yaml_quote(value):
escaped = value.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")
return f'"{escaped}"'
def format_display_name(skill_name):
words = [word for word in skill_name.split("-") if word]
formatted = []
for index, word in enumerate(words):
lower = word.lower()
upper = word.upper()
if upper in ACRONYMS:
formatted.append(upper)
continue
if lower in BRANDS:
formatted.append(BRANDS[lower])
continue
if index > 0 and lower in SMALL_WORDS:
formatted.append(lower)
continue
formatted.append(word.capitalize())
return " ".join(formatted)
def generate_short_description(display_name):
description = f"Help with {display_name} tasks"
if len(description) < 25:
description = f"Help with {display_name} tasks and workflows"
if len(description) < 25:
description = f"Help with {display_name} tasks with guidance"
if len(description) > 64:
description = f"Help with {display_name}"
if len(description) > 64:
description = f"{display_name} helper"
if len(description) > 64:
description = f"{display_name} tools"
if len(description) > 64:
suffix = " helper"
max_name_length = 64 - len(suffix)
trimmed = display_name[:max_name_length].rstrip()
description = f"{trimmed}{suffix}"
if len(description) > 64:
description = description[:64].rstrip()
if len(description) < 25:
description = f"{description} workflows"
if len(description) > 64:
description = description[:64].rstrip()
return description
def read_frontmatter_name(skill_dir):
skill_md = Path(skill_dir) / "SKILL.md"
if not skill_md.exists():
print(f"[ERROR] SKILL.md not found in {skill_dir}")
return None
content = skill_md.read_text()
match = re.match(r"^---\n(.*?)\n---", content, re.DOTALL)
if not match:
print("[ERROR] Invalid SKILL.md frontmatter format.")
return None
frontmatter_text = match.group(1)
import yaml
try:
frontmatter = yaml.safe_load(frontmatter_text)
except yaml.YAMLError as exc:
print(f"[ERROR] Invalid YAML frontmatter: {exc}")
return None
if not isinstance(frontmatter, dict):
print("[ERROR] Frontmatter must be a YAML dictionary.")
return None
name = frontmatter.get("name", "")
if not isinstance(name, str) or not name.strip():
print("[ERROR] Frontmatter 'name' is missing or invalid.")
return None
return name.strip()
def parse_interface_overrides(raw_overrides):
overrides = {}
optional_order = []
for item in raw_overrides:
if "=" not in item:
print(f"[ERROR] Invalid interface override '{item}'. Use key=value.")
return None, None
key, value = item.split("=", 1)
key = key.strip()
value = value.strip()
if not key:
print(f"[ERROR] Invalid interface override '{item}'. Key is empty.")
return None, None
if key not in ALLOWED_INTERFACE_KEYS:
allowed = ", ".join(sorted(ALLOWED_INTERFACE_KEYS))
print(f"[ERROR] Unknown interface field '{key}'. Allowed: {allowed}")
return None, None
overrides[key] = value
if key not in ("display_name", "short_description") and key not in optional_order:
optional_order.append(key)
return overrides, optional_order
def write_openai_yaml(skill_dir, skill_name, raw_overrides):
overrides, optional_order = parse_interface_overrides(raw_overrides)
if overrides is None:
return None
display_name = overrides.get("display_name") or format_display_name(skill_name)
short_description = overrides.get("short_description") or generate_short_description(display_name)
if not (25 <= len(short_description) <= 64):
print(
"[ERROR] short_description must be 25-64 characters "
f"(got {len(short_description)})."
)
return None
interface_lines = [
"interface:",
f" display_name: {yaml_quote(display_name)}",
f" short_description: {yaml_quote(short_description)}",
]
for key in optional_order:
value = overrides.get(key)
if value is not None:
interface_lines.append(f" {key}: {yaml_quote(value)}")
agents_dir = Path(skill_dir) / "agents"
agents_dir.mkdir(parents=True, exist_ok=True)
output_path = agents_dir / "openai.yaml"
output_path.write_text("\n".join(interface_lines) + "\n")
print(f"[OK] Created agents/openai.yaml")
return output_path
def main():
parser = argparse.ArgumentParser(
description="Create agents/openai.yaml for a skill directory.",
)
parser.add_argument("skill_dir", help="Path to the skill directory")
parser.add_argument(
"--name",
help="Skill name override (defaults to SKILL.md frontmatter)",
)
parser.add_argument(
"--interface",
action="append",
default=[],
help="Interface override in key=value format (repeatable)",
)
args = parser.parse_args()
skill_dir = Path(args.skill_dir).resolve()
if not skill_dir.exists():
print(f"[ERROR] Skill directory not found: {skill_dir}")
sys.exit(1)
if not skill_dir.is_dir():
print(f"[ERROR] Path is not a directory: {skill_dir}")
sys.exit(1)
skill_name = args.name or read_frontmatter_name(skill_dir)
if not skill_name:
sys.exit(1)
result = write_openai_yaml(skill_dir, skill_name, args.interface)
if result:
sys.exit(0)
sys.exit(1)
if __name__ == "__main__":
main()

239
.agents/skills/skill-creator/scripts/init_skill.py Executable file → Normal file
View File

@@ -3,17 +3,25 @@
Skill Initializer - Creates a new skill from template
Usage:
init_skill.py <skill-name> --path <path>
init_skill.py <skill-name> --path <path> [--resources scripts,references,assets] [--examples] [--interface key=value]
Examples:
init_skill.py my-new-skill --path skills/public
init_skill.py my-api-helper --path skills/private
init_skill.py my-new-skill --path skills/public --resources scripts,references
init_skill.py my-api-helper --path skills/private --resources scripts --examples
init_skill.py custom-skill --path /custom/location
init_skill.py my-skill --path skills/public --interface short_description="Short UI label"
"""
import argparse
import re
import sys
from pathlib import Path
from generate_openai_yaml import write_openai_yaml
MAX_SKILL_NAME_LENGTH = 64
ALLOWED_RESOURCES = {"scripts", "references", "assets"}
SKILL_TEMPLATE = """---
name: {skill_name}
@@ -32,23 +40,23 @@ description: [TODO: Complete and informative explanation of what the skill does
**1. Workflow-Based** (best for sequential processes)
- Works well when there are clear step-by-step procedures
- Example: DOCX skill with "Workflow Decision Tree" "Reading" "Creating" "Editing"
- Structure: ## Overview ## Workflow Decision Tree ## Step 1 ## Step 2...
- Example: DOCX skill with "Workflow Decision Tree" -> "Reading" -> "Creating" -> "Editing"
- Structure: ## Overview -> ## Workflow Decision Tree -> ## Step 1 -> ## Step 2...
**2. Task-Based** (best for tool collections)
- Works well when the skill offers different operations/capabilities
- Example: PDF skill with "Quick Start" "Merge PDFs" "Split PDFs" "Extract Text"
- Structure: ## Overview ## Quick Start ## Task Category 1 ## Task Category 2...
- Example: PDF skill with "Quick Start" -> "Merge PDFs" -> "Split PDFs" -> "Extract Text"
- Structure: ## Overview -> ## Quick Start -> ## Task Category 1 -> ## Task Category 2...
**3. Reference/Guidelines** (best for standards or specifications)
- Works well for brand guidelines, coding standards, or requirements
- Example: Brand styling with "Brand Guidelines" "Colors" "Typography" "Features"
- Structure: ## Overview ## Guidelines ## Specifications ## Usage...
- Example: Brand styling with "Brand Guidelines" -> "Colors" -> "Typography" -> "Features"
- Structure: ## Overview -> ## Guidelines -> ## Specifications -> ## Usage...
**4. Capabilities-Based** (best for integrated systems)
- Works well when the skill provides multiple interrelated features
- Example: Product Management with "Core Capabilities" numbered capability list
- Structure: ## Overview ## Core Capabilities ### 1. Feature ### 2. Feature...
- Example: Product Management with "Core Capabilities" -> numbered capability list
- Structure: ## Overview -> ## Core Capabilities -> ### 1. Feature -> ### 2. Feature...
Patterns can be mixed and matched as needed. Most skills combine patterns (e.g., start with task-based, add workflow for complex operations).
@@ -62,9 +70,9 @@ Delete this entire "Structuring This Skill" section when done - it's just guidan
- Concrete examples with realistic user requests
- References to scripts/templates/references as needed]
## Resources
## Resources (optional)
This skill includes example resource directories that demonstrate how to organize different types of bundled resources:
Create only the resource directories this skill actually needs. Delete this section if no resources are required.
### scripts/
Executable code (Python/Bash/etc.) that can be run directly to perform specific operations.
@@ -75,20 +83,20 @@ Executable code (Python/Bash/etc.) that can be run directly to perform specific
**Appropriate for:** Python scripts, shell scripts, or any executable code that performs automation, data processing, or specific operations.
**Note:** Scripts may be executed without loading into context, but can still be read by Claude for patching or environment adjustments.
**Note:** Scripts may be executed without loading into context, but can still be read by Codex for patching or environment adjustments.
### references/
Documentation and reference material intended to be loaded into context to inform Claude's process and thinking.
Documentation and reference material intended to be loaded into context to inform Codex's process and thinking.
**Examples from other skills:**
- Product management: `communication.md`, `context_building.md` - detailed workflow guides
- BigQuery: API reference documentation and query examples
- Finance: Schema documentation, company policies
**Appropriate for:** In-depth documentation, API references, database schemas, comprehensive guides, or any detailed information that Claude should reference while working.
**Appropriate for:** In-depth documentation, API references, database schemas, comprehensive guides, or any detailed information that Codex should reference while working.
### assets/
Files not intended to be loaded into context, but rather used within the output Claude produces.
Files not intended to be loaded into context, but rather used within the output Codex produces.
**Examples from other skills:**
- Brand styling: PowerPoint template files (.pptx), logo files
@@ -99,7 +107,7 @@ Files not intended to be loaded into context, but rather used within the output
---
**Any unneeded directories can be deleted.** Not every skill requires all three types of resources.
**Not every skill requires all three types of resources.**
"""
EXAMPLE_SCRIPT = '''#!/usr/bin/env python3
@@ -165,7 +173,7 @@ This placeholder represents where asset files would be stored.
Replace with actual asset files (templates, images, fonts, etc.) or delete if not needed.
Asset files are NOT intended to be loaded into context, but rather used within
the output Claude produces.
the output Codex produces.
Example asset files from other skills:
- Brand guidelines: logo.png, slides_template.pptx
@@ -186,18 +194,76 @@ Note: This is a text placeholder. Actual assets can be any file type.
"""
def normalize_skill_name(skill_name):
"""Normalize a skill name to lowercase hyphen-case."""
normalized = skill_name.strip().lower()
normalized = re.sub(r"[^a-z0-9]+", "-", normalized)
normalized = normalized.strip("-")
normalized = re.sub(r"-{2,}", "-", normalized)
return normalized
def title_case_skill_name(skill_name):
"""Convert hyphenated skill name to Title Case for display."""
return ' '.join(word.capitalize() for word in skill_name.split('-'))
return " ".join(word.capitalize() for word in skill_name.split("-"))
def init_skill(skill_name, path):
def parse_resources(raw_resources):
if not raw_resources:
return []
resources = [item.strip() for item in raw_resources.split(",") if item.strip()]
invalid = sorted({item for item in resources if item not in ALLOWED_RESOURCES})
if invalid:
allowed = ", ".join(sorted(ALLOWED_RESOURCES))
print(f"[ERROR] Unknown resource type(s): {', '.join(invalid)}")
print(f" Allowed: {allowed}")
sys.exit(1)
deduped = []
seen = set()
for resource in resources:
if resource not in seen:
deduped.append(resource)
seen.add(resource)
return deduped
def create_resource_dirs(skill_dir, skill_name, skill_title, resources, include_examples):
for resource in resources:
resource_dir = skill_dir / resource
resource_dir.mkdir(exist_ok=True)
if resource == "scripts":
if include_examples:
example_script = resource_dir / "example.py"
example_script.write_text(EXAMPLE_SCRIPT.format(skill_name=skill_name))
example_script.chmod(0o755)
print("[OK] Created scripts/example.py")
else:
print("[OK] Created scripts/")
elif resource == "references":
if include_examples:
example_reference = resource_dir / "api_reference.md"
example_reference.write_text(EXAMPLE_REFERENCE.format(skill_title=skill_title))
print("[OK] Created references/api_reference.md")
else:
print("[OK] Created references/")
elif resource == "assets":
if include_examples:
example_asset = resource_dir / "example_asset.txt"
example_asset.write_text(EXAMPLE_ASSET)
print("[OK] Created assets/example_asset.txt")
else:
print("[OK] Created assets/")
def init_skill(skill_name, path, resources, include_examples, interface_overrides):
"""
Initialize a new skill directory with template SKILL.md.
Args:
skill_name: Name of the skill
path: Path where the skill directory should be created
resources: Resource directories to create
include_examples: Whether to create example files in resource directories
Returns:
Path to created skill directory, or None if error
@@ -207,91 +273,122 @@ def init_skill(skill_name, path):
# Check if directory already exists
if skill_dir.exists():
print(f"❌ Error: Skill directory already exists: {skill_dir}")
print(f"[ERROR] Skill directory already exists: {skill_dir}")
return None
# Create skill directory
try:
skill_dir.mkdir(parents=True, exist_ok=False)
print(f" Created skill directory: {skill_dir}")
print(f"[OK] Created skill directory: {skill_dir}")
except Exception as e:
print(f" Error creating directory: {e}")
print(f"[ERROR] Error creating directory: {e}")
return None
# Create SKILL.md from template
skill_title = title_case_skill_name(skill_name)
skill_content = SKILL_TEMPLATE.format(
skill_name=skill_name,
skill_title=skill_title
)
skill_content = SKILL_TEMPLATE.format(skill_name=skill_name, skill_title=skill_title)
skill_md_path = skill_dir / 'SKILL.md'
skill_md_path = skill_dir / "SKILL.md"
try:
skill_md_path.write_text(skill_content)
print(" Created SKILL.md")
print("[OK] Created SKILL.md")
except Exception as e:
print(f" Error creating SKILL.md: {e}")
print(f"[ERROR] Error creating SKILL.md: {e}")
return None
# Create resource directories with example files
# Create agents/openai.yaml
try:
# Create scripts/ directory with example script
scripts_dir = skill_dir / 'scripts'
scripts_dir.mkdir(exist_ok=True)
example_script = scripts_dir / 'example.py'
example_script.write_text(EXAMPLE_SCRIPT.format(skill_name=skill_name))
example_script.chmod(0o755)
print("✅ Created scripts/example.py")
# Create references/ directory with example reference doc
references_dir = skill_dir / 'references'
references_dir.mkdir(exist_ok=True)
example_reference = references_dir / 'api_reference.md'
example_reference.write_text(EXAMPLE_REFERENCE.format(skill_title=skill_title))
print("✅ Created references/api_reference.md")
# Create assets/ directory with example asset placeholder
assets_dir = skill_dir / 'assets'
assets_dir.mkdir(exist_ok=True)
example_asset = assets_dir / 'example_asset.txt'
example_asset.write_text(EXAMPLE_ASSET)
print("✅ Created assets/example_asset.txt")
result = write_openai_yaml(skill_dir, skill_name, interface_overrides)
if not result:
return None
except Exception as e:
print(f" Error creating resource directories: {e}")
print(f"[ERROR] Error creating agents/openai.yaml: {e}")
return None
# Create resource directories if requested
if resources:
try:
create_resource_dirs(skill_dir, skill_name, skill_title, resources, include_examples)
except Exception as e:
print(f"[ERROR] Error creating resource directories: {e}")
return None
# Print next steps
print(f"\n Skill '{skill_name}' initialized successfully at {skill_dir}")
print(f"\n[OK] Skill '{skill_name}' initialized successfully at {skill_dir}")
print("\nNext steps:")
print("1. Edit SKILL.md to complete the TODO items and update the description")
print("2. Customize or delete the example files in scripts/, references/, and assets/")
print("3. Run the validator when ready to check the skill structure")
if resources:
if include_examples:
print("2. Customize or delete the example files in scripts/, references/, and assets/")
else:
print("2. Add resources to scripts/, references/, and assets/ as needed")
else:
print("2. Create resource directories only if needed (scripts/, references/, assets/)")
print("3. Update agents/openai.yaml if the UI metadata should differ")
print("4. Run the validator when ready to check the skill structure")
print(
"5. Forward-test complex skills with realistic user requests to ensure they work as intended"
)
return skill_dir
def main():
if len(sys.argv) < 4 or sys.argv[2] != '--path':
print("Usage: init_skill.py <skill-name> --path <path>")
print("\nSkill name requirements:")
print(" - Kebab-case identifier (e.g., 'my-data-analyzer')")
print(" - Lowercase letters, digits, and hyphens only")
print(" - Max 64 characters")
print(" - Must match directory name exactly")
print("\nExamples:")
print(" init_skill.py my-new-skill --path skills/public")
print(" init_skill.py my-api-helper --path skills/private")
print(" init_skill.py custom-skill --path /custom/location")
parser = argparse.ArgumentParser(
description="Create a new skill directory with a SKILL.md template.",
)
parser.add_argument("skill_name", help="Skill name (normalized to hyphen-case)")
parser.add_argument("--path", required=True, help="Output directory for the skill")
parser.add_argument(
"--resources",
default="",
help="Comma-separated list: scripts,references,assets",
)
parser.add_argument(
"--examples",
action="store_true",
help="Create example files inside the selected resource directories",
)
parser.add_argument(
"--interface",
action="append",
default=[],
help="Interface override in key=value format (repeatable)",
)
args = parser.parse_args()
raw_skill_name = args.skill_name
skill_name = normalize_skill_name(raw_skill_name)
if not skill_name:
print("[ERROR] Skill name must include at least one letter or digit.")
sys.exit(1)
if len(skill_name) > MAX_SKILL_NAME_LENGTH:
print(
f"[ERROR] Skill name '{skill_name}' is too long ({len(skill_name)} characters). "
f"Maximum is {MAX_SKILL_NAME_LENGTH} characters."
)
sys.exit(1)
if skill_name != raw_skill_name:
print(f"Note: Normalized skill name from '{raw_skill_name}' to '{skill_name}'.")
resources = parse_resources(args.resources)
if args.examples and not resources:
print("[ERROR] --examples requires --resources to be set.")
sys.exit(1)
skill_name = sys.argv[1]
path = sys.argv[3]
path = args.path
print(f"🚀 Initializing skill: {skill_name}")
print(f"Initializing skill: {skill_name}")
print(f" Location: {path}")
if resources:
print(f" Resources: {', '.join(resources)}")
if args.examples:
print(" Examples: enabled")
else:
print(" Resources: none (create as needed)")
print()
result = init_skill(skill_name, path)
result = init_skill(skill_name, path, resources, args.examples, args.interface)
if result:
sys.exit(0)

90
.agents/skills/skill-creator/scripts/quick_validate.py Executable file → Normal file
View File

@@ -3,34 +3,33 @@
Quick validation script for skills - minimal version
"""
import sys
import os
import re
import yaml
import sys
from pathlib import Path
import yaml
MAX_SKILL_NAME_LENGTH = 64
def validate_skill(skill_path):
"""Basic validation of a skill"""
skill_path = Path(skill_path)
# Check SKILL.md exists
skill_md = skill_path / 'SKILL.md'
skill_md = skill_path / "SKILL.md"
if not skill_md.exists():
return False, "SKILL.md not found"
# Read and validate frontmatter
content = skill_md.read_text()
if not content.startswith('---'):
if not content.startswith("---"):
return False, "No YAML frontmatter found"
# Extract frontmatter
match = re.match(r'^---\n(.*?)\n---', content, re.DOTALL)
match = re.match(r"^---\n(.*?)\n---", content, re.DOTALL)
if not match:
return False, "Invalid frontmatter format"
frontmatter_text = match.group(1)
# Parse YAML frontmatter
try:
frontmatter = yaml.safe_load(frontmatter_text)
if not isinstance(frontmatter, dict):
@@ -38,66 +37,65 @@ def validate_skill(skill_path):
except yaml.YAMLError as e:
return False, f"Invalid YAML in frontmatter: {e}"
# Define allowed properties
ALLOWED_PROPERTIES = {'name', 'description', 'license', 'allowed-tools', 'metadata', 'compatibility'}
allowed_properties = {"name", "description", "license", "allowed-tools", "metadata"}
# Check for unexpected properties (excluding nested keys under metadata)
unexpected_keys = set(frontmatter.keys()) - ALLOWED_PROPERTIES
unexpected_keys = set(frontmatter.keys()) - allowed_properties
if unexpected_keys:
return False, (
f"Unexpected key(s) in SKILL.md frontmatter: {', '.join(sorted(unexpected_keys))}. "
f"Allowed properties are: {', '.join(sorted(ALLOWED_PROPERTIES))}"
allowed = ", ".join(sorted(allowed_properties))
unexpected = ", ".join(sorted(unexpected_keys))
return (
False,
f"Unexpected key(s) in SKILL.md frontmatter: {unexpected}. Allowed properties are: {allowed}",
)
# Check required fields
if 'name' not in frontmatter:
if "name" not in frontmatter:
return False, "Missing 'name' in frontmatter"
if 'description' not in frontmatter:
if "description" not in frontmatter:
return False, "Missing 'description' in frontmatter"
# Extract name for validation
name = frontmatter.get('name', '')
name = frontmatter.get("name", "")
if not isinstance(name, str):
return False, f"Name must be a string, got {type(name).__name__}"
name = name.strip()
if name:
# Check naming convention (kebab-case: lowercase with hyphens)
if not re.match(r'^[a-z0-9-]+$', name):
return False, f"Name '{name}' should be kebab-case (lowercase letters, digits, and hyphens only)"
if name.startswith('-') or name.endswith('-') or '--' in name:
return False, f"Name '{name}' cannot start/end with hyphen or contain consecutive hyphens"
# Check name length (max 64 characters per spec)
if len(name) > 64:
return False, f"Name is too long ({len(name)} characters). Maximum is 64 characters."
if not re.match(r"^[a-z0-9-]+$", name):
return (
False,
f"Name '{name}' should be hyphen-case (lowercase letters, digits, and hyphens only)",
)
if name.startswith("-") or name.endswith("-") or "--" in name:
return (
False,
f"Name '{name}' cannot start/end with hyphen or contain consecutive hyphens",
)
if len(name) > MAX_SKILL_NAME_LENGTH:
return (
False,
f"Name is too long ({len(name)} characters). "
f"Maximum is {MAX_SKILL_NAME_LENGTH} characters.",
)
# Extract and validate description
description = frontmatter.get('description', '')
description = frontmatter.get("description", "")
if not isinstance(description, str):
return False, f"Description must be a string, got {type(description).__name__}"
description = description.strip()
if description:
# Check for angle brackets
if '<' in description or '>' in description:
if "<" in description or ">" in description:
return False, "Description cannot contain angle brackets (< or >)"
# Check description length (max 1024 characters per spec)
if len(description) > 1024:
return False, f"Description is too long ({len(description)} characters). Maximum is 1024 characters."
# Validate compatibility field if present (optional)
compatibility = frontmatter.get('compatibility', '')
if compatibility:
if not isinstance(compatibility, str):
return False, f"Compatibility must be a string, got {type(compatibility).__name__}"
if len(compatibility) > 500:
return False, f"Compatibility is too long ({len(compatibility)} characters). Maximum is 500 characters."
return (
False,
f"Description is too long ({len(description)} characters). Maximum is 1024 characters.",
)
return True, "Skill is valid!"
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python quick_validate.py <skill_directory>")
sys.exit(1)
valid, message = validate_skill(sys.argv[1])
print(message)
sys.exit(0 if valid else 1)
sys.exit(0 if valid else 1)

View File

@@ -1,201 +0,0 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf of
any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don\'t include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -1,153 +0,0 @@
---
name: "sora"
description: "Use when the user asks to generate, remix, poll, list, download, or delete Sora videos via OpenAI\u2019s video API using the bundled CLI (`scripts/sora.py`), including requests like \u201cgenerate AI video,\u201d \u201cSora,\u201d \u201cvideo remix,\u201d \u201cdownload video/thumbnail/spritesheet,\u201d and batch video generation; requires `OPENAI_API_KEY` and Sora API access."
---
# Sora Video Generation Skill
Creates or manages short video clips for the current project (product demos, marketing spots, cinematic shots, UI mocks). Defaults to `sora-2` and a structured prompt augmentation workflow, and prefers the bundled CLI for deterministic runs. Note: `$sora` is a skill tag in prompts, not a shell command.
## When to use
- Generate a new video clip from a prompt
- Remix an existing video by ID
- Poll status, list jobs, or download assets (video/thumbnail/spritesheet)
- Batch runs (many prompts or variants)
## Decision tree (create vs remix vs status/download vs batch)
- If the user has a **video id** and wants a change → **remix**
- If the user has a **video id** and wants status or assets → **status/poll/download**
- If the user needs many prompts/assets → **create-batch**
- If the user asks for two versions with a small change (same shot, different subject/detail) → **create** the base, then **remix** for the variation
- Otherwise → **create** (or **create-and-poll** if they need a ready asset in one step)
## Workflow
1. Decide intent: create vs remix vs status/download vs batch.
2. Collect inputs: prompt, model, size, seconds, and any input reference image.
3. If batch: write a temporary JSONL under tmp/ (one job per line), run once, then delete the JSONL.
4. Prefer CLI augmentation flags (`--use-case`, `--scene`, `--camera`, etc.) instead of pre-writing a structured prompt. If you already produced a structured prompt file, pass `--no-augment` to avoid double-wrapping.
5. Run the bundled CLI (`scripts/sora.py`) with sensible defaults (see references/cli.md). For long prompts, prefer `--prompt-file` to avoid shell-escaping issues; pair with `--no-augment` if the prompt is already structured.
6. For async jobs, poll until complete (or use create-and-poll).
7. Download assets (video/thumbnail/spritesheet) and save locally.
8. Remove intermediate files created during invocation (for example `prompt.txt`, `remix_job.json`, temp JSONL). If the sandbox blocks `rm`, skip cleanup or truncate the files without surfacing an error.
9. Iterate with a single targeted change per prompt.
## Authentication
- `OPENAI_API_KEY` must be set for live API calls.
If the key is missing, give the user these steps:
1. Create an API key in the OpenAI platform UI: https://platform.openai.com/api-keys
2. Set `OPENAI_API_KEY` as an environment variable in their system.
3. Offer to guide them through setting the environment variable for their OS/shell if needed.
- Never ask the user to paste the full key in chat. Ask them to set it locally and confirm when ready.
## Defaults & rules
- Default model: `sora-2` (use `sora-2-pro` for higher fidelity).
- Default size: `1280x720`.
- Default seconds: `4` (allowed: "4", "8", "12" as strings).
- Always set size and seconds via API params; prose will not change them.
- Use the OpenAI Python SDK (`openai` package); do not use raw HTTP.
- Require `OPENAI_API_KEY` before any live API call.
- If uv cache permissions fail, set `UV_CACHE_DIR=/tmp/uv-cache`.
- Input reference images must be jpg/png/webp and should match target size.
- Download URLs expire after about 1 hour; copy assets to your own storage.
- Prefer the bundled CLI and **never modify** `scripts/sora.py` unless the user asks.
- Sora can generate audio; if a user requests voiceover/audio, specify it explicitly in the `Audio:` and `Dialogue:` lines and keep it short.
## API limitations
- Models are limited to `sora-2` and `sora-2-pro`.
- API access to Sora models requires an organization-verified account.
- Duration is limited to 4/8/12 seconds and must be set via the `seconds` parameter.
- The API expects `seconds` as a string enum ("4", "8", "12").
- Output sizes are limited by model (see `references/video-api.md` for the supported sizes).
- Video creation is async; you must poll for completion before downloading.
- Rate limits apply by usage tier (do not list specific limits).
- Content restrictions are enforced by the API (see Guardrails below).
## Guardrails (must enforce)
- Only content suitable for audiences under 18.
- No copyrighted characters or copyrighted music.
- No real people (including public figures).
- Input images with human faces are rejected.
## Prompt augmentation
Reformat prompts into a structured, production-oriented spec. Only make implicit details explicit; do not invent new creative requirements.
Template (include only relevant lines):
```
Use case: <where the clip will be used>
Primary request: <user's main prompt>
Scene/background: <location, time of day, atmosphere>
Subject: <main subject>
Action: <single clear action>
Camera: <shot type, angle, motion>
Lighting/mood: <lighting + mood>
Color palette: <3-5 color anchors>
Style/format: <film/animation/format cues>
Timing/beats: <counts or beats>
Audio: <ambient cue / music / voiceover if requested>
Text (verbatim): "<exact text>"
Dialogue:
<dialogue>
- Speaker: "Short line."
</dialogue>
Constraints: <must keep/must avoid>
Avoid: <negative constraints>
```
Augmentation rules:
- Keep it short; add only details the user already implied or provided elsewhere.
- For remixes, explicitly list invariants ("same shot, change only X").
- If any critical detail is missing and blocks success, ask a question; otherwise proceed.
- If you pass a structured prompt file to the CLI, add `--no-augment` to avoid the tool re-wrapping it.
## Examples
### Generation example (single shot)
```
Use case: product teaser
Primary request: a close-up of a matte black camera on a pedestal
Action: slow 30-degree orbit over 4 seconds
Camera: 85mm, shallow depth of field, gentle handheld drift
Lighting/mood: soft key light, subtle rim, premium studio feel
Constraints: no logos, no text
```
### Remix example (invariants)
```
Primary request: same shot and framing, switch palette to teal/sand/rust with warmer backlight
Constraints: keep the subject and camera move unchanged
```
## Prompting best practices (short list)
- One main action + one camera move per shot.
- Use counts or beats for timing ("two steps, pause, turn").
- Keep text short and the camera locked-off for UI or on-screen text.
- Add a brief avoid line when artifacts appear (flicker, jitter, fast motion).
- Shorter prompts are more creative; longer prompts are more controlled.
- Put dialogue in a dedicated block; keep lines short for 4-8s clips.
- State invariants explicitly for remixes (same shot, same camera move).
- Iterate with single-change follow-ups to preserve continuity.
## Guidance by asset type
Use these modules when the request is for a specific artifact. They provide targeted templates and defaults.
- Cinematic shots: `references/cinematic-shots.md`
- Social ads: `references/social-ads.md`
## CLI + environment notes
- CLI commands + examples: `references/cli.md`
- API parameter quick reference: `references/video-api.md`
- Prompting guidance: `references/prompting.md`
- Sample prompts: `references/sample-prompts.md`
- Troubleshooting: `references/troubleshooting.md`
- Network/sandbox tips: `references/codex-network.md`
## Reference map
- **`references/cli.md`**: how to run create/poll/remix/download/batch via `scripts/sora.py`.
- **`references/video-api.md`**: API-level knobs (models, sizes, duration, variants, status).
- **`references/prompting.md`**: prompt structure and iteration guidance.
- **`references/sample-prompts.md`**: copy/paste prompt recipes (examples only; no extra theory).
- **`references/cinematic-shots.md`**: templates for filmic shots.
- **`references/social-ads.md`**: templates for short social ad beats.
- **`references/troubleshooting.md`**: common errors and fixes.
- **`references/codex-network.md`**: network/approval troubleshooting.

View File

@@ -1,6 +0,0 @@
interface:
display_name: "Sora Video Generation Skill"
short_description: "Generate and manage Sora videos"
icon_small: "./assets/sora-small.svg"
icon_large: "./assets/sora.png"
default_prompt: "Plan and generate a Sora video for this request, then iterate with concrete prompt edits."

View File

@@ -1,4 +0,0 @@
<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" fill="currentColor" viewBox="0 0 14 14">
<path fill="currentColor" fill-rule="evenodd" d="M4.668 4.896c.552-.148 1.113.252 1.288.905l.192.718c.175.654-.11 1.281-.662 1.429-.551.147-1.112-.252-1.287-.906l-.193-.719c-.175-.654.111-1.28.662-1.427Zm.16.403a.057.057 0 0 0-.096.026l-.088.362a.058.058 0 0 1-.016.027l-.27.258a.057.057 0 0 0 .027.096l.361.088c.01.002.02.008.028.016l.257.269c.031.032.086.018.096-.025l.088-.362a.06.06 0 0 1 .016-.028l.27-.257a.057.057 0 0 0-.026-.096l-.362-.089a.058.058 0 0 1-.028-.015l-.257-.27Zm2.685-1.166c.551-.148 1.112.252 1.287.905l.192.719c.175.654-.11 1.28-.662 1.428-.551.147-1.111-.253-1.286-.907L6.85 5.56c-.175-.654.11-1.28.662-1.427Zm.159.403a.057.057 0 0 0-.096.026l-.088.361a.057.057 0 0 1-.016.028l-.27.257a.057.057 0 0 0 .026.096l.362.089c.01.002.02.007.028.015l.257.27a.057.057 0 0 0 .096-.026l.088-.362a.058.058 0 0 1 .016-.027l.27-.258a.057.057 0 0 0-.027-.096l-.36-.088a.058.058 0 0 1-.029-.016l-.257-.27Z" clip-rule="evenodd"/>
<path fill="currentColor" fill-rule="evenodd" d="M6.001 1.167c.743 0 1.423.27 1.948.715a3.015 3.015 0 0 1 3.502 3.502c.446.525.716 1.206.716 1.948 0 1.308-.834 2.42-1.997 2.838a3.015 3.015 0 0 1-4.786 1.281A3.015 3.015 0 0 1 1.882 7.95a3.004 3.004 0 0 1-.71-1.793l-.005-.155c0-1.308.833-2.42 1.996-2.839A3.016 3.016 0 0 1 6 1.167Zm0 .885c-.985 0-1.815.67-2.058 1.578a.444.444 0 0 1-.313.313 2.13 2.13 0 0 0-.954 3.564.443.443 0 0 1 .115.427 2.13 2.13 0 0 0 2.609 2.61l.057-.012a.443.443 0 0 1 .37.126 2.131 2.131 0 0 0 3.564-.955l.018-.056a.443.443 0 0 1 .294-.257 2.131 2.131 0 0 0 .955-3.564.443.443 0 0 1-.115-.427 2.13 2.13 0 0 0-2.61-2.61.443.443 0 0 1-.426-.113 2.123 2.123 0 0 0-1.396-.622l-.11-.002Z" clip-rule="evenodd"/>
</svg>

Before

Width:  |  Height:  |  Size: 1.7 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 12 KiB

View File

@@ -1,53 +0,0 @@
# Cinematic shot templates
Use these for filmic, mood-forward clips. Keep one subject, one action, one camera move.
## Shot grammar (pick one)
- Static wide: locked-off, slow atmosphere changes
- Dolly-in: slow push toward subject
- Dolly-out: reveal more context
- Orbit: 15-45 degree arc around subject
- Lateral move: smooth left-right slide
- Crane: subtle vertical rise
- Handheld drift: gentle, controlled sway
## Default template
```
Use case: cinematic shot
Primary request: <subject + setting>
Scene/background: <location, time of day, atmosphere>
Subject: <main subject>
Action: <one clear action>
Camera: <shot type, lens, motion>
Lighting/mood: <key light + mood>
Color palette: <3-5 anchors>
Style/format: filmic, natural grain
Constraints: no logos, no text, no people
Avoid: jitter; flicker; oversharpening
```
## Example: moody exterior
```
Use case: cinematic shot
Primary request: a lone cabin on a cliff above the sea
Scene/background: foggy coastline at dawn, drifting mist
Subject: small wooden cabin with warm window glow
Action: light fog rolls past the cabin
Camera: slow dolly-in, 35mm, steady
Lighting/mood: moody, soft dawn light, subtle contrast
Color palette: deep blue, slate, warm amber
Constraints: no logos, no text, no people
```
## Example: intimate detail
```
Use case: cinematic detail
Primary request: close-up of a vinyl record spinning
Scene/background: dim room, soft lamp glow
Subject: record grooves and stylus
Action: slow rotation, subtle dust motes
Camera: macro, locked-off
Lighting/mood: warm, low-key, soft highlights
Color palette: warm amber, deep brown, charcoal
Constraints: no logos, no text
```

View File

@@ -1,248 +0,0 @@
# CLI reference (`scripts/sora.py`)
This file contains the command catalog for the bundled video generation CLI. Keep `SKILL.md` overview-first; put verbose CLI details here.
## What this CLI does
- `create`: create a new video job (async)
- `create-and-poll`: create a job, poll until complete, optionally download
- `poll`: wait for an existing job to finish
- `status`: retrieve job status/details
- `download`: download video/thumbnail/spritesheet
- `list`: list recent jobs
- `delete`: delete a job
- `remix`: remix a completed video
- `create-batch`: create multiple jobs from a JSONL file
Real API calls require **network access** + `OPENAI_API_KEY`. `--dry-run` does not.
## Quick start (works from any repo)
Set a stable path to the skill CLI (default `CODEX_HOME` is `~/.codex`):
```
export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
export SORA_CLI="$CODEX_HOME/skills/sora/scripts/sora.py"
```
If you're in this repo, you can set the path directly:
```
# Use repo root (tests run from output/* so $PWD is not the root)
export SORA_CLI="$(git rev-parse --show-toplevel)/<path-to-skill>/scripts/sora.py"
```
If `git` isn't available, set `SORA_CLI` to the absolute path to `<path-to-skill>/scripts/sora.py`.
If uv cache fails with permission errors, set a writable cache:
```
export UV_CACHE_DIR="/tmp/uv-cache"
```
Dry-run (no API call; no network required; does not require the `openai` package):
```
python "$SORA_CLI" create --prompt "Test" --dry-run
```
Preflight a full prompt without running the API:
```
python "$SORA_CLI" create --prompt-file prompt.txt --dry-run --json-out out/request.json
```
Create a job (requires `OPENAI_API_KEY` + network):
```
uv run --with openai python "$SORA_CLI" create \
--model sora-2 \
--prompt "Wide tracking shot of a teal coupe on a desert highway" \
--size 1280x720 \
--seconds 8
```
Create from a prompt file (avoids shell-escaping issues for multi-line prompts):
```
cat > prompt.txt << 'EOF'
Use case: landing page hero
Primary request: a matte black camera on a pedestal
Action: slow 30-degree orbit over 4 seconds
Camera: 85mm, shallow depth of field
Lighting/mood: soft key light, subtle rim
Constraints: no logos, no text
EOF
uv run --with openai python "$SORA_CLI" create \
--prompt-file prompt.txt \
--size 1280x720 \
--seconds 4
```
If your prompt file is already structured (Use case/Scene/Camera/etc), disable tool augmentation:
```
uv run --with openai python "$SORA_CLI" create \
--prompt-file prompt.txt \
--no-augment \
--size 1280x720 \
--seconds 4
```
Create + poll + download:
```
uv run --with openai python "$SORA_CLI" create-and-poll \
--model sora-2-pro \
--prompt "Close-up of a steaming coffee cup on a wooden table" \
--size 1280x720 \
--seconds 8 \
--download \
--variant video \
--out coffee.mp4
```
Create + poll + write JSON bundle:
```
uv run --with openai python "$SORA_CLI" create-and-poll \
--prompt "Minimal product teaser of a matte black camera" \
--json-out out/coffee-job.json
```
Remix a completed video:
```
uv run --with openai python "$SORA_CLI" remix \
--id video_abc123 \
--prompt "Same shot, shift palette to teal/sand/rust with warm backlight."
```
Download a thumbnail or spritesheet:
```
uv run --with openai python "$SORA_CLI" download --id video_abc123 --variant thumbnail --out thumb.webp
uv run --with openai python "$SORA_CLI" download --id video_abc123 --variant spritesheet --out sheet.jpg
```
## Guardrails (important)
- Use `python "$SORA_CLI" ...` (or equivalent full path) for all video work.
- For API calls, prefer `uv run --with openai ...` to avoid missing SDK errors.
- Do **not** create one-off runners unless the user explicitly asks.
- **Never modify** `scripts/sora.py` unless the user asks.
## Defaults (unless overridden by flags)
- Model: `sora-2`
- Size: `1280x720`
- Seconds: `4` (API expects a string enum: "4", "8", "12")
- Variant: `video`
- Poll interval: `10` seconds
## JSON output (`--json-out`)
- For `create`, `status`, `list`, `delete`, `poll`, and `remix`, `--json-out` writes the JSON response to a file.
- For `create-and-poll`, `--json-out` writes a bundle: `{ "create": ..., "final": ... }`.
- If the path has no extension, `.json` is added automatically.
- In `--dry-run`, `--json-out` writes the request preview instead of a response.
## Input reference images
- Must be jpg/png/webp; they should match the target size.
- Provide the path with `--input-reference`.
## Optional deps
Prefer `uv run --with ...` for an out-of-the-box run without changing the current project env; otherwise install into your active env:
```
uv pip install openai
```
## JSONL schema for `create-batch`
Each line is a JSON object (or a raw prompt string). Required key: `prompt`.
Top-level override keys:
- `model`, `size`, `seconds`
- `input_reference` (path)
- `out` (optional output filename for the job JSON)
Prompt augmentation keys (top-level or under `fields`):
- `use_case`, `scene`, `subject`, `action`, `camera`, `style`, `lighting`, `palette`, `audio`, `dialogue`, `text`, `timing`, `constraints`, `negative`
Notes:
- `fields` merges into the prompt augmentation inputs.
- Top-level keys override CLI defaults.
- `seconds` must be one of: "4", "8", "12".
## Common recipes
Create with prompt augmentation fields:
```
uv run --with openai python "$SORA_CLI" create \
--prompt "A minimal product teaser shot of a matte black camera" \
--use-case "landing page hero" \
--camera "85mm, slow orbit" \
--lighting "soft key, subtle rim" \
--constraints "no logos, no text"
```
Two-variant workflow (base + remix):
```
# 1) Base clip
uv run --with openai python "$SORA_CLI" create-and-poll \
--prompt "Ceramic mug on a sunlit wooden table in a cozy cafe" \
--size 1280x720 --seconds 4 --download --out output.mp4
# 2) Remix with invariant (same shot, change only the drink)
uv run --with openai python "$SORA_CLI" remix \
--id video_abc123 \
--prompt "Same shot and framing; replace the mug with an iced americano in a glass, visible ice and condensation."
# 3) Poll and download the remix
uv run --with openai python "$SORA_CLI" poll \
--id video_def456 --download --out remix.mp4
```
Poll and download after a job finishes:
```
uv run --with openai python "$SORA_CLI" poll --id video_abc123 --download --variant video --out out.mp4
```
Write JSON response to a file:
```
uv run --with openai python "$SORA_CLI" status --id video_abc123 --json-out out/status.json
```
Batch create (JSONL input):
```
mkdir -p tmp/sora
cat > tmp/sora/prompts.jsonl << 'EOB'
{"prompt":"A neon-lit rainy alley, slow dolly-in","seconds":"4"}
{"prompt":"A warm sunrise over a misty lake, gentle pan","seconds":"8",
"fields":{"camera":"35mm, slow pan","lighting":"soft dawn light"}}
EOB
uv run --with openai python "$SORA_CLI" create-batch --input tmp/sora/prompts.jsonl --out-dir out --concurrency 3
# Cleanup (recommended)
rm -f tmp/sora/prompts.jsonl
```
Notes:
- `create-batch` writes one JSON response per job under `--out-dir`.
- Output names default to `NNN-<prompt-slug>.json`.
- Use `--concurrency` to control parallelism (default `3`). Higher concurrency can hit rate limits.
- Treat the JSONL file as temporary: write it under `tmp/` and delete it after the run (do not commit it). If `rm` is blocked in your sandbox, skip cleanup or truncate the file.
## CLI notes
- Supported sizes depend on model (see `references/video-api.md`).
- Seconds are limited to 4, 8, or 12.
- Download URLs expire after about 1 hour; copy assets to your own storage.
- In CI/sandboxes where long-running commands time out, prefer `create` + `poll` (or add `--timeout`).
## See also
- API parameter quick reference: `references/video-api.md`
- Prompt structure and examples: `references/prompting.md`
- Sample prompts: `references/sample-prompts.md`
- Troubleshooting: `references/troubleshooting.md`

View File

@@ -1,28 +0,0 @@
# Codex network approvals / sandbox notes
This guidance is intentionally isolated from `SKILL.md` because it can vary by environment and may become stale. Prefer the defaults in your environment when in doubt.
## Why am I asked to approve every video generation call?
Video generation uses the OpenAI Video API, so the CLI needs outbound network access. In many Codex setups, network access is disabled by default (especially under stricter sandbox modes), and/or the approval policy may require confirmation before networked commands run.
## How do I reduce repeated approval prompts (network)?
If you trust the repo and want fewer prompts, enable network access for the relevant sandbox mode and relax the approval policy.
Example `~/.codex/config.toml` pattern:
```
approval_policy = "never"
sandbox_mode = "workspace-write"
[sandbox_workspace_write]
network_access = true
```
Or for a single session:
```
codex --sandbox workspace-write --ask-for-approval never
```
## Safety note
Use caution: enabling network and disabling approvals reduces friction but increases risk if you run untrusted code or work in an untrusted repository.

View File

@@ -1,137 +0,0 @@
# Prompting best practices (Sora)
## Contents
- [Mindset & tradeoffs](#mindset--tradeoffs)
- [API-controlled params](#api-controlled-params)
- [Structure](#structure)
- [Specificity](#specificity)
- [Style & visual cues](#style--visual-cues)
- [Camera & composition](#camera--composition)
- [Motion & timing](#motion--timing)
- [Lighting & palette](#lighting--palette)
- [Character continuity](#character-continuity)
- [Multi-shot prompts](#multi-shot-prompts)
- [Ultra-detailed briefs](#ultra-detailed-briefs)
- [Image input](#image-input)
- [Constraints & invariants](#constraints--invariants)
- [Text, dialogue & audio](#text-dialogue--audio)
- [Avoiding artifacts](#avoiding-artifacts)
- [Remixing](#remixing)
- [Iterate deliberately](#iterate-deliberately)
## Mindset & tradeoffs
- Treat the prompt like a cinematography brief, not a contract.
- The same prompt can yield different results; rerun for variants.
- Short prompts give more creative freedom; longer prompts give more control.
- Shorter clips tend to follow instructions better; consider stitching two 4s clips instead of a single 8s if precision matters.
## API-controlled params
- Model, size, and seconds are controlled by API params, not prose.
- Put desired duration in the `seconds` param; the prompt cannot make a clip longer.
## Structure
- Use short labeled lines; omit sections that do not matter.
- Keep one main subject and one main action.
- Put timing in beats or counts if it matters.
- If you prefer a prose-first template, use:
```
<Prose scene description in plain language. Describe subject, setting, time of day, and key visual details.>
Cinematography:
Camera shot: <framing + angle>
Mood: <tone>
Actions:
- <clear action beat>
- <clear action beat>
Dialogue:
<short lines if needed>
```
## Specificity
- Name the subject and materials (metal, fabric, glass).
- Use camera language (lens, angle, shot type) for stability.
- Describe the environment with time of day and atmosphere.
## Style & visual cues
- Set style early (e.g., "1970s film", "IMAX-scale", "16mm black-and-white").
- Use visible nouns and verbs, not vague adjectives.
- Weak: "A beautiful street at night."
- Strong: "Wet asphalt, zebra crosswalk, neon signs reflecting in puddles."
## Camera & composition
- Prefer one camera move: dolly, orbit, lateral slide, or locked-off.
- Straight-on framing is best for UI and text.
- For close-ups, use longer lenses (85mm+); for wide scenes, 24-35mm.
- Depth of field is a strong lever: shallow for subject isolation, deep for context.
- Example framings: wide establishing, medium close-up, aerial wide, low angle.
- Example camera motions: slow tilt, gentle handheld drift, smooth lateral slide.
## Motion & timing
- Use short beats: "0-2s", "2-4s", "4-6s".
- Keep actions sequential, not simultaneous.
- For 4s clips, limit to 1-2 beats.
- Describe actions as counts or steps when possible (e.g., "takes four steps, pauses, turns in the final second").
## Lighting & palette
- Describe light quality and direction (soft window light, hard rim, backlight).
- Name 3-5 palette anchors to stabilize color across shots.
- If continuity matters, keep lighting logic consistent across clips.
## Character continuity
- Keep character descriptors consistent across shots; reuse phrasing.
- Avoid mixing competing traits that can shift identity or pose.
## Multi-shot prompts
- You can describe multiple shots in one prompt, but keep each shot block distinct.
- For each shot, specify one camera setup, one action, one lighting recipe.
- Treat each shot as a creative unit you can later edit or stitch.
## Ultra-detailed briefs
- Use when you need a specific, filmic look or strict continuity.
- Call out format/look, lensing/filters, grade/palette, lighting direction, texture, and sound.
- If needed, include a short shot list with timing beats.
## Image input
- Use an input image to lock composition, character design, or set dressing.
- The input image should match the target size and be jpg/png/webp.
- The image anchors the first frame; the prompt describes what happens next.
- If you lack a reference, generate one first and pass it as `input_reference`.
## Constraints & invariants
- State what must not change: "same shot", "same framing", "keep background".
- Repeat invariants in every remix to reduce drift.
## Text, dialogue & audio
- Keep text short and specific; quote exact strings.
- Specify placement and avoid motion blur.
- For dialogue, use a dedicated block and keep lines short.
- Label speakers consistently for multi-character scenes.
- If silent, you can still add a small ambient sound cue to set rhythm.
- Sora can generate audio; include an `Audio:` line and a short dialogue block when needed.
- As a rule of thumb, 4s clips fit 1-2 short lines; 8s clips can handle a few more.
Example:
```
Audio: soft ambient café noise, clear warm voiceover
Dialogue:
<dialogue>
- Speaker: "Let's get started."
</dialogue>
```
## Avoiding artifacts
- Avoid multiple actions in 4-8 seconds.
- Keep camera motion smooth and limited.
- Add explicit negatives when needed: "avoid flicker", "avoid jitter", "no fast motion".
## Remixing
- Change one thing at a time: palette, lighting, or action.
- Keep camera and subject consistent unless the change requests otherwise.
- If a shot misfires, simplify: freeze the camera, reduce action, clear background, then add complexity back in.
## Iterate deliberately
- Start simple, then add one constraint per iteration.
- If results look chaotic, reduce motion and simplify the scene.
- When a result is close, pin it as a reference and describe only the tweak.

View File

@@ -1,95 +0,0 @@
# Sample prompts (copy/paste)
Use these as starting points. Keep user-provided requirements and constraints; do not invent new creative elements.
For prompting principles (structure, invariants, iteration), see `references/prompting.md`.
## Contents
- [Product teaser (single shot)](#product-teaser-single-shot)
- [UI demo (screen recording style)](#ui-demo-screen-recording-style)
- [Cinematic detail shot](#cinematic-detail-shot)
- [Social ad (6s with beats)](#social-ad-6s-with-beats)
- [Motion graphics explainer](#motion-graphics-explainer)
- [Ambient loop (atmosphere)](#ambient-loop-atmosphere)
## Product teaser (single shot)
```
Use case: product teaser
Primary request: close-up of a matte black wireless speaker on a stone pedestal
Scene/background: dark studio cyclorama, subtle haze
Subject: compact speaker with soft fabric texture
Action: slow 20-degree orbit over 4 seconds
Camera: 85mm, shallow depth of field, steady dolly
Lighting/mood: soft key, gentle rim, premium studio feel
Color palette: charcoal, slate, warm amber accents
Constraints: no logos, no text
Avoid: harsh bloom; oversharpening; clutter
```
## UI demo (screen recording style)
```
Use case: UI product demo
Primary request: a clean mobile budgeting app demo showing a weekly spend chart
Scene/background: neutral gradient backdrop
Subject: smartphone UI, centered, screen content crisp and legible
Action: tap the "Add expense" button, modal opens, amount typed, save
Camera: locked-off, straight-on, no tilt
Lighting/mood: soft studio light, minimal reflections
Color palette: off-white, slate, mint accent
Text (verbatim): "Add expense", "$24.50", "Groceries"
Constraints: no brand logos; keep UI text readable; avoid motion blur
```
## Cinematic detail shot
```
Use case: cinematic product detail
Primary request: macro shot of raindrops sliding across a car hood
Scene/background: night city bokeh, soft rain mist
Subject: glossy hood surface with water beads
Action: slow push-in over 4 seconds
Camera: 100mm macro, shallow depth of field
Lighting/mood: moody, high-contrast reflections, soft speculars
Color palette: deep navy, teal, silver highlights
Constraints: no logos, no text
Avoid: flicker; unstable reflections; excessive noise
```
## Social ad (6s with beats)
```
Use case: social ad
Primary request: minimal coffee subscription ad with three quick beats
Scene/background: warm kitchen counter, morning light
Subject: ceramic mug, coffee bag, steam
Action: beat 1 (0-2s) pour coffee; beat 2 (2-4s) steam rises; beat 3 (4-6s) mug slides to center
Camera: 50mm, gentle handheld drift
Lighting/mood: warm, cozy, natural light
Text (verbatim): "Fresh roast" (top-left), "Weekly delivery" (bottom-right)
Constraints: no logos; text must be legible; avoid fast motion
```
## Motion graphics explainer
```
Use case: explainer clip
Primary request: clean motion-graphics animation showing data flowing into a dashboard
Scene/background: soft gradient background
Subject: abstract nodes and lines, simple dashboard cards
Action: nodes connect, data pulses, cards fill with charts
Camera: locked-off, no depth, flat design
Lighting/mood: minimal, modern
Color palette: off-white, graphite, teal, coral accents
Constraints: no logos; keep shapes simple; avoid heavy texture
```
## Ambient loop (atmosphere)
```
Use case: ambient background loop
Primary request: fog drifting through a pine forest at dawn
Scene/background: tall pines, soft fog layers, distant hills
Subject: drifting fog and light rays
Action: slow lateral drift, subtle light change
Camera: wide, locked-off, no tilt
Lighting/mood: calm, soft dawn light
Color palette: muted greens, cool gray, pale gold
Constraints: no text, no logos, no people
Avoid: fast motion; flicker; abrupt lighting shifts
```

View File

@@ -1,42 +0,0 @@
# Social ad templates (4-8s)
Short clips work best with clear beats. Use 2-3 beats and keep text minimal.
## Default template
```
Use case: social ad
Primary request: <ad concept>
Scene/background: <simple backdrop>
Subject: <product or scene>
Action: beat 1 (0-2s) <action>; beat 2 (2-4s) <action>; beat 3 (4-6s) <action>
Camera: <shot type + motion>
Lighting/mood: <mood>
Text (verbatim): "<short headline>", "<short subhead>"
Constraints: no logos; keep text legible; avoid fast motion
```
## Example: product benefit
```
Use case: social ad
Primary request: a compact humidifier emphasizing quiet operation
Scene/background: minimal bedroom nightstand
Subject: matte white humidifier with soft vapor
Action: beat 1 (0-2s) vapor begins; beat 2 (2-4s) soft glow turns on; beat 3 (4-6s) device slides to center
Camera: 50mm, gentle push-in
Lighting/mood: calm, warm night light
Text (verbatim): "Quiet mist", "Sleep better"
Constraints: no logos; text must be legible; avoid harsh highlights
```
## Example: before/after
```
Use case: social ad
Primary request: before/after of a cluttered desk becoming tidy
Scene/background: home office desk, neutral wall
Subject: desk surface, organizer tray
Action: beat 1 (0-2s) cluttered desk; beat 2 (2-4s) quick tidy motion; beat 3 (4-6s) clean desk with organizer
Camera: top-down, locked-off
Lighting/mood: soft daylight
Text (verbatim): "Before", "After"
Constraints: no logos; keep motion minimal; avoid blur
```

View File

@@ -1,58 +0,0 @@
# Troubleshooting
## Job fails with size or seconds errors
- Cause: size not supported by model, or seconds not in 4/8/12.
- Fix: match size to model; use only "4", "8", or "12" seconds (see `references/video-api.md`).
- If you see `invalid_type` for seconds, update `scripts/sora.py` or pass a string value for `--seconds`.
## openai SDK not installed
- Cause: running `python "$SORA_CLI" ...` without the OpenAI SDK available.
- Fix: run with `uv run --with openai python "$SORA_CLI" ...` instead of using pip directly.
## uv cache permission error
- Cause: uv cache directory is not writable in CI or sandboxed environments.
- Fix: set `UV_CACHE_DIR=/tmp/uv-cache` (or another writable path) before running `uv`.
## Prompt shell escaping issues
- Cause: multi-line prompts or quotes break the shell.
- Fix: use `--prompt-file prompt.txt` (see `references/cli.md` for an example).
## Prompt looks double-wrapped ("Primary request: Use case: ...")
- Cause: you structured the prompt manually but left CLI augmentation on.
- Fix: add `--no-augment` when passing a structured prompt file, or use the CLI fields (`--use-case`, `--scene`, etc.) instead of pre-formatting.
## Input reference rejected
- Cause: file is not jpg/png/webp, or has a human face, or dimensions do not match target size.
- Fix: convert to jpg/png/webp, remove faces, and resize to match `--size`.
## Download fails or returns expired URL
- Cause: download URLs expire after about 1 hour.
- Fix: re-download while the link is fresh; save to your own storage.
## Video completes but looks unstable or flickers
- Cause: multiple actions or aggressive camera motion.
- Fix: reduce to one main action and one camera move; keep beats simple; add constraints like "avoid flicker" or "stable motion".
## Text is unreadable
- Cause: text too long, too small, or moving.
- Fix: shorten text, increase size, keep camera locked-off, and avoid fast motion.
## Remix drifts from the original
- Cause: too many changes requested at once.
- Fix: state invariants explicitly ("same shot and camera move") and change only one element per remix.
## Job stuck in queued/in_progress for a long time
- Cause: temporary queue delays.
- Fix: increase poll timeout, or retry later; avoid high concurrency if you are rate-limited.
## create-and-poll times out in CI/sandbox
- Cause: long-running CLI commands can exceed CI time limits.
- Fix: run `create` (capture the ID) and then `poll` separately, or set `--timeout`.
## Audio or voiceover missing / incorrect
- Cause: audio wasn't explicitly requested, or the dialogue/audio cue was too long or vague.
- Fix: add a clear `Audio:` line and a short `Dialogue:` block.
## Cleanup blocked by sandbox policy
- Cause: some environments block `rm`.
- Fix: skip cleanup, or truncate files instead of deleting.

View File

@@ -1,45 +0,0 @@
# Sora Video API quick reference
Keep this file short; the full docs live in the OpenAI platform docs.
## Models
- sora-2: faster, flexible iteration
- sora-2-pro: higher fidelity, slower, more expensive
## Sizes (by model)
- sora-2: 1280x720, 720x1280
- sora-2-pro: 1280x720, 720x1280, 1024x1792, 1792x1024
Note: higher resolutions generally yield better detail, texture, and motion consistency.
## Duration
- seconds: "4", "8", "12" (string enum; set via API param; prose will not change clip length)
Shorter clips tend to follow instructions more reliably; consider stitching multiple 4s clips for precision.
## Input reference
- Optional `input_reference` image (jpg/png/webp).
- Input reference should match the target size.
## Jobs and status
- Create is async. Status values: queued, in_progress, completed, failed.
- Prefer polling every 10-20s or use webhooks in production.
## Endpoints (conceptual)
- POST /videos: create a job
- GET /videos/{id}: retrieve status
- GET /videos/{id}/content: download video data
- GET /videos: list
- DELETE /videos/{id}: delete
- POST /videos/{id}/remix: remix a completed job
## Download variants
- video (mp4)
- thumbnail (webp)
- spritesheet (jpg)
Download URLs expire after about 1 hour; copy files to your own storage for retention.
## Guardrails (content restrictions)
- Only content suitable for audiences under 18
- No copyrighted characters or copyrighted music
- No real people (including public figures)
- Input images with human faces are currently rejected

View File

@@ -1,970 +0,0 @@
#!/usr/bin/env python3
"""Create and manage Sora videos with the OpenAI Video API.
Defaults to sora-2 and a structured prompt augmentation workflow.
"""
from __future__ import annotations
import argparse
import asyncio
import json
import os
from pathlib import Path
import re
import sys
import time
from typing import Any, Dict, Iterable, List, Optional, Tuple, Union
DEFAULT_MODEL = "sora-2"
DEFAULT_SIZE = "1280x720"
DEFAULT_SECONDS = "4"
DEFAULT_POLL_INTERVAL = 10.0
DEFAULT_VARIANT = "video"
DEFAULT_CONCURRENCY = 3
DEFAULT_MAX_ATTEMPTS = 3
ALLOWED_MODELS = {"sora-2", "sora-2-pro"}
ALLOWED_SIZES_SORA2 = {"1280x720", "720x1280"}
ALLOWED_SIZES_SORA2_PRO = {"1280x720", "720x1280", "1024x1792", "1792x1024"}
ALLOWED_SECONDS = {"4", "8", "12"}
ALLOWED_VARIANTS = {"video", "thumbnail", "spritesheet"}
ALLOWED_ORDERS = {"asc", "desc"}
ALLOWED_INPUT_EXTS = {".jpg", ".jpeg", ".png", ".webp"}
TERMINAL_STATUSES = {"completed", "failed", "canceled"}
VARIANT_EXTENSIONS = {"video": ".mp4", "thumbnail": ".webp", "spritesheet": ".jpg"}
MAX_BATCH_JOBS = 200
def _die(message: str, code: int = 1) -> None:
print(f"Error: {message}", file=sys.stderr)
raise SystemExit(code)
def _warn(message: str) -> None:
print(f"Warning: {message}", file=sys.stderr)
def _ensure_api_key(dry_run: bool) -> None:
if os.getenv("OPENAI_API_KEY"):
print("OPENAI_API_KEY is set.", file=sys.stderr)
return
if dry_run:
_warn("OPENAI_API_KEY is not set; dry-run only.")
return
_die("OPENAI_API_KEY is not set. Export it before running.")
def _read_prompt(prompt: Optional[str], prompt_file: Optional[str]) -> str:
if prompt and prompt_file:
_die("Use --prompt or --prompt-file, not both.")
if prompt_file:
path = Path(prompt_file)
if not path.exists():
_die(f"Prompt file not found: {path}")
return path.read_text(encoding="utf-8").strip()
if prompt:
return prompt.strip()
_die("Missing prompt. Use --prompt or --prompt-file.")
return "" # unreachable
def _normalize_model(model: Optional[str]) -> str:
value = (model or DEFAULT_MODEL).strip().lower()
if value not in ALLOWED_MODELS:
_die("model must be one of: sora-2, sora-2-pro")
return value
def _normalize_size(size: Optional[str], model: str) -> str:
value = (size or DEFAULT_SIZE).strip().lower()
allowed = ALLOWED_SIZES_SORA2 if model == "sora-2" else ALLOWED_SIZES_SORA2_PRO
if value not in allowed:
allowed_list = ", ".join(sorted(allowed))
_die(f"size must be one of: {allowed_list} for model {model}")
return value
def _normalize_seconds(seconds: Optional[Union[int, str]]) -> str:
if seconds is None:
value = DEFAULT_SECONDS
elif isinstance(seconds, int):
value = str(seconds)
else:
value = str(seconds).strip()
if value not in ALLOWED_SECONDS:
_die("seconds must be one of: 4, 8, 12")
return value
def _normalize_variant(variant: Optional[str]) -> str:
value = (variant or DEFAULT_VARIANT).strip().lower()
if value not in ALLOWED_VARIANTS:
_die("variant must be one of: video, thumbnail, spritesheet")
return value
def _normalize_order(order: Optional[str]) -> Optional[str]:
if order is None:
return None
value = order.strip().lower()
if value not in ALLOWED_ORDERS:
_die("order must be one of: asc, desc")
return value
def _normalize_poll_interval(interval: Optional[float]) -> float:
value = float(interval if interval is not None else DEFAULT_POLL_INTERVAL)
if value <= 0:
_die("poll-interval must be > 0")
return value
def _normalize_timeout(timeout: Optional[float]) -> Optional[float]:
if timeout is None:
return None
value = float(timeout)
if value <= 0:
_die("timeout must be > 0")
return value
def _default_out_path(variant: str) -> Path:
if variant == "video":
return Path("video.mp4")
if variant == "thumbnail":
return Path("thumbnail.webp")
return Path("spritesheet.jpg")
def _normalize_out_path(out: Optional[str], variant: str) -> Path:
expected_ext = VARIANT_EXTENSIONS[variant]
if not out:
return _default_out_path(variant)
path = Path(out)
if path.suffix == "":
return path.with_suffix(expected_ext)
if path.suffix.lower() != expected_ext:
_warn(f"Output extension {path.suffix} does not match {expected_ext} for {variant}.")
return path
def _normalize_json_out(out: Optional[str], default_name: str) -> Optional[Path]:
if not out:
return None
raw = str(out)
if raw.endswith("/") or raw.endswith(os.sep):
return Path(raw) / default_name
path = Path(out)
if path.exists() and path.is_dir():
return path / default_name
if path.suffix == "":
path = path.with_suffix(".json")
return path
def _open_input_reference(path: Optional[str]):
if not path:
return _NullContext()
p = Path(path)
if not p.exists():
_die(f"Input reference not found: {p}")
if p.suffix.lower() not in ALLOWED_INPUT_EXTS:
_warn("Input reference should be jpeg, png, or webp.")
return _SingleFile(p)
def _create_client():
try:
from openai import OpenAI
except ImportError:
_die("openai SDK not installed. Run with `uv run --with openai` or install with `uv pip install openai`.")
return OpenAI()
def _create_async_client():
try:
from openai import AsyncOpenAI
except ImportError:
try:
import openai as _openai # noqa: F401
except ImportError:
_die("openai SDK not installed. Run with `uv run --with openai` or install with `uv pip install openai`.")
_die(
"AsyncOpenAI not available in this openai SDK version. Upgrade with `uv pip install -U openai`."
)
return AsyncOpenAI()
def _to_dict(obj: Any) -> Any:
if isinstance(obj, dict):
return obj
if hasattr(obj, "model_dump"):
return obj.model_dump()
if hasattr(obj, "dict"):
return obj.dict()
if hasattr(obj, "__dict__"):
return obj.__dict__
return obj
def _print_json(obj: Any) -> None:
print(json.dumps(_to_dict(obj), indent=2, sort_keys=True))
def _print_request(payload: Dict[str, Any]) -> None:
print(json.dumps(payload, indent=2, sort_keys=True))
def _slugify(value: str) -> str:
value = value.strip().lower()
value = re.sub(r"[^a-z0-9]+", "-", value)
value = re.sub(r"-{2,}", "-", value).strip("-")
return value[:60] if value else "job"
def _normalize_job(job: Any, idx: int) -> Dict[str, Any]:
if isinstance(job, str):
prompt = job.strip()
if not prompt:
_die(f"Empty prompt at job {idx}")
return {"prompt": prompt}
if isinstance(job, dict):
if "prompt" not in job or not str(job["prompt"]).strip():
_die(f"Missing prompt for job {idx}")
return job
_die(f"Invalid job at index {idx}: expected string or object.")
return {} # unreachable
def _read_jobs_jsonl(path: str) -> List[Dict[str, Any]]:
p = Path(path)
if not p.exists():
_die(f"Input file not found: {p}")
jobs: List[Dict[str, Any]] = []
for line_no, raw in enumerate(p.read_text(encoding="utf-8").splitlines(), start=1):
line = raw.strip()
if not line or line.startswith("#"):
continue
try:
item: Any
if line.startswith("{"):
item = json.loads(line)
else:
item = line
jobs.append(_normalize_job(item, idx=line_no))
except json.JSONDecodeError as exc:
_die(f"Invalid JSON on line {line_no}: {exc}")
if not jobs:
_die("No jobs found in input file.")
if len(jobs) > MAX_BATCH_JOBS:
_die(f"Too many jobs ({len(jobs)}). Max is {MAX_BATCH_JOBS}.")
return jobs
def _merge_non_null(dst: Dict[str, Any], src: Dict[str, Any]) -> Dict[str, Any]:
merged = dict(dst)
for k, v in src.items():
if v is not None:
merged[k] = v
return merged
def _job_output_path(out_dir: Path, idx: int, prompt: str, explicit_out: Optional[str]) -> Path:
out_dir.mkdir(parents=True, exist_ok=True)
if explicit_out:
path = Path(explicit_out)
if path.suffix == "":
path = path.with_suffix(".json")
return out_dir / path.name
slug = _slugify(prompt[:80])
return out_dir / f"{idx:03d}-{slug}.json"
def _extract_retry_after_seconds(exc: Exception) -> Optional[float]:
for attr in ("retry_after", "retry_after_seconds"):
val = getattr(exc, attr, None)
if isinstance(val, (int, float)) and val >= 0:
return float(val)
msg = str(exc)
m = re.search(r"retry[- ]after[:= ]+([0-9]+(?:\\.[0-9]+)?)", msg, re.IGNORECASE)
if m:
try:
return float(m.group(1))
except Exception:
return None
return None
def _is_rate_limit_error(exc: Exception) -> bool:
name = exc.__class__.__name__.lower()
if "ratelimit" in name or "rate_limit" in name:
return True
msg = str(exc).lower()
return "429" in msg or "rate limit" in msg or "too many requests" in msg
def _is_transient_error(exc: Exception) -> bool:
if _is_rate_limit_error(exc):
return True
name = exc.__class__.__name__.lower()
if "timeout" in name or "timedout" in name or "tempor" in name:
return True
msg = str(exc).lower()
return "timeout" in msg or "timed out" in msg or "connection reset" in msg
def _fields_from_args(args: argparse.Namespace) -> Dict[str, Optional[str]]:
return {
"use_case": getattr(args, "use_case", None),
"scene": getattr(args, "scene", None),
"subject": getattr(args, "subject", None),
"action": getattr(args, "action", None),
"camera": getattr(args, "camera", None),
"style": getattr(args, "style", None),
"lighting": getattr(args, "lighting", None),
"palette": getattr(args, "palette", None),
"audio": getattr(args, "audio", None),
"dialogue": getattr(args, "dialogue", None),
"text": getattr(args, "text", None),
"timing": getattr(args, "timing", None),
"constraints": getattr(args, "constraints", None),
"negative": getattr(args, "negative", None),
}
def _augment_prompt_fields(augment: bool, prompt: str, fields: Dict[str, Optional[str]]) -> str:
if not augment:
return prompt
sections: List[str] = []
if fields.get("use_case"):
sections.append(f"Use case: {fields['use_case']}")
sections.append(f"Primary request: {prompt}")
if fields.get("scene"):
sections.append(f"Scene/background: {fields['scene']}")
if fields.get("subject"):
sections.append(f"Subject: {fields['subject']}")
if fields.get("action"):
sections.append(f"Action: {fields['action']}")
if fields.get("camera"):
sections.append(f"Camera: {fields['camera']}")
if fields.get("lighting"):
sections.append(f"Lighting/mood: {fields['lighting']}")
if fields.get("palette"):
sections.append(f"Color palette: {fields['palette']}")
if fields.get("style"):
sections.append(f"Style/format: {fields['style']}")
if fields.get("timing"):
sections.append(f"Timing/beats: {fields['timing']}")
if fields.get("audio"):
sections.append(f"Audio: {fields['audio']}")
if fields.get("text"):
sections.append(f"Text (verbatim): \"{fields['text']}\"")
if fields.get("dialogue"):
dialogue = fields["dialogue"].strip()
sections.append("Dialogue:\n<dialogue>\n" + dialogue + "\n</dialogue>")
if fields.get("constraints"):
sections.append(f"Constraints: {fields['constraints']}")
if fields.get("negative"):
sections.append(f"Avoid: {fields['negative']}")
return "\n".join(sections)
def _augment_prompt(args: argparse.Namespace, prompt: str) -> str:
fields = _fields_from_args(args)
return _augment_prompt_fields(args.augment, prompt, fields)
def _get_status(video: Any) -> Optional[str]:
if isinstance(video, dict):
for key in ("status", "state"):
if key in video and isinstance(video[key], str):
return video[key]
data = video.get("data") if isinstance(video.get("data"), dict) else None
if data:
for key in ("status", "state"):
if key in data and isinstance(data[key], str):
return data[key]
return None
for key in ("status", "state"):
val = getattr(video, key, None)
if isinstance(val, str):
return val
return None
def _get_video_id(video: Any) -> Optional[str]:
if isinstance(video, dict):
if isinstance(video.get("id"), str):
return video["id"]
data = video.get("data") if isinstance(video.get("data"), dict) else None
if data and isinstance(data.get("id"), str):
return data["id"]
return None
vid = getattr(video, "id", None)
return vid if isinstance(vid, str) else None
def _poll_video(
client: Any,
video_id: str,
*,
poll_interval: float,
timeout: Optional[float],
) -> Any:
start = time.time()
last_status: Optional[str] = None
while True:
video = client.videos.retrieve(video_id)
status = _get_status(video) or "unknown"
if status != last_status:
print(f"Status: {status}", file=sys.stderr)
last_status = status
if status in TERMINAL_STATUSES:
return video
if timeout is not None and (time.time() - start) > timeout:
_die(f"Timed out after {timeout:.1f}s waiting for {video_id}")
time.sleep(poll_interval)
def _download_content(client: Any, video_id: str, variant: str) -> Any:
content = client.videos.download_content(video_id, variant=variant)
if hasattr(content, "write_to_file"):
return content
if hasattr(content, "read"):
return content.read()
if isinstance(content, (bytes, bytearray)):
return bytes(content)
if hasattr(content, "content"):
return content.content
return content
def _write_download(data: Any, out_path: Path, *, force: bool) -> None:
if out_path.exists() and not force:
_die(f"Output exists: {out_path} (use --force to overwrite)")
if hasattr(data, "write_to_file"):
data.write_to_file(out_path)
print(f"Wrote {out_path}")
return
if hasattr(data, "read"):
out_path.write_bytes(data.read())
print(f"Wrote {out_path}")
return
out_path.write_bytes(data)
print(f"Wrote {out_path}")
def _build_create_payload(args: argparse.Namespace, prompt: str) -> Dict[str, Any]:
model = _normalize_model(args.model)
size = _normalize_size(args.size, model)
seconds = _normalize_seconds(args.seconds)
return {
"model": model,
"prompt": prompt,
"size": size,
"seconds": seconds,
}
def _prepare_job_payload(
args: argparse.Namespace,
job: Dict[str, Any],
base_fields: Dict[str, Optional[str]],
base_payload: Dict[str, Any],
) -> Tuple[Dict[str, Any], Optional[str], str]:
prompt = str(job["prompt"]).strip()
fields = _merge_non_null(base_fields, job.get("fields", {}))
fields = _merge_non_null(fields, {k: job.get(k) for k in base_fields.keys()})
augmented = _augment_prompt_fields(args.augment, prompt, fields)
payload = dict(base_payload)
payload["prompt"] = augmented
payload = _merge_non_null(payload, {k: job.get(k) for k in base_payload.keys()})
payload = {k: v for k, v in payload.items() if v is not None}
model = _normalize_model(payload.get("model"))
size = _normalize_size(payload.get("size"), model)
seconds = _normalize_seconds(payload.get("seconds"))
payload["model"] = model
payload["size"] = size
payload["seconds"] = seconds
input_ref = (
job.get("input_reference")
or job.get("input_reference_path")
or job.get("input_reference_file")
)
return payload, input_ref, prompt
def _write_json(path: Path, obj: Any) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(_to_dict(obj), indent=2, sort_keys=True), encoding="utf-8")
print(f"Wrote {path}")
def _write_json_out(out_path: Optional[Path], obj: Any) -> None:
if out_path is None:
return
_write_json(out_path, obj)
async def _create_one_with_retries(
client: Any,
payload: Dict[str, Any],
*,
attempts: int,
job_label: str,
) -> Any:
last_exc: Optional[Exception] = None
for attempt in range(1, attempts + 1):
try:
return await client.videos.create(**payload)
except Exception as exc:
last_exc = exc
if not _is_transient_error(exc):
raise
if attempt == attempts:
raise
sleep_s = _extract_retry_after_seconds(exc)
if sleep_s is None:
sleep_s = min(60.0, 2.0**attempt)
print(
f"{job_label} attempt {attempt}/{attempts} failed ({exc.__class__.__name__}); retrying in {sleep_s:.1f}s",
file=sys.stderr,
)
await asyncio.sleep(sleep_s)
raise last_exc or RuntimeError("unknown error")
async def _run_create_batch(args: argparse.Namespace) -> int:
jobs = _read_jobs_jsonl(args.input)
out_dir = Path(args.out_dir)
base_fields = _fields_from_args(args)
base_payload = {
"model": args.model,
"size": args.size,
"seconds": args.seconds,
}
if args.dry_run:
for i, job in enumerate(jobs, start=1):
payload, input_ref, prompt = _prepare_job_payload(args, job, base_fields, base_payload)
out_path = _job_output_path(out_dir, i, prompt, job.get("out"))
preview = dict(payload)
if input_ref:
preview["input_reference"] = input_ref
_print_request(
{
"endpoint": "/v1/videos",
"job": i,
"output": str(out_path),
**preview,
}
)
return 0
client = _create_async_client()
sem = asyncio.Semaphore(args.concurrency)
any_failed = False
async def run_job(i: int, job: Dict[str, Any]) -> Tuple[int, Optional[str]]:
nonlocal any_failed
payload, input_ref, prompt = _prepare_job_payload(args, job, base_fields, base_payload)
job_label = f"[job {i}/{len(jobs)}]"
out_path = _job_output_path(out_dir, i, prompt, job.get("out"))
try:
async with sem:
print(f"{job_label} starting", file=sys.stderr)
started = time.time()
with _open_input_reference(input_ref) as ref:
request = dict(payload)
if ref is not None:
request["input_reference"] = ref
result = await _create_one_with_retries(
client,
request,
attempts=args.max_attempts,
job_label=job_label,
)
elapsed = time.time() - started
print(f"{job_label} completed in {elapsed:.1f}s", file=sys.stderr)
_write_json(out_path, result)
return i, None
except Exception as exc:
any_failed = True
print(f"{job_label} failed: {exc}", file=sys.stderr)
if args.fail_fast:
raise
return i, str(exc)
tasks = [asyncio.create_task(run_job(i, job)) for i, job in enumerate(jobs, start=1)]
try:
await asyncio.gather(*tasks)
except Exception:
for t in tasks:
if not t.done():
t.cancel()
raise
return 1 if any_failed else 0
def _create_batch(args: argparse.Namespace) -> None:
exit_code = asyncio.run(_run_create_batch(args))
if exit_code:
raise SystemExit(exit_code)
def _cmd_create(args: argparse.Namespace) -> int:
prompt = _read_prompt(args.prompt, args.prompt_file)
prompt = _augment_prompt(args, prompt)
payload = _build_create_payload(args, prompt)
json_out = _normalize_json_out(args.json_out, "create.json")
if args.dry_run:
preview = dict(payload)
if args.input_reference:
preview["input_reference"] = args.input_reference
_print_request({"endpoint": "/v1/videos", **preview})
_write_json_out(json_out, {"dry_run": True, "request": {"endpoint": "/v1/videos", **preview}})
return 0
client = _create_client()
with _open_input_reference(args.input_reference) as input_ref:
if input_ref is not None:
payload["input_reference"] = input_ref
video = client.videos.create(**payload)
_print_json(video)
_write_json_out(json_out, video)
return 0
def _cmd_create_and_poll(args: argparse.Namespace) -> int:
prompt = _read_prompt(args.prompt, args.prompt_file)
prompt = _augment_prompt(args, prompt)
payload = _build_create_payload(args, prompt)
json_out = _normalize_json_out(args.json_out, "create-and-poll.json")
if args.dry_run:
preview = dict(payload)
if args.input_reference:
preview["input_reference"] = args.input_reference
_print_request({"endpoint": "/v1/videos", **preview})
print("Would poll for completion.")
if args.download:
variant = _normalize_variant(args.variant)
out_path = _normalize_out_path(args.out, variant)
print(f"Would download variant={variant} to {out_path}")
if json_out:
dry_bundle: Dict[str, Any] = {
"dry_run": True,
"request": {"endpoint": "/v1/videos", **preview},
"poll": True,
}
if args.download:
dry_bundle["download"] = {
"variant": variant,
"out": str(out_path),
}
_write_json_out(json_out, dry_bundle)
return 0
client = _create_client()
with _open_input_reference(args.input_reference) as input_ref:
if input_ref is not None:
payload["input_reference"] = input_ref
video = client.videos.create(**payload)
_print_json(video)
video_id = _get_video_id(video)
if not video_id:
_die("Could not determine video id from create response.")
poll_interval = _normalize_poll_interval(args.poll_interval)
timeout = _normalize_timeout(args.timeout)
final_video = _poll_video(
client,
video_id,
poll_interval=poll_interval,
timeout=timeout,
)
_print_json(final_video)
if args.download:
status = _get_status(final_video) or "unknown"
if status != "completed":
_die(f"Video status is {status}; download is available only after completion.")
variant = _normalize_variant(args.variant)
out_path = _normalize_out_path(args.out, variant)
data = _download_content(client, video_id, variant)
_write_download(data, out_path, force=args.force)
if json_out:
_write_json_out(
json_out,
{"create": _to_dict(video), "final": _to_dict(final_video)},
)
return 0
def _cmd_poll(args: argparse.Namespace) -> int:
poll_interval = _normalize_poll_interval(args.poll_interval)
timeout = _normalize_timeout(args.timeout)
json_out = _normalize_json_out(args.json_out, "poll.json")
client = _create_client()
final_video = _poll_video(
client,
args.id,
poll_interval=poll_interval,
timeout=timeout,
)
_print_json(final_video)
_write_json_out(json_out, final_video)
if args.download:
status = _get_status(final_video) or "unknown"
if status != "completed":
_die(f"Video status is {status}; download is available only after completion.")
variant = _normalize_variant(args.variant)
out_path = _normalize_out_path(args.out, variant)
data = _download_content(client, args.id, variant)
_write_download(data, out_path, force=args.force)
return 0
def _cmd_status(args: argparse.Namespace) -> int:
json_out = _normalize_json_out(args.json_out, "status.json")
client = _create_client()
video = client.videos.retrieve(args.id)
_print_json(video)
_write_json_out(json_out, video)
return 0
def _cmd_list(args: argparse.Namespace) -> int:
params: Dict[str, Any] = {
"limit": args.limit,
"order": _normalize_order(args.order),
"after": args.after,
"before": args.before,
}
params = {k: v for k, v in params.items() if v is not None}
json_out = _normalize_json_out(args.json_out, "list.json")
client = _create_client()
videos = client.videos.list(**params)
_print_json(videos)
_write_json_out(json_out, videos)
return 0
def _cmd_delete(args: argparse.Namespace) -> int:
json_out = _normalize_json_out(args.json_out, "delete.json")
client = _create_client()
result = client.videos.delete(args.id)
_print_json(result)
_write_json_out(json_out, result)
return 0
def _cmd_remix(args: argparse.Namespace) -> int:
prompt = _read_prompt(args.prompt, args.prompt_file)
prompt = _augment_prompt(args, prompt)
json_out = _normalize_json_out(args.json_out, "remix.json")
if args.dry_run:
preview = {"endpoint": f"/v1/videos/{args.id}/remix", "prompt": prompt}
_print_request(preview)
_write_json_out(json_out, {"dry_run": True, "request": preview})
return 0
client = _create_client()
result = client.videos.remix(video_id=args.id, prompt=prompt)
_print_json(result)
_write_json_out(json_out, result)
return 0
def _cmd_download(args: argparse.Namespace) -> int:
variant = _normalize_variant(args.variant)
out_path = _normalize_out_path(args.out, variant)
client = _create_client()
data = _download_content(client, args.id, variant)
_write_download(data, out_path, force=args.force)
return 0
class _NullContext:
def __enter__(self):
return None
def __exit__(self, exc_type, exc, tb):
return False
class _SingleFile:
def __init__(self, path: Path):
self._path = path
self._handle = None
def __enter__(self):
self._handle = self._path.open("rb")
return self._handle
def __exit__(self, exc_type, exc, tb):
if self._handle:
try:
self._handle.close()
except Exception:
pass
return False
def _add_prompt_args(parser: argparse.ArgumentParser) -> None:
parser.add_argument("--prompt")
parser.add_argument("--prompt-file")
parser.add_argument("--augment", dest="augment", action="store_true")
parser.add_argument("--no-augment", dest="augment", action="store_false")
parser.set_defaults(augment=True)
parser.add_argument("--use-case")
parser.add_argument("--scene")
parser.add_argument("--subject")
parser.add_argument("--action")
parser.add_argument("--camera")
parser.add_argument("--style")
parser.add_argument("--lighting")
parser.add_argument("--palette")
parser.add_argument("--audio")
parser.add_argument("--dialogue")
parser.add_argument("--text")
parser.add_argument("--timing")
parser.add_argument("--constraints")
parser.add_argument("--negative")
def _add_create_args(parser: argparse.ArgumentParser) -> None:
parser.add_argument("--model", default=DEFAULT_MODEL)
parser.add_argument("--size", default=DEFAULT_SIZE)
parser.add_argument("--seconds", default=DEFAULT_SECONDS)
parser.add_argument("--input-reference")
parser.add_argument("--dry-run", action="store_true")
_add_prompt_args(parser)
def _add_poll_args(parser: argparse.ArgumentParser) -> None:
parser.add_argument("--poll-interval", type=float, default=DEFAULT_POLL_INTERVAL)
parser.add_argument("--timeout", type=float)
def _add_download_args(parser: argparse.ArgumentParser) -> None:
parser.add_argument("--download", action="store_true")
parser.add_argument("--variant", default=DEFAULT_VARIANT)
parser.add_argument("--out")
parser.add_argument("--force", action="store_true")
def _add_json_out(parser: argparse.ArgumentParser) -> None:
parser.add_argument("--json-out")
def main() -> int:
parser = argparse.ArgumentParser(description="Create and manage videos via the Sora Video API")
subparsers = parser.add_subparsers(dest="command", required=True)
create_parser = subparsers.add_parser("create", help="Create a new video job")
_add_create_args(create_parser)
_add_json_out(create_parser)
create_parser.set_defaults(func=_cmd_create)
create_poll_parser = subparsers.add_parser(
"create-and-poll",
help="Create a job, poll until complete, optionally download",
)
_add_create_args(create_poll_parser)
_add_poll_args(create_poll_parser)
_add_download_args(create_poll_parser)
_add_json_out(create_poll_parser)
create_poll_parser.set_defaults(func=_cmd_create_and_poll)
poll_parser = subparsers.add_parser("poll", help="Poll a job until it completes")
poll_parser.add_argument("--id", required=True)
_add_poll_args(poll_parser)
_add_download_args(poll_parser)
_add_json_out(poll_parser)
poll_parser.set_defaults(func=_cmd_poll)
status_parser = subparsers.add_parser("status", help="Retrieve a job status")
status_parser.add_argument("--id", required=True)
_add_json_out(status_parser)
status_parser.set_defaults(func=_cmd_status)
list_parser = subparsers.add_parser("list", help="List recent video jobs")
list_parser.add_argument("--limit", type=int)
list_parser.add_argument("--order")
list_parser.add_argument("--after")
list_parser.add_argument("--before")
_add_json_out(list_parser)
list_parser.set_defaults(func=_cmd_list)
delete_parser = subparsers.add_parser("delete", help="Delete a video job")
delete_parser.add_argument("--id", required=True)
_add_json_out(delete_parser)
delete_parser.set_defaults(func=_cmd_delete)
remix_parser = subparsers.add_parser("remix", help="Remix a completed video job")
remix_parser.add_argument("--id", required=True)
remix_parser.add_argument("--dry-run", action="store_true")
_add_prompt_args(remix_parser)
_add_json_out(remix_parser)
remix_parser.set_defaults(func=_cmd_remix)
download_parser = subparsers.add_parser("download", help="Download video/thumbnail/spritesheet")
download_parser.add_argument("--id", required=True)
download_parser.add_argument("--variant", default=DEFAULT_VARIANT)
download_parser.add_argument("--out")
download_parser.add_argument("--force", action="store_true")
download_parser.set_defaults(func=_cmd_download)
batch_parser = subparsers.add_parser("create-batch", help="Create multiple video jobs (JSONL input)")
_add_create_args(batch_parser)
batch_parser.add_argument("--input", required=True, help="Path to JSONL file (one job per line)")
batch_parser.add_argument("--out-dir", required=True)
batch_parser.add_argument("--concurrency", type=int, default=DEFAULT_CONCURRENCY)
batch_parser.add_argument("--max-attempts", type=int, default=DEFAULT_MAX_ATTEMPTS)
batch_parser.add_argument("--fail-fast", action="store_true")
batch_parser.set_defaults(func=_create_batch)
args = parser.parse_args()
if getattr(args, "concurrency", 1) < 1 or getattr(args, "concurrency", 1) > 10:
_die("--concurrency must be between 1 and 10")
if getattr(args, "max_attempts", DEFAULT_MAX_ATTEMPTS) < 1 or getattr(args, "max_attempts", DEFAULT_MAX_ATTEMPTS) > 10:
_die("--max-attempts must be between 1 and 10")
dry_run = bool(getattr(args, "dry_run", False))
_ensure_api_key(dry_run)
args.func(args)
return 0
if __name__ == "__main__":
raise SystemExit(main())