mirror of
https://github.com/ksyasuda/dotfiles.git
synced 2026-03-21 06:11:27 -07:00
133 lines
5.2 KiB
Markdown
133 lines
5.2 KiB
Markdown
# CLI reference (`scripts/image_gen.py`)
|
||
|
||
This file contains the “command catalog” for the bundled image generation CLI. Keep `SKILL.md` as overview-first; put verbose CLI details here.
|
||
|
||
## What this CLI does
|
||
- `generate`: generate new images from a prompt
|
||
- `edit`: edit an existing image (optionally with a mask) — inpainting / background replacement / “change only X”
|
||
- `generate-batch`: run many jobs from a JSONL file (one job per line)
|
||
|
||
Real API calls require **network access** + `OPENAI_API_KEY`. `--dry-run` does not.
|
||
|
||
## Quick start (works from any repo)
|
||
Set a stable path to the skill CLI (default `CODEX_HOME` is `~/.codex`):
|
||
|
||
```
|
||
export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
|
||
export IMAGE_GEN="$CODEX_HOME/skills/imagegen/scripts/image_gen.py"
|
||
```
|
||
|
||
Dry-run (no API call; no network required; does not require the `openai` package):
|
||
|
||
```
|
||
python "$IMAGE_GEN" generate --prompt "Test" --dry-run
|
||
```
|
||
|
||
Generate (requires `OPENAI_API_KEY` + network):
|
||
|
||
```
|
||
uv run --with openai python "$IMAGE_GEN" generate --prompt "A cozy alpine cabin at dawn" --size 1024x1024
|
||
```
|
||
|
||
No `uv` installed? Use your active Python env:
|
||
|
||
```
|
||
python "$IMAGE_GEN" generate --prompt "A cozy alpine cabin at dawn" --size 1024x1024
|
||
```
|
||
|
||
## Guardrails (important)
|
||
- Use `python "$IMAGE_GEN" ...` (or equivalent full path) for generations/edits/batch work.
|
||
- Do **not** create one-off runners (e.g. `gen_images.py`) unless the user explicitly asks for a custom wrapper.
|
||
- **Never modify** `scripts/image_gen.py`. If something is missing, ask the user before doing anything else.
|
||
|
||
## Defaults (unless overridden by flags)
|
||
- Model: `gpt-image-1.5`
|
||
- Size: `1024x1024`
|
||
- Quality: `auto`
|
||
- Output format: `png`
|
||
- Background: unspecified (API default). If you set `--background transparent`, also set `--output-format png` or `webp`.
|
||
|
||
## Quality + input fidelity
|
||
- `--quality` works for `generate`, `edit`, and `generate-batch`: `low|medium|high|auto`.
|
||
- `--input-fidelity` is **edit-only**: `low|high` (use `high` for strict edits like identity or layout lock).
|
||
|
||
Example:
|
||
```
|
||
python "$IMAGE_GEN" edit --image input.png --prompt "Change only the background" --quality high --input-fidelity high
|
||
```
|
||
|
||
## Masks (edits)
|
||
- Use a **PNG** mask; an alpha channel is strongly recommended.
|
||
- The mask should match the input image dimensions.
|
||
- In the edit prompt, repeat invariants (e.g., “change only the background; keep the subject unchanged”) to reduce drift.
|
||
|
||
## Optional deps
|
||
Prefer `uv run --with ...` for an out-of-the-box run without changing the current project env; otherwise install into your active env:
|
||
|
||
```
|
||
uv pip install openai
|
||
```
|
||
|
||
## Common recipes
|
||
|
||
Generate + also write a downscaled copy for fast web loading:
|
||
|
||
```
|
||
uv run --with openai --with pillow python "$IMAGE_GEN" generate \
|
||
--prompt "A cozy alpine cabin at dawn" \
|
||
--size 1024x1024 \
|
||
--downscale-max-dim 1024
|
||
```
|
||
|
||
Notes:
|
||
- Downscaling writes an extra file next to the original (default suffix `-web`, e.g. `output-web.png`).
|
||
- Downscaling requires Pillow (use `uv run --with pillow ...` or install it into your env).
|
||
|
||
Generate with augmentation fields:
|
||
|
||
```
|
||
python "$IMAGE_GEN" generate \
|
||
--prompt "A minimal hero image of a ceramic coffee mug" \
|
||
--use-case "landing page hero" \
|
||
--style "clean product photography" \
|
||
--composition "centered product, generous negative space" \
|
||
--constraints "no logos, no text"
|
||
```
|
||
|
||
Generate multiple prompts concurrently (async batch):
|
||
|
||
```
|
||
mkdir -p tmp/imagegen
|
||
cat > tmp/imagegen/prompts.jsonl << 'EOF'
|
||
{"prompt":"Cavernous hangar interior with a compact shuttle parked center-left, open bay door","use_case":"game concept art environment","composition":"wide-angle, low-angle, cinematic framing","lighting":"volumetric light rays through drifting fog","constraints":"no logos or trademarks; no watermark","size":"1536x1024"}
|
||
{"prompt":"Gray wolf in profile in a snowy forest, crisp fur texture","use_case":"wildlife photography print","composition":"100mm, eye-level, shallow depth of field","constraints":"no logos or trademarks; no watermark","size":"1024x1024"}
|
||
EOF
|
||
|
||
python "$IMAGE_GEN" generate-batch --input tmp/imagegen/prompts.jsonl --out-dir out --concurrency 5
|
||
|
||
# Cleanup (recommended)
|
||
rm -f tmp/imagegen/prompts.jsonl
|
||
```
|
||
|
||
Notes:
|
||
- Use `--concurrency` to control parallelism (default `5`). Higher concurrency can hit rate limits; the CLI retries on transient errors.
|
||
- Per-job overrides are supported in JSONL (e.g., `size`, `quality`, `background`, `output_format`, `n`, and prompt-augmentation fields).
|
||
- `--n` generates multiple variants for a single prompt; `generate-batch` is for many different prompts.
|
||
- Treat the JSONL file as temporary: write it under `tmp/` and delete it after the run (don’t commit it).
|
||
|
||
Edit:
|
||
|
||
```
|
||
python "$IMAGE_GEN" edit --image input.png --mask mask.png --prompt "Replace the background with a warm sunset"
|
||
```
|
||
|
||
## CLI notes
|
||
- Supported sizes: `1024x1024`, `1536x1024`, `1024x1536`, or `auto`.
|
||
- Transparent backgrounds require `output_format` to be `png` or `webp`.
|
||
- Default output is `output.png`; multiple images become `output-1.png`, `output-2.png`, etc.
|
||
- Use `--no-augment` to skip prompt augmentation.
|
||
|
||
## See also
|
||
- API parameter quick reference: `references/image-api.md`
|
||
- Prompt examples: `references/sample-prompts.md`
|