Compare commits

..

2 Commits

Author SHA1 Message Date
41cb43a916 update 2026-03-28 00:02:25 -07:00
31e52230a3 update 2026-03-27 18:13:42 -07:00
70 changed files with 57 additions and 13438 deletions

View File

@@ -1,220 +0,0 @@
---
name: autofix
description: Auto-fix CodeRabbit review comments - get CodeRabbit review comments from GitHub and fix them interactively or in batch
version: 0.1.0
triggers:
- coderabbit.?autofix
- coderabbit.?auto.?fix
- autofix.?coderabbit
- coderabbit.?fix
- fix.?coderabbit
- coderabbit.?review
- review.?coderabbit
- coderabbit.?issues?
- show.?coderabbit
- get.?coderabbit
- cr.?autofix
- cr.?fix
- cr.?review
---
# CodeRabbit Autofix
Fetch CodeRabbit review comments for your current branch's PR and fix them interactively or in batch.
## Prerequisites
### Required Tools
- `gh` (GitHub CLI) - [Installation guide](./github.md)
- `git`
Verify: `gh auth status`
### Required State
- Git repo on GitHub
- Current branch has open PR
- PR reviewed by CodeRabbit bot (`coderabbitai`, `coderabbit[bot]`, `coderabbitai[bot]`)
## Workflow
### Step 0: Load Repository Instructions (`AGENTS.md`)
Before any autofix actions, search for `AGENTS.md` in the current repository and load applicable instructions.
- If found, follow its build/lint/test/commit guidance throughout the run.
- If not found, continue with default workflow.
### Step 1: Check Code Push Status
Check: `git status` + check for unpushed commits
**If uncommitted changes:**
- Warn: "⚠️ Uncommitted changes won't be in CodeRabbit review"
- Ask: "Commit and push first?" → If yes: wait for user action, then continue
**If unpushed commits:**
- Warn: "⚠️ N unpushed commits. CodeRabbit hasn't reviewed them"
- Ask: "Push now?" → If yes: `git push`, inform "CodeRabbit will review in ~5 min", EXIT skill
**Otherwise:** Proceed to Step 2
### Step 2: Find Open PR
```bash
gh pr list --head $(git branch --show-current) --state open --json number,title
```
**If no PR:** Ask "Create PR?" → If yes: create PR (see [github.md § 5](./github.md#5-create-pr-if-needed)), inform "Run skill again in ~5 min", EXIT
### Step 3: Fetch Unresolved CodeRabbit Threads
Fetch PR review threads (see [github.md § 2](./github.md#2-fetch-unresolved-threads)):
- Threads: `gh api graphql ... pullRequest.reviewThreads ...` (see [github.md § 2](./github.md#2-fetch-unresolved-threads))
Filter to:
- unresolved threads only (`isResolved == false`)
- threads started by CodeRabbit bot (`coderabbitai`, `coderabbit[bot]`, `coderabbitai[bot]`)
**If review in progress:** Check for "Come back again in a few minutes" message → Inform "⏳ Review in progress, try again in a few minutes", EXIT
**If no unresolved CodeRabbit threads:** Inform "No unresolved CodeRabbit review threads found", EXIT
**For each selected thread:**
- Extract issue metadata from root comment
### Step 4: Parse and Display Issues
**Extract from each comment:**
1. **Header:** `_([^_]+)_ \| _([^_]+)_` → Issue type | Severity
2. **Description:** Main body text
3. **Agent prompt:** Content in `<details><summary>🤖 Prompt for AI Agents</summary>` (this is the fix instruction)
- If missing, use description as fallback
4. **Location:** File path and line numbers
**Map severity:**
- 🔴 Critical/High → CRITICAL (action required)
- 🟠 Medium → HIGH (review recommended)
- 🟡 Minor/Low → MEDIUM (review recommended)
- 🟢 Info/Suggestion → LOW (optional)
- 🔒 Security → Treat as high priority
**Display in CodeRabbit's original order** (already severity-ordered):
```
CodeRabbit Issues for PR #123: [PR Title]
| # | Severity | Issue Title | Location & Details | Type | Action |
|---|----------|-------------|-------------------|------|--------|
| 1 | 🔴 CRITICAL | Insecure authentication check | src/auth/service.py:42<br>Authorization logic inverted | 🐛 Bug 🔒 Security | Fix |
| 2 | 🟠 HIGH | Database query not awaited | src/db/repository.py:89<br>Async call missing await | 🐛 Bug | Fix |
```
### Step 5: Ask User for Fix Preference
Use AskUserQuestion:
- 🔍 "Review each issue" - Manual review and approval (recommended)
- ⚡ "Auto-fix all" - Apply all "Fix" issues without approval
- ❌ "Cancel" - Exit
**Route based on choice:**
- Review → Step 5
- Auto-fix → Step 6
- Cancel → EXIT
### Step 6: Manual Review Mode
For each "Fix" issue (CRITICAL first):
1. Read relevant files
2. **Execute CodeRabbit's agent prompt as direct instruction** (from "🤖 Prompt for AI Agents" section)
3. Calculate proposed fix (DO NOT apply yet)
4. **Show fix and ask approval in ONE step:**
- Issue title + location
- CodeRabbit's agent prompt (so user can verify)
- Current code
- Proposed diff
- AskUserQuestion: ✅ Apply fix | ⏭️ Defer | 🔧 Modify
**If "Apply fix":**
- Apply with Edit tool
- Track changed files for a single consolidated commit after all fixes
- Confirm: "✅ Fix applied and commented"
**If "Defer":**
- Ask for reason (AskUserQuestion)
- Move to next
**If "Modify":**
- Inform user can make changes manually
- Move to next
### Step 7: Auto-Fix Mode
For each "Fix" issue (CRITICAL first):
1. Read relevant files
2. **Execute CodeRabbit's agent prompt as direct instruction**
3. Apply fix with Edit tool
4. Track changed files for one consolidated commit
5. Report:
> ✅ **Fixed: [Issue Title]** at `[Location]`
> **Agent prompt:** [prompt used]
After all fixes, display summary of fixed/skipped issues.
### Step 8: Create Single Consolidated Commit
If any fixes were applied:
```bash
git add <all-changed-files>
git commit -m "fix: apply CodeRabbit auto-fixes"
```
Use one commit for all applied fixes in this run.
### Step 9: Prompt Build/Lint Before Push
If a consolidated commit was created:
- Prompt user interactively to run validation before push (recommended, not required).
- Remind the user of the `AGENTS.md` instructions already loaded in Step 0 (if present).
- If user agrees, run the requested checks and report results.
### Step 10: Push Changes
If a consolidated commit was created:
- Ask: "Push changes?" → If yes: `git push`
If all deferred (no commit): Skip this step.
### Step 11: Post Summary
**REQUIRED after all issues reviewed:**
```bash
gh pr comment <pr-number> --body "$(cat <<'EOF'
## Fixes Applied Successfully
Fixed <file-count> file(s) based on <issue-count> unresolved review comment(s).
**Files modified:**
- `path/to/file-a.ts`
- `path/to/file-b.ts`
**Commit:** `<commit-sha>`
The latest autofix changes are on the `<branch-name>` branch.
EOF
)"
```
See [github.md § 3](./github.md#3-post-summary-comment) for details.
Optionally react to CodeRabbit's main comment with 👍.
## Key Notes
- **Follow agent prompts literally** - The "🤖 Prompt for AI Agents" section IS the fix specification
- **One approval per fix** - Show context + diff + AskUserQuestion in single message (manual mode)
- **Preserve issue titles** - Use CodeRabbit's exact titles, don't paraphrase
- **Preserve ordering** - Display issues in CodeRabbit's original order
- **Do not post per-issue replies** - Keep the workflow summary-comment only

View File

@@ -1,110 +0,0 @@
# Git Platform Commands
GitHub CLI commands for the CodeRabbit Autofix skill.
## Prerequisites
**GitHub CLI (`gh`):**
- Install: `brew install gh` or [cli.github.com](https://cli.github.com/)
- Authenticate: `gh auth login`
- Verify: `gh auth status`
## Core Operations
### 1. Find Pull Request
```bash
gh pr list --head $(git branch --show-current) --state open --json number,title
```
Gets the PR number for the current branch.
### 2. Fetch Unresolved Threads
Use GitHub GraphQL `reviewThreads` (there is no REST `pulls/<pr-number>/threads` endpoint):
```bash
gh api graphql \
-F owner='{owner}' \
-F repo='{repo}' \
-F pr=<pr-number> \
-f query='query($owner:String!, $repo:String!, $pr:Int!) {
repository(owner:$owner, name:$repo) {
pullRequest(number:$pr) {
reviewThreads(first:100) {
nodes {
isResolved
comments(first:1) {
nodes {
databaseId
body
author { login }
}
}
}
}
}
}
}'
```
Filter criteria:
- `isResolved == false`
- root comment author is one of: `coderabbitai`, `coderabbit[bot]`, `coderabbitai[bot]`
Use the root comment body for the issue prompt.
### 3. Post Summary Comment
```bash
gh pr comment <pr-number> --body "$(cat <<'EOF'
## Fixes Applied Successfully
Fixed <file-count> file(s) based on <issue-count> unresolved review comment(s).
**Files modified:**
- `path/to/file-a.ts`
- `path/to/file-b.ts`
**Commit:** `<commit-sha>`
The latest autofix changes are on the `<branch-name>` branch.
EOF
)"
```
Post after the push step (if pushing) so branch state is final.
### 4. Acknowledge Review
```bash
# React with thumbs up to the CodeRabbit comment
gh api repos/{owner}/{repo}/issues/comments/<comment-id>/reactions \
-X POST \
-f content='+1'
```
Find the comment ID from step 2.
### 5. Create PR (if needed)
```bash
gh pr create --title '<title>' --body '<body>'
```
## Error Handling
**Missing `gh` CLI:**
- Inform user and provide install instructions
- Exit skill
**API failures:**
- Log error and continue
- Don't abort for comment posting failures
**Getting repo info:**
```bash
gh repo view --json owner,name,nameWithOwner
```

View File

@@ -1,201 +0,0 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf of
any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don\'t include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -1,174 +0,0 @@
---
name: "imagegen"
description: "Use when the user asks to generate or edit images via the OpenAI Image API (for example: generate image, edit/inpaint/mask, background removal or replacement, transparent background, product shots, concept art, covers, or batch variants); run the bundled CLI (`scripts/image_gen.py`) and require `OPENAI_API_KEY` for live calls."
---
# Image Generation Skill
Generates or edits images for the current project (e.g., website assets, game assets, UI mockups, product mockups, wireframes, logo design, photorealistic images, infographics). Defaults to `gpt-image-1.5` and the OpenAI Image API, and prefers the bundled CLI for deterministic, reproducible runs.
## When to use
- Generate a new image (concept art, product shot, cover, website hero)
- Edit an existing image (inpainting, masked edits, lighting or weather transformations, background replacement, object removal, compositing, transparent background)
- Batch runs (many prompts, or many variants across prompts)
## Decision tree (generate vs edit vs batch)
- If the user provides an input image (or says “edit/retouch/inpaint/mask/translate/localize/change only X”) → **edit**
- Else if the user needs many different prompts/assets → **generate-batch**
- Else → **generate**
## Workflow
1. Decide intent: generate vs edit vs batch (see decision tree above).
2. Collect inputs up front: prompt(s), exact text (verbatim), constraints/avoid list, and any input image(s)/mask(s). For multi-image edits, label each input by index and role; for edits, list invariants explicitly.
3. If batch: write a temporary JSONL under tmp/ (one job per line), run once, then delete the JSONL.
4. Augment prompt into a short labeled spec (structure + constraints) without inventing new creative requirements.
5. Run the bundled CLI (`scripts/image_gen.py`) with sensible defaults (see references/cli.md).
6. For complex edits/generations, inspect outputs (open/view images) and validate: subject, style, composition, text accuracy, and invariants/avoid items.
7. Iterate: make a single targeted change (prompt or mask), re-run, re-check.
8. Save/return final outputs and note the final prompt + flags used.
## Temp and output conventions
- Use `tmp/imagegen/` for intermediate files (for example JSONL batches); delete when done.
- Write final artifacts under `output/imagegen/` when working in this repo.
- Use `--out` or `--out-dir` to control output paths; keep filenames stable and descriptive.
## Dependencies (install if missing)
Prefer `uv` for dependency management.
Python packages:
```
uv pip install openai pillow
```
If `uv` is unavailable:
```
python3 -m pip install openai pillow
```
## Environment
- `OPENAI_API_KEY` must be set for live API calls.
If the key is missing, give the user these steps:
1. Create an API key in the OpenAI platform UI: https://platform.openai.com/api-keys
2. Set `OPENAI_API_KEY` as an environment variable in their system.
3. Offer to guide them through setting the environment variable for their OS/shell if needed.
- Never ask the user to paste the full key in chat. Ask them to set it locally and confirm when ready.
If installation isn't possible in this environment, tell the user which dependency is missing and how to install it locally.
## Defaults & rules
- Use `gpt-image-1.5` unless the user explicitly asks for `gpt-image-1-mini` or explicitly prefers a cheaper/faster model.
- Assume the user wants a new image unless they explicitly ask for an edit.
- Require `OPENAI_API_KEY` before any live API call.
- Use the OpenAI Python SDK (`openai` package) for all API calls; do not use raw HTTP.
- If the user requests edits, use `client.images.edit(...)` and include input images (and mask if provided).
- Prefer the bundled CLI (`scripts/image_gen.py`) over writing new one-off scripts.
- Never modify `scripts/image_gen.py`. If something is missing, ask the user before doing anything else.
- If the result isnt clearly relevant or doesnt satisfy constraints, iterate with small targeted prompt changes; only ask a question if a missing detail blocks success.
## Prompt augmentation
Reformat user prompts into a structured, production-oriented spec. Only make implicit details explicit; do not invent new requirements.
## Use-case taxonomy (exact slugs)
Classify each request into one of these buckets and keep the slug consistent across prompts and references.
Generate:
- photorealistic-natural — candid/editorial lifestyle scenes with real texture and natural lighting.
- product-mockup — product/packaging shots, catalog imagery, merch concepts.
- ui-mockup — app/web interface mockups that look shippable.
- infographic-diagram — diagrams/infographics with structured layout and text.
- logo-brand — logo/mark exploration, vector-friendly.
- illustration-story — comics, childrens book art, narrative scenes.
- stylized-concept — style-driven concept art, 3D/stylized renders.
- historical-scene — period-accurate/world-knowledge scenes.
Edit:
- text-localization — translate/replace in-image text, preserve layout.
- identity-preserve — try-on, person-in-scene; lock face/body/pose.
- precise-object-edit — remove/replace a specific element (incl. interior swaps).
- lighting-weather — time-of-day/season/atmosphere changes only.
- background-extraction — transparent background / clean cutout.
- style-transfer — apply reference style while changing subject/scene.
- compositing — multi-image insert/merge with matched lighting/perspective.
- sketch-to-render — drawing/line art to photoreal render.
Quick clarification (augmentation vs invention):
- If the user says “a hero image for a landing page”, you may add *layout/composition constraints* that are implied by that use (e.g., “generous negative space on the right for headline text”).
- Do not introduce new creative elements the user didnt ask for (e.g., adding a mascot, changing the subject, inventing brand names/logos).
Template (include only relevant lines):
```
Use case: <taxonomy slug>
Asset type: <where the asset will be used>
Primary request: <user's main prompt>
Scene/background: <environment>
Subject: <main subject>
Style/medium: <photo/illustration/3D/etc>
Composition/framing: <wide/close/top-down; placement>
Lighting/mood: <lighting + mood>
Color palette: <palette notes>
Materials/textures: <surface details>
Quality: <low/medium/high/auto>
Input fidelity (edits): <low/high>
Text (verbatim): "<exact text>"
Constraints: <must keep/must avoid>
Avoid: <negative constraints>
```
Augmentation rules:
- Keep it short; add only details the user already implied or provided elsewhere.
- Always classify the request into a taxonomy slug above and tailor constraints/composition/quality to that bucket. Use the slug to find the matching example in `references/sample-prompts.md`.
- If the user gives a broad request (e.g., "Generate images for this website"), use judgment to propose tasteful, context-appropriate assets and map each to a taxonomy slug.
- For edits, explicitly list invariants ("change only X; keep Y unchanged").
- If any critical detail is missing and blocks success, ask a question; otherwise proceed.
## Examples
### Generation example (hero image)
```
Use case: stylized-concept
Asset type: landing page hero
Primary request: a minimal hero image of a ceramic coffee mug
Style/medium: clean product photography
Composition/framing: centered product, generous negative space on the right
Lighting/mood: soft studio lighting
Constraints: no logos, no text, no watermark
```
### Edit example (invariants)
```
Use case: precise-object-edit
Asset type: product photo background replacement
Primary request: replace the background with a warm sunset gradient
Constraints: change only the background; keep the product and its edges unchanged; no text; no watermark
```
## Prompting best practices (short list)
- Structure prompt as scene -> subject -> details -> constraints.
- Include intended use (ad, UI mock, infographic) to set the mode and polish level.
- Use camera/composition language for photorealism.
- Quote exact text and specify typography + placement.
- For tricky words, spell them letter-by-letter and require verbatim rendering.
- For multi-image inputs, reference images by index and describe how to combine them.
- For edits, repeat invariants every iteration to reduce drift.
- Iterate with single-change follow-ups.
- For latency-sensitive runs, start with quality=low; use quality=high for text-heavy or detail-critical outputs.
- For strict edits (identity/layout lock), consider input_fidelity=high.
- If results feel “tacky”, add a brief “Avoid:” line (stock-photo vibe; cheesy lens flare; oversaturated neon; harsh bloom; oversharpening; clutter) and specify restraint (“editorial”, “premium”, “subtle”).
More principles: `references/prompting.md`. Copy/paste specs: `references/sample-prompts.md`.
## Guidance by asset type
Asset-type templates (website assets, game assets, wireframes, logo) are consolidated in `references/sample-prompts.md`.
## CLI + environment notes
- CLI commands + examples: `references/cli.md`
- API parameter quick reference: `references/image-api.md`
- If network approvals / sandbox settings are getting in the way: `references/codex-network.md`
## Reference map
- **`references/cli.md`**: how to *run* image generation/edits/batches via `scripts/image_gen.py` (commands, flags, recipes).
- **`references/image-api.md`**: what knobs exist at the API level (parameters, sizes, quality, background, edit-only fields).
- **`references/prompting.md`**: prompting principles (structure, constraints/invariants, iteration patterns).
- **`references/sample-prompts.md`**: copy/paste prompt recipes (generate + edit workflows; examples only).
- **`references/codex-network.md`**: environment/sandbox/network-approval troubleshooting.

View File

@@ -1,6 +0,0 @@
interface:
display_name: "Image Gen"
short_description: "Generate and edit images using OpenAI"
icon_small: "./assets/imagegen-small.svg"
icon_large: "./assets/imagegen.png"
default_prompt: "Generate or edit images for this task and return the final prompt plus selected outputs."

View File

@@ -1,5 +0,0 @@
<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" fill="currentColor" viewBox="0 0 16 16">
<path fill="currentColor" d="M7.51 6.827a1 1 0 1 1 .278 1.982 1 1 0 0 1-.278-1.982Z"/>
<path fill="currentColor" fill-rule="evenodd" d="M8.31 4.47c.368-.016.699.008 1.016.124l.186.075c.423.194.786.5 1.047.888l.067.107c.148.253.235.533.3.848.073.354.126.797.193 1.343l.277 2.25.088.745c.024.224.041.425.049.605.013.322-.004.615-.085.896l-.04.12a2.53 2.53 0 0 1-.802 1.115l-.16.118c-.281.189-.596.292-.956.366a9.46 9.46 0 0 1-.6.1l-.743.094-2.25.277c-.547.067-.99.121-1.35.136a2.765 2.765 0 0 1-.896-.085l-.12-.039a2.533 2.533 0 0 1-1.115-.802l-.118-.161c-.189-.28-.292-.596-.366-.956a9.42 9.42 0 0 1-.1-.599l-.094-.744-.276-2.25a17.884 17.884 0 0 1-.137-1.35c-.015-.367.009-.698.124-1.015l.076-.185c.193-.423.5-.787.887-1.048l.107-.067c.253-.148.534-.234.849-.3.354-.073.796-.126 1.343-.193l2.25-.277.744-.088c.224-.024.425-.041.606-.049Zm-2.905 5.978a1.47 1.47 0 0 0-.875.074c-.127.052-.267.146-.475.344-.212.204-.462.484-.822.889l-.314.351c.018.115.036.219.055.313.061.295.127.458.206.575l.07.094c.167.211.39.372.645.465l.109.032c.119.027.273.038.499.029.308-.013.7-.06 1.264-.13l2.25-.275.727-.093.198-.03-2.05-1.64a16.848 16.848 0 0 0-.96-.738c-.18-.121-.31-.19-.421-.23l-.106-.03Zm2.95-4.915c-.154.006-.33.021-.536.043l-.729.086-2.25.276c-.564.07-.956.118-1.257.18a1.937 1.937 0 0 0-.478.15l-.097.057a1.47 1.47 0 0 0-.515.608l-.044.107c-.048.133-.073.307-.06.608.012.307.06.7.129 1.264l.22 1.8.178-.197c.145-.159.278-.298.403-.418.255-.243.507-.437.809-.56l.181-.067a2.526 2.526 0 0 1 1.328-.06l.118.029c.27.079.517.215.772.387.287.194.619.46 1.03.789l2.52 2.016c.146-.148.26-.326.332-.524l.031-.109c.027-.119.039-.273.03-.499a8.311 8.311 0 0 0-.044-.536l-.086-.728-.276-2.25c-.07-.564-.118-.956-.18-1.258a1.935 1.935 0 0 0-.15-.477l-.057-.098a1.468 1.468 0 0 0-.608-.515l-.107-.043c-.133-.049-.306-.074-.607-.061Z" clip-rule="evenodd"/>
<path fill="currentColor" d="M7.783 1.272c.36.014.803.07 1.35.136l2.25.277.743.095c.224.03.423.062.6.099.36.074.675.177.955.366l.161.118c.364.29.642.675.802 1.115l.04.12c.081.28.098.574.085.896a9.42 9.42 0 0 1-.05.605l-.087.745-.277 2.25c-.067.547-.12.989-.193 1.343a2.765 2.765 0 0 1-.3.848l-.067.107a2.534 2.534 0 0 1-.415.474l-.086.064a.532.532 0 0 1-.622-.858l.13-.13c.04-.046.077-.094.111-.145l.057-.098c.055-.109.104-.256.15-.477.062-.302.11-.694.18-1.258l.276-2.25.086-.728c.022-.207.037-.382.043-.536.01-.226-.002-.38-.029-.5l-.032-.108a1.469 1.469 0 0 0-.464-.646l-.094-.069c-.118-.08-.28-.145-.575-.206a8.285 8.285 0 0 0-.53-.088l-.728-.092-2.25-.276c-.565-.07-.956-.117-1.264-.13a1.94 1.94 0 0 0-.5.029l-.108.032a1.469 1.469 0 0 0-.647.465l-.068.094c-.054.08-.102.18-.146.33l-.04.1a.533.533 0 0 1-.98-.403l.055-.166c.059-.162.133-.314.23-.457l.117-.16c.29-.365.675-.643 1.115-.803l.12-.04c.28-.08.574-.097.896-.084Z"/>
</svg>

Before

Width:  |  Height:  |  Size: 2.8 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.7 KiB

View File

@@ -1,132 +0,0 @@
# CLI reference (`scripts/image_gen.py`)
This file contains the “command catalog” for the bundled image generation CLI. Keep `SKILL.md` as overview-first; put verbose CLI details here.
## What this CLI does
- `generate`: generate new images from a prompt
- `edit`: edit an existing image (optionally with a mask) — inpainting / background replacement / “change only X”
- `generate-batch`: run many jobs from a JSONL file (one job per line)
Real API calls require **network access** + `OPENAI_API_KEY`. `--dry-run` does not.
## Quick start (works from any repo)
Set a stable path to the skill CLI (default `CODEX_HOME` is `~/.codex`):
```
export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
export IMAGE_GEN="$CODEX_HOME/skills/imagegen/scripts/image_gen.py"
```
Dry-run (no API call; no network required; does not require the `openai` package):
```
python "$IMAGE_GEN" generate --prompt "Test" --dry-run
```
Generate (requires `OPENAI_API_KEY` + network):
```
uv run --with openai python "$IMAGE_GEN" generate --prompt "A cozy alpine cabin at dawn" --size 1024x1024
```
No `uv` installed? Use your active Python env:
```
python "$IMAGE_GEN" generate --prompt "A cozy alpine cabin at dawn" --size 1024x1024
```
## Guardrails (important)
- Use `python "$IMAGE_GEN" ...` (or equivalent full path) for generations/edits/batch work.
- Do **not** create one-off runners (e.g. `gen_images.py`) unless the user explicitly asks for a custom wrapper.
- **Never modify** `scripts/image_gen.py`. If something is missing, ask the user before doing anything else.
## Defaults (unless overridden by flags)
- Model: `gpt-image-1.5`
- Size: `1024x1024`
- Quality: `auto`
- Output format: `png`
- Background: unspecified (API default). If you set `--background transparent`, also set `--output-format png` or `webp`.
## Quality + input fidelity
- `--quality` works for `generate`, `edit`, and `generate-batch`: `low|medium|high|auto`.
- `--input-fidelity` is **edit-only**: `low|high` (use `high` for strict edits like identity or layout lock).
Example:
```
python "$IMAGE_GEN" edit --image input.png --prompt "Change only the background" --quality high --input-fidelity high
```
## Masks (edits)
- Use a **PNG** mask; an alpha channel is strongly recommended.
- The mask should match the input image dimensions.
- In the edit prompt, repeat invariants (e.g., “change only the background; keep the subject unchanged”) to reduce drift.
## Optional deps
Prefer `uv run --with ...` for an out-of-the-box run without changing the current project env; otherwise install into your active env:
```
uv pip install openai
```
## Common recipes
Generate + also write a downscaled copy for fast web loading:
```
uv run --with openai --with pillow python "$IMAGE_GEN" generate \
--prompt "A cozy alpine cabin at dawn" \
--size 1024x1024 \
--downscale-max-dim 1024
```
Notes:
- Downscaling writes an extra file next to the original (default suffix `-web`, e.g. `output-web.png`).
- Downscaling requires Pillow (use `uv run --with pillow ...` or install it into your env).
Generate with augmentation fields:
```
python "$IMAGE_GEN" generate \
--prompt "A minimal hero image of a ceramic coffee mug" \
--use-case "landing page hero" \
--style "clean product photography" \
--composition "centered product, generous negative space" \
--constraints "no logos, no text"
```
Generate multiple prompts concurrently (async batch):
```
mkdir -p tmp/imagegen
cat > tmp/imagegen/prompts.jsonl << 'EOF'
{"prompt":"Cavernous hangar interior with a compact shuttle parked center-left, open bay door","use_case":"game concept art environment","composition":"wide-angle, low-angle, cinematic framing","lighting":"volumetric light rays through drifting fog","constraints":"no logos or trademarks; no watermark","size":"1536x1024"}
{"prompt":"Gray wolf in profile in a snowy forest, crisp fur texture","use_case":"wildlife photography print","composition":"100mm, eye-level, shallow depth of field","constraints":"no logos or trademarks; no watermark","size":"1024x1024"}
EOF
python "$IMAGE_GEN" generate-batch --input tmp/imagegen/prompts.jsonl --out-dir out --concurrency 5
# Cleanup (recommended)
rm -f tmp/imagegen/prompts.jsonl
```
Notes:
- Use `--concurrency` to control parallelism (default `5`). Higher concurrency can hit rate limits; the CLI retries on transient errors.
- Per-job overrides are supported in JSONL (e.g., `size`, `quality`, `background`, `output_format`, `n`, and prompt-augmentation fields).
- `--n` generates multiple variants for a single prompt; `generate-batch` is for many different prompts.
- Treat the JSONL file as temporary: write it under `tmp/` and delete it after the run (dont commit it).
Edit:
```
python "$IMAGE_GEN" edit --image input.png --mask mask.png --prompt "Replace the background with a warm sunset"
```
## CLI notes
- Supported sizes: `1024x1024`, `1536x1024`, `1024x1536`, or `auto`.
- Transparent backgrounds require `output_format` to be `png` or `webp`.
- Default output is `output.png`; multiple images become `output-1.png`, `output-2.png`, etc.
- Use `--no-augment` to skip prompt augmentation.
## See also
- API parameter quick reference: `references/image-api.md`
- Prompt examples: `references/sample-prompts.md`

View File

@@ -1,28 +0,0 @@
# Codex network approvals / sandbox notes
This guidance is intentionally isolated from `SKILL.md` because it can vary by environment and may become stale. Prefer the defaults in your environment when in doubt.
## Why am I asked to approve every image generation call?
Image generation uses the OpenAI Image API, so the CLI needs outbound network access. In many Codex setups, network access is disabled by default (especially under stricter sandbox modes), and/or the approval policy may require confirmation before networked commands run.
## How do I reduce repeated approval prompts (network)?
If you trust the repo and want fewer prompts, enable network access for the relevant sandbox mode and relax the approval policy.
Example `~/.codex/config.toml` pattern:
```
approval_policy = "never"
sandbox_mode = "workspace-write"
[sandbox_workspace_write]
network_access = true
```
Or for a single session:
```
codex --sandbox workspace-write --ask-for-approval never
```
## Safety note
Use caution: enabling network and disabling approvals reduces friction but increases risk if you run untrusted code or work in an untrusted repository.

View File

@@ -1,36 +0,0 @@
# Image API quick reference
## Endpoints
- Generate: `POST /v1/images/generations` (`client.images.generate(...)`)
- Edit: `POST /v1/images/edits` (`client.images.edit(...)`)
## Models
- Default: `gpt-image-1.5`
- Alternatives: `gpt-image-1-mini` (for faster, lower-cost generation)
## Core parameters (generate + edit)
- `prompt`: text prompt
- `model`: image model
- `n`: number of images (1-10)
- `size`: `1024x1024`, `1536x1024`, `1024x1536`, or `auto`
- `quality`: `low`, `medium`, `high`, or `auto`
- `background`: `transparent`, `opaque`, or `auto` (transparent requires `png`/`webp`)
- `output_format`: `png` (default), `jpeg`, `webp`
- `output_compression`: 0-100 (jpeg/webp only)
- `moderation`: `auto` (default) or `low`
## Edit-specific parameters
- `image`: one or more input images (first image is primary)
- `mask`: optional mask image (same size, alpha channel required)
- `input_fidelity`: `low` (default) or `high` (support varies by model) - set it to `high` if the user needs a very specific edit and you can't achieve it with the default `low` fidelity.
## Output
- `data[]` list with `b64_json` per image
## Limits & notes
- Input images and masks must be under 50MB.
- Use edits endpoint when the user requests changes to an existing image.
- Masking is prompt-guided; exact shapes are not guaranteed.
- Large sizes and high quality increase latency and cost.
- For fast iteration or latency-sensitive runs, start with `quality=low`; raise to `high` for text-heavy or detail-critical outputs.
- Use `input_fidelity=high` for strict edits (identity preservation, layout lock, or precise compositing).

View File

@@ -1,81 +0,0 @@
# Prompting best practices (gpt-image-1.5)
## Contents
- [Structure](#structure)
- [Specificity](#specificity)
- [Avoiding “tacky” outputs](#avoiding-tacky-outputs)
- [Composition & layout](#composition--layout)
- [Constraints & invariants](#constraints--invariants)
- [Text in images](#text-in-images)
- [Multi-image inputs](#multi-image-inputs)
- [Iterate deliberately](#iterate-deliberately)
- [Quality vs latency](#quality-vs-latency)
- [Use-case tips](#use-case-tips)
- [Where to find copy/paste recipes](#where-to-find-copypaste-recipes)
## Structure
- Use a consistent order: scene/background -> subject -> key details -> constraints -> output intent.
- Include intended use (ad, UI mock, infographic) to set the mode and polish level.
- For complex requests, use short labeled lines instead of a long paragraph.
## Specificity
- Name materials, textures, and visual medium (photo, watercolor, 3D render).
- For photorealism, include camera/composition language (lens, framing, lighting).
- Add targeted quality cues only when needed (film grain, textured brushstrokes, macro detail); avoid generic "8K" style prompts.
## Avoiding “tacky” outputs
- Dont use vibe-only buzzwords (“epic”, “cinematic”, “trending”, “8k”, “award-winning”, “unreal engine”, “artstation”) unless the user explicitly wants that look.
- Specify restraint: “minimal”, “editorial”, “premium”, “subtle”, “natural color grading”, “soft contrast”, “no harsh bloom”, “no oversharpening”.
- For 3D/illustration, name the finish you want: “matte”, “paper grain”, “ink texture”, “flat color with soft shadow”; avoid “glossy plastic” unless requested.
- Add a short negative line when needed (especially for marketing art): “Avoid: stock-photo vibe; cheesy lens flare; oversaturated neon; excessive bokeh; fake-looking smiles; clutter”.
## Composition & layout
- Specify framing and viewpoint (close-up, wide, top-down) and placement ("logo top-right").
- Call out negative space if you need room for UI or overlays.
## Constraints & invariants
- State what must not change ("keep background unchanged").
- For edits, say "change only X; keep Y unchanged" and repeat invariants on every iteration to reduce drift.
## Text in images
- Put literal text in quotes or ALL CAPS and specify typography (font style, size, color, placement).
- Spell uncommon words letter-by-letter if accuracy matters.
- For in-image copy, require verbatim rendering and no extra characters.
## Multi-image inputs
- Reference inputs by index and role ("Image 1: product, Image 2: style").
- Describe how to combine them ("apply Image 2's style to Image 1").
- For compositing, specify what moves where and what must remain unchanged.
## Iterate deliberately
- Start with a clean base prompt, then make small single-change edits.
- Re-specify critical constraints when you iterate.
## Quality vs latency
- For latency-sensitive runs, start at `quality=low` and only raise it if needed.
- Use `quality=high` for text-heavy or detail-critical images.
- For strict edits (identity preservation, layout lock), consider `input_fidelity=high`.
## Use-case tips
Generate:
- photorealistic-natural: Prompt as if a real photo is captured in the moment; use photography language (lens, lighting, framing); call for real texture (pores, wrinkles, fabric wear, imperfections); avoid studio polish or staging; use `quality=high` when detail matters.
- product-mockup: Describe the product/packaging and materials; ensure clean silhouette and label clarity; if in-image text is needed, require verbatim rendering and specify typography.
- ui-mockup: Describe a real product; focus on layout, hierarchy, and common UI elements; avoid concept-art language so it looks shippable.
- infographic-diagram: Define the audience and layout flow; label parts explicitly; require verbatim text; use `quality=high`.
- logo-brand: Keep it simple and scalable; ask for a strong silhouette and balanced negative space; avoid gradients and fine detail.
- illustration-story: Define panels or scene beats; keep each action concrete; for continuity, restate character traits and outfit each time.
- stylized-concept: Specify style cues, material finish, and rendering approach (3D, painterly, clay); add a short "Avoid" line to prevent tacky effects.
- historical-scene: State the location/date and required period accuracy; constrain clothing, props, and environment to match the era.
Edit:
- text-localization: Change only the text; preserve layout, typography, spacing, and hierarchy; no extra words or reflow unless needed.
- identity-preserve: Lock identity (face, body, pose, hair, expression); change only the specified elements; match lighting and shadows; use `input_fidelity=high` if likeness drifts.
- precise-object-edit: Specify exactly what to remove/replace; preserve surrounding texture and lighting; keep everything else unchanged.
- lighting-weather: Change only environmental conditions (light, shadows, atmosphere, precipitation); keep geometry, framing, and subject identity.
- background-extraction: Request transparent background; crisp silhouette; no halos; preserve label text exactly; optionally add a subtle contact shadow.
- style-transfer: Specify style cues to preserve (palette, texture, brushwork) and what must change; add "no extra elements" to prevent drift.
- compositing: Reference inputs by index; specify what moves where; match lighting, perspective, and scale; keep background and framing unchanged.
- sketch-to-render: Preserve layout, proportions, and perspective; add plausible materials, lighting, and environment; "do not add new elements or text."
## Where to find copy/paste recipes
For copy/paste prompt specs (examples only), see `references/sample-prompts.md`. This file focuses on principles, structure, and iteration patterns.

View File

@@ -1,384 +0,0 @@
# Sample prompts (copy/paste)
Use these as starting points (recipes only). Keep user-provided requirements; do not invent new creative elements.
For prompting principles (structure, invariants, iteration), see `references/prompting.md`.
## Generate
### photorealistic-natural
```
Use case: photorealistic-natural
Primary request: candid photo of an elderly sailor on a small fishing boat adjusting a net
Scene/background: coastal water with soft haze
Subject: weathered skin with wrinkles and sun texture; a calm dog on deck nearby
Style/medium: photorealistic candid photo
Composition/framing: medium close-up, eye-level, 50mm lens
Lighting/mood: soft coastal daylight, shallow depth of field, subtle film grain
Materials/textures: real skin texture, worn fabric, salt-worn wood
Constraints: natural color balance; no heavy retouching; no glamorization; no watermark
Avoid: studio polish; staged look
Quality: high
```
### product-mockup
```
Use case: product-mockup
Primary request: premium product photo of a matte black shampoo bottle with a minimal label
Scene/background: clean studio gradient from light gray to white
Subject: single bottle centered with subtle reflection
Style/medium: premium product photography
Composition/framing: centered, slight three-quarter angle, generous padding
Lighting/mood: softbox lighting, clean highlights, controlled shadows
Materials/textures: matte plastic, crisp label printing
Constraints: no logos or trademarks; no watermark
Quality: high
```
### ui-mockup
```
Use case: ui-mockup
Primary request: mobile app UI for a local farmers market with vendors and specials
Scene/background: clean white background with subtle natural accents
Subject: header, vendor list with small photos, "Today's specials" section, location and hours
Style/medium: realistic product UI, not concept art
Composition/framing: iPhone frame, balanced spacing and hierarchy
Constraints: practical layout, clear typography, no logos or trademarks, no watermark
```
### infographic-diagram
```
Use case: infographic-diagram
Primary request: detailed infographic of an automatic coffee machine flow
Scene/background: clean, light neutral background
Subject: bean hopper -> grinder -> brew group -> boiler -> water tank -> drip tray
Style/medium: clean vector-like infographic with clear callouts and arrows
Composition/framing: vertical poster layout, top-to-bottom flow
Text (verbatim): "Bean Hopper", "Grinder", "Brew Group", "Boiler", "Water Tank", "Drip Tray"
Constraints: clear labels, strong contrast, no logos or trademarks, no watermark
Quality: high
```
### logo-brand
```
Use case: logo-brand
Primary request: original logo for "Field & Flour", a local bakery
Style/medium: vector logo mark; flat colors; minimal
Composition/framing: single centered logo on plain background with padding
Constraints: strong silhouette, balanced negative space; original design only; no gradients unless essential; no trademarks; no watermark
```
### illustration-story
```
Use case: illustration-story
Primary request: 4-panel comic about a pet left alone at home
Scene/background: cozy living room across panels
Subject: pet reacting to the owner leaving, then relaxing, then returning to a composed pose
Style/medium: comic illustration with clear panels
Composition/framing: 4 equal-sized vertical panels, readable actions per panel
Constraints: no text; no logos or trademarks; no watermark
```
### stylized-concept
```
Use case: stylized-concept
Primary request: cavernous hangar interior with tall support beams and drifting fog
Scene/background: industrial hangar interior, deep scale, light haze
Subject: compact shuttle, parked center-left, bay door open
Style/medium: cinematic concept art, industrial realism
Composition/framing: wide-angle, low-angle, cinematic framing
Lighting/mood: volumetric light rays cutting through fog
Constraints: no logos or trademarks; no watermark
```
### historical-scene
```
Use case: historical-scene
Primary request: outdoor crowd scene in Bethel, New York on August 16, 1969
Scene/background: open field, temporary stages, period-accurate tents and signage
Subject: crowd in period-accurate clothing, authentic staging and environment
Style/medium: photorealistic photo
Composition/framing: wide shot, eye-level
Constraints: period-accurate details; no modern objects; no logos or trademarks; no watermark
```
## Asset type templates (taxonomy-aligned)
### Website assets template
```
Use case: <photorealistic-natural|stylized-concept|product-mockup|infographic-diagram|ui-mockup>
Asset type: <hero image / section illustration / blog header>
Primary request: <short description>
Scene/background: <environment or abstract background>
Subject: <main subject>
Style/medium: <photo/illustration/3D>
Composition/framing: <wide/centered; specify negative space side>
Lighting/mood: <soft/bright/neutral>
Color palette: <brand colors or neutral>
Constraints: <no text; no logos; no watermark; leave space for UI>
```
### Website assets example: minimal hero background
```
Use case: stylized-concept
Asset type: landing page hero background
Primary request: minimal abstract background with a soft gradient and subtle texture (calm, modern)
Style/medium: matte illustration / soft-rendered abstract background (not glossy 3D)
Composition/framing: wide composition; large negative space on the right for headline
Lighting/mood: gentle studio glow
Color palette: cool neutrals with a restrained blue accent
Constraints: no text; no logos; no watermark
```
### Website assets example: feature section illustration
```
Use case: stylized-concept
Asset type: feature section illustration
Primary request: simple abstract shapes suggesting connection and flow (tasteful, minimal)
Scene/background: subtle light-gray backdrop with faint texture
Style/medium: flat illustration; soft shadows; restrained contrast
Composition/framing: centered cluster; open margins for UI
Color palette: muted teal and slate, low contrast accents
Constraints: no text; no logos; no watermark
```
### Website assets example: blog header image
```
Use case: photorealistic-natural
Asset type: blog header image
Primary request: overhead desk scene with notebook, pen, and coffee cup
Scene/background: warm wooden tabletop
Style/medium: photorealistic photo
Composition/framing: wide crop; subject placed left; right side left empty
Lighting/mood: soft morning light
Constraints: no text; no logos; no watermark
```
### Game assets template
```
Use case: stylized-concept
Asset type: <game environment concept art / game character concept / game UI icon / tileable game texture>
Primary request: <biome/scene/character/icon/material>
Scene/background: <location + set dressing> (if applicable)
Subject: <main focal element(s)>
Style/medium: <realistic/stylized>; <concept art / character render / UI icon / texture>
Composition/framing: <wide/establishing/top-down>; <camera angle>; <focal point placement>
Lighting/mood: <time of day>; <mood>; <volumetric/fog/etc>
Constraints: no logos or trademarks; no watermark
```
### Game assets example: environment concept art
```
Use case: stylized-concept
Asset type: game environment concept art
Primary request: cavernous hangar interior with tall support beams and drifting fog
Scene/background: industrial hangar interior, deep scale, light haze
Subject: compact shuttle, parked center-left, bay door open
Foreground: painted floor markings; cables; tool carts along edges
Style/medium: cinematic concept art, industrial realism
Composition/framing: wide-angle, low-angle, cinematic framing
Lighting/mood: volumetric light rays cutting through fog
Constraints: no logos or trademarks; no watermark
```
### Game assets example: character concept
```
Use case: stylized-concept
Asset type: game character concept
Primary request: desert scout character with layered travel gear
Silhouette: long coat with hood, wide boots, satchel
Outfit/gear: dusty canvas, leather straps, brass buckles
Face/hair: windworn face, short cropped hair
Style/medium: character render; stylized realism
Pose: neutral hero pose
Background: simple neutral backdrop
Constraints: no logos or trademarks; no watermark
```
### Game assets example: UI icon
```
Use case: stylized-concept
Asset type: game UI icon
Primary request: round shield icon with a subtle rune pattern
Style/medium: painted game UI icon
Composition/framing: centered icon; generous padding; clear silhouette
Background: transparent
Lighting/mood: subtle highlights; crisp edges
Constraints: no text; no logos or trademarks; no watermark
```
### Game assets example: tileable texture
```
Use case: stylized-concept
Asset type: tileable game texture
Primary request: worn sandstone blocks
Style/medium: seamless tileable texture; PBR-ish look
Scale: medium tiling
Lighting: neutral / flat lighting
Constraints: seamless edges; no obvious focal elements; no text; no logos or trademarks; no watermark
```
### Wireframe template
```
Use case: ui-mockup
Asset type: website wireframe
Primary request: <page or flow to sketch>
Fidelity: low-fi grayscale wireframe; hand-drawn feel; simple boxes
Layout: <sections in order; grid/columns>
Annotations: <labels for key blocks>
Resolution/orientation: <landscape or portrait to match expected device>
Constraints: no color; no logos; no real photos; no watermark
```
### Wireframe example: homepage (desktop)
```
Use case: ui-mockup
Asset type: website wireframe
Primary request: SaaS homepage layout with clear hierarchy
Fidelity: low-fi grayscale wireframe; hand-drawn feel; simple boxes
Layout: top nav; hero with headline and CTA; three feature cards; testimonial strip; pricing preview; footer
Annotations: label each block ("Nav", "Hero", "CTA", "Feature", "Testimonial", "Pricing", "Footer")
Resolution/orientation: landscape (wide) for desktop
Constraints: no color; no logos; no real photos; no watermark
```
### Wireframe example: pricing page
```
Use case: ui-mockup
Asset type: website wireframe
Primary request: pricing page layout with comparison table
Fidelity: low-fi grayscale wireframe; sketchy lines; simple boxes
Layout: header; plan toggle; 3 pricing cards; comparison table; FAQ accordion; footer
Annotations: label key areas ("Toggle", "Plan Card", "Table", "FAQ")
Resolution/orientation: landscape for desktop or portrait for tablet
Constraints: no color; no logos; no real photos; no watermark
```
### Wireframe example: mobile onboarding flow
```
Use case: ui-mockup
Asset type: website wireframe
Primary request: three-screen mobile onboarding flow
Fidelity: low-fi grayscale wireframe; hand-drawn feel; simple boxes
Layout: screen 1 (logo placeholder, headline, illustration placeholder, CTA); screen 2 (feature bullets); screen 3 (form fields + CTA)
Annotations: label each block and screen number
Resolution/orientation: portrait (tall) for mobile
Constraints: no color; no logos; no real photos; no watermark
```
### Logo template
```
Use case: logo-brand
Asset type: logo concept
Primary request: <brand idea or symbol concept>
Style/medium: vector logo mark; flat colors; minimal
Composition/framing: centered mark; clear silhouette; generous margin
Color palette: <1-2 colors; high contrast>
Text (verbatim): "<exact name>" (only if needed)
Constraints: no gradients; no mockups; no 3D; no watermark
```
### Logo example: abstract symbol mark
```
Use case: logo-brand
Asset type: logo concept
Primary request: geometric leaf symbol suggesting sustainability and growth
Style/medium: vector logo mark; flat colors; minimal
Composition/framing: centered mark; clear silhouette
Color palette: deep green and off-white
Constraints: no text; no gradients; no mockups; no 3D; no watermark
```
### Logo example: monogram mark
```
Use case: logo-brand
Asset type: logo concept
Primary request: interlocking monogram of the letters "AV"
Style/medium: vector logo mark; flat colors; minimal
Composition/framing: centered mark; balanced spacing
Color palette: black on white
Constraints: no gradients; no mockups; no 3D; no watermark
```
### Logo example: wordmark
```
Use case: logo-brand
Asset type: logo concept
Primary request: clean wordmark for a modern studio
Style/medium: vector wordmark; flat colors; minimal
Text (verbatim): "Studio North"
Composition/framing: centered text; even letter spacing
Color palette: charcoal on white
Constraints: no gradients; no mockups; no 3D; no watermark
```
## Edit
### text-localization
```
Use case: text-localization
Input images: Image 1: original infographic
Primary request: translate all in-image text to Spanish
Constraints: change only the text; preserve layout, typography, spacing, and hierarchy; no extra words; do not alter logos or imagery
```
### identity-preserve
```
Use case: identity-preserve
Input images: Image 1: person photo; Image 2..N: clothing items
Primary request: replace only the clothing with the provided garments
Constraints: preserve face, body shape, pose, hair, expression, and identity; match lighting and shadows; keep background unchanged; no accessories or text
Input fidelity (edits): high
```
### precise-object-edit
```
Use case: precise-object-edit
Input images: Image 1: room photo
Primary request: replace ONLY the white chairs with wooden chairs
Constraints: preserve camera angle, room lighting, floor shadows, and surrounding objects; keep all other aspects unchanged
```
### lighting-weather
```
Use case: lighting-weather
Input images: Image 1: original photo
Primary request: make it look like a winter evening with gentle snowfall
Constraints: preserve subject identity, geometry, camera angle, and composition; change only lighting, atmosphere, and weather
Quality: high
```
### background-extraction
```
Use case: background-extraction
Input images: Image 1: product photo
Primary request: extract the product on a transparent background
Output: transparent background (RGBA PNG)
Constraints: crisp silhouette, no halos/fringing; preserve label text exactly; no restyling
```
### style-transfer
```
Use case: style-transfer
Input images: Image 1: style reference
Primary request: apply Image 1's visual style to a man riding a motorcycle on a white background
Constraints: preserve palette, texture, and brushwork; no extra elements; plain white background
```
### compositing
```
Use case: compositing
Input images: Image 1: base scene; Image 2: subject to insert
Primary request: place the subject from Image 2 next to the person in Image 1
Constraints: match lighting, perspective, and scale; keep background and framing unchanged; no extra elements
Input fidelity (edits): high
```
### sketch-to-render
```
Use case: sketch-to-render
Input images: Image 1: drawing
Primary request: turn the drawing into a photorealistic image
Constraints: preserve layout, proportions, and perspective; choose realistic materials and lighting; do not add new elements or text
Quality: high
```

View File

@@ -1,876 +0,0 @@
#!/usr/bin/env python3
"""Generate or edit images with the OpenAI Image API.
Defaults to gpt-image-1.5 and a structured prompt augmentation workflow.
"""
from __future__ import annotations
import argparse
import asyncio
import base64
import json
import os
from pathlib import Path
import re
import sys
import time
from typing import Any, Dict, Iterable, List, Optional, Tuple
from io import BytesIO
DEFAULT_MODEL = "gpt-image-1.5"
DEFAULT_SIZE = "1024x1024"
DEFAULT_QUALITY = "auto"
DEFAULT_OUTPUT_FORMAT = "png"
DEFAULT_CONCURRENCY = 5
DEFAULT_DOWNSCALE_SUFFIX = "-web"
ALLOWED_SIZES = {"1024x1024", "1536x1024", "1024x1536", "auto"}
ALLOWED_QUALITIES = {"low", "medium", "high", "auto"}
ALLOWED_BACKGROUNDS = {"transparent", "opaque", "auto", None}
MAX_IMAGE_BYTES = 50 * 1024 * 1024
MAX_BATCH_JOBS = 500
def _die(message: str, code: int = 1) -> None:
print(f"Error: {message}", file=sys.stderr)
raise SystemExit(code)
def _warn(message: str) -> None:
print(f"Warning: {message}", file=sys.stderr)
def _ensure_api_key(dry_run: bool) -> None:
if os.getenv("OPENAI_API_KEY"):
print("OPENAI_API_KEY is set.", file=sys.stderr)
return
if dry_run:
_warn("OPENAI_API_KEY is not set; dry-run only.")
return
_die("OPENAI_API_KEY is not set. Export it before running.")
def _read_prompt(prompt: Optional[str], prompt_file: Optional[str]) -> str:
if prompt and prompt_file:
_die("Use --prompt or --prompt-file, not both.")
if prompt_file:
path = Path(prompt_file)
if not path.exists():
_die(f"Prompt file not found: {path}")
return path.read_text(encoding="utf-8").strip()
if prompt:
return prompt.strip()
_die("Missing prompt. Use --prompt or --prompt-file.")
return "" # unreachable
def _check_image_paths(paths: Iterable[str]) -> List[Path]:
resolved: List[Path] = []
for raw in paths:
path = Path(raw)
if not path.exists():
_die(f"Image file not found: {path}")
if path.stat().st_size > MAX_IMAGE_BYTES:
_warn(f"Image exceeds 50MB limit: {path}")
resolved.append(path)
return resolved
def _normalize_output_format(fmt: Optional[str]) -> str:
if not fmt:
return DEFAULT_OUTPUT_FORMAT
fmt = fmt.lower()
if fmt not in {"png", "jpeg", "jpg", "webp"}:
_die("output-format must be png, jpeg, jpg, or webp.")
return "jpeg" if fmt == "jpg" else fmt
def _validate_size(size: str) -> None:
if size not in ALLOWED_SIZES:
_die(
"size must be one of 1024x1024, 1536x1024, 1024x1536, or auto for GPT image models."
)
def _validate_quality(quality: str) -> None:
if quality not in ALLOWED_QUALITIES:
_die("quality must be one of low, medium, high, or auto.")
def _validate_background(background: Optional[str]) -> None:
if background not in ALLOWED_BACKGROUNDS:
_die("background must be one of transparent, opaque, or auto.")
def _validate_transparency(background: Optional[str], output_format: str) -> None:
if background == "transparent" and output_format not in {"png", "webp"}:
_die("transparent background requires output-format png or webp.")
def _validate_generate_payload(payload: Dict[str, Any]) -> None:
n = int(payload.get("n", 1))
if n < 1 or n > 10:
_die("n must be between 1 and 10")
size = str(payload.get("size", DEFAULT_SIZE))
quality = str(payload.get("quality", DEFAULT_QUALITY))
background = payload.get("background")
_validate_size(size)
_validate_quality(quality)
_validate_background(background)
oc = payload.get("output_compression")
if oc is not None and not (0 <= int(oc) <= 100):
_die("output_compression must be between 0 and 100")
def _build_output_paths(
out: str,
output_format: str,
count: int,
out_dir: Optional[str],
) -> List[Path]:
ext = "." + output_format
if out_dir:
out_base = Path(out_dir)
out_base.mkdir(parents=True, exist_ok=True)
return [out_base / f"image_{i}{ext}" for i in range(1, count + 1)]
out_path = Path(out)
if out_path.exists() and out_path.is_dir():
out_path.mkdir(parents=True, exist_ok=True)
return [out_path / f"image_{i}{ext}" for i in range(1, count + 1)]
if out_path.suffix == "":
out_path = out_path.with_suffix(ext)
elif output_format and out_path.suffix.lstrip(".").lower() != output_format:
_warn(
f"Output extension {out_path.suffix} does not match output-format {output_format}."
)
if count == 1:
return [out_path]
return [
out_path.with_name(f"{out_path.stem}-{i}{out_path.suffix}")
for i in range(1, count + 1)
]
def _augment_prompt(args: argparse.Namespace, prompt: str) -> str:
fields = _fields_from_args(args)
return _augment_prompt_fields(args.augment, prompt, fields)
def _augment_prompt_fields(augment: bool, prompt: str, fields: Dict[str, Optional[str]]) -> str:
if not augment:
return prompt
sections: List[str] = []
if fields.get("use_case"):
sections.append(f"Use case: {fields['use_case']}")
sections.append(f"Primary request: {prompt}")
if fields.get("scene"):
sections.append(f"Scene/background: {fields['scene']}")
if fields.get("subject"):
sections.append(f"Subject: {fields['subject']}")
if fields.get("style"):
sections.append(f"Style/medium: {fields['style']}")
if fields.get("composition"):
sections.append(f"Composition/framing: {fields['composition']}")
if fields.get("lighting"):
sections.append(f"Lighting/mood: {fields['lighting']}")
if fields.get("palette"):
sections.append(f"Color palette: {fields['palette']}")
if fields.get("materials"):
sections.append(f"Materials/textures: {fields['materials']}")
if fields.get("text"):
sections.append(f"Text (verbatim): \"{fields['text']}\"")
if fields.get("constraints"):
sections.append(f"Constraints: {fields['constraints']}")
if fields.get("negative"):
sections.append(f"Avoid: {fields['negative']}")
return "\n".join(sections)
def _fields_from_args(args: argparse.Namespace) -> Dict[str, Optional[str]]:
return {
"use_case": getattr(args, "use_case", None),
"scene": getattr(args, "scene", None),
"subject": getattr(args, "subject", None),
"style": getattr(args, "style", None),
"composition": getattr(args, "composition", None),
"lighting": getattr(args, "lighting", None),
"palette": getattr(args, "palette", None),
"materials": getattr(args, "materials", None),
"text": getattr(args, "text", None),
"constraints": getattr(args, "constraints", None),
"negative": getattr(args, "negative", None),
}
def _print_request(payload: dict) -> None:
print(json.dumps(payload, indent=2, sort_keys=True))
def _decode_and_write(images: List[str], outputs: List[Path], force: bool) -> None:
for idx, image_b64 in enumerate(images):
if idx >= len(outputs):
break
out_path = outputs[idx]
if out_path.exists() and not force:
_die(f"Output already exists: {out_path} (use --force to overwrite)")
out_path.parent.mkdir(parents=True, exist_ok=True)
out_path.write_bytes(base64.b64decode(image_b64))
print(f"Wrote {out_path}")
def _derive_downscale_path(path: Path, suffix: str) -> Path:
if suffix and not suffix.startswith("-") and not suffix.startswith("_"):
suffix = "-" + suffix
return path.with_name(f"{path.stem}{suffix}{path.suffix}")
def _downscale_image_bytes(image_bytes: bytes, *, max_dim: int, output_format: str) -> bytes:
try:
from PIL import Image
except Exception:
_die(
"Downscaling requires Pillow. Install with `uv pip install pillow` (then re-run)."
)
if max_dim < 1:
_die("--downscale-max-dim must be >= 1")
with Image.open(BytesIO(image_bytes)) as img:
img.load()
w, h = img.size
scale = min(1.0, float(max_dim) / float(max(w, h)))
target = (max(1, int(round(w * scale))), max(1, int(round(h * scale))))
resized = img if target == (w, h) else img.resize(target, Image.Resampling.LANCZOS)
fmt = output_format.lower()
if fmt == "jpg":
fmt = "jpeg"
if fmt == "jpeg":
if resized.mode in ("RGBA", "LA") or ("transparency" in getattr(resized, "info", {})):
bg = Image.new("RGB", resized.size, (255, 255, 255))
bg.paste(resized.convert("RGBA"), mask=resized.convert("RGBA").split()[-1])
resized = bg
else:
resized = resized.convert("RGB")
out = BytesIO()
resized.save(out, format=fmt.upper())
return out.getvalue()
def _decode_write_and_downscale(
images: List[str],
outputs: List[Path],
*,
force: bool,
downscale_max_dim: Optional[int],
downscale_suffix: str,
output_format: str,
) -> None:
for idx, image_b64 in enumerate(images):
if idx >= len(outputs):
break
out_path = outputs[idx]
if out_path.exists() and not force:
_die(f"Output already exists: {out_path} (use --force to overwrite)")
out_path.parent.mkdir(parents=True, exist_ok=True)
raw = base64.b64decode(image_b64)
out_path.write_bytes(raw)
print(f"Wrote {out_path}")
if downscale_max_dim is None:
continue
derived = _derive_downscale_path(out_path, downscale_suffix)
if derived.exists() and not force:
_die(f"Output already exists: {derived} (use --force to overwrite)")
derived.parent.mkdir(parents=True, exist_ok=True)
resized = _downscale_image_bytes(raw, max_dim=downscale_max_dim, output_format=output_format)
derived.write_bytes(resized)
print(f"Wrote {derived}")
def _create_client():
try:
from openai import OpenAI
except ImportError as exc:
_die("openai SDK not installed. Install with `uv pip install openai`.")
return OpenAI()
def _create_async_client():
try:
from openai import AsyncOpenAI
except ImportError:
try:
import openai as _openai # noqa: F401
except ImportError:
_die("openai SDK not installed. Install with `uv pip install openai`.")
_die(
"AsyncOpenAI not available in this openai SDK version. Upgrade with `uv pip install -U openai`."
)
return AsyncOpenAI()
def _slugify(value: str) -> str:
value = value.strip().lower()
value = re.sub(r"[^a-z0-9]+", "-", value)
value = re.sub(r"-{2,}", "-", value).strip("-")
return value[:60] if value else "job"
def _normalize_job(job: Any, idx: int) -> Dict[str, Any]:
if isinstance(job, str):
prompt = job.strip()
if not prompt:
_die(f"Empty prompt at job {idx}")
return {"prompt": prompt}
if isinstance(job, dict):
if "prompt" not in job or not str(job["prompt"]).strip():
_die(f"Missing prompt for job {idx}")
return job
_die(f"Invalid job at index {idx}: expected string or object.")
return {} # unreachable
def _read_jobs_jsonl(path: str) -> List[Dict[str, Any]]:
p = Path(path)
if not p.exists():
_die(f"Input file not found: {p}")
jobs: List[Dict[str, Any]] = []
for line_no, raw in enumerate(p.read_text(encoding="utf-8").splitlines(), start=1):
line = raw.strip()
if not line or line.startswith("#"):
continue
try:
item: Any
if line.startswith("{"):
item = json.loads(line)
else:
item = line
jobs.append(_normalize_job(item, idx=line_no))
except json.JSONDecodeError as exc:
_die(f"Invalid JSON on line {line_no}: {exc}")
if not jobs:
_die("No jobs found in input file.")
if len(jobs) > MAX_BATCH_JOBS:
_die(f"Too many jobs ({len(jobs)}). Max is {MAX_BATCH_JOBS}.")
return jobs
def _merge_non_null(dst: Dict[str, Any], src: Dict[str, Any]) -> Dict[str, Any]:
merged = dict(dst)
for k, v in src.items():
if v is not None:
merged[k] = v
return merged
def _job_output_paths(
*,
out_dir: Path,
output_format: str,
idx: int,
prompt: str,
n: int,
explicit_out: Optional[str],
) -> List[Path]:
out_dir.mkdir(parents=True, exist_ok=True)
ext = "." + output_format
if explicit_out:
base = Path(explicit_out)
if base.suffix == "":
base = base.with_suffix(ext)
elif base.suffix.lstrip(".").lower() != output_format:
_warn(
f"Job {idx}: output extension {base.suffix} does not match output-format {output_format}."
)
base = out_dir / base.name
else:
slug = _slugify(prompt[:80])
base = out_dir / f"{idx:03d}-{slug}{ext}"
if n == 1:
return [base]
return [
base.with_name(f"{base.stem}-{i}{base.suffix}")
for i in range(1, n + 1)
]
def _extract_retry_after_seconds(exc: Exception) -> Optional[float]:
# Best-effort: openai SDK errors vary by version. Prefer a conservative fallback.
for attr in ("retry_after", "retry_after_seconds"):
val = getattr(exc, attr, None)
if isinstance(val, (int, float)) and val >= 0:
return float(val)
msg = str(exc)
m = re.search(r"retry[- ]after[:= ]+([0-9]+(?:\\.[0-9]+)?)", msg, re.IGNORECASE)
if m:
try:
return float(m.group(1))
except Exception:
return None
return None
def _is_rate_limit_error(exc: Exception) -> bool:
name = exc.__class__.__name__.lower()
if "ratelimit" in name or "rate_limit" in name:
return True
msg = str(exc).lower()
return "429" in msg or "rate limit" in msg or "too many requests" in msg
def _is_transient_error(exc: Exception) -> bool:
if _is_rate_limit_error(exc):
return True
name = exc.__class__.__name__.lower()
if "timeout" in name or "timedout" in name or "tempor" in name:
return True
msg = str(exc).lower()
return "timeout" in msg or "timed out" in msg or "connection reset" in msg
async def _generate_one_with_retries(
client: Any,
payload: Dict[str, Any],
*,
attempts: int,
job_label: str,
) -> Any:
last_exc: Optional[Exception] = None
for attempt in range(1, attempts + 1):
try:
return await client.images.generate(**payload)
except Exception as exc:
last_exc = exc
if not _is_transient_error(exc):
raise
if attempt == attempts:
raise
sleep_s = _extract_retry_after_seconds(exc)
if sleep_s is None:
sleep_s = min(60.0, 2.0**attempt)
print(
f"{job_label} attempt {attempt}/{attempts} failed ({exc.__class__.__name__}); retrying in {sleep_s:.1f}s",
file=sys.stderr,
)
await asyncio.sleep(sleep_s)
raise last_exc or RuntimeError("unknown error")
async def _run_generate_batch(args: argparse.Namespace) -> int:
jobs = _read_jobs_jsonl(args.input)
out_dir = Path(args.out_dir)
base_fields = _fields_from_args(args)
base_payload = {
"model": args.model,
"n": args.n,
"size": args.size,
"quality": args.quality,
"background": args.background,
"output_format": args.output_format,
"output_compression": args.output_compression,
"moderation": args.moderation,
}
if args.dry_run:
for i, job in enumerate(jobs, start=1):
prompt = str(job["prompt"]).strip()
fields = _merge_non_null(base_fields, job.get("fields", {}))
# Allow flat job keys as well (use_case, scene, etc.)
fields = _merge_non_null(fields, {k: job.get(k) for k in base_fields.keys()})
augmented = _augment_prompt_fields(args.augment, prompt, fields)
job_payload = dict(base_payload)
job_payload["prompt"] = augmented
job_payload = _merge_non_null(job_payload, {k: job.get(k) for k in base_payload.keys()})
job_payload = {k: v for k, v in job_payload.items() if v is not None}
_validate_generate_payload(job_payload)
effective_output_format = _normalize_output_format(job_payload.get("output_format"))
_validate_transparency(job_payload.get("background"), effective_output_format)
if "output_format" in job_payload:
job_payload["output_format"] = effective_output_format
n = int(job_payload.get("n", 1))
outputs = _job_output_paths(
out_dir=out_dir,
output_format=effective_output_format,
idx=i,
prompt=prompt,
n=n,
explicit_out=job.get("out"),
)
downscaled = None
if args.downscale_max_dim is not None:
downscaled = [
str(_derive_downscale_path(p, args.downscale_suffix)) for p in outputs
]
_print_request(
{
"endpoint": "/v1/images/generations",
"job": i,
"outputs": [str(p) for p in outputs],
"outputs_downscaled": downscaled,
**job_payload,
}
)
return 0
client = _create_async_client()
sem = asyncio.Semaphore(args.concurrency)
any_failed = False
async def run_job(i: int, job: Dict[str, Any]) -> Tuple[int, Optional[str]]:
nonlocal any_failed
prompt = str(job["prompt"]).strip()
job_label = f"[job {i}/{len(jobs)}]"
fields = _merge_non_null(base_fields, job.get("fields", {}))
fields = _merge_non_null(fields, {k: job.get(k) for k in base_fields.keys()})
augmented = _augment_prompt_fields(args.augment, prompt, fields)
payload = dict(base_payload)
payload["prompt"] = augmented
payload = _merge_non_null(payload, {k: job.get(k) for k in base_payload.keys()})
payload = {k: v for k, v in payload.items() if v is not None}
n = int(payload.get("n", 1))
_validate_generate_payload(payload)
effective_output_format = _normalize_output_format(payload.get("output_format"))
_validate_transparency(payload.get("background"), effective_output_format)
if "output_format" in payload:
payload["output_format"] = effective_output_format
outputs = _job_output_paths(
out_dir=out_dir,
output_format=effective_output_format,
idx=i,
prompt=prompt,
n=n,
explicit_out=job.get("out"),
)
try:
async with sem:
print(f"{job_label} starting", file=sys.stderr)
started = time.time()
result = await _generate_one_with_retries(
client,
payload,
attempts=args.max_attempts,
job_label=job_label,
)
elapsed = time.time() - started
print(f"{job_label} completed in {elapsed:.1f}s", file=sys.stderr)
images = [item.b64_json for item in result.data]
_decode_write_and_downscale(
images,
outputs,
force=args.force,
downscale_max_dim=args.downscale_max_dim,
downscale_suffix=args.downscale_suffix,
output_format=effective_output_format,
)
return i, None
except Exception as exc:
any_failed = True
print(f"{job_label} failed: {exc}", file=sys.stderr)
if args.fail_fast:
raise
return i, str(exc)
tasks = [asyncio.create_task(run_job(i, job)) for i, job in enumerate(jobs, start=1)]
try:
await asyncio.gather(*tasks)
except Exception:
for t in tasks:
if not t.done():
t.cancel()
raise
return 1 if any_failed else 0
def _generate_batch(args: argparse.Namespace) -> None:
exit_code = asyncio.run(_run_generate_batch(args))
if exit_code:
raise SystemExit(exit_code)
def _generate(args: argparse.Namespace) -> None:
prompt = _read_prompt(args.prompt, args.prompt_file)
prompt = _augment_prompt(args, prompt)
payload = {
"model": args.model,
"prompt": prompt,
"n": args.n,
"size": args.size,
"quality": args.quality,
"background": args.background,
"output_format": args.output_format,
"output_compression": args.output_compression,
"moderation": args.moderation,
}
payload = {k: v for k, v in payload.items() if v is not None}
output_format = _normalize_output_format(args.output_format)
_validate_transparency(args.background, output_format)
if "output_format" in payload:
payload["output_format"] = output_format
output_paths = _build_output_paths(args.out, output_format, args.n, args.out_dir)
if args.dry_run:
_print_request({"endpoint": "/v1/images/generations", **payload})
return
print(
"Calling Image API (generation). This can take up to a couple of minutes.",
file=sys.stderr,
)
started = time.time()
client = _create_client()
result = client.images.generate(**payload)
elapsed = time.time() - started
print(f"Generation completed in {elapsed:.1f}s.", file=sys.stderr)
images = [item.b64_json for item in result.data]
_decode_write_and_downscale(
images,
output_paths,
force=args.force,
downscale_max_dim=args.downscale_max_dim,
downscale_suffix=args.downscale_suffix,
output_format=output_format,
)
def _edit(args: argparse.Namespace) -> None:
prompt = _read_prompt(args.prompt, args.prompt_file)
prompt = _augment_prompt(args, prompt)
image_paths = _check_image_paths(args.image)
mask_path = Path(args.mask) if args.mask else None
if mask_path:
if not mask_path.exists():
_die(f"Mask file not found: {mask_path}")
if mask_path.suffix.lower() != ".png":
_warn(f"Mask should be a PNG with an alpha channel: {mask_path}")
if mask_path.stat().st_size > MAX_IMAGE_BYTES:
_warn(f"Mask exceeds 50MB limit: {mask_path}")
payload = {
"model": args.model,
"prompt": prompt,
"n": args.n,
"size": args.size,
"quality": args.quality,
"background": args.background,
"output_format": args.output_format,
"output_compression": args.output_compression,
"input_fidelity": args.input_fidelity,
"moderation": args.moderation,
}
payload = {k: v for k, v in payload.items() if v is not None}
output_format = _normalize_output_format(args.output_format)
_validate_transparency(args.background, output_format)
if "output_format" in payload:
payload["output_format"] = output_format
output_paths = _build_output_paths(args.out, output_format, args.n, args.out_dir)
if args.dry_run:
payload_preview = dict(payload)
payload_preview["image"] = [str(p) for p in image_paths]
if mask_path:
payload_preview["mask"] = str(mask_path)
_print_request({"endpoint": "/v1/images/edits", **payload_preview})
return
print(
f"Calling Image API (edit) with {len(image_paths)} image(s).",
file=sys.stderr,
)
started = time.time()
client = _create_client()
with _open_files(image_paths) as image_files, _open_mask(mask_path) as mask_file:
request = dict(payload)
request["image"] = image_files if len(image_files) > 1 else image_files[0]
if mask_file is not None:
request["mask"] = mask_file
result = client.images.edit(**request)
elapsed = time.time() - started
print(f"Edit completed in {elapsed:.1f}s.", file=sys.stderr)
images = [item.b64_json for item in result.data]
_decode_write_and_downscale(
images,
output_paths,
force=args.force,
downscale_max_dim=args.downscale_max_dim,
downscale_suffix=args.downscale_suffix,
output_format=output_format,
)
def _open_files(paths: List[Path]):
return _FileBundle(paths)
def _open_mask(mask_path: Optional[Path]):
if mask_path is None:
return _NullContext()
return _SingleFile(mask_path)
class _NullContext:
def __enter__(self):
return None
def __exit__(self, exc_type, exc, tb):
return False
class _SingleFile:
def __init__(self, path: Path):
self._path = path
self._handle = None
def __enter__(self):
self._handle = self._path.open("rb")
return self._handle
def __exit__(self, exc_type, exc, tb):
if self._handle:
try:
self._handle.close()
except Exception:
pass
return False
class _FileBundle:
def __init__(self, paths: List[Path]):
self._paths = paths
self._handles: List[object] = []
def __enter__(self):
self._handles = [p.open("rb") for p in self._paths]
return self._handles
def __exit__(self, exc_type, exc, tb):
for handle in self._handles:
try:
handle.close()
except Exception:
pass
return False
def _add_shared_args(parser: argparse.ArgumentParser) -> None:
parser.add_argument("--model", default=DEFAULT_MODEL)
parser.add_argument("--prompt")
parser.add_argument("--prompt-file")
parser.add_argument("--n", type=int, default=1)
parser.add_argument("--size", default=DEFAULT_SIZE)
parser.add_argument("--quality", default=DEFAULT_QUALITY)
parser.add_argument("--background")
parser.add_argument("--output-format")
parser.add_argument("--output-compression", type=int)
parser.add_argument("--moderation")
parser.add_argument("--out", default="output.png")
parser.add_argument("--out-dir")
parser.add_argument("--force", action="store_true")
parser.add_argument("--dry-run", action="store_true")
parser.add_argument("--augment", dest="augment", action="store_true")
parser.add_argument("--no-augment", dest="augment", action="store_false")
parser.set_defaults(augment=True)
# Prompt augmentation hints
parser.add_argument("--use-case")
parser.add_argument("--scene")
parser.add_argument("--subject")
parser.add_argument("--style")
parser.add_argument("--composition")
parser.add_argument("--lighting")
parser.add_argument("--palette")
parser.add_argument("--materials")
parser.add_argument("--text")
parser.add_argument("--constraints")
parser.add_argument("--negative")
# Post-processing (optional): generate an additional downscaled copy for fast web loading.
parser.add_argument("--downscale-max-dim", type=int)
parser.add_argument("--downscale-suffix", default=DEFAULT_DOWNSCALE_SUFFIX)
def main() -> int:
parser = argparse.ArgumentParser(description="Generate or edit images via the Image API")
subparsers = parser.add_subparsers(dest="command", required=True)
gen_parser = subparsers.add_parser("generate", help="Create a new image")
_add_shared_args(gen_parser)
gen_parser.set_defaults(func=_generate)
batch_parser = subparsers.add_parser(
"generate-batch",
help="Generate multiple prompts concurrently (JSONL input)",
)
_add_shared_args(batch_parser)
batch_parser.add_argument("--input", required=True, help="Path to JSONL file (one job per line)")
batch_parser.add_argument("--concurrency", type=int, default=DEFAULT_CONCURRENCY)
batch_parser.add_argument("--max-attempts", type=int, default=3)
batch_parser.add_argument("--fail-fast", action="store_true")
batch_parser.set_defaults(func=_generate_batch)
edit_parser = subparsers.add_parser("edit", help="Edit an existing image")
_add_shared_args(edit_parser)
edit_parser.add_argument("--image", action="append", required=True)
edit_parser.add_argument("--mask")
edit_parser.add_argument("--input-fidelity")
edit_parser.set_defaults(func=_edit)
args = parser.parse_args()
if args.n < 1 or args.n > 10:
_die("--n must be between 1 and 10")
if getattr(args, "concurrency", 1) < 1 or getattr(args, "concurrency", 1) > 25:
_die("--concurrency must be between 1 and 25")
if getattr(args, "max_attempts", 3) < 1 or getattr(args, "max_attempts", 3) > 10:
_die("--max-attempts must be between 1 and 10")
if args.output_compression is not None and not (0 <= args.output_compression <= 100):
_die("--output-compression must be between 0 and 100")
if args.command == "generate-batch" and not args.out_dir:
_die("generate-batch requires --out-dir")
if getattr(args, "downscale_max_dim", None) is not None and args.downscale_max_dim < 1:
_die("--downscale-max-dim must be >= 1")
_validate_size(args.size)
_validate_quality(args.quality)
_validate_background(args.background)
_ensure_api_key(args.dry_run)
args.func(args)
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -1,227 +0,0 @@
---
name: ios-simulator-skill
version: 1.3.0
description: 21 production-ready scripts for iOS app testing, building, and automation. Provides semantic UI navigation, build automation, accessibility testing, and simulator lifecycle management. Optimized for AI agents with minimal token output.
---
# iOS Simulator Skill
Build, test, and automate iOS applications using accessibility-driven navigation and structured data instead of pixel coordinates.
## Quick Start
```bash
# 1. Check environment
bash scripts/sim_health_check.sh
# 2. Launch app
python scripts/app_launcher.py --launch com.example.app
# 3. Map screen to see elements
python scripts/screen_mapper.py
# 4. Tap button
python scripts/navigator.py --find-text "Login" --tap
# 5. Enter text
python scripts/navigator.py --find-type TextField --enter-text "user@example.com"
```
All scripts support `--help` for detailed options and `--json` for machine-readable output.
## 21 Production Scripts
### Build & Development (2 scripts)
1. **build_and_test.py** - Build Xcode projects, run tests, parse results with progressive disclosure
- Build with live result streaming
- Parse errors and warnings from xcresult bundles
- Retrieve detailed build logs on demand
- Options: `--project`, `--scheme`, `--clean`, `--test`, `--verbose`, `--json`
2. **log_monitor.py** - Real-time log monitoring with intelligent filtering
- Stream logs or capture by duration
- Filter by severity (error/warning/info/debug)
- Deduplicate repeated messages
- Options: `--app`, `--severity`, `--follow`, `--duration`, `--output`, `--json`
### Navigation & Interaction (5 scripts)
3. **screen_mapper.py** - Analyze current screen and list interactive elements
- Element type breakdown
- Interactive button list
- Text field status
- Options: `--verbose`, `--hints`, `--json`
4. **navigator.py** - Find and interact with elements semantically
- Find by text (fuzzy matching)
- Find by element type
- Find by accessibility ID
- Enter text or tap elements
- Options: `--find-text`, `--find-type`, `--find-id`, `--tap`, `--enter-text`, `--json`
5. **gesture.py** - Perform swipes, scrolls, pinches, and complex gestures
- Directional swipes (up/down/left/right)
- Multi-swipe scrolling
- Pinch zoom
- Long press
- Pull to refresh
- Options: `--swipe`, `--scroll`, `--pinch`, `--long-press`, `--refresh`, `--json`
6. **keyboard.py** - Text input and hardware button control
- Type text (fast or slow)
- Special keys (return, delete, tab, space, arrows)
- Hardware buttons (home, lock, volume, screenshot)
- Key combinations
- Options: `--type`, `--key`, `--button`, `--slow`, `--clear`, `--dismiss`, `--json`
7. **app_launcher.py** - App lifecycle management
- Launch apps by bundle ID
- Terminate apps
- Install/uninstall from .app bundles
- Deep link navigation
- List installed apps
- Check app state
- Options: `--launch`, `--terminate`, `--install`, `--uninstall`, `--open-url`, `--list`, `--state`, `--json`
### Testing & Analysis (5 scripts)
8. **accessibility_audit.py** - Check WCAG compliance on current screen
- Critical issues (missing labels, empty buttons, no alt text)
- Warnings (missing hints, small touch targets)
- Info (missing IDs, deep nesting)
- Options: `--verbose`, `--output`, `--json`
9. **visual_diff.py** - Compare two screenshots for visual changes
- Pixel-by-pixel comparison
- Threshold-based pass/fail
- Generate diff images
- Options: `--threshold`, `--output`, `--details`, `--json`
10. **test_recorder.py** - Automatically document test execution
- Capture screenshots and accessibility trees per step
- Generate markdown reports with timing data
- Options: `--test-name`, `--output`, `--verbose`, `--json`
11. **app_state_capture.py** - Create comprehensive debugging snapshots
- Screenshot, UI hierarchy, app logs, device info
- Markdown summary for bug reports
- Options: `--app-bundle-id`, `--output`, `--log-lines`, `--json`
12. **sim_health_check.sh** - Verify environment is properly configured
- Check macOS, Xcode, simctl, IDB, Python
- List available and booted simulators
- Verify Python packages (Pillow)
### Advanced Testing & Permissions (4 scripts)
13. **clipboard.py** - Manage simulator clipboard for paste testing
- Copy text to clipboard
- Test paste flows without manual entry
- Options: `--copy`, `--test-name`, `--expected`, `--json`
14. **status_bar.py** - Override simulator status bar appearance
- Presets: clean (9:41, 100% battery), testing (11:11, 50%), low-battery (20%), airplane (offline)
- Custom time, network, battery, WiFi settings
- Options: `--preset`, `--time`, `--data-network`, `--battery-level`, `--clear`, `--json`
15. **push_notification.py** - Send simulated push notifications
- Simple mode (title + body + badge)
- Custom JSON payloads
- Test notification handling and deep links
- Options: `--bundle-id`, `--title`, `--body`, `--badge`, `--payload`, `--json`
16. **privacy_manager.py** - Grant, revoke, and reset app permissions
- 13 supported services (camera, microphone, location, contacts, photos, calendar, health, etc.)
- Batch operations (comma-separated services)
- Audit trail with test scenario tracking
- Options: `--bundle-id`, `--grant`, `--revoke`, `--reset`, `--list`, `--json`
### Device Lifecycle Management (5 scripts)
17. **simctl_boot.py** - Boot simulators with optional readiness verification
- Boot by UDID or device name
- Wait for device ready with timeout
- Batch boot operations (--all, --type)
- Performance timing
- Options: `--udid`, `--name`, `--wait-ready`, `--timeout`, `--all`, `--type`, `--json`
18. **simctl_shutdown.py** - Gracefully shutdown simulators
- Shutdown by UDID or device name
- Optional verification of shutdown completion
- Batch shutdown operations
- Options: `--udid`, `--name`, `--verify`, `--timeout`, `--all`, `--type`, `--json`
19. **simctl_create.py** - Create simulators dynamically
- Create by device type and iOS version
- List available device types and runtimes
- Custom device naming
- Returns UDID for CI/CD integration
- Options: `--device`, `--runtime`, `--name`, `--list-devices`, `--list-runtimes`, `--json`
20. **simctl_delete.py** - Permanently delete simulators
- Delete by UDID or device name
- Safety confirmation by default (skip with --yes)
- Batch delete operations
- Smart deletion (--old N to keep N per device type)
- Options: `--udid`, `--name`, `--yes`, `--all`, `--type`, `--old`, `--json`
21. **simctl_erase.py** - Factory reset simulators without deletion
- Preserve device UUID (faster than delete+create)
- Erase all, by type, or booted simulators
- Optional verification
- Options: `--udid`, `--name`, `--verify`, `--timeout`, `--all`, `--type`, `--booted`, `--json`
## Common Patterns
**Auto-UDID Detection**: Most scripts auto-detect the booted simulator if --udid is not provided.
**Device Name Resolution**: Use device names (e.g., "iPhone 16 Pro") instead of UDIDs - scripts resolve automatically.
**Batch Operations**: Many scripts support `--all` for all simulators or `--type iPhone` for device type filtering.
**Output Formats**: Default is concise human-readable output. Use `--json` for machine-readable output in CI/CD.
**Help**: All scripts support `--help` for detailed options and examples.
## Typical Workflow
1. Verify environment: `bash scripts/sim_health_check.sh`
2. Launch app: `python scripts/app_launcher.py --launch com.example.app`
3. Analyze screen: `python scripts/screen_mapper.py`
4. Interact: `python scripts/navigator.py --find-text "Button" --tap`
5. Verify: `python scripts/accessibility_audit.py`
6. Debug if needed: `python scripts/app_state_capture.py --app-bundle-id com.example.app`
## Requirements
- macOS 12+
- Xcode Command Line Tools
- Python 3
- IDB (optional, for interactive features)
## Documentation
- **SKILL.md** (this file) - Script reference and quick start
- **README.md** - Installation and examples
- **CLAUDE.md** - Architecture and implementation details
- **references/** - Deep documentation on specific topics
- **examples/** - Complete automation workflows
## Key Design Principles
**Semantic Navigation**: Find elements by meaning (text, type, ID) not pixel coordinates. Survives UI changes.
**Token Efficiency**: Concise default output (3-5 lines) with optional verbose and JSON modes for detailed results.
**Accessibility-First**: Built on standard accessibility APIs for reliability and compatibility.
**Zero Configuration**: Works immediately on any macOS with Xcode. No setup required.
**Structured Data**: Scripts output JSON or formatted text, not raw logs. Easy to parse and integrate.
**Auto-Learning**: Build system remembers your device preference. Configuration stored per-project.
---
Use these scripts directly or let Claude Code invoke them automatically when your request matches the skill description.

View File

@@ -1,292 +0,0 @@
#!/usr/bin/env python3
"""
iOS Simulator Accessibility Audit
Scans the current simulator screen for accessibility compliance issues.
Optimized for minimal token output while maintaining functionality.
Usage: python scripts/accessibility_audit.py [options]
"""
import argparse
import json
import subprocess
import sys
from dataclasses import asdict, dataclass
from typing import Any
from common import flatten_tree, get_accessibility_tree, resolve_udid
@dataclass
class Issue:
"""Represents an accessibility issue."""
severity: str # critical, warning, info
rule: str
element_type: str
issue: str
fix: str
def to_dict(self) -> dict:
"""Convert to dictionary for JSON serialization."""
return asdict(self)
class AccessibilityAuditor:
"""Performs accessibility audits on iOS simulator screens."""
# Critical rules that block users
CRITICAL_RULES = {
"missing_label": lambda e: e.get("type") in ["Button", "Link"] and not e.get("AXLabel"),
"empty_button": lambda e: e.get("type") == "Button"
and not (e.get("AXLabel") or e.get("AXValue")),
"image_no_alt": lambda e: e.get("type") == "Image" and not e.get("AXLabel"),
}
# Warnings that degrade UX
WARNING_RULES = {
"missing_hint": lambda e: e.get("type") in ["Slider", "TextField"] and not e.get("help"),
"missing_traits": lambda e: e.get("type") and not e.get("traits"),
}
# Info level suggestions
INFO_RULES = {
"no_identifier": lambda e: not e.get("AXUniqueId"),
"deep_nesting": lambda e: e.get("depth", 0) > 5,
}
def __init__(self, udid: str | None = None):
"""Initialize auditor with optional device UDID."""
self.udid = udid
def get_accessibility_tree(self) -> dict:
"""Fetch accessibility tree from simulator using shared utility."""
return get_accessibility_tree(self.udid, nested=True)
@staticmethod
def _is_small_target(element: dict) -> bool:
"""Check if touch target is too small (< 44x44 points)."""
frame = element.get("frame", {})
width = frame.get("width", 0)
height = frame.get("height", 0)
return width < 44 or height < 44
def _flatten_tree(self, node: dict, depth: int = 0) -> list[dict]:
"""Flatten nested accessibility tree for easier processing using shared utility."""
return flatten_tree(node, depth)
def audit_element(self, element: dict) -> list[Issue]:
"""Audit a single element for accessibility issues."""
issues = []
# Check critical rules
for rule_name, rule_func in self.CRITICAL_RULES.items():
if rule_func(element):
issues.append(
Issue(
severity="critical",
rule=rule_name,
element_type=element.get("type", "Unknown"),
issue=self._get_issue_description(rule_name),
fix=self._get_fix_suggestion(rule_name),
)
)
# Check warnings (skip if critical issues found)
if not issues:
for rule_name, rule_func in self.WARNING_RULES.items():
if rule_func(element):
issues.append(
Issue(
severity="warning",
rule=rule_name,
element_type=element.get("type", "Unknown"),
issue=self._get_issue_description(rule_name),
fix=self._get_fix_suggestion(rule_name),
)
)
# Check info level (only if verbose or no other issues)
if not issues:
for rule_name, rule_func in self.INFO_RULES.items():
if rule_func(element):
issues.append(
Issue(
severity="info",
rule=rule_name,
element_type=element.get("type", "Unknown"),
issue=self._get_issue_description(rule_name),
fix=self._get_fix_suggestion(rule_name),
)
)
return issues
def _get_issue_description(self, rule: str) -> str:
"""Get human-readable issue description."""
descriptions = {
"missing_label": "Interactive element missing accessibility label",
"empty_button": "Button has no text or label",
"image_no_alt": "Image missing alternative text",
"missing_hint": "Complex control missing hint",
"small_touch_target": "Touch target smaller than 44x44pt",
"missing_traits": "Element missing accessibility traits",
"no_identifier": "Missing accessibility identifier",
"deep_nesting": "Deeply nested (>5 levels)",
}
return descriptions.get(rule, "Accessibility issue")
def _get_fix_suggestion(self, rule: str) -> str:
"""Get fix suggestion for issue."""
fixes = {
"missing_label": "Add accessibilityLabel",
"empty_button": "Set button title or accessibilityLabel",
"image_no_alt": "Add accessibilityLabel with description",
"missing_hint": "Add accessibilityHint",
"small_touch_target": "Increase to minimum 44x44pt",
"missing_traits": "Set appropriate accessibilityTraits",
"no_identifier": "Add accessibilityIdentifier for testing",
"deep_nesting": "Simplify view hierarchy",
}
return fixes.get(rule, "Review accessibility")
def audit(self, verbose: bool = False) -> dict[str, Any]:
"""Perform full accessibility audit."""
# Get accessibility tree
tree = self.get_accessibility_tree()
# Flatten for processing
elements = self._flatten_tree(tree)
# Audit each element
all_issues = []
for element in elements:
issues = self.audit_element(element)
for issue in issues:
issue_dict = issue.to_dict()
# Add minimal element info for context
issue_dict["element"] = {
"type": element.get("type", "Unknown"),
"label": element.get("AXLabel", "")[:30] if element.get("AXLabel") else None,
}
all_issues.append(issue_dict)
# Count by severity
critical = len([i for i in all_issues if i["severity"] == "critical"])
warning = len([i for i in all_issues if i["severity"] == "warning"])
info = len([i for i in all_issues if i["severity"] == "info"])
# Build result (token-optimized)
result = {
"summary": {
"total": len(elements),
"issues": len(all_issues),
"critical": critical,
"warning": warning,
"info": info,
}
}
if verbose:
# Full details only if requested
result["issues"] = all_issues
else:
# Default: top issues only (token-efficient)
result["top_issues"] = self._get_top_issues(all_issues)
return result
def _get_top_issues(self, issues: list[dict]) -> list[dict]:
"""Get top 3 issues grouped by type (token-efficient)."""
if not issues:
return []
# Group by rule
grouped = {}
for issue in issues:
rule = issue["rule"]
if rule not in grouped:
grouped[rule] = {
"severity": issue["severity"],
"rule": rule,
"count": 0,
"fix": issue["fix"],
}
grouped[rule]["count"] += 1
# Sort by severity and count
severity_order = {"critical": 0, "warning": 1, "info": 2}
sorted_issues = sorted(
grouped.values(), key=lambda x: (severity_order[x["severity"]], -x["count"])
)
return sorted_issues[:3]
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(
description="Audit iOS simulator screen for accessibility issues"
)
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
parser.add_argument("--output", help="Save JSON report to file")
parser.add_argument(
"--verbose", action="store_true", help="Include all issue details (increases output)"
)
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
# Perform audit
auditor = AccessibilityAuditor(udid=udid)
try:
result = auditor.audit(verbose=args.verbose)
except Exception as e:
print(f"Error: {e}")
sys.exit(1)
# Output results
if args.output:
# Save to file
with open(args.output, "w") as f:
json.dump(result, f, indent=2)
# Print minimal summary
summary = result["summary"]
print(f"Audit complete: {summary['issues']} issues ({summary['critical']} critical)")
print(f"Report saved to: {args.output}")
# Print to stdout (token-optimized by default)
elif args.verbose:
print(json.dumps(result, indent=2))
else:
# Ultra-compact output
summary = result["summary"]
print(f"Elements: {summary['total']}, Issues: {summary['issues']}")
print(
f"Critical: {summary['critical']}, Warning: {summary['warning']}, Info: {summary['info']}"
)
if result.get("top_issues"):
print("\nTop issues:")
for issue in result["top_issues"]:
print(
f" [{issue['severity']}] {issue['rule']} ({issue['count']}x) - {issue['fix']}"
)
# Exit with error if critical issues found
if result["summary"]["critical"] > 0:
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,322 +0,0 @@
#!/usr/bin/env python3
"""
iOS App Launcher - App Lifecycle Control
Launches, terminates, and manages iOS apps in the simulator.
Handles deep links and app switching.
Usage: python scripts/app_launcher.py --launch com.example.app
"""
import argparse
import contextlib
import subprocess
import sys
import time
from common import build_simctl_command, resolve_udid
class AppLauncher:
"""Controls app lifecycle on iOS simulator."""
def __init__(self, udid: str | None = None):
"""Initialize app launcher."""
self.udid = udid
def launch(self, bundle_id: str, wait_for_debugger: bool = False) -> tuple[bool, int | None]:
"""
Launch an app.
Args:
bundle_id: App bundle identifier
wait_for_debugger: Wait for debugger attachment
Returns:
(success, pid) tuple
"""
cmd = build_simctl_command("launch", self.udid, bundle_id)
if wait_for_debugger:
cmd.insert(3, "--wait-for-debugger") # Insert after "launch" operation
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
# Parse PID from output if available
pid = None
if result.stdout:
# Output format: "com.example.app: <PID>"
parts = result.stdout.strip().split(":")
if len(parts) > 1:
with contextlib.suppress(ValueError):
pid = int(parts[1].strip())
return (True, pid)
except subprocess.CalledProcessError:
return (False, None)
def terminate(self, bundle_id: str) -> bool:
"""
Terminate an app.
Args:
bundle_id: App bundle identifier
Returns:
Success status
"""
cmd = build_simctl_command("terminate", self.udid, bundle_id)
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def install(self, app_path: str) -> bool:
"""
Install an app.
Args:
app_path: Path to .app bundle
Returns:
Success status
"""
cmd = build_simctl_command("install", self.udid, app_path)
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def uninstall(self, bundle_id: str) -> bool:
"""
Uninstall an app.
Args:
bundle_id: App bundle identifier
Returns:
Success status
"""
cmd = build_simctl_command("uninstall", self.udid, bundle_id)
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def open_url(self, url: str) -> bool:
"""
Open URL (for deep linking).
Args:
url: URL to open (http://, myapp://, etc.)
Returns:
Success status
"""
cmd = build_simctl_command("openurl", self.udid, url)
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def list_apps(self) -> list[dict[str, str]]:
"""
List installed apps.
Returns:
List of app info dictionaries
"""
cmd = build_simctl_command("listapps", self.udid)
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
# Parse plist output using plutil to convert to JSON
plist_data = result.stdout
# Use plutil to convert plist to JSON
convert_cmd = ["plutil", "-convert", "json", "-o", "-", "-"]
convert_result = subprocess.run(
convert_cmd, check=False, input=plist_data, capture_output=True, text=True
)
apps = []
if convert_result.returncode == 0:
import json
try:
data = json.loads(convert_result.stdout)
for bundle_id, app_info in data.items():
# Skip system internal apps that are hidden
if app_info.get("ApplicationType") == "Hidden":
continue
apps.append(
{
"bundle_id": bundle_id,
"name": app_info.get(
"CFBundleDisplayName", app_info.get("CFBundleName", bundle_id)
),
"path": app_info.get("Path", ""),
"version": app_info.get("CFBundleVersion", "Unknown"),
"type": app_info.get("ApplicationType", "User"),
}
)
except json.JSONDecodeError:
pass
return apps
except subprocess.CalledProcessError:
return []
def get_app_state(self, bundle_id: str) -> str:
"""
Get app state (running, suspended, etc.).
Args:
bundle_id: App bundle identifier
Returns:
State string or 'unknown'
"""
# Check if app is running by trying to get its PID
cmd = build_simctl_command("spawn", self.udid, "launchctl", "list")
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
if bundle_id in result.stdout:
return "running"
return "not running"
except subprocess.CalledProcessError:
return "unknown"
def restart_app(self, bundle_id: str, delay: float = 1.0) -> bool:
"""
Restart an app (terminate then launch).
Args:
bundle_id: App bundle identifier
delay: Delay between terminate and launch
Returns:
Success status
"""
# Terminate
self.terminate(bundle_id)
time.sleep(delay)
# Launch
success, _ = self.launch(bundle_id)
return success
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Control iOS app lifecycle")
# Actions
parser.add_argument("--launch", help="Launch app by bundle ID")
parser.add_argument("--terminate", help="Terminate app by bundle ID")
parser.add_argument("--restart", help="Restart app by bundle ID")
parser.add_argument("--install", help="Install app from .app path")
parser.add_argument("--uninstall", help="Uninstall app by bundle ID")
parser.add_argument("--open-url", help="Open URL (deep link)")
parser.add_argument("--list", action="store_true", help="List installed apps")
parser.add_argument("--state", help="Get app state by bundle ID")
# Options
parser.add_argument(
"--wait-for-debugger", action="store_true", help="Wait for debugger when launching"
)
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
launcher = AppLauncher(udid=udid)
# Execute requested action
if args.launch:
success, pid = launcher.launch(args.launch, args.wait_for_debugger)
if success:
if pid:
print(f"Launched {args.launch} (PID: {pid})")
else:
print(f"Launched {args.launch}")
else:
print(f"Failed to launch {args.launch}")
sys.exit(1)
elif args.terminate:
if launcher.terminate(args.terminate):
print(f"Terminated {args.terminate}")
else:
print(f"Failed to terminate {args.terminate}")
sys.exit(1)
elif args.restart:
if launcher.restart_app(args.restart):
print(f"Restarted {args.restart}")
else:
print(f"Failed to restart {args.restart}")
sys.exit(1)
elif args.install:
if launcher.install(args.install):
print(f"Installed {args.install}")
else:
print(f"Failed to install {args.install}")
sys.exit(1)
elif args.uninstall:
if launcher.uninstall(args.uninstall):
print(f"Uninstalled {args.uninstall}")
else:
print(f"Failed to uninstall {args.uninstall}")
sys.exit(1)
elif args.open_url:
if launcher.open_url(args.open_url):
print(f"Opened URL: {args.open_url}")
else:
print(f"Failed to open URL: {args.open_url}")
sys.exit(1)
elif args.list:
apps = launcher.list_apps()
if apps:
print(f"Installed apps ({len(apps)}):")
for app in apps[:10]: # Limit for token efficiency
print(f" {app['bundle_id']}: {app['name']} (v{app['version']})")
if len(apps) > 10:
print(f" ... and {len(apps) - 10} more")
else:
print("No apps found or failed to list")
elif args.state:
state = launcher.get_app_state(args.state)
print(f"{args.state}: {state}")
else:
parser.print_help()
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,391 +0,0 @@
#!/usr/bin/env python3
"""
App State Capture for iOS Simulator
Captures complete app state including screenshot, accessibility tree, and logs.
Optimized for minimal token output.
Usage: python scripts/app_state_capture.py [options]
"""
import argparse
import json
import subprocess
import sys
from datetime import datetime
from pathlib import Path
from common import (
capture_screenshot,
count_elements,
get_accessibility_tree,
resolve_udid,
)
class AppStateCapture:
"""Captures comprehensive app state for debugging."""
def __init__(
self,
app_bundle_id: str | None = None,
udid: str | None = None,
inline: bool = False,
screenshot_size: str = "half",
):
"""
Initialize state capture.
Args:
app_bundle_id: Optional app bundle ID for log filtering
udid: Optional device UDID (uses booted if not specified)
inline: If True, return screenshots as base64 (for vision-based automation)
screenshot_size: 'full', 'half', 'quarter', 'thumb' (default: 'half')
"""
self.app_bundle_id = app_bundle_id
self.udid = udid
self.inline = inline
self.screenshot_size = screenshot_size
def capture_screenshot(self, output_path: Path) -> bool:
"""Capture screenshot of current screen."""
cmd = ["xcrun", "simctl", "io"]
if self.udid:
cmd.append(self.udid)
else:
cmd.append("booted")
cmd.extend(["screenshot", str(output_path)])
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def capture_accessibility_tree(self, output_path: Path) -> dict:
"""Capture accessibility tree using shared utility."""
try:
# Use shared utility to fetch tree
tree = get_accessibility_tree(self.udid, nested=True)
# Save tree
with open(output_path, "w") as f:
json.dump(tree, f, indent=2)
# Return summary using shared utility
return {"captured": True, "element_count": count_elements(tree)}
except Exception as e:
return {"captured": False, "error": str(e)}
def capture_logs(self, output_path: Path, line_limit: int = 100) -> dict:
"""Capture recent app logs."""
if not self.app_bundle_id:
# Can't capture logs without app ID
return {"captured": False, "reason": "No app bundle ID specified"}
# Get app name from bundle ID (simplified)
app_name = self.app_bundle_id.split(".")[-1]
cmd = ["xcrun", "simctl", "spawn"]
if self.udid:
cmd.append(self.udid)
else:
cmd.append("booted")
cmd.extend(
[
"log",
"show",
"--predicate",
f'process == "{app_name}"',
"--last",
"1m", # Last 1 minute
"--style",
"compact",
]
)
try:
result = subprocess.run(cmd, check=False, capture_output=True, text=True, timeout=5)
logs = result.stdout
# Limit lines for token efficiency
lines = logs.split("\n")
if len(lines) > line_limit:
lines = lines[-line_limit:]
# Save logs
with open(output_path, "w") as f:
f.write("\n".join(lines))
# Analyze for issues
warning_count = sum(1 for line in lines if "warning" in line.lower())
error_count = sum(1 for line in lines if "error" in line.lower())
return {
"captured": True,
"lines": len(lines),
"warnings": warning_count,
"errors": error_count,
}
except (subprocess.CalledProcessError, subprocess.TimeoutExpired) as e:
return {"captured": False, "error": str(e)}
def capture_device_info(self) -> dict:
"""Get device information."""
cmd = ["xcrun", "simctl", "list", "devices", "booted"]
if self.udid:
# Specific device info
cmd = ["xcrun", "simctl", "list", "devices"]
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
# Parse output for device info (simplified)
lines = result.stdout.split("\n")
device_info = {}
for line in lines:
if "iPhone" in line or "iPad" in line:
# Extract device name and state
parts = line.strip().split("(")
if parts:
device_info["name"] = parts[0].strip()
if len(parts) > 2:
device_info["udid"] = parts[1].replace(")", "").strip()
device_info["state"] = parts[2].replace(")", "").strip()
break
return device_info
except subprocess.CalledProcessError:
return {}
def capture_all(
self, output_dir: str, log_lines: int = 100, app_name: str | None = None
) -> dict:
"""
Capture complete app state.
Args:
output_dir: Directory to save artifacts
log_lines: Number of log lines to capture
app_name: App name for semantic naming (for inline mode)
Returns:
Summary of captured state
"""
# Create output directory (only if not in inline mode)
output_path = Path(output_dir)
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
if not self.inline:
capture_dir = output_path / f"app-state-{timestamp}"
capture_dir.mkdir(parents=True, exist_ok=True)
else:
capture_dir = None
summary = {
"timestamp": datetime.now().isoformat(),
"screenshot_mode": "inline" if self.inline else "file",
}
if capture_dir:
summary["output_dir"] = str(capture_dir)
# Capture screenshot using new unified utility
screenshot_result = capture_screenshot(
self.udid,
size=self.screenshot_size,
inline=self.inline,
app_name=app_name,
)
if self.inline:
# Inline mode: store base64
summary["screenshot"] = {
"mode": "inline",
"base64": screenshot_result["base64_data"],
"width": screenshot_result["width"],
"height": screenshot_result["height"],
"size_preset": self.screenshot_size,
}
else:
# File mode: save to disk
screenshot_path = capture_dir / "screenshot.png"
# Move temp file to target location
import shutil
shutil.move(screenshot_result["file_path"], screenshot_path)
summary["screenshot"] = {
"mode": "file",
"file": "screenshot.png",
"size_bytes": screenshot_result["size_bytes"],
}
# Capture accessibility tree
if not self.inline or capture_dir:
accessibility_path = (capture_dir or output_path) / "accessibility-tree.json"
else:
accessibility_path = None
if accessibility_path:
tree_info = self.capture_accessibility_tree(accessibility_path)
summary["accessibility"] = tree_info
# Capture logs (if app ID provided)
if self.app_bundle_id:
if not self.inline or capture_dir:
logs_path = (capture_dir or output_path) / "app-logs.txt"
else:
logs_path = None
if logs_path:
log_info = self.capture_logs(logs_path, log_lines)
summary["logs"] = log_info
# Get device info
device_info = self.capture_device_info()
if device_info:
summary["device"] = device_info
# Save device info (file mode only)
if capture_dir:
with open(capture_dir / "device-info.json", "w") as f:
json.dump(device_info, f, indent=2)
# Save summary (file mode only)
if capture_dir:
with open(capture_dir / "summary.json", "w") as f:
json.dump(summary, f, indent=2)
# Create markdown summary
self._create_summary_md(capture_dir, summary)
return summary
def _create_summary_md(self, capture_dir: Path, summary: dict) -> None:
"""Create markdown summary file."""
md_path = capture_dir / "summary.md"
with open(md_path, "w") as f:
f.write("# App State Capture\n\n")
f.write(f"**Timestamp:** {summary['timestamp']}\n\n")
if "device" in summary:
f.write("## Device\n")
device = summary["device"]
f.write(f"- Name: {device.get('name', 'Unknown')}\n")
f.write(f"- UDID: {device.get('udid', 'N/A')}\n")
f.write(f"- State: {device.get('state', 'Unknown')}\n\n")
f.write("## Screenshot\n")
f.write("![Current Screen](screenshot.png)\n\n")
if "accessibility" in summary:
acc = summary["accessibility"]
f.write("## Accessibility\n")
if acc.get("captured"):
f.write(f"- Elements: {acc.get('element_count', 0)}\n")
else:
f.write(f"- Error: {acc.get('error', 'Unknown')}\n")
f.write("\n")
if "logs" in summary:
logs = summary["logs"]
f.write("## Logs\n")
if logs.get("captured"):
f.write(f"- Lines: {logs.get('lines', 0)}\n")
f.write(f"- Warnings: {logs.get('warnings', 0)}\n")
f.write(f"- Errors: {logs.get('errors', 0)}\n")
else:
f.write(f"- {logs.get('reason', logs.get('error', 'Not captured'))}\n")
f.write("\n")
f.write("## Files\n")
f.write("- `screenshot.png` - Current screen\n")
f.write("- `accessibility-tree.json` - Full UI hierarchy\n")
if self.app_bundle_id:
f.write("- `app-logs.txt` - Recent app logs\n")
f.write("- `device-info.json` - Device details\n")
f.write("- `summary.json` - Complete capture metadata\n")
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Capture complete app state for debugging")
parser.add_argument(
"--app-bundle-id", help="App bundle ID for log filtering (e.g., com.example.app)"
)
parser.add_argument(
"--output", default=".", help="Output directory (default: current directory)"
)
parser.add_argument(
"--log-lines", type=int, default=100, help="Number of log lines to capture (default: 100)"
)
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
parser.add_argument(
"--inline",
action="store_true",
help="Return screenshots as base64 (inline mode for vision-based automation)",
)
parser.add_argument(
"--size",
choices=["full", "half", "quarter", "thumb"],
default="half",
help="Screenshot size for token optimization (default: half)",
)
parser.add_argument("--app-name", help="App name for semantic screenshot naming")
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
# Create capturer
capturer = AppStateCapture(
app_bundle_id=args.app_bundle_id,
udid=udid,
inline=args.inline,
screenshot_size=args.size,
)
# Capture state
try:
summary = capturer.capture_all(
output_dir=args.output, log_lines=args.log_lines, app_name=args.app_name
)
# Token-efficient output
if "output_dir" in summary:
print(f"State captured: {summary['output_dir']}/")
else:
# Inline mode
print(
f"State captured (inline mode): {summary['screenshot']['width']}x{summary['screenshot']['height']}"
)
# Report any issues found
if "logs" in summary and summary["logs"].get("captured"):
logs = summary["logs"]
if logs["errors"] > 0 or logs["warnings"] > 0:
print(f"Issues found: {logs['errors']} errors, {logs['warnings']} warnings")
if "accessibility" in summary and summary["accessibility"].get("captured"):
print(f"Elements: {summary['accessibility']['element_count']}")
except Exception as e:
print(f"Error: {e}")
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,310 +0,0 @@
#!/usr/bin/env python3
"""
Build and Test Automation for Xcode Projects
Ultra token-efficient build automation with progressive disclosure via xcresult bundles.
Features:
- Minimal default output (5-10 tokens)
- Progressive disclosure for error/warning/log details
- Native xcresult bundle support
- Clean modular architecture
Usage Examples:
# Build (minimal output)
python scripts/build_and_test.py --project MyApp.xcodeproj
# Output: Build: SUCCESS (0 errors, 3 warnings) [xcresult-20251018-143052]
# Get error details
python scripts/build_and_test.py --get-errors xcresult-20251018-143052
# Get warnings
python scripts/build_and_test.py --get-warnings xcresult-20251018-143052
# Get build log
python scripts/build_and_test.py --get-log xcresult-20251018-143052
# Get everything as JSON
python scripts/build_and_test.py --get-all xcresult-20251018-143052 --json
# List recent builds
python scripts/build_and_test.py --list-xcresults
# Verbose mode (for debugging)
python scripts/build_and_test.py --project MyApp.xcodeproj --verbose
"""
import argparse
import sys
from pathlib import Path
# Import our modular components
from xcode import BuildRunner, OutputFormatter, XCResultCache, XCResultParser
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(
description="Build and test Xcode projects with progressive disclosure",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Build project (minimal output)
python scripts/build_and_test.py --project MyApp.xcodeproj
# Run tests
python scripts/build_and_test.py --project MyApp.xcodeproj --test
# Get error details from previous build
python scripts/build_and_test.py --get-errors xcresult-20251018-143052
# Get all details as JSON
python scripts/build_and_test.py --get-all xcresult-20251018-143052 --json
# List recent builds
python scripts/build_and_test.py --list-xcresults
""",
)
# Build/test mode arguments
build_group = parser.add_argument_group("Build/Test Options")
project_group = build_group.add_mutually_exclusive_group()
project_group.add_argument("--project", help="Path to .xcodeproj file")
project_group.add_argument("--workspace", help="Path to .xcworkspace file")
build_group.add_argument("--scheme", help="Build scheme (auto-detected if not specified)")
build_group.add_argument(
"--configuration",
default="Debug",
choices=["Debug", "Release"],
help="Build configuration (default: Debug)",
)
build_group.add_argument("--simulator", help="Simulator name (default: iPhone 15)")
build_group.add_argument("--clean", action="store_true", help="Clean before building")
build_group.add_argument("--test", action="store_true", help="Run tests")
build_group.add_argument("--suite", help="Specific test suite to run")
# Progressive disclosure arguments
disclosure_group = parser.add_argument_group("Progressive Disclosure Options")
disclosure_group.add_argument(
"--get-errors", metavar="XCRESULT_ID", help="Get error details from xcresult"
)
disclosure_group.add_argument(
"--get-warnings", metavar="XCRESULT_ID", help="Get warning details from xcresult"
)
disclosure_group.add_argument(
"--get-log", metavar="XCRESULT_ID", help="Get build log from xcresult"
)
disclosure_group.add_argument(
"--get-all", metavar="XCRESULT_ID", help="Get all details from xcresult"
)
disclosure_group.add_argument(
"--list-xcresults", action="store_true", help="List recent xcresult bundles"
)
# Output options
output_group = parser.add_argument_group("Output Options")
output_group.add_argument("--verbose", action="store_true", help="Show detailed output")
output_group.add_argument("--json", action="store_true", help="Output as JSON")
args = parser.parse_args()
# Initialize cache
cache = XCResultCache()
# Handle list mode
if args.list_xcresults:
xcresults = cache.list()
if args.json:
import json
print(json.dumps(xcresults, indent=2))
elif not xcresults:
print("No xcresult bundles found")
else:
print(f"Recent XCResult bundles ({len(xcresults)}):")
print()
for xc in xcresults:
print(f" {xc['id']}")
print(f" Created: {xc['created']}")
print(f" Size: {xc['size_mb']} MB")
print()
return 0
# Handle retrieval modes
xcresult_id = args.get_errors or args.get_warnings or args.get_log or args.get_all
if xcresult_id:
xcresult_path = cache.get_path(xcresult_id)
if not xcresult_path or not xcresult_path.exists():
print(f"Error: XCResult bundle not found: {xcresult_id}", file=sys.stderr)
print("Use --list-xcresults to see available bundles", file=sys.stderr)
return 1
# Load cached stderr for progressive disclosure
cached_stderr = cache.get_stderr(xcresult_id)
parser = XCResultParser(xcresult_path, stderr=cached_stderr)
# Get errors
if args.get_errors:
errors = parser.get_errors()
if args.json:
import json
print(json.dumps(errors, indent=2))
else:
print(OutputFormatter.format_errors(errors))
return 0
# Get warnings
if args.get_warnings:
warnings = parser.get_warnings()
if args.json:
import json
print(json.dumps(warnings, indent=2))
else:
print(OutputFormatter.format_warnings(warnings))
return 0
# Get log
if args.get_log:
log = parser.get_build_log()
if log:
print(OutputFormatter.format_log(log))
else:
print("No build log available", file=sys.stderr)
return 1
return 0
# Get all
if args.get_all:
error_count, warning_count = parser.count_issues()
errors = parser.get_errors()
warnings = parser.get_warnings()
build_log = parser.get_build_log()
if args.json:
import json
data = {
"xcresult_id": xcresult_id,
"error_count": error_count,
"warning_count": warning_count,
"errors": errors,
"warnings": warnings,
"log_preview": build_log[:1000] if build_log else None,
}
print(json.dumps(data, indent=2))
else:
print(f"XCResult: {xcresult_id}")
print(f"Errors: {error_count}, Warnings: {warning_count}")
print()
if errors:
print(OutputFormatter.format_errors(errors, limit=10))
print()
if warnings:
print(OutputFormatter.format_warnings(warnings, limit=10))
print()
if build_log:
print("Build Log (last 30 lines):")
print(OutputFormatter.format_log(build_log, lines=30))
return 0
# Build/test mode
if not args.project and not args.workspace:
# Try to auto-detect in current directory
cwd = Path.cwd()
projects = list(cwd.glob("*.xcodeproj"))
workspaces = list(cwd.glob("*.xcworkspace"))
if workspaces:
args.workspace = str(workspaces[0])
elif projects:
args.project = str(projects[0])
else:
parser.error("No project or workspace specified and none found in current directory")
# Initialize builder
builder = BuildRunner(
project_path=args.project,
workspace_path=args.workspace,
scheme=args.scheme,
configuration=args.configuration,
simulator=args.simulator,
cache=cache,
)
# Execute build or test
if args.test:
success, xcresult_id, stderr = builder.test(test_suite=args.suite)
else:
success, xcresult_id, stderr = builder.build(clean=args.clean)
if not xcresult_id and not stderr:
print("Error: Build/test failed without creating xcresult or error output", file=sys.stderr)
return 1
# Save stderr to cache for progressive disclosure
if xcresult_id and stderr:
cache.save_stderr(xcresult_id, stderr)
# Parse results
xcresult_path = cache.get_path(xcresult_id) if xcresult_id else None
parser = XCResultParser(xcresult_path, stderr=stderr)
error_count, warning_count = parser.count_issues()
# Format output
status = "SUCCESS" if success else "FAILED"
# Generate hints for failed builds
hints = None
if not success:
errors = parser.get_errors()
hints = OutputFormatter.generate_hints(errors)
if args.verbose:
# Verbose mode with error/warning details
errors = parser.get_errors() if error_count > 0 else None
warnings = parser.get_warnings() if warning_count > 0 else None
output = OutputFormatter.format_verbose(
status=status,
error_count=error_count,
warning_count=warning_count,
xcresult_id=xcresult_id or "N/A",
errors=errors,
warnings=warnings,
)
print(output)
elif args.json:
# JSON mode
data = {
"success": success,
"xcresult_id": xcresult_id or None,
"error_count": error_count,
"warning_count": warning_count,
}
if hints:
data["hints"] = hints
import json
print(json.dumps(data, indent=2))
else:
# Minimal mode (default)
output = OutputFormatter.format_minimal(
status=status,
error_count=error_count,
warning_count=warning_count,
xcresult_id=xcresult_id or "N/A",
hints=hints,
)
print(output)
# Exit with appropriate code
return 0 if success else 1
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1,103 +0,0 @@
#!/usr/bin/env python3
"""
iOS Simulator Clipboard Manager
Copy text to simulator clipboard for testing paste flows.
Optimized for minimal token output.
Usage: python scripts/clipboard.py --copy "text to copy"
"""
import argparse
import subprocess
import sys
from common import resolve_udid
class ClipboardManager:
"""Manages clipboard operations on iOS simulator."""
def __init__(self, udid: str | None = None):
"""Initialize clipboard manager.
Args:
udid: Optional device UDID (auto-detects booted simulator if None)
"""
self.udid = udid
def copy(self, text: str) -> bool:
"""
Copy text to simulator clipboard.
Args:
text: Text to copy to clipboard
Returns:
Success status
"""
cmd = ["xcrun", "simctl", "pbcopy"]
if self.udid:
cmd.append(self.udid)
else:
cmd.append("booted")
cmd.append(text)
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Copy text to iOS simulator clipboard")
parser.add_argument("--copy", required=True, help="Text to copy to clipboard")
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
parser.add_argument("--test-name", help="Test scenario name for tracking")
parser.add_argument("--expected", help="Expected behavior after paste")
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
# Create manager and copy text
manager = ClipboardManager(udid=udid)
if manager.copy(args.copy):
# Token-efficient output
output = f'Copied: "{args.copy}"'
if args.test_name:
output += f" (test: {args.test_name})"
print(output)
# Provide usage guidance
if args.expected:
print(f"Expected: {args.expected}")
print()
print("Next steps:")
print("1. Tap text field with: python scripts/navigator.py --find-type TextField --tap")
print("2. Paste with: python scripts/keyboard.py --key return")
print(" Or use Cmd+V gesture with: python scripts/keyboard.py --key cmd+v")
else:
print("Failed to copy text to clipboard")
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,59 +0,0 @@
"""
Common utilities shared across iOS simulator scripts.
This module centralizes genuinely reused code patterns to eliminate duplication
while respecting Jackson's Law - no over-abstraction, only truly shared logic.
Organization:
- device_utils: Device detection, command building, coordinate transformation
- idb_utils: IDB-specific operations (accessibility tree, element manipulation)
- cache_utils: Progressive disclosure caching for large outputs
- screenshot_utils: Screenshot capture with file and inline modes
"""
from .cache_utils import ProgressiveCache, get_cache
from .device_utils import (
build_idb_command,
build_simctl_command,
get_booted_device_udid,
get_device_screen_size,
resolve_udid,
transform_screenshot_coords,
)
from .idb_utils import (
count_elements,
flatten_tree,
get_accessibility_tree,
get_screen_size,
)
from .screenshot_utils import (
capture_screenshot,
format_screenshot_result,
generate_screenshot_name,
get_size_preset,
resize_screenshot,
)
__all__ = [
# cache_utils
"ProgressiveCache",
# device_utils
"build_idb_command",
"build_simctl_command",
# screenshot_utils
"capture_screenshot",
# idb_utils
"count_elements",
"flatten_tree",
"format_screenshot_result",
"generate_screenshot_name",
"get_accessibility_tree",
"get_booted_device_udid",
"get_cache",
"get_device_screen_size",
"get_screen_size",
"get_size_preset",
"resize_screenshot",
"resolve_udid",
"transform_screenshot_coords",
]

View File

@@ -1,260 +0,0 @@
#!/usr/bin/env python3
"""
Progressive disclosure cache for large outputs.
Implements cache system to support progressive disclosure pattern:
- Return concise summary with cache_id for large outputs
- User retrieves full details on demand via cache_id
- Reduces token usage by 96% for common queries
Cache directory: ~/.ios-simulator-skill/cache/
Cache expiration: Configurable per cache type (default 1 hour)
Used by:
- sim_list.py - Simulator listing progressive disclosure
- Future: build logs, UI trees, etc.
"""
import json
import time
from datetime import datetime, timedelta
from pathlib import Path
from typing import Any
class ProgressiveCache:
"""Cache for progressive disclosure pattern.
Stores large outputs with timestamped IDs for on-demand retrieval.
Automatically cleans up expired entries.
"""
def __init__(self, cache_dir: str | None = None, max_age_hours: int = 1):
"""Initialize cache system.
Args:
cache_dir: Cache directory path (default: ~/.ios-simulator-skill/cache/)
max_age_hours: Max age for cache entries before expiration (default: 1 hour)
"""
if cache_dir is None:
cache_dir = str(Path("~/.ios-simulator-skill/cache").expanduser())
self.cache_dir = Path(cache_dir)
self.max_age_hours = max_age_hours
# Create cache directory if needed
self.cache_dir.mkdir(parents=True, exist_ok=True)
def save(self, data: dict[str, Any], cache_type: str) -> str:
"""Save data to cache and return cache_id.
Args:
data: Dictionary data to cache
cache_type: Type of cache ('simulator-list', 'build-log', 'ui-tree', etc.)
Returns:
Cache ID like 'sim-20251028-143052' for use in progressive disclosure
Example:
cache_id = cache.save({'devices': [...]}, 'simulator-list')
# Returns: 'sim-20251028-143052'
"""
# Generate cache_id with timestamp
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
cache_prefix = cache_type.split("-")[0] # e.g., 'sim' from 'simulator-list'
cache_id = f"{cache_prefix}-{timestamp}"
# Save to file
cache_file = self.cache_dir / f"{cache_id}.json"
with open(cache_file, "w") as f:
json.dump(
{
"cache_id": cache_id,
"cache_type": cache_type,
"created_at": datetime.now().isoformat(),
"data": data,
},
f,
indent=2,
)
return cache_id
def get(self, cache_id: str) -> dict[str, Any] | None:
"""Retrieve data from cache by cache_id.
Args:
cache_id: Cache ID from save() or list_entries()
Returns:
Cached data dictionary, or None if not found/expired
Example:
data = cache.get('sim-20251028-143052')
if data:
print(f"Found {len(data)} devices")
"""
cache_file = self.cache_dir / f"{cache_id}.json"
if not cache_file.exists():
return None
# Check if expired
if self._is_expired(cache_file):
cache_file.unlink() # Delete expired file
return None
try:
with open(cache_file) as f:
entry = json.load(f)
return entry.get("data")
except (OSError, json.JSONDecodeError):
return None
def list_entries(self, cache_type: str | None = None) -> list[dict[str, Any]]:
"""List available cache entries with metadata.
Args:
cache_type: Filter by type (e.g., 'simulator-list'), or None for all
Returns:
List of cache entries with id, type, created_at, age_seconds
Example:
entries = cache.list_entries('simulator-list')
for entry in entries:
print(f"{entry['id']} - {entry['age_seconds']}s old")
"""
entries = []
for cache_file in sorted(self.cache_dir.glob("*.json"), reverse=True):
# Check if expired
if self._is_expired(cache_file):
cache_file.unlink()
continue
try:
with open(cache_file) as f:
entry = json.load(f)
# Filter by type if specified
if cache_type and entry.get("cache_type") != cache_type:
continue
created_at = datetime.fromisoformat(entry.get("created_at", ""))
age_seconds = (datetime.now() - created_at).total_seconds()
entries.append(
{
"id": entry.get("cache_id"),
"type": entry.get("cache_type"),
"created_at": entry.get("created_at"),
"age_seconds": int(age_seconds),
}
)
except (OSError, json.JSONDecodeError, ValueError):
continue
return entries
def cleanup(self, max_age_hours: int | None = None) -> int:
"""Remove expired cache entries.
Args:
max_age_hours: Age threshold (default: uses instance max_age_hours)
Returns:
Number of entries deleted
Example:
deleted = cache.cleanup()
print(f"Deleted {deleted} expired cache entries")
"""
if max_age_hours is None:
max_age_hours = self.max_age_hours
deleted = 0
for cache_file in self.cache_dir.glob("*.json"):
if self._is_expired(cache_file, max_age_hours):
cache_file.unlink()
deleted += 1
return deleted
def clear(self, cache_type: str | None = None) -> int:
"""Clear all cache entries of a type.
Args:
cache_type: Type to clear (e.g., 'simulator-list'), or None to clear all
Returns:
Number of entries deleted
Example:
cleared = cache.clear('simulator-list')
print(f"Cleared {cleared} simulator list entries")
"""
deleted = 0
for cache_file in self.cache_dir.glob("*.json"):
if cache_type is None:
# Clear all
cache_file.unlink()
deleted += 1
else:
# Clear by type
try:
with open(cache_file) as f:
entry = json.load(f)
if entry.get("cache_type") == cache_type:
cache_file.unlink()
deleted += 1
except (OSError, json.JSONDecodeError):
pass
return deleted
def _is_expired(self, cache_file: Path, max_age_hours: int | None = None) -> bool:
"""Check if cache file is expired.
Args:
cache_file: Path to cache file
max_age_hours: Age threshold (default: uses instance max_age_hours)
Returns:
True if file is older than max_age_hours
"""
if max_age_hours is None:
max_age_hours = self.max_age_hours
try:
with open(cache_file) as f:
entry = json.load(f)
created_at = datetime.fromisoformat(entry.get("created_at", ""))
age = datetime.now() - created_at
return age > timedelta(hours=max_age_hours)
except (OSError, json.JSONDecodeError, ValueError):
return True
# Module-level cache instances (lazy-loaded)
_cache_instances: dict[str, ProgressiveCache] = {}
def get_cache(cache_dir: str | None = None) -> ProgressiveCache:
"""Get or create global cache instance.
Args:
cache_dir: Custom cache directory (uses default if None)
Returns:
ProgressiveCache instance
"""
# Use cache_dir as key, or 'default' if None
key = cache_dir or "default"
if key not in _cache_instances:
_cache_instances[key] = ProgressiveCache(cache_dir)
return _cache_instances[key]

View File

@@ -1,432 +0,0 @@
#!/usr/bin/env python3
"""
Shared device and simulator utilities.
Common patterns for interacting with simulators via xcrun simctl and IDB.
Standardizes command building and device targeting to prevent errors.
Follows Jackson's Law - only extracts genuinely reused patterns.
Used by:
- app_launcher.py (8 call sites) - App lifecycle commands
- Multiple scripts (15+ locations) - IDB command building
- navigator.py, gesture.py - Coordinate transformation
- test_recorder.py, app_state_capture.py - Auto-UDID detection
"""
import json
import re
import subprocess
def build_simctl_command(
operation: str,
udid: str | None = None,
*args,
) -> list[str]:
"""
Build xcrun simctl command with proper device handling.
Standardizes command building to prevent device targeting bugs.
Automatically uses "booted" if no UDID provided.
Used by:
- app_launcher.py: launch, terminate, install, uninstall, openurl, listapps, spawn
- Multiple scripts: generic simctl operations
Args:
operation: simctl operation (launch, terminate, install, etc.)
udid: Device UDID (uses 'booted' if None)
*args: Additional command arguments
Returns:
Complete command list ready for subprocess.run()
Examples:
# Launch app on booted simulator
cmd = build_simctl_command("launch", None, "com.app.bundle")
# Returns: ["xcrun", "simctl", "launch", "booted", "com.app.bundle"]
# Launch on specific device
cmd = build_simctl_command("launch", "ABC123", "com.app.bundle")
# Returns: ["xcrun", "simctl", "launch", "ABC123", "com.app.bundle"]
# Install app on specific device
cmd = build_simctl_command("install", "ABC123", "/path/to/app.app")
# Returns: ["xcrun", "simctl", "install", "ABC123", "/path/to/app.app"]
"""
cmd = ["xcrun", "simctl", operation]
# Add device (booted or specific UDID)
cmd.append(udid if udid else "booted")
# Add remaining arguments
cmd.extend(str(arg) for arg in args)
return cmd
def build_idb_command(
operation: str,
udid: str | None = None,
*args,
) -> list[str]:
"""
Build IDB command with proper device targeting.
Standardizes IDB command building across all scripts using IDB.
Handles device UDID consistently.
Used by:
- navigator.py: ui tap, ui text, ui describe-all
- gesture.py: ui swipe, ui tap
- keyboard.py: ui key, ui text, ui tap
- And more: 15+ locations
Args:
operation: IDB operation path (e.g., "ui tap", "ui text", "ui describe-all")
udid: Device UDID (omits --udid flag if None, IDB uses booted by default)
*args: Additional command arguments
Returns:
Complete command list ready for subprocess.run()
Examples:
# Tap on booted simulator
cmd = build_idb_command("ui tap", None, "200", "400")
# Returns: ["idb", "ui", "tap", "200", "400"]
# Tap on specific device
cmd = build_idb_command("ui tap", "ABC123", "200", "400")
# Returns: ["idb", "ui", "tap", "200", "400", "--udid", "ABC123"]
# Get accessibility tree
cmd = build_idb_command("ui describe-all", "ABC123", "--json", "--nested")
# Returns: ["idb", "ui", "describe-all", "--json", "--nested", "--udid", "ABC123"]
# Enter text
cmd = build_idb_command("ui text", None, "hello world")
# Returns: ["idb", "ui", "text", "hello world"]
"""
# Split operation into parts (e.g., "ui tap" -> ["ui", "tap"])
cmd = ["idb"] + operation.split()
# Add arguments
cmd.extend(str(arg) for arg in args)
# Add device targeting if specified (optional for IDB, uses booted by default)
if udid:
cmd.extend(["--udid", udid])
return cmd
def get_booted_device_udid() -> str | None:
"""
Auto-detect currently booted simulator UDID.
Queries xcrun simctl for booted devices and returns first match.
Returns:
UDID of booted simulator, or None if no simulator is booted.
Example:
udid = get_booted_device_udid()
if udid:
print(f"Booted simulator: {udid}")
else:
print("No simulator is currently booted")
"""
try:
result = subprocess.run(
["xcrun", "simctl", "list", "devices", "booted"],
capture_output=True,
text=True,
check=True,
)
# Parse output to find UDID
# Format: " iPhone 16 Pro (ABC123-DEF456) (Booted)"
for line in result.stdout.split("\n"):
# Look for UUID pattern in parentheses
match = re.search(r"\(([A-F0-9\-]{36})\)", line)
if match:
return match.group(1)
return None
except subprocess.CalledProcessError:
return None
def resolve_udid(udid_arg: str | None) -> str:
"""
Resolve device UDID with auto-detection fallback.
If udid_arg is provided, returns it immediately.
If None, attempts to auto-detect booted simulator.
Raises error if neither is available.
Args:
udid_arg: Explicit UDID from command line, or None
Returns:
Valid UDID string
Raises:
RuntimeError: If no UDID provided and no booted simulator found
Example:
try:
udid = resolve_udid(args.udid) # args.udid might be None
print(f"Using device: {udid}")
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
"""
if udid_arg:
return udid_arg
booted_udid = get_booted_device_udid()
if booted_udid:
return booted_udid
raise RuntimeError(
"No device UDID provided and no simulator is currently booted.\n"
"Boot a simulator or provide --udid explicitly:\n"
" xcrun simctl boot <device-name>\n"
" python scripts/script_name.py --udid <device-udid>"
)
def get_device_screen_size(udid: str) -> tuple[int, int]:
"""
Get actual screen dimensions for device via accessibility tree.
Queries IDB accessibility tree to determine actual device resolution.
Falls back to iPhone 14 defaults (390x844) if detection fails.
Args:
udid: Device UDID
Returns:
Tuple of (width, height) in pixels
Example:
width, height = get_device_screen_size("ABC123")
print(f"Device screen: {width}x{height}")
"""
try:
cmd = build_idb_command("ui describe-all", udid, "--json")
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
# Parse JSON response
data = json.loads(result.stdout)
tree = data[0] if isinstance(data, list) and len(data) > 0 else data
# Get frame size from root element
if tree and "frame" in tree:
frame = tree["frame"]
width = int(frame.get("width", 390))
height = int(frame.get("height", 844))
return (width, height)
# Fallback
return (390, 844)
except Exception:
# Graceful fallback to iPhone 14 Pro defaults
return (390, 844)
def resolve_device_identifier(identifier: str) -> str:
"""
Resolve device name or partial UDID to full UDID.
Supports multiple identifier formats:
- Full UDID: "ABC-123-DEF456..." (36 character UUID)
- Device name: "iPhone 16 Pro" (matches full name)
- Partial match: "iPhone 16" (matches first device containing this string)
- Special: "booted" (resolves to currently booted device)
Args:
identifier: Device UDID, name, or special value "booted"
Returns:
Full device UDID
Raises:
RuntimeError: If identifier cannot be resolved
Example:
udid = resolve_device_identifier("iPhone 16 Pro")
# Returns: "ABC123DEF456..."
udid = resolve_device_identifier("booted")
# Returns UDID of booted simulator
"""
# Handle "booted" special case
if identifier.lower() == "booted":
booted = get_booted_device_udid()
if booted:
return booted
raise RuntimeError(
"No simulator is currently booted. "
"Boot a simulator first: xcrun simctl boot <device-udid>"
)
# Check if already a full UDID (36 character UUID format)
if re.match(r"^[A-F0-9\-]{36}$", identifier, re.IGNORECASE):
return identifier.upper()
# Try to match by device name
simulators = list_simulators(state=None)
exact_matches = [s for s in simulators if s["name"].lower() == identifier.lower()]
if exact_matches:
return exact_matches[0]["udid"]
# Try partial match
partial_matches = [s for s in simulators if identifier.lower() in s["name"].lower()]
if partial_matches:
return partial_matches[0]["udid"]
# No match found
raise RuntimeError(
f"Device '{identifier}' not found. "
f"Use 'xcrun simctl list devices' to see available simulators."
)
def list_simulators(state: str | None = None) -> list[dict]:
"""
List iOS simulators with optional state filtering.
Queries xcrun simctl and returns structured list of simulators.
Optionally filters by state (available, booted, all).
Args:
state: Optional filter - "available", "booted", or None for all
Returns:
List of simulator dicts with keys:
- "name": Device name (e.g., "iPhone 16 Pro")
- "udid": Device UDID (36 char UUID)
- "state": Device state ("Booted", "Shutdown", "Unavailable")
- "runtime": iOS version (e.g., "iOS 18.0", "unavailable")
- "type": Device type ("iPhone", "iPad", "Apple Watch", etc.)
Example:
# List all simulators
all_sims = list_simulators()
print(f"Total simulators: {len(all_sims)}")
# List only available simulators
available = list_simulators(state="available")
for sim in available:
print(f"{sim['name']} ({sim['state']}) - {sim['udid']}")
# List only booted simulators
booted = list_simulators(state="booted")
for sim in booted:
print(f"Booted: {sim['name']}")
"""
try:
# Query simctl for device list
cmd = ["xcrun", "simctl", "list", "devices", "-j"]
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
data = json.loads(result.stdout)
simulators = []
# Parse JSON response
# Format: {"devices": {"iOS 18.0": [{...}, {...}], "iOS 17.0": [...], ...}}
for ios_version, devices in data.get("devices", {}).items():
for device in devices:
sim = {
"name": device.get("name", "Unknown"),
"udid": device.get("udid", ""),
"state": device.get("state", "Unknown"),
"runtime": ios_version,
"type": _extract_device_type(device.get("name", "")),
}
simulators.append(sim)
# Apply state filtering
if state == "booted":
return [s for s in simulators if s["state"] == "Booted"]
if state == "available":
return [s for s in simulators if s["state"] == "Shutdown"] # Available to boot
if state is None:
return simulators
return [s for s in simulators if s["state"].lower() == state.lower()]
except (subprocess.CalledProcessError, json.JSONDecodeError, KeyError) as e:
raise RuntimeError(f"Failed to list simulators: {e}") from e
def _extract_device_type(device_name: str) -> str:
"""
Extract device type from device name.
Parses device name to determine type (iPhone, iPad, Watch, etc.).
Args:
device_name: Full device name (e.g., "iPhone 16 Pro")
Returns:
Device type string
Example:
_extract_device_type("iPhone 16 Pro") # Returns "iPhone"
_extract_device_type("iPad Air") # Returns "iPad"
_extract_device_type("Apple Watch Series 9") # Returns "Watch"
"""
if "iPhone" in device_name:
return "iPhone"
if "iPad" in device_name:
return "iPad"
if "Watch" in device_name or "Apple Watch" in device_name:
return "Watch"
if "TV" in device_name or "Apple TV" in device_name:
return "TV"
return "Unknown"
def transform_screenshot_coords(
x: float,
y: float,
screenshot_width: int,
screenshot_height: int,
device_width: int,
device_height: int,
) -> tuple[int, int]:
"""
Transform screenshot coordinates to device coordinates.
Handles the case where a screenshot was downscaled (e.g., to 'half' size)
and needs to be transformed back to actual device pixel coordinates
for accurate tapping.
The transformation is linear:
device_x = (screenshot_x / screenshot_width) * device_width
device_y = (screenshot_y / screenshot_height) * device_height
Args:
x, y: Coordinates in the screenshot
screenshot_width, screenshot_height: Screenshot dimensions (e.g., 195, 422)
device_width, device_height: Actual device dimensions (e.g., 390, 844)
Returns:
Tuple of (device_x, device_y) in device pixels
Example:
# Screenshot taken at 'half' size: 195x422 (from 390x844 device)
device_x, device_y = transform_screenshot_coords(
100, 200, # Tap point in screenshot
195, 422, # Screenshot dimensions
390, 844 # Device dimensions
)
print(f"Tap at device coords: ({device_x}, {device_y})")
# Output: Tap at device coords: (200, 400)
"""
device_x = int((x / screenshot_width) * device_width)
device_y = int((y / screenshot_height) * device_height)
return (device_x, device_y)

View File

@@ -1,180 +0,0 @@
#!/usr/bin/env python3
"""
Shared IDB utility functions.
This module provides common IDB operations used across multiple scripts.
Follows Jackson's Law - only shared code that's truly reused, not speculative.
Used by:
- navigator.py - Accessibility tree navigation
- screen_mapper.py - UI element analysis
- accessibility_audit.py - WCAG compliance checking
- test_recorder.py - Test documentation
- app_state_capture.py - State snapshots
- gesture.py - Touch gesture operations
"""
import json
import subprocess
import sys
def get_accessibility_tree(udid: str | None = None, nested: bool = True) -> dict:
"""
Fetch accessibility tree from IDB.
The accessibility tree represents the complete UI hierarchy of the current
screen, with all element properties needed for semantic navigation.
Args:
udid: Device UDID (uses booted simulator if None)
nested: Include nested structure (default True). If False, returns flat array.
Returns:
Root element of accessibility tree as dict.
Structure: {
"type": "Window",
"AXLabel": "App Name",
"frame": {"x": 0, "y": 0, "width": 390, "height": 844},
"children": [...]
}
Raises:
SystemExit: If IDB command fails or returns invalid JSON
Example:
tree = get_accessibility_tree("UDID123")
# Root is Window element with all children nested
"""
cmd = ["idb", "ui", "describe-all", "--json"]
if nested:
cmd.append("--nested")
if udid:
cmd.extend(["--udid", udid])
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
tree_data = json.loads(result.stdout)
# IDB returns array format, extract first element (root)
if isinstance(tree_data, list) and len(tree_data) > 0:
return tree_data[0]
return tree_data
except subprocess.CalledProcessError as e:
print(f"Error: Failed to get accessibility tree: {e.stderr}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError:
print("Error: Invalid JSON from idb", file=sys.stderr)
sys.exit(1)
def flatten_tree(node: dict, depth: int = 0, elements: list[dict] | None = None) -> list[dict]:
"""
Flatten nested accessibility tree into list of elements.
Converts the hierarchical accessibility tree into a flat list where each
element includes its depth for context.
Used by:
- navigator.py - Element finding
- screen_mapper.py - Element analysis
- accessibility_audit.py - Audit scanning
Args:
node: Root node of tree (typically from get_accessibility_tree)
depth: Current depth (used internally, start at 0)
elements: Accumulator list (used internally, start as None)
Returns:
Flat list of elements, each with "depth" key indicating nesting level.
Structure of each element: {
"type": "Button",
"AXLabel": "Login",
"frame": {...},
"depth": 2,
...
}
Example:
tree = get_accessibility_tree()
flat = flatten_tree(tree)
for elem in flat:
print(f"{' ' * elem['depth']}{elem.get('type')}: {elem.get('AXLabel')}")
"""
if elements is None:
elements = []
# Add current node with depth tracking
node_copy = node.copy()
node_copy["depth"] = depth
elements.append(node_copy)
# Process children recursively
for child in node.get("children", []):
flatten_tree(child, depth + 1, elements)
return elements
def count_elements(node: dict) -> int:
"""
Count total elements in tree (recursive).
Traverses entire tree counting all elements for reporting purposes.
Used by:
- test_recorder.py - Element counting per step
- screen_mapper.py - Summary statistics
Args:
node: Root node of tree
Returns:
Total element count including root and all descendants
Example:
tree = get_accessibility_tree()
total = count_elements(tree)
print(f"Screen has {total} elements")
"""
count = 1
for child in node.get("children", []):
count += count_elements(child)
return count
def get_screen_size(udid: str | None = None) -> tuple[int, int]:
"""
Get screen dimensions from accessibility tree.
Extracts the screen size from the root element's frame. Useful for
gesture calculations and coordinate normalization.
Used by:
- gesture.py - Gesture positioning
- Potentially: screenshot positioning, screen-aware scaling
Args:
udid: Device UDID (uses booted if None)
Returns:
(width, height) tuple. Defaults to (390, 844) if detection fails
or tree cannot be accessed.
Example:
width, height = get_screen_size()
center_x = width // 2
center_y = height // 2
"""
DEFAULT_WIDTH = 390 # iPhone 14
DEFAULT_HEIGHT = 844
try:
tree = get_accessibility_tree(udid, nested=False)
frame = tree.get("frame", {})
width = int(frame.get("width", DEFAULT_WIDTH))
height = int(frame.get("height", DEFAULT_HEIGHT))
return (width, height)
except Exception:
# Silently fall back to defaults if tree access fails
return (DEFAULT_WIDTH, DEFAULT_HEIGHT)

View File

@@ -1,338 +0,0 @@
#!/usr/bin/env python3
"""
Screenshot utilities with dual-mode support.
Provides unified screenshot handling with:
- File-based mode: Persistent artifacts for test documentation
- Inline base64 mode: Vision-based automation for agent analysis
- Size presets: Token optimization (full/half/quarter/thumb)
- Semantic naming: {appName}_{screenName}_{state}_{timestamp}.png
Supports resize operations via PIL (optional dependency).
Used by:
- test_recorder.py - Step-based screenshot recording
- app_state_capture.py - State snapshot captures
"""
import base64
import os
import subprocess
import sys
from datetime import datetime
from pathlib import Path
from typing import Any
# Try to import PIL for resizing, but make it optional
try:
from PIL import Image
HAS_PIL = True
except ImportError:
HAS_PIL = False
def generate_screenshot_name(
app_name: str | None = None,
screen_name: str | None = None,
state: str | None = None,
timestamp: str | None = None,
extension: str = "png",
) -> str:
"""Generate semantic screenshot filename.
Format: {appName}_{screenName}_{state}_{timestamp}.{ext}
Falls back to: screenshot_{timestamp}.{ext}
Args:
app_name: Application name (e.g., 'MyApp')
screen_name: Screen name (e.g., 'Login')
state: State description (e.g., 'Empty', 'Filled', 'Error')
timestamp: ISO timestamp (uses current time if None)
extension: File extension (default: 'png')
Returns:
Semantic filename ready for safe file creation
Example:
name = generate_screenshot_name('MyApp', 'Login', 'Empty')
# Returns: 'MyApp_Login_Empty_20251028-143052.png'
name = generate_screenshot_name()
# Returns: 'screenshot_20251028-143052.png'
"""
if timestamp is None:
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
# Build semantic name
if app_name or screen_name or state:
parts = [app_name, screen_name, state]
parts = [p for p in parts if p] # Filter None/empty
name = "_".join(parts) + f"_{timestamp}"
else:
name = f"screenshot_{timestamp}"
return f"{name}.{extension}"
def get_size_preset(size: str = "half") -> tuple[float, float]:
"""Get scale factors for size preset.
Args:
size: 'full', 'half', 'quarter', 'thumb'
Returns:
Tuple of (scale_x, scale_y) for resizing
Example:
scale_x, scale_y = get_size_preset('half')
# Returns: (0.5, 0.5)
"""
presets = {
"full": (1.0, 1.0),
"half": (0.5, 0.5),
"quarter": (0.25, 0.25),
"thumb": (0.1, 0.1),
}
return presets.get(size, (0.5, 0.5))
def resize_screenshot(
input_path: str,
output_path: str | None = None,
size: str = "half",
quality: int = 85,
) -> tuple[str, int, int]:
"""Resize screenshot for token optimization.
Requires PIL (Pillow). Falls back gracefully without it.
Args:
input_path: Path to original screenshot
output_path: Output path (uses input_path if None)
size: 'full', 'half', 'quarter', 'thumb'
quality: JPEG quality (1-100, default: 85)
Returns:
Tuple of (output_path, width, height) of resized image
Raises:
FileNotFoundError: If input file doesn't exist
ValueError: If PIL not installed and size != 'full'
Example:
output, w, h = resize_screenshot(
'screenshot.png',
'screenshot_half.png',
'half'
)
print(f"Resized to {w}x{h}")
"""
input_file = Path(input_path)
if not input_file.exists():
raise FileNotFoundError(f"Screenshot not found: {input_path}")
# If full size, just copy
if size == "full":
if output_path:
import shutil
shutil.copy(input_path, output_path)
output_file = Path(output_path)
else:
output_file = input_file
# Get original dimensions
if HAS_PIL:
img = Image.open(str(output_file))
return (str(output_file), img.width, img.height)
return (str(output_file), 0, 0) # Dimensions unknown without PIL
# Need PIL to resize
if not HAS_PIL:
raise ValueError(
f"Size preset '{size}' requires PIL (Pillow). " "Install with: pip3 install pillow"
)
# Open original image
img = Image.open(str(input_file))
orig_w, orig_h = img.size
# Calculate new size
scale_x, scale_y = get_size_preset(size)
new_w = int(orig_w * scale_x)
new_h = int(orig_h * scale_y)
# Resize with high-quality resampling
resized = img.resize((new_w, new_h), Image.Resampling.LANCZOS)
# Determine output path
if output_path is None:
# Insert size marker before extension
stem = input_file.stem
suffix = input_file.suffix
output_path = str(input_file.parent / f"{stem}_{size}{suffix}")
# Save resized image
resized.save(output_path, quality=quality, optimize=True)
return (output_path, new_w, new_h)
def capture_screenshot(
udid: str,
output_path: str | None = None,
size: str = "half",
inline: bool = False,
app_name: str | None = None,
screen_name: str | None = None,
state: str | None = None,
) -> dict[str, Any]:
"""Capture screenshot with flexible output modes.
Supports both file-based (persistent artifacts) and inline base64 modes
(for vision-based automation).
Args:
udid: Device UDID
output_path: File path for file mode (generates semantic name if None)
size: 'full', 'half', 'quarter', 'thumb' (default: 'half')
inline: If True, returns base64 data instead of saving to file
app_name: App name for semantic naming
screen_name: Screen name for semantic naming
state: State description for semantic naming
Returns:
Dict with mode-specific fields:
File mode:
{
'mode': 'file',
'file_path': str,
'size_bytes': int,
'width': int,
'height': int,
'size_preset': str
}
Inline mode:
{
'mode': 'inline',
'base64_data': str,
'mime_type': 'image/png',
'width': int,
'height': int,
'size_preset': str
}
Example:
# File mode
result = capture_screenshot('ABC123', app_name='MyApp')
print(f"Saved to: {result['file_path']}")
# Inline mode
result = capture_screenshot('ABC123', inline=True, size='half')
print(f"Screenshot: {result['width']}x{result['height']}")
print(f"Base64: {result['base64_data'][:50]}...")
"""
try:
# Capture raw screenshot to temp file
temp_path = "/tmp/ios_simulator_screenshot.png"
cmd = ["xcrun", "simctl", "io", udid, "screenshot", temp_path]
subprocess.run(cmd, capture_output=True, text=True, check=True)
if inline:
# Inline mode: resize and convert to base64
# Resize if needed
if size != "full" and HAS_PIL:
resized_path, width, height = resize_screenshot(temp_path, size=size)
else:
resized_path = temp_path
# Get dimensions via PIL if available
if HAS_PIL:
img = Image.open(resized_path)
width, height = img.size
else:
width, height = 390, 844 # Fallback to common device size
# Read and encode as base64
with open(resized_path, "rb") as f:
base64_data = base64.b64encode(f.read()).decode("utf-8")
# Clean up temp files
Path(temp_path).unlink(missing_ok=True)
if resized_path != temp_path:
Path(resized_path).unlink(missing_ok=True)
return {
"mode": "inline",
"base64_data": base64_data,
"mime_type": "image/png",
"width": width,
"height": height,
"size_preset": size,
}
# File mode: save to output path with semantic naming
if output_path is None:
output_path = generate_screenshot_name(app_name, screen_name, state)
# Resize if needed
if size != "full" and HAS_PIL:
final_path, width, height = resize_screenshot(temp_path, output_path, size)
else:
# Just move temp to output
import shutil
shutil.move(temp_path, output_path)
final_path = output_path
# Get dimensions via PIL if available
if HAS_PIL:
img = Image.open(final_path)
width, height = img.size
else:
width, height = 390, 844 # Fallback
# Get file size
size_bytes = Path(final_path).stat().st_size
return {
"mode": "file",
"file_path": final_path,
"size_bytes": size_bytes,
"width": width,
"height": height,
"size_preset": size,
}
except subprocess.CalledProcessError as e:
raise RuntimeError(f"Failed to capture screenshot: {e.stderr}") from e
except Exception as e:
raise RuntimeError(f"Screenshot capture error: {e!s}") from e
def format_screenshot_result(result: dict[str, Any]) -> str:
"""Format screenshot result for human-readable output.
Args:
result: Result dictionary from capture_screenshot()
Returns:
Formatted string for printing
Example:
result = capture_screenshot('ABC123', inline=True)
print(format_screenshot_result(result))
"""
if result["mode"] == "file":
return (
f"Screenshot: {result['file_path']}\n"
f"Dimensions: {result['width']}x{result['height']}\n"
f"Size: {result['size_bytes']} bytes"
)
return (
f"Screenshot (inline): {result['width']}x{result['height']}\n"
f"Base64 length: {len(result['base64_data'])} chars"
)

View File

@@ -1,394 +0,0 @@
#!/usr/bin/env python3
"""
iOS Gesture Controller - Swipes and Complex Gestures
Performs navigation gestures like swipes, scrolls, and pinches.
Token-efficient output for common navigation patterns.
This script handles touch gestures for iOS simulator automation. It provides
directional swipes, multi-swipe scrolling, pull-to-refresh, and pinch gestures.
Automatically detects screen size from the device for accurate gesture positioning.
Key Features:
- Directional swipes (up, down, left, right)
- Multi-swipe scrolling with customizable amount
- Pull-to-refresh gesture
- Pinch to zoom (in/out)
- Custom swipe between any two points
- Drag and drop simulation
- Auto-detects screen dimensions from device
Usage Examples:
# Simple directional swipe
python scripts/gesture.py --swipe up --udid <device-id>
# Scroll down multiple times
python scripts/gesture.py --scroll down --scroll-amount 3 --udid <device-id>
# Pull to refresh
python scripts/gesture.py --refresh --udid <device-id>
# Custom swipe coordinates
python scripts/gesture.py --swipe-from 100,500 --swipe-to 100,100 --udid <device-id>
# Pinch to zoom
python scripts/gesture.py --pinch out --udid <device-id>
# Long press at coordinates
python scripts/gesture.py --long-press 200,300 --duration 2.0 --udid <device-id>
Output Format:
Swiped up
Scrolled down (3x)
Performed pull to refresh
Gesture Details:
- Swipes use 70% of screen by default (configurable)
- Scrolls are multiple small 30% swipes with delays
- Start points are offset from edges for reliability
- Screen size auto-detected from accessibility tree root element
- Falls back to iPhone 14 dimensions (390x844) if detection fails
Technical Details:
- Uses `idb ui swipe x1 y1 x2 y2` for gesture execution
- Duration parameter converts to milliseconds for IDB
- Automatically fetches screen size on initialization
- Parses IDB accessibility tree to get root frame dimensions
- All coordinates calculated as fractions of screen size for device independence
"""
import argparse
import subprocess
import sys
import time
from common import (
get_device_screen_size,
get_screen_size,
resolve_udid,
transform_screenshot_coords,
)
class GestureController:
"""Performs gestures on iOS simulator."""
# Standard screen dimensions (will be detected if possible)
DEFAULT_WIDTH = 390 # iPhone 14
DEFAULT_HEIGHT = 844
def __init__(self, udid: str | None = None):
"""Initialize gesture controller."""
self.udid = udid
self.screen_size = self._get_screen_size()
def _get_screen_size(self) -> tuple[int, int]:
"""Try to detect screen size from device using shared utility."""
return get_screen_size(self.udid)
def swipe(self, direction: str, distance_ratio: float = 0.7) -> bool:
"""
Perform directional swipe.
Args:
direction: up, down, left, right
distance_ratio: How far to swipe (0.0-1.0 of screen)
Returns:
Success status
"""
width, height = self.screen_size
center_x = width // 2
center_y = height // 2
# Calculate swipe coordinates based on direction
if direction == "up":
start = (center_x, int(height * 0.7))
end = (center_x, int(height * (1 - distance_ratio + 0.3)))
elif direction == "down":
start = (center_x, int(height * 0.3))
end = (center_x, int(height * (distance_ratio - 0.3 + 0.3)))
elif direction == "left":
start = (int(width * 0.8), center_y)
end = (int(width * (1 - distance_ratio + 0.2)), center_y)
elif direction == "right":
start = (int(width * 0.2), center_y)
end = (int(width * (distance_ratio - 0.2 + 0.2)), center_y)
else:
return False
return self.swipe_between(start, end)
def swipe_between(
self, start: tuple[int, int], end: tuple[int, int], duration: float = 0.3
) -> bool:
"""
Swipe between two points.
Args:
start: Starting coordinates (x, y)
end: Ending coordinates (x, y)
duration: Swipe duration in seconds
Returns:
Success status
"""
cmd = ["idb", "ui", "swipe"]
cmd.extend([str(start[0]), str(start[1]), str(end[0]), str(end[1])])
# IDB doesn't support duration directly, but we can add delay
if duration != 0.3:
cmd.extend(["--duration", str(int(duration * 1000))])
if self.udid:
cmd.extend(["--udid", self.udid])
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def scroll(self, direction: str, amount: int = 3) -> bool:
"""
Perform multiple small swipes to scroll.
Args:
direction: up, down
amount: Number of small swipes
Returns:
Success status
"""
for _ in range(amount):
if not self.swipe(direction, distance_ratio=0.3):
return False
time.sleep(0.2) # Small delay between swipes
return True
def tap_and_hold(self, x: int, y: int, duration: float = 2.0) -> bool:
"""
Long press at coordinates.
Args:
x, y: Coordinates
duration: Hold duration in seconds
Returns:
Success status
"""
# IDB doesn't have native long press, simulate with tap
# In real implementation, might need to use different approach
cmd = ["idb", "ui", "tap", str(x), str(y)]
if self.udid:
cmd.extend(["--udid", self.udid])
try:
subprocess.run(cmd, capture_output=True, check=True)
# Simulate hold with delay
time.sleep(duration)
return True
except subprocess.CalledProcessError:
return False
def pinch(self, direction: str = "out", center: tuple[int, int] | None = None) -> bool:
"""
Perform pinch gesture (zoom in/out).
Args:
direction: 'in' (zoom out) or 'out' (zoom in)
center: Center point for pinch
Returns:
Success status
"""
if not center:
width, height = self.screen_size
center = (width // 2, height // 2)
# Calculate pinch points
offset = 100 if direction == "out" else 50
if direction == "out":
# Zoom in - fingers move apart
start1 = (center[0] - 20, center[1] - 20)
end1 = (center[0] - offset, center[1] - offset)
start2 = (center[0] + 20, center[1] + 20)
end2 = (center[0] + offset, center[1] + offset)
else:
# Zoom out - fingers move together
start1 = (center[0] - offset, center[1] - offset)
end1 = (center[0] - 20, center[1] - 20)
start2 = (center[0] + offset, center[1] + offset)
end2 = (center[0] + 20, center[1] + 20)
# Perform two swipes simultaneously (simulated)
success1 = self.swipe_between(start1, end1)
success2 = self.swipe_between(start2, end2)
return success1 and success2
def drag_and_drop(self, start: tuple[int, int], end: tuple[int, int]) -> bool:
"""
Drag element from one position to another.
Args:
start: Starting coordinates
end: Ending coordinates
Returns:
Success status
"""
# Use slow swipe to simulate drag
return self.swipe_between(start, end, duration=1.0)
def refresh(self) -> bool:
"""Pull to refresh gesture."""
width, _ = self.screen_size
start = (width // 2, 100)
end = (width // 2, 400)
return self.swipe_between(start, end)
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Perform gestures on iOS simulator")
# Gesture options
parser.add_argument(
"--swipe", choices=["up", "down", "left", "right"], help="Perform directional swipe"
)
parser.add_argument("--swipe-from", help="Custom swipe start coordinates (x,y)")
parser.add_argument("--swipe-to", help="Custom swipe end coordinates (x,y)")
parser.add_argument(
"--scroll", choices=["up", "down"], help="Scroll in direction (multiple small swipes)"
)
parser.add_argument(
"--scroll-amount", type=int, default=3, help="Number of scroll swipes (default: 3)"
)
parser.add_argument("--long-press", help="Long press at coordinates (x,y)")
parser.add_argument(
"--duration", type=float, default=2.0, help="Duration for long press in seconds"
)
parser.add_argument(
"--pinch", choices=["in", "out"], help="Pinch gesture (in=zoom out, out=zoom in)"
)
parser.add_argument("--refresh", action="store_true", help="Pull to refresh gesture")
# Coordinate transformation
parser.add_argument(
"--screenshot-coords",
action="store_true",
help="Interpret swipe coordinates as from a screenshot (requires --screenshot-width/height)",
)
parser.add_argument(
"--screenshot-width",
type=int,
help="Screenshot width for coordinate transformation",
)
parser.add_argument(
"--screenshot-height",
type=int,
help="Screenshot height for coordinate transformation",
)
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
controller = GestureController(udid=udid)
# Execute requested gesture
if args.swipe:
if controller.swipe(args.swipe):
print(f"Swiped {args.swipe}")
else:
print(f"Failed to swipe {args.swipe}")
sys.exit(1)
elif args.swipe_from and args.swipe_to:
# Custom swipe
start = tuple(map(int, args.swipe_from.split(",")))
end = tuple(map(int, args.swipe_to.split(",")))
# Handle coordinate transformation if requested
if args.screenshot_coords:
if not args.screenshot_width or not args.screenshot_height:
print(
"Error: --screenshot-coords requires --screenshot-width and --screenshot-height"
)
sys.exit(1)
device_w, device_h = get_device_screen_size(udid)
start = transform_screenshot_coords(
start[0],
start[1],
args.screenshot_width,
args.screenshot_height,
device_w,
device_h,
)
end = transform_screenshot_coords(
end[0],
end[1],
args.screenshot_width,
args.screenshot_height,
device_w,
device_h,
)
print("Transformed screenshot coords to device coords")
if controller.swipe_between(start, end):
print(f"Swiped from {start} to {end}")
else:
print("Failed to swipe")
sys.exit(1)
elif args.scroll:
if controller.scroll(args.scroll, args.scroll_amount):
print(f"Scrolled {args.scroll} ({args.scroll_amount}x)")
else:
print(f"Failed to scroll {args.scroll}")
sys.exit(1)
elif args.long_press:
coords = tuple(map(int, args.long_press.split(",")))
if controller.tap_and_hold(coords[0], coords[1], args.duration):
print(f"Long pressed at {coords} for {args.duration}s")
else:
print("Failed to long press")
sys.exit(1)
elif args.pinch:
if controller.pinch(args.pinch):
action = "Zoomed in" if args.pinch == "out" else "Zoomed out"
print(action)
else:
print(f"Failed to pinch {args.pinch}")
sys.exit(1)
elif args.refresh:
if controller.refresh():
print("Performed pull to refresh")
else:
print("Failed to refresh")
sys.exit(1)
else:
parser.print_help()
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,391 +0,0 @@
#!/usr/bin/env python3
"""
iOS Keyboard Controller - Text Entry and Hardware Buttons
Handles keyboard input, special keys, and hardware button simulation.
Token-efficient text entry and navigation control.
This script provides text input and hardware button control for iOS simulator
automation. It handles both typing text strings and pressing special keys like
return, delete, tab, etc. Also controls hardware buttons like home and lock.
Key Features:
- Type text strings into focused elements
- Press special keys (return, delete, tab, space, arrows)
- Hardware button simulation (home, lock, volume, screenshot)
- Character-by-character typing with delays (for animations)
- Multiple key press support
- iOS HID key code mapping for reliability
Usage Examples:
# Type text into focused field
python scripts/keyboard.py --type "hello@example.com" --udid <device-id>
# Press return key to submit
python scripts/keyboard.py --key return --udid <device-id>
# Press delete 3 times
python scripts/keyboard.py --key delete --key delete --key delete --udid <device-id>
# Press home button
python scripts/keyboard.py --button home --udid <device-id>
# Press lock button
python scripts/keyboard.py --button lock --udid <device-id>
# Type with delay between characters (for animations)
python scripts/keyboard.py --type "slow typing" --delay 0.1 --udid <device-id>
Output Format:
Typed: "hello@example.com"
Pressed return
Pressed home button
Special Keys Supported:
- return/enter: Submit forms, new lines (HID code 40)
- delete/backspace: Remove characters (HID code 42)
- tab: Navigate between fields (HID code 43)
- space: Space character (HID code 44)
- escape: Cancel/dismiss (HID code 41)
- up/down/left/right: Arrow keys (HID codes 82/81/80/79)
Hardware Buttons Supported:
- home: Return to home screen
- lock/power: Lock device
- volume-up/volume-down: Volume control
- ringer: Toggle mute
- screenshot: Capture screen
Technical Details:
- Uses `idb ui text` for typing text strings
- Uses `idb ui key <code>` for special keys with iOS HID codes
- HID codes from Apple's UIKeyboardHIDUsage specification
- Hardware buttons use `xcrun simctl` button actions
- Text entry works on currently focused element
- Special keys are integers (40=Return, 42=Delete, etc.)
"""
import argparse
import subprocess
import sys
import time
from common import resolve_udid
class KeyboardController:
"""Controls keyboard and hardware buttons on iOS simulator."""
# Special key mappings to iOS HID key codes
# See: https://developer.apple.com/documentation/uikit/uikeyboardhidusage
SPECIAL_KEYS = {
"return": 40,
"enter": 40,
"delete": 42,
"backspace": 42,
"tab": 43,
"space": 44,
"escape": 41,
"up": 82,
"down": 81,
"left": 80,
"right": 79,
}
# Hardware button mappings
HARDWARE_BUTTONS = {
"home": "HOME",
"lock": "LOCK",
"volume-up": "VOLUME_UP",
"volume-down": "VOLUME_DOWN",
"ringer": "RINGER",
"power": "LOCK", # Alias
"screenshot": "SCREENSHOT",
}
def __init__(self, udid: str | None = None):
"""Initialize keyboard controller."""
self.udid = udid
def type_text(self, text: str, delay: float = 0.0) -> bool:
"""
Type text into current focus.
Args:
text: Text to type
delay: Delay between characters (for slow typing effect)
Returns:
Success status
"""
if delay > 0:
# Type character by character with delay
for char in text:
if not self._type_single(char):
return False
time.sleep(delay)
return True
# Type all at once (efficient)
return self._type_single(text)
def _type_single(self, text: str) -> bool:
"""Type text using IDB."""
cmd = ["idb", "ui", "text", text]
if self.udid:
cmd.extend(["--udid", self.udid])
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def press_key(self, key: str, count: int = 1) -> bool:
"""
Press a special key.
Args:
key: Key name (return, delete, tab, etc.)
count: Number of times to press
Returns:
Success status
"""
# Map key name to IDB key code
key_code = self.SPECIAL_KEYS.get(key.lower())
if not key_code:
# Try as literal integer key code
try:
key_code = int(key)
except ValueError:
return False
cmd = ["idb", "ui", "key", str(key_code)]
if self.udid:
cmd.extend(["--udid", self.udid])
try:
for _ in range(count):
subprocess.run(cmd, capture_output=True, check=True)
if count > 1:
time.sleep(0.1) # Small delay for multiple presses
return True
except subprocess.CalledProcessError:
return False
def press_key_sequence(self, keys: list[str]) -> bool:
"""
Press a sequence of keys.
Args:
keys: List of key names
Returns:
Success status
"""
cmd_base = ["idb", "ui", "key-sequence"]
# Map keys to codes
mapped_keys = []
for key in keys:
mapped = self.SPECIAL_KEYS.get(key.lower())
if mapped is None:
# Try as integer
try:
mapped = int(key)
except ValueError:
return False
mapped_keys.append(str(mapped))
cmd = cmd_base + mapped_keys
if self.udid:
cmd.extend(["--udid", self.udid])
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def press_hardware_button(self, button: str) -> bool:
"""
Press hardware button.
Args:
button: Button name (home, lock, volume-up, etc.)
Returns:
Success status
"""
button_code = self.HARDWARE_BUTTONS.get(button.lower())
if not button_code:
return False
cmd = ["idb", "ui", "button", button_code]
if self.udid:
cmd.extend(["--udid", self.udid])
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def clear_text(self, select_all: bool = True) -> bool:
"""
Clear text in current field.
Args:
select_all: Use Cmd+A to select all first
Returns:
Success status
"""
if select_all:
# Select all then delete
# Note: This might need adjustment for iOS keyboard shortcuts
success = self.press_key_combo(["cmd", "a"])
if success:
return self.press_key("delete")
else:
# Just delete multiple times
return self.press_key("delete", count=50)
return None
def press_key_combo(self, keys: list[str]) -> bool:
"""
Press key combination (like Cmd+A).
Args:
keys: List of keys to press together
Returns:
Success status
"""
# IDB doesn't directly support key combos
# This is a workaround - may need platform-specific handling
if "cmd" in keys or "command" in keys:
# Handle common shortcuts
if "a" in keys:
# Select all - might work with key sequence
return self.press_key_sequence(["command", "a"])
if "c" in keys:
return self.press_key_sequence(["command", "c"])
if "v" in keys:
return self.press_key_sequence(["command", "v"])
if "x" in keys:
return self.press_key_sequence(["command", "x"])
# Try as sequence
return self.press_key_sequence(keys)
def dismiss_keyboard(self) -> bool:
"""Dismiss on-screen keyboard."""
# Common ways to dismiss keyboard on iOS
# Try Done button first, then Return
success = self.press_key("return")
if not success:
# Try tapping outside (would need coordinate)
pass
return success
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Control keyboard and hardware buttons")
# Text input
parser.add_argument("--type", help="Type text into current focus")
parser.add_argument("--slow", action="store_true", help="Type slowly (character by character)")
# Special keys
parser.add_argument("--key", help="Press special key (return, delete, tab, space, etc.)")
parser.add_argument("--key-sequence", help="Press key sequence (comma-separated)")
parser.add_argument("--count", type=int, default=1, help="Number of times to press key")
# Hardware buttons
parser.add_argument(
"--button",
choices=["home", "lock", "volume-up", "volume-down", "ringer", "screenshot"],
help="Press hardware button",
)
# Other operations
parser.add_argument("--clear", action="store_true", help="Clear current text field")
parser.add_argument("--dismiss", action="store_true", help="Dismiss keyboard")
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
controller = KeyboardController(udid=udid)
# Execute requested action
if args.type:
delay = 0.1 if args.slow else 0.0
if controller.type_text(args.type, delay):
if args.slow:
print(f'Typed: "{args.type}" (slowly)')
else:
print(f'Typed: "{args.type}"')
else:
print("Failed to type text")
sys.exit(1)
elif args.key:
if controller.press_key(args.key, args.count):
if args.count > 1:
print(f"Pressed {args.key} ({args.count}x)")
else:
print(f"Pressed {args.key}")
else:
print(f"Failed to press {args.key}")
sys.exit(1)
elif args.key_sequence:
keys = args.key_sequence.split(",")
if controller.press_key_sequence(keys):
print(f"Pressed sequence: {' -> '.join(keys)}")
else:
print("Failed to press key sequence")
sys.exit(1)
elif args.button:
if controller.press_hardware_button(args.button):
print(f"Pressed {args.button} button")
else:
print(f"Failed to press {args.button}")
sys.exit(1)
elif args.clear:
if controller.clear_text():
print("Cleared text field")
else:
print("Failed to clear text")
sys.exit(1)
elif args.dismiss:
if controller.dismiss_keyboard():
print("Dismissed keyboard")
else:
print("Failed to dismiss keyboard")
sys.exit(1)
else:
parser.print_help()
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,486 +0,0 @@
#!/usr/bin/env python3
"""
iOS Simulator Log Monitoring and Analysis
Real-time log streaming from iOS simulators with intelligent filtering, error detection,
and token-efficient summarization. Enhanced version of app_state_capture.py's log capture.
Features:
- Real-time log streaming from booted simulators
- Smart filtering by app bundle ID, subsystem, category, severity
- Error/warning classification and deduplication
- Duration-based or continuous follow mode
- Token-efficient summaries with full logs saved to file
- Integration with test_recorder and app_state_capture
Usage Examples:
# Monitor app logs in real-time (follow mode)
python scripts/log_monitor.py --app com.myapp.MyApp --follow
# Capture logs for specific duration
python scripts/log_monitor.py --app com.myapp.MyApp --duration 30s
# Extract errors and warnings only from last 5 minutes
python scripts/log_monitor.py --severity error,warning --last 5m
# Save logs to file
python scripts/log_monitor.py --app com.myapp.MyApp --duration 1m --output logs/
# Verbose output with full log lines
python scripts/log_monitor.py --app com.myapp.MyApp --verbose
"""
import argparse
import json
import re
import signal
import subprocess
import sys
from datetime import datetime, timedelta
from pathlib import Path
class LogMonitor:
"""Monitor and analyze iOS simulator logs with intelligent filtering."""
def __init__(
self,
app_bundle_id: str | None = None,
device_udid: str | None = None,
severity_filter: list[str] | None = None,
):
"""
Initialize log monitor.
Args:
app_bundle_id: Filter logs by app bundle ID
device_udid: Device UDID (uses booted if not specified)
severity_filter: List of severities to include (error, warning, info, debug)
"""
self.app_bundle_id = app_bundle_id
self.device_udid = device_udid or "booted"
self.severity_filter = severity_filter or ["error", "warning", "info", "debug"]
# Log storage
self.log_lines: list[str] = []
self.errors: list[str] = []
self.warnings: list[str] = []
self.info_messages: list[str] = []
# Statistics
self.error_count = 0
self.warning_count = 0
self.info_count = 0
self.debug_count = 0
self.total_lines = 0
# Deduplication
self.seen_messages: set[str] = set()
# Process control
self.log_process: subprocess.Popen | None = None
self.interrupted = False
def parse_time_duration(self, duration_str: str) -> float:
"""
Parse duration string to seconds.
Args:
duration_str: Duration like "30s", "5m", "1h"
Returns:
Duration in seconds
"""
match = re.match(r"(\d+)([smh])", duration_str.lower())
if not match:
raise ValueError(
f"Invalid duration format: {duration_str}. Use format like '30s', '5m', '1h'"
)
value, unit = match.groups()
value = int(value)
if unit == "s":
return value
if unit == "m":
return value * 60
if unit == "h":
return value * 3600
return 0
def classify_log_line(self, line: str) -> str | None:
"""
Classify log line by severity.
Args:
line: Log line to classify
Returns:
Severity level (error, warning, info, debug) or None
"""
line_lower = line.lower()
# Error patterns
error_patterns = [
r"\berror\b",
r"\bfault\b",
r"\bfailed\b",
r"\bexception\b",
r"\bcrash\b",
r"",
]
# Warning patterns
warning_patterns = [r"\bwarning\b", r"\bwarn\b", r"\bdeprecated\b", r"⚠️"]
# Info patterns
info_patterns = [r"\binfo\b", r"\bnotice\b", r""]
for pattern in error_patterns:
if re.search(pattern, line_lower):
return "error"
for pattern in warning_patterns:
if re.search(pattern, line_lower):
return "warning"
for pattern in info_patterns:
if re.search(pattern, line_lower):
return "info"
return "debug"
def deduplicate_message(self, line: str) -> bool:
"""
Check if message is duplicate.
Args:
line: Log line
Returns:
True if this is a new message, False if duplicate
"""
# Create signature by removing timestamps and process IDs
signature = re.sub(r"\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}", "", line)
signature = re.sub(r"\[\d+\]", "", signature)
signature = re.sub(r"\s+", " ", signature).strip()
if signature in self.seen_messages:
return False
self.seen_messages.add(signature)
return True
def process_log_line(self, line: str):
"""
Process a single log line.
Args:
line: Log line to process
"""
if not line.strip():
return
self.total_lines += 1
self.log_lines.append(line)
# Classify severity
severity = self.classify_log_line(line)
# Skip if not in filter
if severity not in self.severity_filter:
return
# Deduplicate (for errors and warnings)
if severity in ["error", "warning"] and not self.deduplicate_message(line):
return
# Store by severity
if severity == "error":
self.error_count += 1
self.errors.append(line)
elif severity == "warning":
self.warning_count += 1
self.warnings.append(line)
elif severity == "info":
self.info_count += 1
if len(self.info_messages) < 20: # Keep only recent info
self.info_messages.append(line)
else: # debug
self.debug_count += 1
def stream_logs(
self,
follow: bool = False,
duration: float | None = None,
last_minutes: float | None = None,
) -> bool:
"""
Stream logs from simulator.
Args:
follow: Follow mode (continuous streaming)
duration: Capture duration in seconds
last_minutes: Show logs from last N minutes
Returns:
True if successful
"""
# Build log stream command
cmd = ["xcrun", "simctl", "spawn", self.device_udid, "log", "stream"]
# Add filters
if self.app_bundle_id:
# Filter by process name (extracted from bundle ID)
app_name = self.app_bundle_id.split(".")[-1]
cmd.extend(["--predicate", f'processImagePath CONTAINS "{app_name}"'])
# Add time filter for historical logs
if last_minutes:
start_time = datetime.now() - timedelta(minutes=last_minutes)
time_str = start_time.strftime("%Y-%m-%d %H:%M:%S")
cmd.extend(["--start", time_str])
# Setup signal handler for graceful interruption
def signal_handler(sig, frame):
self.interrupted = True
if self.log_process:
self.log_process.terminate()
signal.signal(signal.SIGINT, signal_handler)
try:
# Start log streaming process
self.log_process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
bufsize=1, # Line buffered
)
# Track start time for duration
start_time = datetime.now()
# Process log lines
for line in iter(self.log_process.stdout.readline, ""):
if not line:
break
# Process the line
self.process_log_line(line.rstrip())
# Print in follow mode
if follow:
severity = self.classify_log_line(line)
if severity in self.severity_filter:
print(line.rstrip())
# Check duration
if duration and (datetime.now() - start_time).total_seconds() >= duration:
break
# Check if interrupted
if self.interrupted:
break
# Wait for process to finish
self.log_process.wait()
return True
except Exception as e:
print(f"Error streaming logs: {e}", file=sys.stderr)
return False
finally:
if self.log_process:
self.log_process.terminate()
def get_summary(self, verbose: bool = False) -> str:
"""
Get log summary.
Args:
verbose: Include full log details
Returns:
Formatted summary string
"""
lines = []
# Header
if self.app_bundle_id:
lines.append(f"Logs for: {self.app_bundle_id}")
else:
lines.append("Logs for: All processes")
# Statistics
lines.append(f"Total lines: {self.total_lines}")
lines.append(
f"Errors: {self.error_count}, Warnings: {self.warning_count}, Info: {self.info_count}"
)
# Top issues
if self.errors:
lines.append(f"\nTop Errors ({len(self.errors)}):")
for error in self.errors[:5]: # Show first 5
lines.append(f"{error[:120]}") # Truncate long lines
if self.warnings:
lines.append(f"\nTop Warnings ({len(self.warnings)}):")
for warning in self.warnings[:5]: # Show first 5
lines.append(f" ⚠️ {warning[:120]}")
# Verbose output
if verbose and self.log_lines:
lines.append("\n=== Recent Log Lines ===")
for line in self.log_lines[-50:]: # Last 50 lines
lines.append(line)
return "\n".join(lines)
def get_json_output(self) -> dict:
"""Get log results as JSON."""
return {
"app_bundle_id": self.app_bundle_id,
"device_udid": self.device_udid,
"statistics": {
"total_lines": self.total_lines,
"errors": self.error_count,
"warnings": self.warning_count,
"info": self.info_count,
"debug": self.debug_count,
},
"errors": self.errors[:20], # Limit to 20
"warnings": self.warnings[:20],
"sample_logs": self.log_lines[-50:], # Last 50 lines
}
def save_logs(self, output_dir: str) -> str:
"""
Save logs to file.
Args:
output_dir: Directory to save logs
Returns:
Path to saved log file
"""
# Create output directory
output_path = Path(output_dir)
output_path.mkdir(parents=True, exist_ok=True)
# Generate filename with timestamp
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
app_name = self.app_bundle_id.split(".")[-1] if self.app_bundle_id else "simulator"
log_file = output_path / f"{app_name}-{timestamp}.log"
# Write all log lines
with open(log_file, "w") as f:
f.write("\n".join(self.log_lines))
# Also save JSON summary
json_file = output_path / f"{app_name}-{timestamp}-summary.json"
with open(json_file, "w") as f:
json.dump(self.get_json_output(), f, indent=2)
return str(log_file)
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(
description="Monitor and analyze iOS simulator logs",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Monitor app in real-time
python scripts/log_monitor.py --app com.myapp.MyApp --follow
# Capture logs for 30 seconds
python scripts/log_monitor.py --app com.myapp.MyApp --duration 30s
# Show errors/warnings from last 5 minutes
python scripts/log_monitor.py --severity error,warning --last 5m
# Save logs to file
python scripts/log_monitor.py --app com.myapp.MyApp --duration 1m --output logs/
""",
)
# Filtering options
parser.add_argument(
"--app", dest="app_bundle_id", help="App bundle ID to filter logs (e.g., com.myapp.MyApp)"
)
parser.add_argument("--device-udid", help="Device UDID (uses booted if not specified)")
parser.add_argument(
"--severity", help="Comma-separated severity levels (error,warning,info,debug)"
)
# Time options
time_group = parser.add_mutually_exclusive_group()
time_group.add_argument(
"--follow", action="store_true", help="Follow mode (continuous streaming)"
)
time_group.add_argument("--duration", help="Capture duration (e.g., 30s, 5m, 1h)")
time_group.add_argument(
"--last", dest="last_minutes", help="Show logs from last N minutes (e.g., 5m)"
)
# Output options
parser.add_argument("--output", help="Save logs to directory")
parser.add_argument("--verbose", action="store_true", help="Show detailed output")
parser.add_argument("--json", action="store_true", help="Output as JSON")
args = parser.parse_args()
# Parse severity filter
severity_filter = None
if args.severity:
severity_filter = [s.strip().lower() for s in args.severity.split(",")]
# Initialize monitor
monitor = LogMonitor(
app_bundle_id=args.app_bundle_id,
device_udid=args.device_udid,
severity_filter=severity_filter,
)
# Parse duration
duration = None
if args.duration:
duration = monitor.parse_time_duration(args.duration)
# Parse last minutes
last_minutes = None
if args.last_minutes:
last_minutes = monitor.parse_time_duration(args.last_minutes) / 60
# Stream logs
print("Monitoring logs...", file=sys.stderr)
if args.app_bundle_id:
print(f"App: {args.app_bundle_id}", file=sys.stderr)
success = monitor.stream_logs(follow=args.follow, duration=duration, last_minutes=last_minutes)
if not success:
sys.exit(1)
# Save logs if requested
if args.output:
log_file = monitor.save_logs(args.output)
print(f"\nLogs saved to: {log_file}", file=sys.stderr)
# Output results
if not args.follow: # Don't show summary in follow mode
if args.json:
print(json.dumps(monitor.get_json_output(), indent=2))
else:
print("\n" + monitor.get_summary(verbose=args.verbose))
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -1,453 +0,0 @@
#!/usr/bin/env python3
"""
iOS Simulator Navigator - Smart Element Finder and Interactor
Finds and interacts with UI elements using accessibility data.
Prioritizes structured navigation over pixel-based interaction.
This script is the core automation tool for iOS simulator navigation. It finds
UI elements by text, type, or accessibility ID and performs actions on them
(tap, enter text). Uses semantic element finding instead of fragile pixel coordinates.
Key Features:
- Find elements by text (fuzzy or exact matching)
- Find elements by type (Button, TextField, etc.)
- Find elements by accessibility identifier
- Tap elements at their center point
- Enter text into text fields
- List all tappable elements on screen
- Automatic element caching for performance
Usage Examples:
# Find and tap a button by text
python scripts/navigator.py --find-text "Login" --tap --udid <device-id>
# Enter text into first text field
python scripts/navigator.py --find-type TextField --index 0 --enter-text "username" --udid <device-id>
# Tap element by accessibility ID
python scripts/navigator.py --find-id "submitButton" --tap --udid <device-id>
# List all interactive elements
python scripts/navigator.py --list --udid <device-id>
# Tap at specific coordinates (fallback)
python scripts/navigator.py --tap-at 200,400 --udid <device-id>
Output Format:
Tapped: Button "Login" at (320, 450)
Entered text in: TextField "Username"
Not found: text='Submit'
Navigation Priority (best to worst):
1. Find by accessibility label/text (most reliable)
2. Find by element type + index (good for forms)
3. Find by accessibility ID (precise but app-specific)
4. Tap at coordinates (last resort, fragile)
Technical Details:
- Uses IDB's accessibility tree via `idb ui describe-all --json --nested`
- Caches tree for multiple operations (call with force_refresh to update)
- Finds elements by parsing tree recursively
- Calculates tap coordinates from element frame center
- Uses `idb ui tap` for tapping, `idb ui text` for text entry
- Extracts data from AXLabel, AXValue, and AXUniqueId fields
"""
import argparse
import json
import subprocess
import sys
from dataclasses import dataclass
from common import (
flatten_tree,
get_accessibility_tree,
get_device_screen_size,
resolve_udid,
transform_screenshot_coords,
)
@dataclass
class Element:
"""Represents a UI element from accessibility tree."""
type: str
label: str | None
value: str | None
identifier: str | None
frame: dict[str, float]
traits: list[str]
enabled: bool = True
@property
def center(self) -> tuple[int, int]:
"""Calculate center point for tapping."""
x = int(self.frame["x"] + self.frame["width"] / 2)
y = int(self.frame["y"] + self.frame["height"] / 2)
return (x, y)
@property
def description(self) -> str:
"""Human-readable description."""
label = self.label or self.value or self.identifier or "Unnamed"
return f'{self.type} "{label}"'
class Navigator:
"""Navigates iOS apps using accessibility data."""
def __init__(self, udid: str | None = None):
"""Initialize navigator with optional device UDID."""
self.udid = udid
self._tree_cache = None
def get_accessibility_tree(self, force_refresh: bool = False) -> dict:
"""Get accessibility tree (cached for efficiency)."""
if self._tree_cache and not force_refresh:
return self._tree_cache
# Delegate to shared utility
self._tree_cache = get_accessibility_tree(self.udid, nested=True)
return self._tree_cache
def _flatten_tree(self, node: dict, elements: list[Element] | None = None) -> list[Element]:
"""Flatten accessibility tree into list of elements."""
if elements is None:
elements = []
# Create element from node
if node.get("type"):
element = Element(
type=node.get("type", "Unknown"),
label=node.get("AXLabel"),
value=node.get("AXValue"),
identifier=node.get("AXUniqueId"),
frame=node.get("frame", {}),
traits=node.get("traits", []),
enabled=node.get("enabled", True),
)
elements.append(element)
# Process children
for child in node.get("children", []):
self._flatten_tree(child, elements)
return elements
def find_element(
self,
text: str | None = None,
element_type: str | None = None,
identifier: str | None = None,
index: int = 0,
fuzzy: bool = True,
) -> Element | None:
"""
Find element by various criteria.
Args:
text: Text to search in label/value
element_type: Type of element (Button, TextField, etc.)
identifier: Accessibility identifier
index: Which matching element to return (0-based)
fuzzy: Use fuzzy matching for text
Returns:
Element if found, None otherwise
"""
tree = self.get_accessibility_tree()
elements = self._flatten_tree(tree)
matches = []
for elem in elements:
# Skip disabled elements
if not elem.enabled:
continue
# Check type
if element_type and elem.type != element_type:
continue
# Check identifier (exact match)
if identifier and elem.identifier != identifier:
continue
# Check text (in label or value)
if text:
elem_text = (elem.label or "") + " " + (elem.value or "")
if fuzzy:
if text.lower() not in elem_text.lower():
continue
elif text not in (elem.label, elem.value):
continue
matches.append(elem)
if matches and index < len(matches):
return matches[index]
return None
def tap(self, element: Element) -> bool:
"""Tap on an element."""
x, y = element.center
return self.tap_at(x, y)
def tap_at(self, x: int, y: int) -> bool:
"""Tap at specific coordinates."""
cmd = ["idb", "ui", "tap", str(x), str(y)]
if self.udid:
cmd.extend(["--udid", self.udid])
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def enter_text(self, text: str, element: Element | None = None) -> bool:
"""
Enter text into element or current focus.
Args:
text: Text to enter
element: Optional element to tap first
Returns:
Success status
"""
# Tap element if provided
if element:
if not self.tap(element):
return False
# Small delay for focus
import time
time.sleep(0.5)
# Enter text
cmd = ["idb", "ui", "text", text]
if self.udid:
cmd.extend(["--udid", self.udid])
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def find_and_tap(
self,
text: str | None = None,
element_type: str | None = None,
identifier: str | None = None,
index: int = 0,
) -> tuple[bool, str]:
"""
Find element and tap it.
Returns:
(success, message) tuple
"""
element = self.find_element(text, element_type, identifier, index)
if not element:
criteria = []
if text:
criteria.append(f"text='{text}'")
if element_type:
criteria.append(f"type={element_type}")
if identifier:
criteria.append(f"id={identifier}")
return (False, f"Not found: {', '.join(criteria)}")
if self.tap(element):
return (True, f"Tapped: {element.description} at {element.center}")
return (False, f"Failed to tap: {element.description}")
def find_and_enter_text(
self,
text_to_enter: str,
find_text: str | None = None,
element_type: str | None = "TextField",
identifier: str | None = None,
index: int = 0,
) -> tuple[bool, str]:
"""
Find element and enter text into it.
Returns:
(success, message) tuple
"""
element = self.find_element(find_text, element_type, identifier, index)
if not element:
return (False, "TextField not found")
if self.enter_text(text_to_enter, element):
return (True, f"Entered text in: {element.description}")
return (False, "Failed to enter text")
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Navigate iOS apps using accessibility data")
# Finding options
parser.add_argument("--find-text", help="Find element by text (fuzzy match)")
parser.add_argument("--find-exact", help="Find element by exact text")
parser.add_argument("--find-type", help="Element type (Button, TextField, etc.)")
parser.add_argument("--find-id", help="Accessibility identifier")
parser.add_argument("--index", type=int, default=0, help="Which match to use (0-based)")
# Action options
parser.add_argument("--tap", action="store_true", help="Tap the found element")
parser.add_argument("--tap-at", help="Tap at coordinates (x,y)")
parser.add_argument("--enter-text", help="Enter text into element")
# Coordinate transformation
parser.add_argument(
"--screenshot-coords",
action="store_true",
help="Interpret tap coordinates as from a screenshot (requires --screenshot-width/height)",
)
parser.add_argument(
"--screenshot-width",
type=int,
help="Screenshot width for coordinate transformation",
)
parser.add_argument(
"--screenshot-height",
type=int,
help="Screenshot height for coordinate transformation",
)
# Other options
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
parser.add_argument("--list", action="store_true", help="List all tappable elements")
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
navigator = Navigator(udid=udid)
# List mode
if args.list:
tree = navigator.get_accessibility_tree()
elements = navigator._flatten_tree(tree)
# Filter to tappable elements
tappable = [
e
for e in elements
if e.enabled and e.type in ["Button", "Link", "Cell", "TextField", "SecureTextField"]
]
print(f"Tappable elements ({len(tappable)}):")
for elem in tappable[:10]: # Limit output for tokens
print(f" {elem.type}: \"{elem.label or elem.value or 'Unnamed'}\" {elem.center}")
if len(tappable) > 10:
print(f" ... and {len(tappable) - 10} more")
sys.exit(0)
# Direct tap at coordinates
if args.tap_at:
coords = args.tap_at.split(",")
if len(coords) != 2:
print("Error: --tap-at requires x,y format")
sys.exit(1)
x, y = int(coords[0]), int(coords[1])
# Handle coordinate transformation if requested
if args.screenshot_coords:
if not args.screenshot_width or not args.screenshot_height:
print(
"Error: --screenshot-coords requires --screenshot-width and --screenshot-height"
)
sys.exit(1)
device_w, device_h = get_device_screen_size(udid)
x, y = transform_screenshot_coords(
x,
y,
args.screenshot_width,
args.screenshot_height,
device_w,
device_h,
)
print(
f"Transformed screenshot coords ({coords[0]}, {coords[1]}) "
f"to device coords ({x}, {y})"
)
if navigator.tap_at(x, y):
print(f"Tapped at ({x}, {y})")
else:
print(f"Failed to tap at ({x}, {y})")
sys.exit(1)
# Find and tap
elif args.tap:
text = args.find_text or args.find_exact
fuzzy = args.find_text is not None
success, message = navigator.find_and_tap(
text=text, element_type=args.find_type, identifier=args.find_id, index=args.index
)
print(message)
if not success:
sys.exit(1)
# Find and enter text
elif args.enter_text:
text = args.find_text or args.find_exact
success, message = navigator.find_and_enter_text(
text_to_enter=args.enter_text,
find_text=text,
element_type=args.find_type or "TextField",
identifier=args.find_id,
index=args.index,
)
print(message)
if not success:
sys.exit(1)
# Just find (no action)
else:
text = args.find_text or args.find_exact
fuzzy = args.find_text is not None
element = navigator.find_element(
text=text,
element_type=args.find_type,
identifier=args.find_id,
index=args.index,
fuzzy=fuzzy,
)
if element:
print(f"Found: {element.description} at {element.center}")
else:
print("Element not found")
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,310 +0,0 @@
#!/usr/bin/env python3
"""
iOS Privacy & Permissions Manager
Grant/revoke app permissions for testing permission flows.
Supports 13+ services with audit trail tracking.
Usage: python scripts/privacy_manager.py --grant camera --bundle-id com.app
"""
import argparse
import subprocess
import sys
from datetime import datetime
from common import resolve_udid
class PrivacyManager:
"""Manages iOS app privacy and permissions."""
# Supported services
SUPPORTED_SERVICES = {
"camera": "Camera access",
"microphone": "Microphone access",
"location": "Location services",
"contacts": "Contacts access",
"photos": "Photos library access",
"calendar": "Calendar access",
"health": "Health data access",
"reminders": "Reminders access",
"motion": "Motion & fitness",
"keyboard": "Keyboard access",
"mediaLibrary": "Media library",
"calls": "Call history",
"siri": "Siri access",
}
def __init__(self, udid: str | None = None):
"""Initialize privacy manager.
Args:
udid: Optional device UDID (auto-detects booted simulator if None)
"""
self.udid = udid
def grant_permission(
self,
bundle_id: str,
service: str,
scenario: str | None = None,
step: int | None = None,
) -> bool:
"""
Grant permission for app.
Args:
bundle_id: App bundle ID
service: Service name (camera, microphone, location, etc.)
scenario: Test scenario name for audit trail
step: Step number in test scenario
Returns:
Success status
"""
if service not in self.SUPPORTED_SERVICES:
print(f"Error: Unknown service '{service}'")
print(f"Supported: {', '.join(self.SUPPORTED_SERVICES.keys())}")
return False
cmd = ["xcrun", "simctl", "privacy"]
if self.udid:
cmd.append(self.udid)
else:
cmd.append("booted")
cmd.extend(["grant", service, bundle_id])
try:
subprocess.run(cmd, capture_output=True, check=True)
# Log audit entry
self._log_audit("grant", bundle_id, service, scenario, step)
return True
except subprocess.CalledProcessError:
return False
def revoke_permission(
self,
bundle_id: str,
service: str,
scenario: str | None = None,
step: int | None = None,
) -> bool:
"""
Revoke permission for app.
Args:
bundle_id: App bundle ID
service: Service name
scenario: Test scenario name for audit trail
step: Step number in test scenario
Returns:
Success status
"""
if service not in self.SUPPORTED_SERVICES:
print(f"Error: Unknown service '{service}'")
return False
cmd = ["xcrun", "simctl", "privacy"]
if self.udid:
cmd.append(self.udid)
else:
cmd.append("booted")
cmd.extend(["revoke", service, bundle_id])
try:
subprocess.run(cmd, capture_output=True, check=True)
# Log audit entry
self._log_audit("revoke", bundle_id, service, scenario, step)
return True
except subprocess.CalledProcessError:
return False
def reset_permission(
self,
bundle_id: str,
service: str,
scenario: str | None = None,
step: int | None = None,
) -> bool:
"""
Reset permission to default.
Args:
bundle_id: App bundle ID
service: Service name
scenario: Test scenario name for audit trail
step: Step number in test scenario
Returns:
Success status
"""
if service not in self.SUPPORTED_SERVICES:
print(f"Error: Unknown service '{service}'")
return False
cmd = ["xcrun", "simctl", "privacy"]
if self.udid:
cmd.append(self.udid)
else:
cmd.append("booted")
cmd.extend(["reset", service, bundle_id])
try:
subprocess.run(cmd, capture_output=True, check=True)
# Log audit entry
self._log_audit("reset", bundle_id, service, scenario, step)
return True
except subprocess.CalledProcessError:
return False
@staticmethod
def _log_audit(
action: str,
bundle_id: str,
service: str,
scenario: str | None = None,
step: int | None = None,
) -> None:
"""Log permission change to audit trail (for test tracking).
Args:
action: grant, revoke, or reset
bundle_id: App bundle ID
service: Service name
scenario: Test scenario name
step: Step number
"""
# Could write to file, but for now just log to stdout for transparency
timestamp = datetime.now().isoformat()
location = f" (step {step})" if step else ""
scenario_info = f" in {scenario}" if scenario else ""
print(
f"[Audit] {timestamp}: {action.upper()} {service} for {bundle_id}{scenario_info}{location}"
)
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Manage iOS app privacy and permissions")
# Required
parser.add_argument("--bundle-id", required=True, help="App bundle ID (e.g., com.example.app)")
# Action (mutually exclusive)
action_group = parser.add_mutually_exclusive_group(required=True)
action_group.add_argument(
"--grant",
help="Grant permission (service name or comma-separated list)",
)
action_group.add_argument(
"--revoke", help="Revoke permission (service name or comma-separated list)"
)
action_group.add_argument(
"--reset",
help="Reset permission to default (service name or comma-separated list)",
)
action_group.add_argument(
"--list",
action="store_true",
help="List all supported services",
)
# Test tracking
parser.add_argument(
"--scenario",
help="Test scenario name for audit trail",
)
parser.add_argument("--step", type=int, help="Step number in test scenario")
# Device
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
args = parser.parse_args()
# List supported services
if args.list:
print("Supported Privacy Services:\n")
for service, description in PrivacyManager.SUPPORTED_SERVICES.items():
print(f" {service:<15} - {description}")
print()
print("Examples:")
print(" python scripts/privacy_manager.py --grant camera --bundle-id com.app")
print(" python scripts/privacy_manager.py --revoke location --bundle-id com.app")
print(" python scripts/privacy_manager.py --grant camera,photos --bundle-id com.app")
sys.exit(0)
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
manager = PrivacyManager(udid=udid)
# Parse service names (support comma-separated list)
if args.grant:
services = [s.strip() for s in args.grant.split(",")]
action = "grant"
action_fn = manager.grant_permission
elif args.revoke:
services = [s.strip() for s in args.revoke.split(",")]
action = "revoke"
action_fn = manager.revoke_permission
else: # reset
services = [s.strip() for s in args.reset.split(",")]
action = "reset"
action_fn = manager.reset_permission
# Execute action for each service
all_success = True
for service in services:
if service not in PrivacyManager.SUPPORTED_SERVICES:
print(f"Error: Unknown service '{service}'")
all_success = False
continue
success = action_fn(
args.bundle_id,
service,
scenario=args.scenario,
step=args.step,
)
if success:
description = PrivacyManager.SUPPORTED_SERVICES[service]
print(f"{action.capitalize()} {service}: {description}")
else:
print(f"✗ Failed to {action} {service}")
all_success = False
if not all_success:
sys.exit(1)
# Summary
if len(services) > 1:
print(f"\nPermissions {action}ed: {', '.join(services)}")
if args.scenario:
print(f"Test scenario: {args.scenario}" + (f" (step {args.step})" if args.step else ""))
if __name__ == "__main__":
main()

View File

@@ -1,240 +0,0 @@
#!/usr/bin/env python3
"""
iOS Push Notification Simulator
Send simulated push notifications to test notification handling.
Supports custom payloads and test tracking.
Usage: python scripts/push_notification.py --bundle-id com.app --title "Alert" --body "Message"
"""
import argparse
import json
import subprocess
import sys
import tempfile
from pathlib import Path
from common import resolve_udid
class PushNotificationSender:
"""Sends simulated push notifications to iOS simulator."""
def __init__(self, udid: str | None = None):
"""Initialize push notification sender.
Args:
udid: Optional device UDID (auto-detects booted simulator if None)
"""
self.udid = udid
def send(
self,
bundle_id: str,
payload: dict | str,
_test_name: str | None = None,
_expected_behavior: str | None = None,
) -> bool:
"""
Send push notification to app.
Args:
bundle_id: Target app bundle ID
payload: Push payload (dict or JSON string) or path to JSON file
test_name: Test scenario name for tracking
expected_behavior: Expected behavior after notification arrives
Returns:
Success status
"""
# Handle different payload formats
if isinstance(payload, str):
# Check if it's a file path
payload_path = Path(payload)
if payload_path.exists():
with open(payload_path) as f:
payload_data = json.load(f)
else:
# Try to parse as JSON string
try:
payload_data = json.loads(payload)
except json.JSONDecodeError:
print(f"Error: Invalid JSON payload: {payload}")
return False
else:
payload_data = payload
# Ensure payload has aps dictionary
if "aps" not in payload_data:
payload_data = {"aps": payload_data}
# Create temp file with payload
try:
with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False) as f:
json.dump(payload_data, f)
temp_payload_path = f.name
# Build simctl command
cmd = ["xcrun", "simctl", "push"]
if self.udid:
cmd.append(self.udid)
else:
cmd.append("booted")
cmd.extend([bundle_id, temp_payload_path])
# Send notification
subprocess.run(cmd, capture_output=True, text=True, check=True)
# Clean up temp file
Path(temp_payload_path).unlink()
return True
except subprocess.CalledProcessError as e:
print(f"Error sending push notification: {e.stderr}")
return False
except Exception as e:
print(f"Error: {e}")
return False
def send_simple(
self,
bundle_id: str,
title: str | None = None,
body: str | None = None,
badge: int | None = None,
sound: bool = True,
) -> bool:
"""
Send simple push notification with common parameters.
Args:
bundle_id: Target app bundle ID
title: Alert title
body: Alert body
badge: Badge number
sound: Whether to play sound
Returns:
Success status
"""
payload = {}
if title or body:
alert = {}
if title:
alert["title"] = title
if body:
alert["body"] = body
payload["alert"] = alert
if badge is not None:
payload["badge"] = badge
if sound:
payload["sound"] = "default"
# Wrap in aps
full_payload = {"aps": payload}
return self.send(bundle_id, full_payload)
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Send simulated push notification to iOS app")
# Required
parser.add_argument(
"--bundle-id", required=True, help="Target app bundle ID (e.g., com.example.app)"
)
# Simple payload options
parser.add_argument("--title", help="Alert title (for simple notifications)")
parser.add_argument("--body", help="Alert body message")
parser.add_argument("--badge", type=int, help="Badge number")
parser.add_argument("--no-sound", action="store_true", help="Don't play notification sound")
# Custom payload
parser.add_argument(
"--payload",
help="Custom JSON payload file or inline JSON string",
)
# Test tracking
parser.add_argument("--test-name", help="Test scenario name for tracking")
parser.add_argument(
"--expected",
help="Expected behavior after notification",
)
# Device
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
sender = PushNotificationSender(udid=udid)
# Send notification
if args.payload:
# Custom payload mode
success = sender.send(args.bundle_id, args.payload)
else:
# Simple notification mode
success = sender.send_simple(
args.bundle_id,
title=args.title,
body=args.body,
badge=args.badge,
sound=not args.no_sound,
)
if success:
# Token-efficient output
output = "Push notification sent"
if args.test_name:
output += f" (test: {args.test_name})"
print(output)
if args.expected:
print(f"Expected: {args.expected}")
print()
print("Notification details:")
if args.title:
print(f" Title: {args.title}")
if args.body:
print(f" Body: {args.body}")
if args.badge:
print(f" Badge: {args.badge}")
print()
print("Verify notification handling:")
print("1. Check app log output: python scripts/log_monitor.py --app " + args.bundle_id)
print(
"2. Capture state: python scripts/app_state_capture.py --app-bundle-id "
+ args.bundle_id
)
else:
print("Failed to send push notification")
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,292 +0,0 @@
#!/usr/bin/env python3
"""
iOS Screen Mapper - Current Screen Analyzer
Maps the current screen's UI elements for navigation decisions.
Provides token-efficient summaries of available interactions.
This script analyzes the iOS simulator screen using IDB's accessibility tree
and provides a compact, actionable summary of what's currently visible and
interactive on the screen. Perfect for AI agents making navigation decisions.
Key Features:
- Token-efficient output (5-7 lines by default)
- Identifies buttons, text fields, navigation elements
- Counts interactive and focusable elements
- Progressive detail with --verbose flag
- Navigation hints with --hints flag
Usage Examples:
# Quick summary (default)
python scripts/screen_mapper.py --udid <device-id>
# Detailed element breakdown
python scripts/screen_mapper.py --udid <device-id> --verbose
# Include navigation suggestions
python scripts/screen_mapper.py --udid <device-id> --hints
# Full JSON output for parsing
python scripts/screen_mapper.py --udid <device-id> --json
Output Format (default):
Screen: LoginViewController (45 elements, 7 interactive)
Buttons: "Login", "Cancel", "Forgot Password"
TextFields: 2 (0 filled)
Navigation: NavBar: "Sign In"
Focusable: 7 elements
Technical Details:
- Uses IDB's accessibility tree via `idb ui describe-all --json --nested`
- Parses IDB's array format: [{ root element with children }]
- Identifies element types: Button, TextField, NavigationBar, TabBar, etc.
- Extracts labels from AXLabel, AXValue, and AXUniqueId fields
"""
import argparse
import json
import subprocess
import sys
from collections import defaultdict
from common import get_accessibility_tree, resolve_udid
class ScreenMapper:
"""
Analyzes current screen for navigation decisions.
This class fetches the iOS accessibility tree from IDB and analyzes it
to provide actionable summaries for navigation. It categorizes elements
by type, counts interactive elements, and identifies key UI patterns.
Attributes:
udid (Optional[str]): Device UDID to target, or None for booted device
INTERACTIVE_TYPES (Set[str]): Element types that users can interact with
Design Philosophy:
- Token efficiency: Provide minimal but complete information
- Progressive disclosure: Summary by default, details on request
- Navigation-focused: Highlight elements relevant for automation
"""
# Element types we care about for navigation
# These are the accessibility element types that indicate user interaction points
INTERACTIVE_TYPES = {
"Button",
"Link",
"TextField",
"SecureTextField",
"Cell",
"Switch",
"Slider",
"Stepper",
"SegmentedControl",
"TabBar",
"NavigationBar",
"Toolbar",
}
def __init__(self, udid: str | None = None):
"""
Initialize screen mapper.
Args:
udid: Optional device UDID. If None, uses booted simulator.
Example:
mapper = ScreenMapper(udid="656DC652-1C9F-4AB2-AD4F-F38E65976BDA")
mapper = ScreenMapper() # Uses booted device
"""
self.udid = udid
def get_accessibility_tree(self) -> dict:
"""
Fetch accessibility tree from iOS simulator via IDB.
Delegates to shared utility for consistent tree fetching across all scripts.
"""
return get_accessibility_tree(self.udid, nested=True)
def analyze_tree(self, node: dict, depth: int = 0) -> dict:
"""Analyze accessibility tree for navigation info."""
analysis = {
"elements_by_type": defaultdict(list),
"total_elements": 0,
"interactive_elements": 0,
"text_fields": [],
"buttons": [],
"navigation": {},
"screen_name": None,
"focusable": 0,
}
self._analyze_recursive(node, analysis, depth)
# Post-process for clean output
analysis["elements_by_type"] = dict(analysis["elements_by_type"])
return analysis
def _analyze_recursive(self, node: dict, analysis: dict, depth: int):
"""Recursively analyze tree nodes."""
elem_type = node.get("type")
label = node.get("AXLabel", "")
value = node.get("AXValue", "")
identifier = node.get("AXUniqueId", "")
# Count element
if elem_type:
analysis["total_elements"] += 1
# Track by type
if elem_type in self.INTERACTIVE_TYPES:
analysis["interactive_elements"] += 1
# Store concise info (label only, not full node)
elem_info = label or value or identifier or "Unnamed"
analysis["elements_by_type"][elem_type].append(elem_info)
# Special handling for common types
if elem_type == "Button":
analysis["buttons"].append(elem_info)
elif elem_type in ("TextField", "SecureTextField"):
analysis["text_fields"].append(
{"type": elem_type, "label": elem_info, "has_value": bool(value)}
)
elif elem_type == "NavigationBar":
analysis["navigation"]["nav_title"] = label or "Navigation"
elif elem_type == "TabBar":
# Count tab items
tab_count = len(node.get("children", []))
analysis["navigation"]["tab_count"] = tab_count
# Track focusable elements
if node.get("enabled", False) and elem_type in self.INTERACTIVE_TYPES:
analysis["focusable"] += 1
# Try to identify screen name from view controller
if not analysis["screen_name"] and identifier:
if "ViewController" in identifier or "Screen" in identifier:
analysis["screen_name"] = identifier
# Process children
for child in node.get("children", []):
self._analyze_recursive(child, analysis, depth + 1)
def format_summary(self, analysis: dict, verbose: bool = False) -> str:
"""Format analysis as token-efficient summary."""
lines = []
# Screen identification (1 line)
screen = analysis["screen_name"] or "Unknown Screen"
total = analysis["total_elements"]
interactive = analysis["interactive_elements"]
lines.append(f"Screen: {screen} ({total} elements, {interactive} interactive)")
# Buttons summary (1 line)
if analysis["buttons"]:
button_list = ", ".join(f'"{b}"' for b in analysis["buttons"][:5])
if len(analysis["buttons"]) > 5:
button_list += f" +{len(analysis['buttons']) - 5} more"
lines.append(f"Buttons: {button_list}")
# Text fields summary (1 line)
if analysis["text_fields"]:
field_count = len(analysis["text_fields"])
[f["type"] for f in analysis["text_fields"]]
filled = sum(1 for f in analysis["text_fields"] if f["has_value"])
lines.append(f"TextFields: {field_count} ({filled} filled)")
# Navigation summary (1 line)
nav_parts = []
if "nav_title" in analysis["navigation"]:
nav_parts.append(f"NavBar: \"{analysis['navigation']['nav_title']}\"")
if "tab_count" in analysis["navigation"]:
nav_parts.append(f"TabBar: {analysis['navigation']['tab_count']} tabs")
if nav_parts:
lines.append(f"Navigation: {', '.join(nav_parts)}")
# Focusable count (1 line)
lines.append(f"Focusable: {analysis['focusable']} elements")
# Verbose mode adds element type breakdown
if verbose:
lines.append("\nElements by type:")
for elem_type, items in analysis["elements_by_type"].items():
if items: # Only show types that exist
lines.append(f" {elem_type}: {len(items)}")
for item in items[:3]: # Show first 3
lines.append(f" - {item}")
if len(items) > 3:
lines.append(f" ... +{len(items) - 3} more")
return "\n".join(lines)
def get_navigation_hints(self, analysis: dict) -> list[str]:
"""Generate navigation hints based on screen analysis."""
hints = []
# Check for common patterns
if "Login" in str(analysis.get("buttons", [])):
hints.append("Login screen detected - find TextFields for credentials")
if analysis["text_fields"]:
unfilled = [f for f in analysis["text_fields"] if not f["has_value"]]
if unfilled:
hints.append(f"{len(unfilled)} empty text field(s) - may need input")
if not analysis["buttons"] and not analysis["text_fields"]:
hints.append("No interactive elements - try swiping or going back")
if "tab_count" in analysis.get("navigation", {}):
hints.append(f"Tab bar available with {analysis['navigation']['tab_count']} tabs")
return hints
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Map current screen UI elements")
parser.add_argument("--verbose", action="store_true", help="Show detailed element breakdown")
parser.add_argument("--json", action="store_true", help="Output raw JSON analysis")
parser.add_argument("--hints", action="store_true", help="Include navigation hints")
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
# Create mapper and analyze
mapper = ScreenMapper(udid=udid)
tree = mapper.get_accessibility_tree()
analysis = mapper.analyze_tree(tree)
# Output based on format
if args.json:
# Full JSON (verbose)
print(json.dumps(analysis, indent=2, default=str))
else:
# Token-efficient summary (default)
summary = mapper.format_summary(analysis, verbose=args.verbose)
print(summary)
# Add hints if requested
if args.hints:
hints = mapper.get_navigation_hints(analysis)
if hints:
print("\nHints:")
for hint in hints:
print(f" - {hint}")
if __name__ == "__main__":
main()

View File

@@ -1,239 +0,0 @@
#!/usr/bin/env bash
#
# iOS Simulator Testing Environment Health Check
#
# Verifies that all required tools and dependencies are properly installed
# and configured for iOS simulator testing.
#
# Usage: bash scripts/sim_health_check.sh [--help]
set -e
# Color codes for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Check flags
SHOW_HELP=false
# Parse arguments
for arg in "$@"; do
case $arg in
--help|-h)
SHOW_HELP=true
shift
;;
esac
done
if [ "$SHOW_HELP" = true ]; then
cat <<EOF
iOS Simulator Testing - Environment Health Check
Verifies that your environment is properly configured for iOS simulator testing.
Usage: bash scripts/sim_health_check.sh [options]
Options:
--help, -h Show this help message
This script checks for:
- Xcode Command Line Tools installation
- iOS Simulator availability
- IDB (iOS Development Bridge) installation
- Available simulator devices
- Python 3 installation (for scripts)
Exit codes:
0 - All checks passed
1 - One or more checks failed (see output for details)
EOF
exit 0
fi
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo -e "${BLUE} iOS Simulator Testing - Environment Health Check${NC}"
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo ""
CHECKS_PASSED=0
CHECKS_FAILED=0
# Function to print check status
check_passed() {
echo -e "${GREEN}${NC} $1"
((CHECKS_PASSED++))
}
check_failed() {
echo -e "${RED}${NC} $1"
((CHECKS_FAILED++))
}
check_warning() {
echo -e "${YELLOW}${NC} $1"
}
# Check 1: macOS
echo -e "${BLUE}[1/8]${NC} Checking operating system..."
if [[ "$OSTYPE" == "darwin"* ]]; then
OS_VERSION=$(sw_vers -productVersion)
check_passed "macOS detected (version $OS_VERSION)"
else
check_failed "Not running on macOS (detected: $OSTYPE)"
echo " iOS Simulator testing requires macOS"
fi
echo ""
# Check 2: Xcode Command Line Tools
echo -e "${BLUE}[2/8]${NC} Checking Xcode Command Line Tools..."
if command -v xcrun &> /dev/null; then
XCODE_PATH=$(xcode-select -p 2>/dev/null || echo "not found")
if [ "$XCODE_PATH" != "not found" ]; then
XCODE_VERSION=$(xcodebuild -version 2>/dev/null | head -n 1 || echo "Unknown")
check_passed "Xcode Command Line Tools installed"
echo " Path: $XCODE_PATH"
echo " Version: $XCODE_VERSION"
else
check_failed "Xcode Command Line Tools path not set"
echo " Run: xcode-select --install"
fi
else
check_failed "xcrun command not found"
echo " Install Xcode Command Line Tools: xcode-select --install"
fi
echo ""
# Check 3: simctl availability
echo -e "${BLUE}[3/8]${NC} Checking simctl (Simulator Control)..."
if command -v xcrun &> /dev/null && xcrun simctl help &> /dev/null; then
check_passed "simctl is available"
else
check_failed "simctl not available"
echo " simctl comes with Xcode Command Line Tools"
fi
echo ""
# Check 4: IDB installation
echo -e "${BLUE}[4/8]${NC} Checking IDB (iOS Development Bridge)..."
if command -v idb &> /dev/null; then
IDB_PATH=$(which idb)
IDB_VERSION=$(idb --version 2>/dev/null || echo "Unknown")
check_passed "IDB is installed"
echo " Path: $IDB_PATH"
echo " Version: $IDB_VERSION"
else
check_warning "IDB not found in PATH"
echo " IDB is optional but provides advanced UI automation"
echo " Install: https://fbidb.io/docs/installation"
echo " Recommended: brew tap facebook/fb && brew install idb-companion"
fi
echo ""
# Check 5: Python 3 installation
echo -e "${BLUE}[5/8]${NC} Checking Python 3..."
if command -v python3 &> /dev/null; then
PYTHON_VERSION=$(python3 --version | cut -d' ' -f2)
check_passed "Python 3 is installed (version $PYTHON_VERSION)"
else
check_failed "Python 3 not found"
echo " Python 3 is required for testing scripts"
echo " Install: brew install python3"
fi
echo ""
# Check 6: Available simulators
echo -e "${BLUE}[6/8]${NC} Checking available iOS Simulators..."
if command -v xcrun &> /dev/null; then
SIMULATOR_COUNT=$(xcrun simctl list devices available 2>/dev/null | grep -c "iPhone\|iPad" || echo "0")
if [ "$SIMULATOR_COUNT" -gt 0 ]; then
check_passed "Found $SIMULATOR_COUNT available simulator(s)"
# Show first 5 simulators
echo ""
echo " Available simulators (showing up to 5):"
xcrun simctl list devices available 2>/dev/null | grep "iPhone\|iPad" | head -5 | while read -r line; do
echo " - $line"
done
else
check_warning "No simulators found"
echo " Create simulators via Xcode or simctl"
echo " Example: xcrun simctl create 'iPhone 15' 'iPhone 15'"
fi
else
check_failed "Cannot check simulators (simctl not available)"
fi
echo ""
# Check 7: Booted simulators
echo -e "${BLUE}[7/8]${NC} Checking booted simulators..."
if command -v xcrun &> /dev/null; then
BOOTED_SIMS=$(xcrun simctl list devices booted 2>/dev/null | grep -c "iPhone\|iPad" || echo "0")
if [ "$BOOTED_SIMS" -gt 0 ]; then
check_passed "$BOOTED_SIMS simulator(s) currently booted"
echo ""
echo " Booted simulators:"
xcrun simctl list devices booted 2>/dev/null | grep "iPhone\|iPad" | while read -r line; do
echo " - $line"
done
else
check_warning "No simulators currently booted"
echo " Boot a simulator to begin testing"
echo " Example: xcrun simctl boot <device-udid>"
echo " Or: open -a Simulator"
fi
else
check_failed "Cannot check booted simulators (simctl not available)"
fi
echo ""
# Check 8: Required Python packages (optional check)
echo -e "${BLUE}[8/8]${NC} Checking Python packages..."
if command -v python3 &> /dev/null; then
MISSING_PACKAGES=()
# Check for PIL/Pillow (for visual_diff.py)
if python3 -c "import PIL" 2>/dev/null; then
check_passed "Pillow (PIL) installed - visual diff available"
else
MISSING_PACKAGES+=("pillow")
check_warning "Pillow (PIL) not installed - visual diff won't work"
fi
if [ ${#MISSING_PACKAGES[@]} -gt 0 ]; then
echo ""
echo " Install missing packages:"
echo " pip3 install ${MISSING_PACKAGES[*]}"
fi
else
check_warning "Cannot check Python packages (Python 3 not available)"
fi
echo ""
# Summary
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo -e "${BLUE} Summary${NC}"
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo ""
echo -e "Checks passed: ${GREEN}$CHECKS_PASSED${NC}"
if [ "$CHECKS_FAILED" -gt 0 ]; then
echo -e "Checks failed: ${RED}$CHECKS_FAILED${NC}"
echo ""
echo -e "${YELLOW}Action required:${NC} Fix the failed checks above before testing"
exit 1
else
echo ""
echo -e "${GREEN}✓ Environment is ready for iOS simulator testing${NC}"
echo ""
echo "Next steps:"
echo " 1. Boot a simulator: open -a Simulator"
echo " 2. Launch your app: xcrun simctl launch booted <bundle-id>"
echo " 3. Run accessibility audit: python scripts/accessibility_audit.py"
exit 0
fi

View File

@@ -1,299 +0,0 @@
#!/usr/bin/env python3
"""
iOS Simulator Listing with Progressive Disclosure
Lists available simulators with token-efficient summaries.
Full details available on demand via cache IDs.
Achieves 96% token reduction (57k→2k tokens) for common queries.
Usage Examples:
# Concise summary (default)
python scripts/sim_list.py
# Get full details for cached list
python scripts/sim_list.py --get-details <cache-id>
# Get recommendations
python scripts/sim_list.py --suggest
# Filter by device type
python scripts/sim_list.py --device-type iPhone
Output (default):
Simulator Summary [cache-sim-20251028-143052]
├─ Total: 47 devices
├─ Available: 31
└─ Booted: 1
✓ iPhone 16 Pro (iOS 18.1) [ABC-123...]
Use --get-details cache-sim-20251028-143052 for full list
Technical Details:
- Uses xcrun simctl list devices
- Caches results with 1-hour TTL
- Reduces output by 96% by default
- Token efficiency: summary = ~30 tokens, full list = ~1500 tokens
"""
import argparse
import json
import subprocess
import sys
from typing import Any
from common import get_cache
class SimulatorLister:
"""Lists iOS simulators with progressive disclosure."""
def __init__(self):
"""Initialize lister with cache."""
self.cache = get_cache()
def list_simulators(self) -> dict:
"""
Get list of all simulators.
Returns:
Dict with structure:
{
"devices": [...],
"runtimes": [...],
"total_devices": int,
"available_devices": int,
"booted_devices": [...]
}
"""
try:
result = subprocess.run(
["xcrun", "simctl", "list", "devices", "--json"],
capture_output=True,
text=True,
check=True,
)
return json.loads(result.stdout)
except (subprocess.CalledProcessError, json.JSONDecodeError):
return {"devices": {}, "runtimes": []}
def parse_devices(self, sim_data: dict) -> list[dict]:
"""
Parse simulator data into flat list.
Returns:
List of device dicts with runtime info
"""
devices = []
devices_by_runtime = sim_data.get("devices", {})
for runtime_str, device_list in devices_by_runtime.items():
# Extract iOS version from runtime string
# Format: "iOS 18.1", "tvOS 18", etc.
runtime_name = runtime_str.replace(" Simulator", "").strip()
for device in device_list:
devices.append(
{
"name": device.get("name"),
"udid": device.get("udid"),
"state": device.get("state"),
"runtime": runtime_name,
"is_available": device.get("isAvailable", False),
}
)
return devices
def get_concise_summary(self, devices: list[dict]) -> dict:
"""
Generate concise summary with cache ID.
Returns 96% fewer tokens than full list.
"""
booted = [d for d in devices if d["state"] == "Booted"]
available = [d for d in devices if d["is_available"]]
iphone = [d for d in available if "iPhone" in d["name"]]
# Cache full list for later retrieval
cache_id = self.cache.save(
{
"devices": devices,
"timestamp": __import__("datetime").datetime.now().isoformat(),
},
"simulator-list",
)
return {
"cache_id": cache_id,
"summary": {
"total_devices": len(devices),
"available_devices": len(available),
"booted_devices": len(booted),
},
"quick_access": {
"booted": booted[:3] if booted else [],
"recommended_iphone": iphone[:3] if iphone else [],
},
}
def get_full_list(
self,
cache_id: str,
device_type: str | None = None,
runtime: str | None = None,
) -> list[dict] | None:
"""
Retrieve full simulator list from cache.
Args:
cache_id: Cache ID from concise summary
device_type: Filter by type (iPhone, iPad, etc.)
runtime: Filter by iOS version
Returns:
List of devices matching filters
"""
data = self.cache.get(cache_id)
if not data:
return None
devices = data.get("devices", [])
# Apply filters
if device_type:
devices = [d for d in devices if device_type in d["name"]]
if runtime:
devices = [d for d in devices if runtime.lower() in d["runtime"].lower()]
return devices
def suggest_simulators(self, limit: int = 4) -> list[dict]:
"""
Get simulator recommendations.
Returns:
List of recommended simulators (best candidates for building)
"""
all_sims = self.list_simulators()
devices = self.parse_devices(all_sims)
# Score devices for recommendations
scored = []
for device in devices:
score = 0
# Prefer booted
if device["state"] == "Booted":
score += 10
# Prefer available
if device["is_available"]:
score += 5
# Prefer recent iOS versions
ios_version = device["runtime"]
if "18" in ios_version:
score += 3
elif "17" in ios_version:
score += 2
# Prefer iPhones over other types
if "iPhone" in device["name"]:
score += 1
scored.append({"device": device, "score": score})
# Sort by score and return top N
scored.sort(key=lambda x: x["score"], reverse=True)
return [s["device"] for s in scored[:limit]]
def format_device(device: dict) -> str:
"""Format device for display."""
state_icon = "" if device["state"] == "Booted" else " "
avail_icon = "" if device["is_available"] else ""
name = device["name"]
runtime = device["runtime"]
udid_short = device["udid"][:8] + "..."
return f"{state_icon} {avail_icon} {name} ({runtime}) [{udid_short}]"
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="List iOS simulators with progressive disclosure")
parser.add_argument(
"--get-details",
metavar="CACHE_ID",
help="Get full details for cached simulator list",
)
parser.add_argument("--suggest", action="store_true", help="Get simulator recommendations")
parser.add_argument(
"--device-type",
help="Filter by device type (iPhone, iPad, Apple Watch, etc.)",
)
parser.add_argument("--runtime", help="Filter by iOS version (e.g., iOS-18, iOS-17)")
parser.add_argument("--json", action="store_true", help="Output as JSON")
args = parser.parse_args()
lister = SimulatorLister()
# Get full list with details
if args.get_details:
devices = lister.get_full_list(
args.get_details, device_type=args.device_type, runtime=args.runtime
)
if devices is None:
print(f"Error: Cache ID not found or expired: {args.get_details}")
sys.exit(1)
if args.json:
print(json.dumps(devices, indent=2))
else:
print(f"Simulators ({len(devices)}):\n")
for device in devices:
print(f" {format_device(device)}")
# Get recommendations
elif args.suggest:
suggestions = lister.suggest_simulators()
if args.json:
print(json.dumps(suggestions, indent=2))
else:
print("Recommended Simulators:\n")
for i, device in enumerate(suggestions, 1):
print(f"{i}. {format_device(device)}")
# Default: concise summary
else:
all_sims = lister.list_simulators()
devices = lister.parse_devices(all_sims)
summary = lister.get_concise_summary(devices)
if args.json:
print(json.dumps(summary, indent=2))
else:
# Human-readable concise output
cache_id = summary["cache_id"]
s = summary["summary"]
q = summary["quick_access"]
print(f"Simulator Summary [{cache_id}]")
print(f"├─ Total: {s['total_devices']} devices")
print(f"├─ Available: {s['available_devices']}")
print(f"└─ Booted: {s['booted_devices']}")
if q["booted"]:
print()
for device in q["booted"]:
print(f" {format_device(device)}")
print()
print(f"Use --get-details {cache_id} for full list")
if __name__ == "__main__":
main()

View File

@@ -1,297 +0,0 @@
#!/usr/bin/env python3
"""
Boot iOS simulators and wait for readiness.
This script boots one or more simulators and optionally waits for them to reach
a ready state. It measures boot time and provides progress feedback.
Key features:
- Boot by UDID or device name
- Wait for device readiness with configurable timeout
- Measure boot performance
- Batch boot operations (boot all, boot by type)
- Progress reporting for CI/CD pipelines
"""
import argparse
import subprocess
import sys
import time
from typing import Optional
from common.device_utils import (
get_booted_device_udid,
list_simulators,
resolve_device_identifier,
)
class SimulatorBooter:
"""Boot iOS simulators with optional readiness waiting."""
def __init__(self, udid: str | None = None):
"""Initialize booter with optional device UDID."""
self.udid = udid
def boot(self, wait_ready: bool = False, timeout_seconds: int = 120) -> tuple[bool, str]:
"""
Boot simulator and optionally wait for readiness.
Args:
wait_ready: Wait for device to be ready before returning
timeout_seconds: Maximum seconds to wait for readiness
Returns:
(success, message) tuple
"""
if not self.udid:
return False, "Error: Device UDID not specified"
start_time = time.time()
# Check if already booted
try:
booted = get_booted_device_udid()
if booted == self.udid:
elapsed = time.time() - start_time
return True, (f"Device already booted: {self.udid} " f"[checked in {elapsed:.1f}s]")
except RuntimeError:
pass # No booted device, proceed with boot
# Execute boot command
try:
cmd = ["xcrun", "simctl", "boot", self.udid]
result = subprocess.run(cmd, check=False, capture_output=True, text=True, timeout=30)
if result.returncode != 0:
error = result.stderr.strip()
return False, f"Boot failed: {error}"
except subprocess.TimeoutExpired:
return False, "Boot command timed out"
except Exception as e:
return False, f"Boot error: {e}"
# Optionally wait for readiness
if wait_ready:
ready, wait_message = self._wait_for_ready(timeout_seconds)
elapsed = time.time() - start_time
if ready:
return True, (f"Device booted and ready: {self.udid} " f"[{elapsed:.1f}s total]")
return False, wait_message
elapsed = time.time() - start_time
return True, (
f"Device booted: {self.udid} [boot in {elapsed:.1f}s] "
"(use --wait-ready to wait for availability)"
)
def _wait_for_ready(self, timeout_seconds: int = 120) -> tuple[bool, str]:
"""
Wait for device to reach ready state.
Args:
timeout_seconds: Maximum seconds to wait
Returns:
(success, message) tuple
"""
start_time = time.time()
poll_interval = 0.5
checks = 0
while time.time() - start_time < timeout_seconds:
try:
checks += 1
# Check if device responds to simctl commands
result = subprocess.run(
["xcrun", "simctl", "spawn", self.udid, "launchctl", "list"],
check=False,
capture_output=True,
text=True,
timeout=5,
)
if result.returncode == 0:
elapsed = time.time() - start_time
return True, (
f"Device ready: {self.udid} " f"[{elapsed:.1f}s, {checks} checks]"
)
except (subprocess.TimeoutExpired, RuntimeError):
pass # Not ready yet
time.sleep(poll_interval)
elapsed = time.time() - start_time
return False, (
f"Boot timeout: Device did not reach ready state "
f"within {elapsed:.1f}s ({checks} checks)"
)
@staticmethod
def boot_all() -> tuple[int, int]:
"""
Boot all available simulators.
Returns:
(succeeded, failed) tuple with counts
"""
simulators = list_simulators(state="available")
succeeded = 0
failed = 0
for sim in simulators:
booter = SimulatorBooter(udid=sim["udid"])
success, _message = booter.boot(wait_ready=False)
if success:
succeeded += 1
else:
failed += 1
return succeeded, failed
@staticmethod
def boot_by_type(device_type: str) -> tuple[int, int]:
"""
Boot all simulators of a specific type.
Args:
device_type: Device type filter (e.g., "iPhone", "iPad")
Returns:
(succeeded, failed) tuple with counts
"""
simulators = list_simulators(state="available")
succeeded = 0
failed = 0
for sim in simulators:
if device_type.lower() in sim["name"].lower():
booter = SimulatorBooter(udid=sim["udid"])
success, _message = booter.boot(wait_ready=False)
if success:
succeeded += 1
else:
failed += 1
return succeeded, failed
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Boot iOS simulators and wait for readiness")
parser.add_argument(
"--udid",
help="Device UDID or name (required unless using --all or --type)",
)
parser.add_argument(
"--name",
help="Device name (alternative to --udid)",
)
parser.add_argument(
"--wait-ready",
action="store_true",
help="Wait for device to reach ready state",
)
parser.add_argument(
"--timeout",
type=int,
default=120,
help="Timeout for --wait-ready in seconds (default: 120)",
)
parser.add_argument(
"--all",
action="store_true",
help="Boot all available simulators",
)
parser.add_argument(
"--type",
help="Boot all simulators of a specific type (e.g., iPhone, iPad)",
)
parser.add_argument(
"--json",
action="store_true",
help="Output as JSON",
)
args = parser.parse_args()
# Handle batch operations
if args.all:
succeeded, failed = SimulatorBooter.boot_all()
if args.json:
import json
print(
json.dumps(
{
"action": "boot_all",
"succeeded": succeeded,
"failed": failed,
"total": succeeded + failed,
}
)
)
else:
total = succeeded + failed
print(f"Boot summary: {succeeded}/{total} succeeded, " f"{failed} failed")
sys.exit(0 if failed == 0 else 1)
if args.type:
succeeded, failed = SimulatorBooter.boot_by_type(args.type)
if args.json:
import json
print(
json.dumps(
{
"action": "boot_by_type",
"type": args.type,
"succeeded": succeeded,
"failed": failed,
"total": succeeded + failed,
}
)
)
else:
total = succeeded + failed
print(f"Boot {args.type} summary: {succeeded}/{total} succeeded, " f"{failed} failed")
sys.exit(0 if failed == 0 else 1)
# Resolve device identifier
device_id = args.udid or args.name
if not device_id:
print("Error: Specify --udid, --name, --all, or --type", file=sys.stderr)
sys.exit(1)
try:
udid = resolve_device_identifier(device_id)
except RuntimeError as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
# Boot device
booter = SimulatorBooter(udid=udid)
success, message = booter.boot(wait_ready=args.wait_ready, timeout_seconds=args.timeout)
if args.json:
import json
print(
json.dumps(
{
"action": "boot",
"device_id": device_id,
"udid": udid,
"success": success,
"message": message,
}
)
)
else:
print(message)
sys.exit(0 if success else 1)
if __name__ == "__main__":
main()

View File

@@ -1,316 +0,0 @@
#!/usr/bin/env python3
"""
Create iOS simulators dynamically.
This script creates new simulators with specified device type and iOS version.
Useful for CI/CD pipelines that need on-demand test device provisioning.
Key features:
- Create by device type (iPhone 16 Pro, iPad Air, etc.)
- Specify iOS version (17.0, 18.0, etc.)
- Custom device naming
- Return newly created device UDID
- List available device types and runtimes
"""
import argparse
import subprocess
import sys
from typing import Optional
from common.device_utils import list_simulators
class SimulatorCreator:
"""Create iOS simulators with specified configurations."""
def __init__(self):
"""Initialize simulator creator."""
pass
def create(
self,
device_type: str,
ios_version: str | None = None,
custom_name: str | None = None,
) -> tuple[bool, str, str | None]:
"""
Create new iOS simulator.
Args:
device_type: Device type (e.g., "iPhone 16 Pro", "iPad Air")
ios_version: iOS version (e.g., "18.0"). If None, uses latest.
custom_name: Custom device name. If None, uses default.
Returns:
(success, message, new_udid) tuple
"""
# Get available device types and runtimes
available_types = self._get_device_types()
if not available_types:
return False, "Failed to get available device types", None
# Normalize device type
device_type_id = None
for dt in available_types:
if device_type.lower() in dt["name"].lower():
device_type_id = dt["identifier"]
break
if not device_type_id:
return (
False,
f"Device type '{device_type}' not found. "
f"Use --list-devices for available types.",
None,
)
# Get available runtimes
available_runtimes = self._get_runtimes()
if not available_runtimes:
return False, "Failed to get available runtimes", None
# Resolve iOS version
runtime_id = None
if ios_version:
for rt in available_runtimes:
if ios_version in rt["name"]:
runtime_id = rt["identifier"]
break
if not runtime_id:
return (
False,
f"iOS version '{ios_version}' not found. "
f"Use --list-runtimes for available versions.",
None,
)
# Use latest runtime
elif available_runtimes:
runtime_id = available_runtimes[-1]["identifier"]
if not runtime_id:
return False, "No iOS runtime available", None
# Create device
try:
# Build device name
device_name = (
custom_name or f"{device_type_id.split('.')[-1]}-{ios_version or 'latest'}"
)
cmd = [
"xcrun",
"simctl",
"create",
device_name,
device_type_id,
runtime_id,
]
result = subprocess.run(cmd, check=False, capture_output=True, text=True, timeout=60)
if result.returncode != 0:
error = result.stderr.strip() or result.stdout.strip()
return False, f"Creation failed: {error}", None
# Extract UDID from output
new_udid = result.stdout.strip()
return (
True,
f"Device created: {device_name} ({device_type}) iOS {ios_version or 'latest'} "
f"UDID: {new_udid}",
new_udid,
)
except subprocess.TimeoutExpired:
return False, "Creation command timed out", None
except Exception as e:
return False, f"Creation error: {e}", None
@staticmethod
def _get_device_types() -> list[dict]:
"""
Get available device types.
Returns:
List of device type dicts with "name" and "identifier" keys
"""
try:
cmd = ["xcrun", "simctl", "list", "devicetypes", "-j"]
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
import json
data = json.loads(result.stdout)
devices = []
for device in data.get("devicetypes", []):
devices.append(
{
"name": device.get("name", ""),
"identifier": device.get("identifier", ""),
}
)
return devices
except Exception:
return []
@staticmethod
def _get_runtimes() -> list[dict]:
"""
Get available iOS runtimes.
Returns:
List of runtime dicts with "name" and "identifier" keys
"""
try:
cmd = ["xcrun", "simctl", "list", "runtimes", "-j"]
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
import json
data = json.loads(result.stdout)
runtimes = []
for runtime in data.get("runtimes", []):
# Only include iOS runtimes (skip watchOS, tvOS, etc.)
identifier = runtime.get("identifier", "")
if "iOS" in identifier or "iOS" in runtime.get("name", ""):
runtimes.append(
{
"name": runtime.get("name", ""),
"identifier": runtime.get("identifier", ""),
}
)
# Sort by version number (latest first)
runtimes.sort(key=lambda r: r.get("identifier", ""), reverse=True)
return runtimes
except Exception:
return []
@staticmethod
def list_device_types() -> list[dict]:
"""
List all available device types.
Returns:
List of device types with name and identifier
"""
return SimulatorCreator._get_device_types()
@staticmethod
def list_runtimes() -> list[dict]:
"""
List all available iOS runtimes.
Returns:
List of runtimes with name and identifier
"""
return SimulatorCreator._get_runtimes()
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Create iOS simulators dynamically")
parser.add_argument(
"--device",
required=False,
help="Device type (e.g., 'iPhone 16 Pro', 'iPad Air')",
)
parser.add_argument(
"--runtime",
help="iOS version (e.g., '18.0', '17.0'). Defaults to latest.",
)
parser.add_argument(
"--name",
help="Custom device name. Defaults to auto-generated.",
)
parser.add_argument(
"--list-devices",
action="store_true",
help="List all available device types",
)
parser.add_argument(
"--list-runtimes",
action="store_true",
help="List all available iOS runtimes",
)
parser.add_argument(
"--json",
action="store_true",
help="Output as JSON",
)
args = parser.parse_args()
creator = SimulatorCreator()
# Handle info queries
if args.list_devices:
devices = creator.list_device_types()
if args.json:
import json
print(json.dumps({"devices": devices}))
else:
print(f"Available device types ({len(devices)}):")
for dev in devices[:20]: # Show first 20
print(f" - {dev['name']}")
if len(devices) > 20:
print(f" ... and {len(devices) - 20} more")
sys.exit(0)
if args.list_runtimes:
runtimes = creator.list_runtimes()
if args.json:
import json
print(json.dumps({"runtimes": runtimes}))
else:
print(f"Available iOS runtimes ({len(runtimes)}):")
for rt in runtimes:
print(f" - {rt['name']}")
sys.exit(0)
# Create device
if not args.device:
print(
"Error: Specify --device, --list-devices, or --list-runtimes",
file=sys.stderr,
)
sys.exit(1)
success, message, new_udid = creator.create(
device_type=args.device,
ios_version=args.runtime,
custom_name=args.name,
)
if args.json:
import json
print(
json.dumps(
{
"action": "create",
"device_type": args.device,
"runtime": args.runtime,
"success": success,
"message": message,
"new_udid": new_udid,
}
)
)
else:
print(message)
sys.exit(0 if success else 1)
if __name__ == "__main__":
main()

View File

@@ -1,357 +0,0 @@
#!/usr/bin/env python3
"""
Delete iOS simulators permanently.
This script permanently removes simulators and frees disk space.
Includes safety confirmation to prevent accidental deletion.
Key features:
- Delete by UDID or device name
- Confirmation required for safety
- Batch delete operations
- Report freed disk space estimate
"""
import argparse
import subprocess
import sys
from typing import Optional
from common.device_utils import (
list_simulators,
resolve_device_identifier,
)
class SimulatorDeleter:
"""Delete iOS simulators with safety confirmation."""
def __init__(self, udid: str | None = None):
"""Initialize with optional device UDID."""
self.udid = udid
def delete(self, confirm: bool = False) -> tuple[bool, str]:
"""
Delete simulator permanently.
Args:
confirm: Skip confirmation prompt (for batch operations)
Returns:
(success, message) tuple
"""
if not self.udid:
return False, "Error: Device UDID not specified"
# Safety confirmation
if not confirm:
try:
response = input(
f"Permanently delete simulator {self.udid}? " f"(type 'yes' to confirm): "
)
if response.lower() != "yes":
return False, "Deletion cancelled by user"
except KeyboardInterrupt:
return False, "Deletion cancelled"
# Execute delete command
try:
cmd = ["xcrun", "simctl", "delete", self.udid]
result = subprocess.run(cmd, check=False, capture_output=True, text=True, timeout=60)
if result.returncode != 0:
error = result.stderr.strip() or result.stdout.strip()
return False, f"Deletion failed: {error}"
return True, f"Device deleted: {self.udid} [disk space freed]"
except subprocess.TimeoutExpired:
return False, "Deletion command timed out"
except Exception as e:
return False, f"Deletion error: {e}"
@staticmethod
def delete_all(confirm: bool = False) -> tuple[int, int]:
"""
Delete all simulators permanently.
Args:
confirm: Skip confirmation prompt
Returns:
(succeeded, failed) tuple with counts
"""
simulators = list_simulators(state=None)
if not confirm:
count = len(simulators)
try:
response = input(
f"Permanently delete ALL {count} simulators? " f"(type 'yes' to confirm): "
)
if response.lower() != "yes":
return 0, count
except KeyboardInterrupt:
return 0, count
succeeded = 0
failed = 0
for sim in simulators:
deleter = SimulatorDeleter(udid=sim["udid"])
success, _message = deleter.delete(confirm=True)
if success:
succeeded += 1
else:
failed += 1
return succeeded, failed
@staticmethod
def delete_by_type(device_type: str, confirm: bool = False) -> tuple[int, int]:
"""
Delete all simulators of a specific type.
Args:
device_type: Device type filter (e.g., "iPhone", "iPad")
confirm: Skip confirmation prompt
Returns:
(succeeded, failed) tuple with counts
"""
simulators = list_simulators(state=None)
matching = [s for s in simulators if device_type.lower() in s["name"].lower()]
if not matching:
return 0, 0
if not confirm:
count = len(matching)
try:
response = input(
f"Permanently delete {count} {device_type} simulators? "
f"(type 'yes' to confirm): "
)
if response.lower() != "yes":
return 0, count
except KeyboardInterrupt:
return 0, count
succeeded = 0
failed = 0
for sim in matching:
deleter = SimulatorDeleter(udid=sim["udid"])
success, _message = deleter.delete(confirm=True)
if success:
succeeded += 1
else:
failed += 1
return succeeded, failed
@staticmethod
def delete_old(keep_count: int = 3, confirm: bool = False) -> tuple[int, int]:
"""
Delete older simulators, keeping most recent versions.
Useful for cleanup after testing multiple iOS versions.
Keeps the most recent N simulators of each type.
Args:
keep_count: Number of recent simulators to keep per type (default: 3)
confirm: Skip confirmation prompt
Returns:
(succeeded, failed) tuple with counts
"""
simulators = list_simulators(state=None)
# Group by device type
by_type: dict[str, list] = {}
for sim in simulators:
dev_type = sim["type"]
if dev_type not in by_type:
by_type[dev_type] = []
by_type[dev_type].append(sim)
# Find candidates for deletion (older ones)
to_delete = []
for _dev_type, sims in by_type.items():
# Sort by runtime (iOS version) - keep newest
sorted_sims = sorted(sims, key=lambda s: s["runtime"], reverse=True)
# Mark older ones for deletion
to_delete.extend(sorted_sims[keep_count:])
if not to_delete:
return 0, 0
if not confirm:
count = len(to_delete)
try:
response = input(
f"Delete {count} older simulators, keeping {keep_count} per type? "
f"(type 'yes' to confirm): "
)
if response.lower() != "yes":
return 0, count
except KeyboardInterrupt:
return 0, count
succeeded = 0
failed = 0
for sim in to_delete:
deleter = SimulatorDeleter(udid=sim["udid"])
success, _message = deleter.delete(confirm=True)
if success:
succeeded += 1
else:
failed += 1
return succeeded, failed
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Delete iOS simulators permanently")
parser.add_argument(
"--udid",
help="Device UDID or name (required unless using batch options)",
)
parser.add_argument(
"--name",
help="Device name (alternative to --udid)",
)
parser.add_argument(
"--yes",
action="store_true",
help="Skip confirmation prompt",
)
parser.add_argument(
"--all",
action="store_true",
help="Delete all simulators",
)
parser.add_argument(
"--type",
help="Delete all simulators of a specific type (e.g., iPhone)",
)
parser.add_argument(
"--old",
type=int,
metavar="KEEP_COUNT",
help="Delete older simulators, keeping this many per type (e.g., --old 3)",
)
parser.add_argument(
"--json",
action="store_true",
help="Output as JSON",
)
args = parser.parse_args()
# Handle batch operations
if args.all:
succeeded, failed = SimulatorDeleter.delete_all(confirm=args.yes)
if args.json:
import json
print(
json.dumps(
{
"action": "delete_all",
"succeeded": succeeded,
"failed": failed,
"total": succeeded + failed,
}
)
)
else:
total = succeeded + failed
print(f"Delete summary: {succeeded}/{total} succeeded, " f"{failed} failed")
sys.exit(0 if failed == 0 else 1)
if args.type:
succeeded, failed = SimulatorDeleter.delete_by_type(args.type, confirm=args.yes)
if args.json:
import json
print(
json.dumps(
{
"action": "delete_by_type",
"type": args.type,
"succeeded": succeeded,
"failed": failed,
"total": succeeded + failed,
}
)
)
else:
total = succeeded + failed
print(f"Delete {args.type} summary: {succeeded}/{total} succeeded, " f"{failed} failed")
sys.exit(0 if failed == 0 else 1)
if args.old is not None:
succeeded, failed = SimulatorDeleter.delete_old(keep_count=args.old, confirm=args.yes)
if args.json:
import json
print(
json.dumps(
{
"action": "delete_old",
"keep_count": args.old,
"succeeded": succeeded,
"failed": failed,
"total": succeeded + failed,
}
)
)
else:
total = succeeded + failed
print(
f"Delete old summary: {succeeded}/{total} succeeded, "
f"{failed} failed (kept {args.old} per type)"
)
sys.exit(0 if failed == 0 else 1)
# Delete single device
device_id = args.udid or args.name
if not device_id:
print("Error: Specify --udid, --name, --all, --type, or --old", file=sys.stderr)
sys.exit(1)
try:
udid = resolve_device_identifier(device_id)
except RuntimeError as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
# Delete device
deleter = SimulatorDeleter(udid=udid)
success, message = deleter.delete(confirm=args.yes)
if args.json:
import json
print(
json.dumps(
{
"action": "delete",
"device_id": device_id,
"udid": udid,
"success": success,
"message": message,
}
)
)
else:
print(message)
sys.exit(0 if success else 1)
if __name__ == "__main__":
main()

View File

@@ -1,342 +0,0 @@
#!/usr/bin/env python3
"""
Erase iOS simulators (factory reset).
This script performs a factory reset on simulators, returning them to
a clean state while preserving the device UUID. Much faster than
delete + create for CI/CD cleanup.
Key features:
- Erase by UDID or device name
- Preserve device UUID (faster than delete)
- Verify erase completion
- Batch erase operations (all, by type)
"""
import argparse
import subprocess
import sys
import time
from typing import Optional
from common.device_utils import (
list_simulators,
resolve_device_identifier,
)
class SimulatorEraser:
"""Erase iOS simulators with optional verification."""
def __init__(self, udid: str | None = None):
"""Initialize with optional device UDID."""
self.udid = udid
def erase(self, verify: bool = True, timeout_seconds: int = 30) -> tuple[bool, str]:
"""
Erase simulator and optionally verify completion.
Performs a factory reset, clearing all app data and settings
while preserving the simulator UUID.
Args:
verify: Wait for erase to complete and verify state
timeout_seconds: Maximum seconds to wait for verification
Returns:
(success, message) tuple
"""
if not self.udid:
return False, "Error: Device UDID not specified"
start_time = time.time()
# Execute erase command
try:
cmd = ["xcrun", "simctl", "erase", self.udid]
result = subprocess.run(cmd, check=False, capture_output=True, text=True, timeout=60)
if result.returncode != 0:
error = result.stderr.strip()
return False, f"Erase failed: {error}"
except subprocess.TimeoutExpired:
return False, "Erase command timed out"
except Exception as e:
return False, f"Erase error: {e}"
# Optionally verify erase completion
if verify:
ready, verify_message = self._verify_erase(timeout_seconds)
elapsed = time.time() - start_time
if ready:
return True, (
f"Device erased: {self.udid} " f"[factory reset complete, {elapsed:.1f}s]"
)
return False, verify_message
elapsed = time.time() - start_time
return True, (
f"Device erase initiated: {self.udid} [{elapsed:.1f}s] "
"(use --verify to wait for completion)"
)
def _verify_erase(self, timeout_seconds: int = 30) -> tuple[bool, str]:
"""
Verify erase has completed.
Polls device state to confirm erase finished successfully.
Args:
timeout_seconds: Maximum seconds to wait
Returns:
(success, message) tuple
"""
start_time = time.time()
poll_interval = 0.5
checks = 0
while time.time() - start_time < timeout_seconds:
try:
checks += 1
# Check if device can be queried (indicates boot status)
result = subprocess.run(
["xcrun", "simctl", "spawn", self.udid, "launchctl", "list"],
check=False,
capture_output=True,
text=True,
timeout=5,
)
# Device responding = erase likely complete
if result.returncode == 0:
elapsed = time.time() - start_time
return True, (
f"Erase verified: {self.udid} " f"[{elapsed:.1f}s, {checks} checks]"
)
except (subprocess.TimeoutExpired, RuntimeError):
pass # Not ready yet, keep polling
time.sleep(poll_interval)
elapsed = time.time() - start_time
return False, (
f"Erase verification timeout: Device did not respond "
f"within {elapsed:.1f}s ({checks} checks)"
)
@staticmethod
def erase_all() -> tuple[int, int]:
"""
Erase all simulators (factory reset).
Returns:
(succeeded, failed) tuple with counts
"""
simulators = list_simulators(state=None)
succeeded = 0
failed = 0
for sim in simulators:
eraser = SimulatorEraser(udid=sim["udid"])
success, _message = eraser.erase(verify=False)
if success:
succeeded += 1
else:
failed += 1
return succeeded, failed
@staticmethod
def erase_by_type(device_type: str) -> tuple[int, int]:
"""
Erase all simulators of a specific type.
Args:
device_type: Device type filter (e.g., "iPhone", "iPad")
Returns:
(succeeded, failed) tuple with counts
"""
simulators = list_simulators(state=None)
succeeded = 0
failed = 0
for sim in simulators:
if device_type.lower() in sim["name"].lower():
eraser = SimulatorEraser(udid=sim["udid"])
success, _message = eraser.erase(verify=False)
if success:
succeeded += 1
else:
failed += 1
return succeeded, failed
@staticmethod
def erase_booted() -> tuple[int, int]:
"""
Erase all currently booted simulators.
Returns:
(succeeded, failed) tuple with counts
"""
simulators = list_simulators(state="booted")
succeeded = 0
failed = 0
for sim in simulators:
eraser = SimulatorEraser(udid=sim["udid"])
success, _message = eraser.erase(verify=False)
if success:
succeeded += 1
else:
failed += 1
return succeeded, failed
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Erase iOS simulators (factory reset)")
parser.add_argument(
"--udid",
help="Device UDID or name (required unless using --all, --type, or --booted)",
)
parser.add_argument(
"--name",
help="Device name (alternative to --udid)",
)
parser.add_argument(
"--verify",
action="store_true",
help="Wait for erase to complete and verify state",
)
parser.add_argument(
"--timeout",
type=int,
default=30,
help="Timeout for --verify in seconds (default: 30)",
)
parser.add_argument(
"--all",
action="store_true",
help="Erase all simulators (factory reset)",
)
parser.add_argument(
"--type",
help="Erase all simulators of a specific type (e.g., iPhone)",
)
parser.add_argument(
"--booted",
action="store_true",
help="Erase all currently booted simulators",
)
parser.add_argument(
"--json",
action="store_true",
help="Output as JSON",
)
args = parser.parse_args()
# Handle batch operations
if args.all:
succeeded, failed = SimulatorEraser.erase_all()
if args.json:
import json
print(
json.dumps(
{
"action": "erase_all",
"succeeded": succeeded,
"failed": failed,
"total": succeeded + failed,
}
)
)
else:
total = succeeded + failed
print(f"Erase summary: {succeeded}/{total} succeeded, " f"{failed} failed")
sys.exit(0 if failed == 0 else 1)
if args.type:
succeeded, failed = SimulatorEraser.erase_by_type(args.type)
if args.json:
import json
print(
json.dumps(
{
"action": "erase_by_type",
"type": args.type,
"succeeded": succeeded,
"failed": failed,
"total": succeeded + failed,
}
)
)
else:
total = succeeded + failed
print(f"Erase {args.type} summary: {succeeded}/{total} succeeded, " f"{failed} failed")
sys.exit(0 if failed == 0 else 1)
if args.booted:
succeeded, failed = SimulatorEraser.erase_booted()
if args.json:
import json
print(
json.dumps(
{
"action": "erase_booted",
"succeeded": succeeded,
"failed": failed,
"total": succeeded + failed,
}
)
)
else:
total = succeeded + failed
print(f"Erase booted summary: {succeeded}/{total} succeeded, " f"{failed} failed")
sys.exit(0 if failed == 0 else 1)
# Erase single device
device_id = args.udid or args.name
if not device_id:
print("Error: Specify --udid, --name, --all, --type, or --booted", file=sys.stderr)
sys.exit(1)
try:
udid = resolve_device_identifier(device_id)
except RuntimeError as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
# Erase device
eraser = SimulatorEraser(udid=udid)
success, message = eraser.erase(verify=args.verify, timeout_seconds=args.timeout)
if args.json:
import json
print(
json.dumps(
{
"action": "erase",
"device_id": device_id,
"udid": udid,
"success": success,
"message": message,
}
)
)
else:
print(message)
sys.exit(0 if success else 1)
if __name__ == "__main__":
main()

View File

@@ -1,290 +0,0 @@
#!/usr/bin/env python3
"""
Shutdown iOS simulators with optional state verification.
This script shuts down one or more running simulators and optionally
verifies completion. Supports batch operations for efficient cleanup.
Key features:
- Shutdown by UDID or device name
- Verify shutdown completion with timeout
- Batch shutdown operations (all, by type)
- Progress reporting for CI/CD pipelines
"""
import argparse
import subprocess
import sys
import time
from typing import Optional
from common.device_utils import (
list_simulators,
resolve_device_identifier,
)
class SimulatorShutdown:
"""Shutdown iOS simulators with optional verification."""
def __init__(self, udid: str | None = None):
"""Initialize with optional device UDID."""
self.udid = udid
def shutdown(self, verify: bool = True, timeout_seconds: int = 30) -> tuple[bool, str]:
"""
Shutdown simulator and optionally verify completion.
Args:
verify: Wait for shutdown to complete
timeout_seconds: Maximum seconds to wait for shutdown
Returns:
(success, message) tuple
"""
if not self.udid:
return False, "Error: Device UDID not specified"
start_time = time.time()
# Check if already shutdown
simulators = list_simulators(state="booted")
if not any(s["udid"] == self.udid for s in simulators):
elapsed = time.time() - start_time
return True, (f"Device already shutdown: {self.udid} " f"[checked in {elapsed:.1f}s]")
# Execute shutdown command
try:
cmd = ["xcrun", "simctl", "shutdown", self.udid]
result = subprocess.run(cmd, check=False, capture_output=True, text=True, timeout=30)
if result.returncode != 0:
error = result.stderr.strip()
return False, f"Shutdown failed: {error}"
except subprocess.TimeoutExpired:
return False, "Shutdown command timed out"
except Exception as e:
return False, f"Shutdown error: {e}"
# Optionally verify shutdown
if verify:
ready, verify_message = self._verify_shutdown(timeout_seconds)
elapsed = time.time() - start_time
if ready:
return True, (f"Device shutdown confirmed: {self.udid} " f"[{elapsed:.1f}s total]")
return False, verify_message
elapsed = time.time() - start_time
return True, (
f"Device shutdown: {self.udid} [{elapsed:.1f}s] "
"(use --verify to wait for confirmation)"
)
def _verify_shutdown(self, timeout_seconds: int = 30) -> tuple[bool, str]:
"""
Verify device has fully shutdown.
Args:
timeout_seconds: Maximum seconds to wait
Returns:
(success, message) tuple
"""
start_time = time.time()
poll_interval = 0.5
checks = 0
while time.time() - start_time < timeout_seconds:
try:
checks += 1
# Check booted devices
simulators = list_simulators(state="booted")
if not any(s["udid"] == self.udid for s in simulators):
elapsed = time.time() - start_time
return True, (
f"Device shutdown verified: {self.udid} "
f"[{elapsed:.1f}s, {checks} checks]"
)
except RuntimeError:
pass # Error checking, retry
time.sleep(poll_interval)
elapsed = time.time() - start_time
return False, (
f"Shutdown verification timeout: Device did not fully shutdown "
f"within {elapsed:.1f}s ({checks} checks)"
)
@staticmethod
def shutdown_all() -> tuple[int, int]:
"""
Shutdown all booted simulators.
Returns:
(succeeded, failed) tuple with counts
"""
simulators = list_simulators(state="booted")
succeeded = 0
failed = 0
for sim in simulators:
shutdown = SimulatorShutdown(udid=sim["udid"])
success, _message = shutdown.shutdown(verify=False)
if success:
succeeded += 1
else:
failed += 1
return succeeded, failed
@staticmethod
def shutdown_by_type(device_type: str) -> tuple[int, int]:
"""
Shutdown all booted simulators of a specific type.
Args:
device_type: Device type filter (e.g., "iPhone", "iPad")
Returns:
(succeeded, failed) tuple with counts
"""
simulators = list_simulators(state="booted")
succeeded = 0
failed = 0
for sim in simulators:
if device_type.lower() in sim["name"].lower():
shutdown = SimulatorShutdown(udid=sim["udid"])
success, _message = shutdown.shutdown(verify=False)
if success:
succeeded += 1
else:
failed += 1
return succeeded, failed
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(
description="Shutdown iOS simulators with optional verification"
)
parser.add_argument(
"--udid",
help="Device UDID or name (required unless using --all or --type)",
)
parser.add_argument(
"--name",
help="Device name (alternative to --udid)",
)
parser.add_argument(
"--verify",
action="store_true",
help="Wait for shutdown to complete and verify state",
)
parser.add_argument(
"--timeout",
type=int,
default=30,
help="Timeout for --verify in seconds (default: 30)",
)
parser.add_argument(
"--all",
action="store_true",
help="Shutdown all booted simulators",
)
parser.add_argument(
"--type",
help="Shutdown all booted simulators of a specific type (e.g., iPhone)",
)
parser.add_argument(
"--json",
action="store_true",
help="Output as JSON",
)
args = parser.parse_args()
# Handle batch operations
if args.all:
succeeded, failed = SimulatorShutdown.shutdown_all()
if args.json:
import json
print(
json.dumps(
{
"action": "shutdown_all",
"succeeded": succeeded,
"failed": failed,
"total": succeeded + failed,
}
)
)
else:
total = succeeded + failed
print(f"Shutdown summary: {succeeded}/{total} succeeded, " f"{failed} failed")
sys.exit(0 if failed == 0 else 1)
if args.type:
succeeded, failed = SimulatorShutdown.shutdown_by_type(args.type)
if args.json:
import json
print(
json.dumps(
{
"action": "shutdown_by_type",
"type": args.type,
"succeeded": succeeded,
"failed": failed,
"total": succeeded + failed,
}
)
)
else:
total = succeeded + failed
print(
f"Shutdown {args.type} summary: {succeeded}/{total} succeeded, " f"{failed} failed"
)
sys.exit(0 if failed == 0 else 1)
# Resolve device identifier
device_id = args.udid or args.name
if not device_id:
print("Error: Specify --udid, --name, --all, or --type", file=sys.stderr)
sys.exit(1)
try:
udid = resolve_device_identifier(device_id)
except RuntimeError as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
# Shutdown device
shutdown = SimulatorShutdown(udid=udid)
success, message = shutdown.shutdown(verify=args.verify, timeout_seconds=args.timeout)
if args.json:
import json
print(
json.dumps(
{
"action": "shutdown",
"device_id": device_id,
"udid": udid,
"success": success,
"message": message,
}
)
)
else:
print(message)
sys.exit(0 if success else 1)
if __name__ == "__main__":
main()

View File

@@ -1,375 +0,0 @@
#!/usr/bin/env python3
"""
Intelligent Simulator Selector
Suggests the best available iOS simulators based on:
- Recently used (from config)
- Latest iOS version
- Common models for testing
- Boot status
Usage Examples:
# Get suggestions for user selection
python scripts/simulator_selector.py --suggest
# List all available simulators
python scripts/simulator_selector.py --list
# Boot a specific simulator
python scripts/simulator_selector.py --boot "67A99DF0-27BD-4507-A3DE-B7D8C38F764A"
# Get suggestions as JSON for programmatic use
python scripts/simulator_selector.py --suggest --json
"""
import argparse
import json
import re
import subprocess
import sys
from datetime import datetime
from pathlib import Path
from typing import Optional
# Try to import config from build_and_test if available
try:
from xcode.config import Config
except ImportError:
Config = None
class SimulatorInfo:
"""Information about an iOS simulator."""
def __init__(
self,
name: str,
udid: str,
ios_version: str,
status: str,
):
"""Initialize simulator info."""
self.name = name
self.udid = udid
self.ios_version = ios_version
self.status = status
self.reasons: list[str] = []
def to_dict(self) -> dict:
"""Convert to dictionary."""
return {
"device": self.name,
"udid": self.udid,
"ios": self.ios_version,
"status": self.status,
"reasons": self.reasons,
}
class SimulatorSelector:
"""Intelligent simulator selection."""
# Common iPhone models ranked by testing priority
COMMON_MODELS = [
"iPhone 16 Pro",
"iPhone 16",
"iPhone 15 Pro",
"iPhone 15",
"iPhone SE (3rd generation)",
]
def __init__(self):
"""Initialize selector."""
self.simulators: list[SimulatorInfo] = []
self.config: dict | None = None
self.last_used_simulator: str | None = None
# Load config if available
if Config:
try:
config = Config.load()
self.last_used_simulator = config.get_preferred_simulator()
except Exception:
pass
def list_simulators(self) -> list[SimulatorInfo]:
"""
List all available simulators.
Returns:
List of SimulatorInfo objects
"""
try:
result = subprocess.run(
["xcrun", "simctl", "list", "devices", "--json"],
capture_output=True,
text=True,
check=True,
)
data = json.loads(result.stdout)
simulators = []
# Parse devices by iOS version
for runtime, devices in data.get("devices", {}).items():
# Extract iOS version from runtime (e.g., "com.apple.CoreSimulator.SimRuntime.iOS-18-0")
ios_version_match = re.search(r"iOS-(\d+-\d+)", runtime)
if not ios_version_match:
continue
ios_version = ios_version_match.group(1).replace("-", ".")
for device in devices:
name = device.get("name", "")
udid = device.get("udid", "")
is_available = device.get("isAvailable", False)
if not is_available or "iPhone" not in name:
continue
status = device.get("state", "").capitalize()
sim_info = SimulatorInfo(name, udid, ios_version, status)
simulators.append(sim_info)
self.simulators = simulators
return simulators
except subprocess.CalledProcessError as e:
print(f"Error listing simulators: {e.stderr}", file=sys.stderr)
return []
except json.JSONDecodeError as e:
print(f"Error parsing simulator list: {e}", file=sys.stderr)
return []
def get_suggestions(self, count: int = 4) -> list[SimulatorInfo]:
"""
Get top N suggested simulators.
Ranking factors:
1. Recently used (from config)
2. Latest iOS version
3. Common models
4. Boot status (Booted preferred)
Args:
count: Number of suggestions to return
Returns:
List of suggested SimulatorInfo objects
"""
if not self.simulators:
return []
# Score each simulator
scored = []
for sim in self.simulators:
score = self._score_simulator(sim)
scored.append((score, sim))
# Sort by score (descending)
scored.sort(key=lambda x: x[0], reverse=True)
# Return top N
suggestions = [sim for _, sim in scored[:count]]
# Add reasons to each suggestion
for i, sim in enumerate(suggestions, 1):
if i == 1:
sim.reasons.append("Recommended")
# Check if recently used
if self.last_used_simulator and self.last_used_simulator == sim.name:
sim.reasons.append("Recently used")
# Check if latest iOS
latest_ios = max(s.ios_version for s in self.simulators)
if sim.ios_version == latest_ios:
sim.reasons.append("Latest iOS")
# Check if common model
for j, model in enumerate(self.COMMON_MODELS):
if model in sim.name:
sim.reasons.append(f"#{j+1} common model")
break
# Check if booted
if sim.status == "Booted":
sim.reasons.append("Currently running")
return suggestions
def _score_simulator(self, sim: SimulatorInfo) -> float:
"""
Score a simulator for ranking.
Higher score = better recommendation.
Args:
sim: Simulator to score
Returns:
Score value
"""
score = 0.0
# Recently used gets highest priority (100 points)
if self.last_used_simulator and self.last_used_simulator == sim.name:
score += 100
# Latest iOS version (50 points)
latest_ios = max(s.ios_version for s in self.simulators)
if sim.ios_version == latest_ios:
score += 50
# Common models (30-20 points based on ranking)
for i, model in enumerate(self.COMMON_MODELS):
if model in sim.name:
score += 30 - (i * 2) # Higher ranking models get more points
break
# Currently booted (10 points)
if sim.status == "Booted":
score += 10
# iOS version number (minor factor for breaking ties)
ios_numeric = float(sim.ios_version.replace(".", ""))
score += ios_numeric * 0.1
return score
def boot_simulator(self, udid: str) -> bool:
"""
Boot a simulator.
Args:
udid: Simulator UDID
Returns:
True if successful, False otherwise
"""
try:
subprocess.run(
["xcrun", "simctl", "boot", udid],
capture_output=True,
check=True,
)
return True
except subprocess.CalledProcessError as e:
print(f"Error booting simulator: {e.stderr}", file=sys.stderr)
return False
def format_suggestions(suggestions: list[SimulatorInfo], json_format: bool = False) -> str:
"""
Format suggestions for output.
Args:
suggestions: List of suggestions
json_format: If True, output as JSON
Returns:
Formatted string
"""
if json_format:
data = {"suggestions": [s.to_dict() for s in suggestions]}
return json.dumps(data, indent=2)
if not suggestions:
return "No simulators available"
lines = ["Available Simulators:\n"]
for i, sim in enumerate(suggestions, 1):
lines.append(f"{i}. {sim.name} (iOS {sim.ios_version})")
if sim.reasons:
lines.append(f" {', '.join(sim.reasons)}")
lines.append(f" UDID: {sim.udid}")
lines.append("")
return "\n".join(lines)
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(
description="Intelligent iOS simulator selector",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Get suggestions for user selection
python scripts/simulator_selector.py --suggest
# List all available simulators
python scripts/simulator_selector.py --list
# Boot a specific simulator
python scripts/simulator_selector.py --boot <UDID>
# Get suggestions as JSON
python scripts/simulator_selector.py --suggest --json
""",
)
parser.add_argument(
"--suggest",
action="store_true",
help="Get top simulator suggestions",
)
parser.add_argument(
"--list",
action="store_true",
help="List all available simulators",
)
parser.add_argument(
"--boot",
metavar="UDID",
help="Boot specific simulator by UDID",
)
parser.add_argument(
"--json",
action="store_true",
help="Output as JSON",
)
parser.add_argument(
"--count",
type=int,
default=4,
help="Number of suggestions (default: 4)",
)
args = parser.parse_args()
selector = SimulatorSelector()
if args.boot:
# Boot specific simulator
success = selector.boot_simulator(args.boot)
if success:
print(f"Booted simulator: {args.boot}")
return 0
return 1
if args.list:
# List all simulators
simulators = selector.list_simulators()
output = format_suggestions(simulators, args.json)
print(output)
return 0
if args.suggest:
# Get suggestions
selector.list_simulators()
suggestions = selector.get_suggestions(args.count)
output = format_suggestions(suggestions, args.json)
print(output)
return 0
# Default: show suggestions
selector.list_simulators()
suggestions = selector.get_suggestions(args.count)
output = format_suggestions(suggestions, args.json)
print(output)
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1,250 +0,0 @@
#!/usr/bin/env python3
"""
iOS Status Bar Controller
Override simulator status bar for clean screenshots and testing.
Control time, network, wifi, battery display.
Usage: python scripts/status_bar.py --preset clean
"""
import argparse
import subprocess
import sys
from common import resolve_udid
class StatusBarController:
"""Controls iOS simulator status bar appearance."""
# Preset configurations
PRESETS = {
"clean": {
"time": "9:41",
"data_network": "5g",
"wifi_mode": "active",
"battery_state": "charged",
"battery_level": 100,
},
"testing": {
"time": "11:11",
"data_network": "4g",
"wifi_mode": "active",
"battery_state": "discharging",
"battery_level": 50,
},
"low_battery": {
"time": "9:41",
"data_network": "5g",
"wifi_mode": "active",
"battery_state": "discharging",
"battery_level": 20,
},
"airplane": {
"time": "9:41",
"data_network": "none",
"wifi_mode": "failed",
"battery_state": "charged",
"battery_level": 100,
},
}
def __init__(self, udid: str | None = None):
"""Initialize status bar controller.
Args:
udid: Optional device UDID (auto-detects booted simulator if None)
"""
self.udid = udid
def override(
self,
time: str | None = None,
data_network: str | None = None,
wifi_mode: str | None = None,
battery_state: str | None = None,
battery_level: int | None = None,
) -> bool:
"""
Override status bar appearance.
Args:
time: Time in HH:MM format (e.g., "9:41")
data_network: Network type (none, 1x, 3g, 4g, 5g, lte, lte-a)
wifi_mode: WiFi state (active, searching, failed)
battery_state: Battery state (charging, charged, discharging)
battery_level: Battery percentage (0-100)
Returns:
Success status
"""
cmd = ["xcrun", "simctl", "status_bar"]
if self.udid:
cmd.append(self.udid)
else:
cmd.append("booted")
cmd.append("override")
# Add parameters if provided
if time:
cmd.extend(["--time", time])
if data_network:
cmd.extend(["--dataNetwork", data_network])
if wifi_mode:
cmd.extend(["--wifiMode", wifi_mode])
if battery_state:
cmd.extend(["--batteryState", battery_state])
if battery_level is not None:
cmd.extend(["--batteryLevel", str(battery_level)])
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def clear(self) -> bool:
"""
Clear status bar override and restore defaults.
Returns:
Success status
"""
cmd = ["xcrun", "simctl", "status_bar"]
if self.udid:
cmd.append(self.udid)
else:
cmd.append("booted")
cmd.append("clear")
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(
description="Override iOS simulator status bar for screenshots and testing"
)
# Preset option
parser.add_argument(
"--preset",
choices=list(StatusBarController.PRESETS.keys()),
help="Use preset configuration (clean, testing, low-battery, airplane)",
)
# Custom options
parser.add_argument(
"--time",
help="Override time (HH:MM format, e.g., '9:41')",
)
parser.add_argument(
"--data-network",
choices=["none", "1x", "3g", "4g", "5g", "lte", "lte-a"],
help="Data network type",
)
parser.add_argument(
"--wifi-mode",
choices=["active", "searching", "failed"],
help="WiFi state",
)
parser.add_argument(
"--battery-state",
choices=["charging", "charged", "discharging"],
help="Battery state",
)
parser.add_argument(
"--battery-level",
type=int,
help="Battery level 0-100",
)
# Other options
parser.add_argument(
"--clear",
action="store_true",
help="Clear status bar override and restore defaults",
)
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
sys.exit(1)
controller = StatusBarController(udid=udid)
# Clear mode
if args.clear:
if controller.clear():
print("Status bar override cleared - defaults restored")
else:
print("Failed to clear status bar override")
sys.exit(1)
# Preset mode
elif args.preset:
preset = StatusBarController.PRESETS[args.preset]
if controller.override(**preset):
print(f"Status bar: {args.preset} preset applied")
print(
f" Time: {preset['time']}, "
f"Network: {preset['data_network']}, "
f"Battery: {preset['battery_level']}%"
)
else:
print(f"Failed to apply {args.preset} preset")
sys.exit(1)
# Custom mode
elif any(
[
args.time,
args.data_network,
args.wifi_mode,
args.battery_state,
args.battery_level is not None,
]
):
if controller.override(
time=args.time,
data_network=args.data_network,
wifi_mode=args.wifi_mode,
battery_state=args.battery_state,
battery_level=args.battery_level,
):
output = "Status bar override applied:"
if args.time:
output += f" Time={args.time}"
if args.data_network:
output += f" Network={args.data_network}"
if args.battery_level is not None:
output += f" Battery={args.battery_level}%"
print(output)
else:
print("Failed to override status bar")
sys.exit(1)
else:
parser.print_help()
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,323 +0,0 @@
#!/usr/bin/env python3
"""
Test Recorder for iOS Simulator Testing
Records test execution with automatic screenshots and documentation.
Optimized for minimal token output during execution.
Usage:
As a script: python scripts/test_recorder.py --test-name "Test Name" --output dir/
As a module: from scripts.test_recorder import TestRecorder
"""
import argparse
import json
import subprocess
import time
from datetime import datetime
from pathlib import Path
from common import (
capture_screenshot,
count_elements,
generate_screenshot_name,
get_accessibility_tree,
resolve_udid,
)
class TestRecorder:
"""Records test execution with screenshots and accessibility snapshots."""
def __init__(
self,
test_name: str,
output_dir: str = "test-artifacts",
udid: str | None = None,
inline: bool = False,
screenshot_size: str = "half",
app_name: str | None = None,
):
"""
Initialize test recorder.
Args:
test_name: Name of the test being recorded
output_dir: Directory for test artifacts
udid: Optional device UDID (uses booted if not specified)
inline: If True, return screenshots as base64 (for vision-based automation)
screenshot_size: 'full', 'half', 'quarter', 'thumb' (default: 'half')
app_name: App name for semantic screenshot naming
"""
self.test_name = test_name
self.udid = udid
self.inline = inline
self.screenshot_size = screenshot_size
self.app_name = app_name
self.start_time = time.time()
self.steps: list[dict] = []
self.current_step = 0
# Create timestamped output directory
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
safe_name = test_name.lower().replace(" ", "-")
self.output_dir = Path(output_dir) / f"{safe_name}-{timestamp}"
self.output_dir.mkdir(parents=True, exist_ok=True)
# Create subdirectories (only if not in inline mode)
if not inline:
self.screenshots_dir = self.output_dir / "screenshots"
self.screenshots_dir.mkdir(exist_ok=True)
else:
self.screenshots_dir = None
self.accessibility_dir = self.output_dir / "accessibility"
self.accessibility_dir.mkdir(exist_ok=True)
# Token-efficient output
mode_str = "(inline mode)" if inline else ""
print(f"Recording: {test_name} {mode_str}")
print(f"Output: {self.output_dir}/")
def step(
self,
description: str,
screen_name: str | None = None,
state: str | None = None,
assertion: str | None = None,
metadata: dict | None = None,
):
"""
Record a test step with automatic screenshot.
Args:
description: Step description
screen_name: Screen name for semantic naming
state: State description for semantic naming
assertion: Optional assertion to verify
metadata: Optional metadata for the step
"""
self.current_step += 1
step_time = time.time() - self.start_time
# Capture screenshot using new utility
screenshot_result = capture_screenshot(
self.udid,
size=self.screenshot_size,
inline=self.inline,
app_name=self.app_name,
screen_name=screen_name or description,
state=state,
)
# Capture accessibility tree
accessibility_path = (
self.accessibility_dir
/ f"{self.current_step:03d}-{description.lower().replace(' ', '-')[:20]}.json"
)
element_count = self._capture_accessibility(accessibility_path)
# Store step data
step_data = {
"number": self.current_step,
"description": description,
"timestamp": step_time,
"element_count": element_count,
"accessibility": accessibility_path.name,
"screenshot_mode": screenshot_result["mode"],
"screenshot_size": self.screenshot_size,
}
# Handle screenshot data based on mode
if screenshot_result["mode"] == "file":
step_data["screenshot"] = screenshot_result["file_path"]
step_data["screenshot_name"] = Path(screenshot_result["file_path"]).name
else:
# Inline mode
step_data["screenshot_base64"] = screenshot_result["base64_data"]
step_data["screenshot_dimensions"] = (
screenshot_result["width"],
screenshot_result["height"],
)
if assertion:
step_data["assertion"] = assertion
step_data["assertion_passed"] = True
if metadata:
step_data["metadata"] = metadata
self.steps.append(step_data)
# Token-efficient output (single line)
status = "" if not assertion or step_data.get("assertion_passed") else ""
screenshot_info = (
f" [{screenshot_result['width']}x{screenshot_result['height']}]" if self.inline else ""
)
print(
f"{status} Step {self.current_step}: {description} ({step_time:.1f}s){screenshot_info}"
)
def _capture_screenshot(self, output_path: Path) -> bool:
"""Capture screenshot using simctl."""
cmd = ["xcrun", "simctl", "io"]
if self.udid:
cmd.append(self.udid)
else:
cmd.append("booted")
cmd.extend(["screenshot", str(output_path)])
try:
subprocess.run(cmd, capture_output=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def _capture_accessibility(self, output_path: Path) -> int:
"""Capture accessibility tree and return element count."""
try:
# Use shared utility to fetch tree
tree = get_accessibility_tree(self.udid, nested=True)
# Save tree
with open(output_path, "w") as f:
json.dump(tree, f, indent=2)
# Count elements using shared utility
return count_elements(tree)
except Exception:
return 0
def generate_report(self) -> dict[str, str]:
"""
Generate markdown test report.
Returns:
Dictionary with paths to generated files
"""
duration = time.time() - self.start_time
report_path = self.output_dir / "report.md"
# Generate markdown
with open(report_path, "w") as f:
f.write(f"# Test Report: {self.test_name}\n\n")
f.write(f"**Date:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
f.write(f"**Duration:** {duration:.1f} seconds\n")
f.write(f"**Steps:** {len(self.steps)}\n\n")
# Steps section
f.write("## Test Steps\n\n")
for step in self.steps:
f.write(
f"### Step {step['number']}: {step['description']} ({step['timestamp']:.1f}s)\n\n"
)
f.write(f"![Screenshot](screenshots/{step['screenshot']})\n\n")
if step.get("assertion"):
status = "" if step.get("assertion_passed") else ""
f.write(f"**Assertion:** {step['assertion']} {status}\n\n")
if step.get("metadata"):
f.write("**Metadata:**\n")
for key, value in step["metadata"].items():
f.write(f"- {key}: {value}\n")
f.write("\n")
f.write(f"**Accessibility Elements:** {step['element_count']}\n\n")
f.write("---\n\n")
# Summary
f.write("## Summary\n\n")
f.write(f"- Total steps: {len(self.steps)}\n")
f.write(f"- Duration: {duration:.1f}s\n")
f.write(f"- Screenshots: {len(self.steps)}\n")
f.write(f"- Accessibility snapshots: {len(self.steps)}\n")
# Save metadata JSON
metadata_path = self.output_dir / "metadata.json"
with open(metadata_path, "w") as f:
json.dump(
{
"test_name": self.test_name,
"duration": duration,
"steps": self.steps,
"timestamp": datetime.now().isoformat(),
},
f,
indent=2,
)
# Token-efficient output
print(f"Report: {report_path}")
return {
"markdown_path": str(report_path),
"metadata_path": str(metadata_path),
"output_dir": str(self.output_dir),
}
def main():
"""Main entry point for command-line usage."""
parser = argparse.ArgumentParser(
description="Record test execution with screenshots and documentation"
)
parser.add_argument("--test-name", required=True, help="Name of the test being recorded")
parser.add_argument(
"--output", default="test-artifacts", help="Output directory for test artifacts"
)
parser.add_argument(
"--udid",
help="Device UDID (auto-detects booted simulator if not provided)",
)
parser.add_argument(
"--inline",
action="store_true",
help="Return screenshots as base64 (inline mode for vision-based automation)",
)
parser.add_argument(
"--size",
choices=["full", "half", "quarter", "thumb"],
default="half",
help="Screenshot size for token optimization (default: half)",
)
parser.add_argument("--app-name", help="App name for semantic screenshot naming")
args = parser.parse_args()
# Resolve UDID with auto-detection
try:
udid = resolve_udid(args.udid)
except RuntimeError as e:
print(f"Error: {e}")
import sys
sys.exit(1)
# Create recorder
TestRecorder(
test_name=args.test_name,
output_dir=args.output,
udid=udid,
inline=args.inline,
screenshot_size=args.size,
app_name=args.app_name,
)
print("Test recorder initialized. Use the following methods:")
print(' recorder.step("description") - Record a test step')
print(" recorder.generate_report() - Generate final report")
print()
print("Example:")
print(' recorder.step("Launch app", screen_name="Splash")')
print(
' recorder.step("Enter credentials", screen_name="Login", state="Empty", metadata={"user": "test"})'
)
print(' recorder.step("Verify login", assertion="Home screen visible")')
print(" recorder.generate_report()")
if __name__ == "__main__":
main()

View File

@@ -1,235 +0,0 @@
#!/usr/bin/env python3
"""
Visual Diff Tool for iOS Simulator Screenshots
Compares two screenshots pixel-by-pixel to detect visual changes.
Optimized for minimal token output.
Usage: python scripts/visual_diff.py baseline.png current.png [options]
"""
import argparse
import json
import os
import sys
from pathlib import Path
try:
from PIL import Image, ImageChops, ImageDraw
except ImportError:
print("Error: Pillow not installed. Run: pip3 install pillow")
sys.exit(1)
class VisualDiffer:
"""Performs visual comparison between screenshots."""
def __init__(self, threshold: float = 0.01):
"""
Initialize differ with threshold.
Args:
threshold: Maximum acceptable difference ratio (0.01 = 1%)
"""
self.threshold = threshold
def compare(self, baseline_path: str, current_path: str) -> dict:
"""
Compare two images and return difference metrics.
Args:
baseline_path: Path to baseline image
current_path: Path to current image
Returns:
Dictionary with comparison results
"""
# Load images
try:
baseline = Image.open(baseline_path)
current = Image.open(current_path)
except FileNotFoundError as e:
print(f"Error: Image not found - {e}")
sys.exit(1)
except Exception as e:
print(f"Error: Failed to load image - {e}")
sys.exit(1)
# Verify dimensions match
if baseline.size != current.size:
return {
"error": "Image dimensions do not match",
"baseline_size": baseline.size,
"current_size": current.size,
}
# Convert to RGB if needed
if baseline.mode != "RGB":
baseline = baseline.convert("RGB")
if current.mode != "RGB":
current = current.convert("RGB")
# Calculate difference
diff = ImageChops.difference(baseline, current)
# Calculate metrics
total_pixels = baseline.size[0] * baseline.size[1]
diff_pixels = self._count_different_pixels(diff)
diff_percentage = (diff_pixels / total_pixels) * 100
# Determine pass/fail
passed = diff_percentage <= (self.threshold * 100)
return {
"dimensions": baseline.size,
"total_pixels": total_pixels,
"different_pixels": diff_pixels,
"difference_percentage": round(diff_percentage, 2),
"threshold_percentage": self.threshold * 100,
"passed": passed,
"verdict": "PASS" if passed else "FAIL",
}
def _count_different_pixels(self, diff_image: Image.Image) -> int:
"""Count number of pixels that are different."""
# Convert to grayscale for easier processing
diff_gray = diff_image.convert("L")
# Count non-zero pixels (different)
pixels = diff_gray.getdata()
return sum(1 for pixel in pixels if pixel > 10) # Threshold for noise
def generate_diff_image(self, baseline_path: str, current_path: str, output_path: str) -> None:
"""Generate highlighted difference image."""
baseline = Image.open(baseline_path).convert("RGB")
current = Image.open(current_path).convert("RGB")
# Create difference image
diff = ImageChops.difference(baseline, current)
# Enhance differences with red overlay
diff_enhanced = Image.new("RGB", baseline.size)
for x in range(baseline.size[0]):
for y in range(baseline.size[1]):
diff_pixel = diff.getpixel((x, y))
if sum(diff_pixel) > 30: # Threshold for visibility
# Highlight in red
diff_enhanced.putpixel((x, y), (255, 0, 0))
else:
# Keep original
diff_enhanced.putpixel((x, y), current.getpixel((x, y)))
diff_enhanced.save(output_path)
def generate_side_by_side(
self, baseline_path: str, current_path: str, output_path: str
) -> None:
"""Generate side-by-side comparison image."""
baseline = Image.open(baseline_path)
current = Image.open(current_path)
# Create combined image
width = baseline.size[0] * 2 + 10 # 10px separator
height = max(baseline.size[1], current.size[1])
combined = Image.new("RGB", (width, height), color=(128, 128, 128))
# Paste images
combined.paste(baseline, (0, 0))
combined.paste(current, (baseline.size[0] + 10, 0))
combined.save(output_path)
def main():
"""Main entry point."""
parser = argparse.ArgumentParser(description="Compare screenshots for visual differences")
parser.add_argument("baseline", help="Path to baseline screenshot")
parser.add_argument("current", help="Path to current screenshot")
parser.add_argument(
"--output",
default=".",
help="Output directory for diff artifacts (default: current directory)",
)
parser.add_argument(
"--threshold",
type=float,
default=0.01,
help="Acceptable difference threshold (0.01 = 1%%, default: 0.01)",
)
parser.add_argument(
"--details", action="store_true", help="Show detailed output (increases tokens)"
)
args = parser.parse_args()
# Create output directory if needed
output_dir = Path(args.output)
output_dir.mkdir(parents=True, exist_ok=True)
# Initialize differ
differ = VisualDiffer(threshold=args.threshold)
# Perform comparison
result = differ.compare(args.baseline, args.current)
# Handle dimension mismatch
if "error" in result:
print(f"Error: {result['error']}")
print(f"Baseline: {result['baseline_size']}")
print(f"Current: {result['current_size']}")
sys.exit(1)
# Generate artifacts
diff_image_path = output_dir / "diff.png"
comparison_image_path = output_dir / "side-by-side.png"
try:
differ.generate_diff_image(args.baseline, args.current, str(diff_image_path))
differ.generate_side_by_side(args.baseline, args.current, str(comparison_image_path))
except Exception as e:
print(f"Warning: Could not generate images - {e}")
# Output results (token-optimized)
if args.details:
# Detailed output
report = {
"summary": {
"baseline": args.baseline,
"current": args.current,
"threshold": args.threshold,
"passed": result["passed"],
},
"results": result,
"artifacts": {
"diff_image": str(diff_image_path),
"comparison_image": str(comparison_image_path),
},
}
print(json.dumps(report, indent=2))
else:
# Minimal output (default)
print(f"Difference: {result['difference_percentage']}% ({result['verdict']})")
if result["different_pixels"] > 0:
print(f"Changed pixels: {result['different_pixels']:,}")
print(f"Artifacts saved to: {output_dir}/")
# Save JSON report
report_path = output_dir / "diff-report.json"
with open(report_path, "w") as f:
json.dump(
{
"baseline": os.path.basename(args.baseline),
"current": os.path.basename(args.current),
"results": result,
"artifacts": {"diff": "diff.png", "comparison": "side-by-side.png"},
},
f,
indent=2,
)
# Exit with error if test failed
sys.exit(0 if result["passed"] else 1)
if __name__ == "__main__":
main()

View File

@@ -1,13 +0,0 @@
"""
Xcode build automation module.
Provides structured, modular access to xcodebuild and xcresult functionality.
"""
from .builder import BuildRunner
from .cache import XCResultCache
from .config import Config
from .reporter import OutputFormatter
from .xcresult import XCResultParser
__all__ = ["BuildRunner", "Config", "OutputFormatter", "XCResultCache", "XCResultParser"]

View File

@@ -1,381 +0,0 @@
"""
Xcode build execution.
Handles xcodebuild command construction and execution with xcresult generation.
"""
import re
import subprocess
import sys
from pathlib import Path
from .cache import XCResultCache
from .config import Config
class BuildRunner:
"""
Execute xcodebuild commands with xcresult bundle generation.
Handles scheme auto-detection, command construction, and build/test execution.
"""
def __init__(
self,
project_path: str | None = None,
workspace_path: str | None = None,
scheme: str | None = None,
configuration: str = "Debug",
simulator: str | None = None,
cache: XCResultCache | None = None,
):
"""
Initialize build runner.
Args:
project_path: Path to .xcodeproj
workspace_path: Path to .xcworkspace
scheme: Build scheme (auto-detected if not provided)
configuration: Build configuration (Debug/Release)
simulator: Simulator name
cache: XCResult cache (creates default if not provided)
"""
self.project_path = project_path
self.workspace_path = workspace_path
self.scheme = scheme
self.configuration = configuration
self.simulator = simulator
self.cache = cache or XCResultCache()
def auto_detect_scheme(self) -> str | None:
"""
Auto-detect build scheme from project/workspace.
Returns:
Detected scheme name or None
"""
cmd = ["xcodebuild", "-list"]
if self.workspace_path:
cmd.extend(["-workspace", self.workspace_path])
elif self.project_path:
cmd.extend(["-project", self.project_path])
else:
return None
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
# Parse schemes from output
in_schemes_section = False
for line in result.stdout.split("\n"):
line = line.strip()
if "Schemes:" in line:
in_schemes_section = True
continue
if in_schemes_section and line and not line.startswith("Build"):
# First scheme in list
return line
except subprocess.CalledProcessError as e:
print(f"Error auto-detecting scheme: {e}", file=sys.stderr)
return None
def get_simulator_destination(self) -> str:
"""
Get xcodebuild destination string.
Uses config preferences with fallback to auto-detection.
Priority:
1. --simulator CLI flag (self.simulator)
2. Config preferred_simulator
3. Config last_used_simulator
4. Auto-detect first iPhone
5. Generic iOS Simulator
Returns:
Destination string for -destination flag
"""
# Priority 1: CLI flag
if self.simulator:
return f"platform=iOS Simulator,name={self.simulator}"
# Priority 2-3: Config preferences
try:
# Determine project directory from project/workspace path
project_dir = None
if self.project_path:
project_dir = Path(self.project_path).parent
elif self.workspace_path:
project_dir = Path(self.workspace_path).parent
config = Config.load(project_dir=project_dir)
preferred = config.get_preferred_simulator()
if preferred:
# Check if preferred simulator exists
if self._simulator_exists(preferred):
return f"platform=iOS Simulator,name={preferred}"
print(f"Warning: Preferred simulator '{preferred}' not available", file=sys.stderr)
if config.should_fallback_to_any_iphone():
print("Falling back to auto-detection...", file=sys.stderr)
else:
# Strict mode: don't fallback
return f"platform=iOS Simulator,name={preferred}"
except Exception as e:
print(f"Warning: Could not load config: {e}", file=sys.stderr)
# Priority 4-5: Auto-detect
return self._auto_detect_simulator()
def _simulator_exists(self, name: str) -> bool:
"""
Check if simulator with given name exists and is available.
Args:
name: Simulator name (e.g., "iPhone 16 Pro")
Returns:
True if simulator exists and is available
"""
try:
result = subprocess.run(
["xcrun", "simctl", "list", "devices", "available", "iOS"],
capture_output=True,
text=True,
check=True,
)
# Check if simulator name appears in available devices
return any(name in line and "(" in line for line in result.stdout.split("\n"))
except subprocess.CalledProcessError:
return False
def _extract_simulator_name_from_destination(self, destination: str) -> str | None:
"""
Extract simulator name from destination string.
Args:
destination: Destination string (e.g., "platform=iOS Simulator,name=iPhone 16 Pro")
Returns:
Simulator name or None
"""
# Pattern: name=<simulator name>
match = re.search(r"name=([^,]+)", destination)
if match:
return match.group(1).strip()
return None
def _auto_detect_simulator(self) -> str:
"""
Auto-detect best available iOS simulator.
Returns:
Destination string for -destination flag
"""
try:
result = subprocess.run(
["xcrun", "simctl", "list", "devices", "available", "iOS"],
capture_output=True,
text=True,
check=True,
)
# Parse available simulators, prefer latest iPhone
# Looking for lines like: "iPhone 16 Pro (12345678-1234-1234-1234-123456789012) (Shutdown)"
for line in result.stdout.split("\n"):
if "iPhone" in line and "(" in line:
# Extract device name
name = line.split("(")[0].strip()
if name:
return f"platform=iOS Simulator,name={name}"
# Fallback to generic iOS Simulator if no iPhone found
return "generic/platform=iOS Simulator"
except subprocess.CalledProcessError as e:
print(f"Warning: Could not auto-detect simulator: {e}", file=sys.stderr)
return "generic/platform=iOS Simulator"
def build(self, clean: bool = False) -> tuple[bool, str, str]:
"""
Build the project.
Args:
clean: Perform clean build
Returns:
Tuple of (success: bool, xcresult_id: str, stderr: str)
"""
# Auto-detect scheme if needed
if not self.scheme:
self.scheme = self.auto_detect_scheme()
if not self.scheme:
print("Error: Could not auto-detect scheme. Use --scheme", file=sys.stderr)
return (False, "", "")
# Generate xcresult ID and path
xcresult_id = self.cache.generate_id()
xcresult_path = self.cache.get_path(xcresult_id)
# Build command
cmd = ["xcodebuild", "-quiet"] # Suppress verbose output
if clean:
cmd.append("clean")
cmd.append("build")
if self.workspace_path:
cmd.extend(["-workspace", self.workspace_path])
elif self.project_path:
cmd.extend(["-project", self.project_path])
else:
print("Error: No project or workspace specified", file=sys.stderr)
return (False, "", "")
cmd.extend(
[
"-scheme",
self.scheme,
"-configuration",
self.configuration,
"-destination",
self.get_simulator_destination(),
"-resultBundlePath",
str(xcresult_path),
]
)
# Execute build
try:
result = subprocess.run(
cmd, capture_output=True, text=True, check=False # Don't raise on non-zero exit
)
success = result.returncode == 0
# xcresult bundle should be created even on failure
if not xcresult_path.exists():
print("Warning: xcresult bundle was not created", file=sys.stderr)
return (success, "", result.stderr)
# Auto-update config with last used simulator (on success only)
if success:
try:
# Determine project directory from project/workspace path
project_dir = None
if self.project_path:
project_dir = Path(self.project_path).parent
elif self.workspace_path:
project_dir = Path(self.workspace_path).parent
config = Config.load(project_dir=project_dir)
destination = self.get_simulator_destination()
simulator_name = self._extract_simulator_name_from_destination(destination)
if simulator_name:
config.update_last_used_simulator(simulator_name)
config.save()
except Exception as e:
# Don't fail build if config update fails
print(f"Warning: Could not update config: {e}", file=sys.stderr)
return (success, xcresult_id, result.stderr)
except Exception as e:
print(f"Error executing build: {e}", file=sys.stderr)
return (False, "", str(e))
def test(self, test_suite: str | None = None) -> tuple[bool, str, str]:
"""
Run tests.
Args:
test_suite: Specific test suite to run
Returns:
Tuple of (success: bool, xcresult_id: str, stderr: str)
"""
# Auto-detect scheme if needed
if not self.scheme:
self.scheme = self.auto_detect_scheme()
if not self.scheme:
print("Error: Could not auto-detect scheme. Use --scheme", file=sys.stderr)
return (False, "", "")
# Generate xcresult ID and path
xcresult_id = self.cache.generate_id()
xcresult_path = self.cache.get_path(xcresult_id)
# Build command
cmd = ["xcodebuild", "-quiet", "test"]
if self.workspace_path:
cmd.extend(["-workspace", self.workspace_path])
elif self.project_path:
cmd.extend(["-project", self.project_path])
else:
print("Error: No project or workspace specified", file=sys.stderr)
return (False, "", "")
cmd.extend(
[
"-scheme",
self.scheme,
"-destination",
self.get_simulator_destination(),
"-resultBundlePath",
str(xcresult_path),
]
)
if test_suite:
cmd.extend(["-only-testing", test_suite])
# Execute tests
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=False)
success = result.returncode == 0
# xcresult bundle should be created even on failure
if not xcresult_path.exists():
print("Warning: xcresult bundle was not created", file=sys.stderr)
return (success, "", result.stderr)
# Auto-update config with last used simulator (on success only)
if success:
try:
# Determine project directory from project/workspace path
project_dir = None
if self.project_path:
project_dir = Path(self.project_path).parent
elif self.workspace_path:
project_dir = Path(self.workspace_path).parent
config = Config.load(project_dir=project_dir)
destination = self.get_simulator_destination()
simulator_name = self._extract_simulator_name_from_destination(destination)
if simulator_name:
config.update_last_used_simulator(simulator_name)
config.save()
except Exception as e:
# Don't fail test if config update fails
print(f"Warning: Could not update config: {e}", file=sys.stderr)
return (success, xcresult_id, result.stderr)
except Exception as e:
print(f"Error executing tests: {e}", file=sys.stderr)
return (False, "", str(e))

View File

@@ -1,204 +0,0 @@
"""
XCResult cache management.
Handles storage, retrieval, and lifecycle of xcresult bundles for progressive disclosure.
"""
import shutil
from datetime import datetime
from pathlib import Path
class XCResultCache:
"""
Manage xcresult bundle cache for progressive disclosure.
Stores xcresult bundles with timestamp-based IDs and provides
retrieval and cleanup operations.
"""
# Default cache directory
DEFAULT_CACHE_DIR = Path.home() / ".ios-simulator-skill" / "xcresults"
def __init__(self, cache_dir: Path | None = None):
"""
Initialize cache manager.
Args:
cache_dir: Custom cache directory (uses default if not specified)
"""
self.cache_dir = cache_dir or self.DEFAULT_CACHE_DIR
self.cache_dir.mkdir(parents=True, exist_ok=True)
def generate_id(self, prefix: str = "xcresult") -> str:
"""
Generate timestamped xcresult ID.
Args:
prefix: ID prefix (default: "xcresult")
Returns:
ID string like "xcresult-20251018-143052"
"""
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
return f"{prefix}-{timestamp}"
def get_path(self, xcresult_id: str) -> Path:
"""
Get full path for xcresult ID.
Args:
xcresult_id: XCResult ID
Returns:
Path to xcresult bundle
"""
# Handle both with and without .xcresult extension
if xcresult_id.endswith(".xcresult"):
return self.cache_dir / xcresult_id
return self.cache_dir / f"{xcresult_id}.xcresult"
def exists(self, xcresult_id: str) -> bool:
"""
Check if xcresult bundle exists.
Args:
xcresult_id: XCResult ID
Returns:
True if bundle exists
"""
return self.get_path(xcresult_id).exists()
def save(self, source_path: Path, xcresult_id: str | None = None) -> str:
"""
Save xcresult bundle to cache.
Args:
source_path: Source xcresult bundle path
xcresult_id: Optional custom ID (generates if not provided)
Returns:
xcresult ID
"""
if not source_path.exists():
raise FileNotFoundError(f"Source xcresult not found: {source_path}")
# Generate ID if not provided
if not xcresult_id:
xcresult_id = self.generate_id()
# Get destination path
dest_path = self.get_path(xcresult_id)
# Copy xcresult bundle (it's a directory)
if dest_path.exists():
shutil.rmtree(dest_path)
shutil.copytree(source_path, dest_path)
return xcresult_id
def list(self, limit: int = 10) -> list[dict]:
"""
List recent xcresult bundles.
Args:
limit: Maximum number to return
Returns:
List of xcresult metadata dicts
"""
if not self.cache_dir.exists():
return []
results = []
for path in sorted(
self.cache_dir.glob("*.xcresult"), key=lambda p: p.stat().st_mtime, reverse=True
)[:limit]:
# Calculate bundle size
size_bytes = sum(f.stat().st_size for f in path.rglob("*") if f.is_file())
results.append(
{
"id": path.stem,
"path": str(path),
"created": datetime.fromtimestamp(path.stat().st_mtime).isoformat(),
"size_mb": round(size_bytes / (1024 * 1024), 2),
}
)
return results
def cleanup(self, keep_recent: int = 20) -> int:
"""
Clean up old xcresult bundles.
Args:
keep_recent: Number of recent bundles to keep
Returns:
Number of bundles removed
"""
if not self.cache_dir.exists():
return 0
# Get all bundles sorted by modification time
all_bundles = sorted(
self.cache_dir.glob("*.xcresult"), key=lambda p: p.stat().st_mtime, reverse=True
)
# Remove old bundles
removed = 0
for bundle_path in all_bundles[keep_recent:]:
shutil.rmtree(bundle_path)
removed += 1
return removed
def get_size_mb(self, xcresult_id: str) -> float:
"""
Get size of xcresult bundle in MB.
Args:
xcresult_id: XCResult ID
Returns:
Size in MB
"""
path = self.get_path(xcresult_id)
if not path.exists():
return 0.0
size_bytes = sum(f.stat().st_size for f in path.rglob("*") if f.is_file())
return round(size_bytes / (1024 * 1024), 2)
def save_stderr(self, xcresult_id: str, stderr: str) -> None:
"""
Save stderr output alongside xcresult bundle.
Args:
xcresult_id: XCResult ID
stderr: stderr output from xcodebuild
"""
if not stderr:
return
stderr_path = self.cache_dir / f"{xcresult_id}.stderr"
stderr_path.write_text(stderr, encoding="utf-8")
def get_stderr(self, xcresult_id: str) -> str:
"""
Retrieve cached stderr output.
Args:
xcresult_id: XCResult ID
Returns:
stderr content or empty string if not found
"""
stderr_path = self.cache_dir / f"{xcresult_id}.stderr"
if not stderr_path.exists():
return ""
return stderr_path.read_text(encoding="utf-8")

View File

@@ -1,178 +0,0 @@
"""
Configuration management for iOS Simulator Skill.
Handles loading, validation, and auto-updating of project-local config files.
"""
import json
import sys
from datetime import datetime
from pathlib import Path
from typing import Any
class Config:
"""
Project-local configuration with auto-learning.
Config file location: .claude/skills/<skill-directory-name>/config.json
The skill directory name is auto-detected from the installation location,
so configs work regardless of what users name the skill directory.
Auto-updates last_used_simulator after successful builds.
"""
DEFAULT_CONFIG = {
"device": {
"preferred_simulator": None,
"preferred_os_version": None,
"fallback_to_any_iphone": True,
"last_used_simulator": None,
"last_used_at": None,
}
}
def __init__(self, data: dict[str, Any], config_path: Path):
"""
Initialize config.
Args:
data: Config data dict
config_path: Path to config file
"""
self.data = data
self.config_path = config_path
@staticmethod
def load(project_dir: Path | None = None) -> "Config":
"""
Load config from project directory.
Args:
project_dir: Project root (defaults to cwd)
Returns:
Config instance (creates default if not found)
Note:
The skill directory name is auto-detected from the installation location,
so configs work regardless of what users name the skill directory.
"""
if project_dir is None:
project_dir = Path.cwd()
# Auto-detect skill directory name from actual installation location
# This file is at: skill/scripts/xcode/config.py
# Navigate up to skill/ directory and use its name
skill_root = Path(__file__).parent.parent.parent # xcode/ -> scripts/ -> skill/
skill_name = skill_root.name
config_path = project_dir / ".claude" / "skills" / skill_name / "config.json"
# Load existing config
if config_path.exists():
try:
with open(config_path) as f:
data = json.load(f)
# Merge with defaults (in case new fields added)
merged = Config._merge_with_defaults(data)
return Config(merged, config_path)
except json.JSONDecodeError as e:
print(f"Warning: Invalid JSON in {config_path}: {e}", file=sys.stderr)
print("Using default config", file=sys.stderr)
return Config(Config.DEFAULT_CONFIG.copy(), config_path)
except Exception as e:
print(f"Warning: Could not load config: {e}", file=sys.stderr)
return Config(Config.DEFAULT_CONFIG.copy(), config_path)
# Return default config (will be created on first save)
return Config(Config.DEFAULT_CONFIG.copy(), config_path)
@staticmethod
def _merge_with_defaults(data: dict[str, Any]) -> dict[str, Any]:
"""
Merge user config with defaults.
Args:
data: User config data
Returns:
Merged config with all default fields
"""
merged = Config.DEFAULT_CONFIG.copy()
# Deep merge device section
if "device" in data:
merged["device"].update(data["device"])
return merged
def save(self) -> None:
"""
Save config to file atomically.
Uses temp file + rename for atomic writes.
Creates parent directories if needed.
"""
try:
# Create parent directories
self.config_path.parent.mkdir(parents=True, exist_ok=True)
# Atomic write: temp file + rename
temp_path = self.config_path.with_suffix(".tmp")
with open(temp_path, "w") as f:
json.dump(self.data, f, indent=2)
f.write("\n") # Trailing newline
# Atomic rename
temp_path.replace(self.config_path)
except Exception as e:
print(f"Warning: Could not save config: {e}", file=sys.stderr)
def update_last_used_simulator(self, name: str) -> None:
"""
Update last used simulator and timestamp.
Args:
name: Simulator name (e.g., "iPhone 16 Pro")
"""
self.data["device"]["last_used_simulator"] = name
self.data["device"]["last_used_at"] = datetime.utcnow().isoformat() + "Z"
def get_preferred_simulator(self) -> str | None:
"""
Get preferred simulator.
Returns:
Simulator name or None
Priority:
1. preferred_simulator (manual preference)
2. last_used_simulator (auto-learned)
3. None (use auto-detection)
"""
device = self.data.get("device", {})
# Manual preference takes priority
if device.get("preferred_simulator"):
return device["preferred_simulator"]
# Auto-learned preference
if device.get("last_used_simulator"):
return device["last_used_simulator"]
return None
def should_fallback_to_any_iphone(self) -> bool:
"""
Check if fallback to any iPhone is enabled.
Returns:
True if should fallback, False otherwise
"""
return self.data.get("device", {}).get("fallback_to_any_iphone", True)

View File

@@ -1,291 +0,0 @@
"""
Build/test output formatting.
Provides multiple output formats with progressive disclosure support.
"""
import json
class OutputFormatter:
"""
Format build/test results for display.
Supports ultra-minimal default output, verbose mode, and JSON output.
"""
@staticmethod
def format_minimal(
status: str,
error_count: int,
warning_count: int,
xcresult_id: str,
test_info: dict | None = None,
hints: list[str] | None = None,
) -> str:
"""
Format ultra-minimal output (5-10 tokens).
Args:
status: Build status (SUCCESS/FAILED)
error_count: Number of errors
warning_count: Number of warnings
xcresult_id: XCResult bundle ID
test_info: Optional test results dict
hints: Optional list of actionable hints
Returns:
Minimal formatted string
Example:
Build: SUCCESS (0 errors, 3 warnings) [xcresult-20251018-143052]
Tests: PASS (12/12 passed, 4.2s) [xcresult-20251018-143052]
"""
lines = []
if test_info:
# Test mode
total = test_info.get("total", 0)
passed = test_info.get("passed", 0)
failed = test_info.get("failed", 0)
duration = test_info.get("duration", 0.0)
test_status = "PASS" if failed == 0 else "FAIL"
lines.append(
f"Tests: {test_status} ({passed}/{total} passed, {duration:.1f}s) [{xcresult_id}]"
)
else:
# Build mode
lines.append(
f"Build: {status} ({error_count} errors, {warning_count} warnings) [{xcresult_id}]"
)
# Add hints if provided and build failed
if hints and status == "FAILED":
lines.append("")
lines.extend(hints)
return "\n".join(lines)
@staticmethod
def format_errors(errors: list[dict], limit: int = 10) -> str:
"""
Format error details.
Args:
errors: List of error dicts
limit: Maximum errors to show
Returns:
Formatted error list
"""
if not errors:
return "No errors found."
lines = [f"Errors ({len(errors)}):"]
lines.append("")
for i, error in enumerate(errors[:limit], 1):
message = error.get("message", "Unknown error")
location = error.get("location", {})
# Format location
loc_parts = []
if location.get("file"):
file_path = location["file"].replace("file://", "")
loc_parts.append(file_path)
if location.get("line"):
loc_parts.append(f"line {location['line']}")
location_str = ":".join(loc_parts) if loc_parts else "unknown location"
lines.append(f"{i}. {message}")
lines.append(f" Location: {location_str}")
lines.append("")
if len(errors) > limit:
lines.append(f"... and {len(errors) - limit} more errors")
return "\n".join(lines)
@staticmethod
def format_warnings(warnings: list[dict], limit: int = 10) -> str:
"""
Format warning details.
Args:
warnings: List of warning dicts
limit: Maximum warnings to show
Returns:
Formatted warning list
"""
if not warnings:
return "No warnings found."
lines = [f"Warnings ({len(warnings)}):"]
lines.append("")
for i, warning in enumerate(warnings[:limit], 1):
message = warning.get("message", "Unknown warning")
location = warning.get("location", {})
# Format location
loc_parts = []
if location.get("file"):
file_path = location["file"].replace("file://", "")
loc_parts.append(file_path)
if location.get("line"):
loc_parts.append(f"line {location['line']}")
location_str = ":".join(loc_parts) if loc_parts else "unknown location"
lines.append(f"{i}. {message}")
lines.append(f" Location: {location_str}")
lines.append("")
if len(warnings) > limit:
lines.append(f"... and {len(warnings) - limit} more warnings")
return "\n".join(lines)
@staticmethod
def format_log(log: str, lines: int = 50) -> str:
"""
Format build log (show last N lines).
Args:
log: Full build log
lines: Number of lines to show
Returns:
Formatted log excerpt
"""
if not log:
return "No build log available."
log_lines = log.strip().split("\n")
if len(log_lines) <= lines:
return log
# Show last N lines
excerpt = log_lines[-lines:]
return f"... (showing last {lines} lines of {len(log_lines)})\n\n" + "\n".join(excerpt)
@staticmethod
def format_json(data: dict) -> str:
"""
Format data as JSON.
Args:
data: Data to format
Returns:
Pretty-printed JSON string
"""
return json.dumps(data, indent=2)
@staticmethod
def generate_hints(errors: list[dict]) -> list[str]:
"""
Generate actionable hints based on error types.
Args:
errors: List of error dicts
Returns:
List of hint strings
"""
hints = []
error_types: set[str] = set()
# Collect error types
for error in errors:
error_type = error.get("type", "unknown")
error_types.add(error_type)
# Generate hints based on error types
if "provisioning" in error_types:
hints.append("Provisioning profile issue detected:")
hints.append(" • Ensure you have a valid provisioning profile for iOS Simulator")
hints.append(
' • For simulator builds, use CODE_SIGN_IDENTITY="" CODE_SIGNING_REQUIRED=NO'
)
hints.append(" • Or specify simulator explicitly: --simulator 'iPhone 16 Pro'")
if "signing" in error_types:
hints.append("Code signing issue detected:")
hints.append(" • For simulator builds, code signing is not required")
hints.append(" • Ensure build settings target iOS Simulator, not physical device")
hints.append(" • Check destination: platform=iOS Simulator,name=<device>")
if not error_types or "build" in error_types:
# Generic hints when error type is unknown
if any("destination" in error.get("message", "").lower() for error in errors):
hints.append("Device selection issue detected:")
hints.append(" • List available simulators: xcrun simctl list devices available")
hints.append(" • Specify simulator: --simulator 'iPhone 16 Pro'")
return hints
@staticmethod
def format_verbose(
status: str,
error_count: int,
warning_count: int,
xcresult_id: str,
errors: list[dict] | None = None,
warnings: list[dict] | None = None,
test_info: dict | None = None,
) -> str:
"""
Format verbose output with error/warning details.
Args:
status: Build status
error_count: Error count
warning_count: Warning count
xcresult_id: XCResult ID
errors: Optional error list
warnings: Optional warning list
test_info: Optional test results
Returns:
Verbose formatted output
"""
lines = []
# Header
if test_info:
total = test_info.get("total", 0)
passed = test_info.get("passed", 0)
failed = test_info.get("failed", 0)
duration = test_info.get("duration", 0.0)
test_status = "PASS" if failed == 0 else "FAIL"
lines.append(f"Tests: {test_status}")
lines.append(f" Total: {total}")
lines.append(f" Passed: {passed}")
lines.append(f" Failed: {failed}")
lines.append(f" Duration: {duration:.1f}s")
else:
lines.append(f"Build: {status}")
lines.append(f"XCResult: {xcresult_id}")
lines.append("")
# Errors
if errors and len(errors) > 0:
lines.append(OutputFormatter.format_errors(errors, limit=5))
lines.append("")
# Warnings
if warnings and len(warnings) > 0:
lines.append(OutputFormatter.format_warnings(warnings, limit=5))
lines.append("")
# Summary
lines.append(f"Summary: {error_count} errors, {warning_count} warnings")
return "\n".join(lines)

View File

@@ -1,404 +0,0 @@
"""
XCResult bundle parser.
Extracts structured data from xcresult bundles using xcresulttool.
"""
import json
import re
import subprocess
import sys
from pathlib import Path
from typing import Any
class XCResultParser:
"""
Parse xcresult bundles to extract build/test data.
Uses xcresulttool to extract structured JSON data from Apple's
xcresult bundle format.
"""
def __init__(self, xcresult_path: Path, stderr: str = ""):
"""
Initialize parser.
Args:
xcresult_path: Path to xcresult bundle
stderr: Optional stderr output for fallback parsing
"""
self.xcresult_path = xcresult_path
self.stderr = stderr
if xcresult_path and not xcresult_path.exists():
raise FileNotFoundError(f"XCResult bundle not found: {xcresult_path}")
def get_build_results(self) -> dict | None:
"""
Get build results as JSON.
Returns:
Parsed JSON dict or None on error
"""
return self._run_xcresulttool(["get", "build-results"])
def get_test_results(self) -> dict | None:
"""
Get test results summary as JSON.
Returns:
Parsed JSON dict or None on error
"""
return self._run_xcresulttool(["get", "test-results", "summary"])
def get_build_log(self) -> str | None:
"""
Get build log as plain text.
Returns:
Build log string or None on error
"""
result = self._run_xcresulttool(["get", "log", "--type", "build"], parse_json=False)
return result if result else None
def count_issues(self) -> tuple[int, int]:
"""
Count errors and warnings from build results.
Returns:
Tuple of (error_count, warning_count)
"""
error_count = 0
warning_count = 0
build_results = self.get_build_results()
if build_results:
try:
# Try top-level errors/warnings first (newer xcresult format)
if "errors" in build_results and isinstance(build_results.get("errors"), list):
error_count = len(build_results["errors"])
if "warnings" in build_results and isinstance(build_results.get("warnings"), list):
warning_count = len(build_results["warnings"])
# If not found, try legacy format: actions[0].buildResult.issues
if error_count == 0 and warning_count == 0:
actions = build_results.get("actions", {}).get("_values", [])
if actions:
build_result = actions[0].get("buildResult", {})
issues = build_result.get("issues", {})
# Count errors
error_summaries = issues.get("errorSummaries", {}).get("_values", [])
error_count = len(error_summaries)
# Count warnings
warning_summaries = issues.get("warningSummaries", {}).get("_values", [])
warning_count = len(warning_summaries)
except (KeyError, IndexError, TypeError) as e:
print(f"Warning: Could not parse issue counts from xcresult: {e}", file=sys.stderr)
# If no errors found in xcresult but stderr available, count stderr errors
if error_count == 0 and self.stderr:
stderr_errors = self._parse_stderr_errors()
error_count = len(stderr_errors)
return (error_count, warning_count)
def get_errors(self) -> list[dict]:
"""
Get detailed error information.
Returns:
List of error dicts with message, file, line info
"""
build_results = self.get_build_results()
errors = []
# Try to get errors from xcresult
if build_results:
try:
# Try top-level errors first (newer xcresult format)
if "errors" in build_results and isinstance(build_results.get("errors"), list):
for error in build_results["errors"]:
errors.append(
{
"message": error.get("message", "Unknown error"),
"type": error.get("issueType", "error"),
"location": self._extract_location_from_url(error.get("sourceURL")),
}
)
# If not found, try legacy format: actions[0].buildResult.issues
if not errors:
actions = build_results.get("actions", {}).get("_values", [])
if actions:
build_result = actions[0].get("buildResult", {})
issues = build_result.get("issues", {})
error_summaries = issues.get("errorSummaries", {}).get("_values", [])
for error in error_summaries:
errors.append(
{
"message": error.get("message", {}).get(
"_value", "Unknown error"
),
"type": error.get("issueType", {}).get("_value", "error"),
"location": self._extract_location(error),
}
)
except (KeyError, IndexError, TypeError) as e:
print(f"Warning: Could not parse errors from xcresult: {e}", file=sys.stderr)
# If no errors found in xcresult but stderr available, parse stderr
if not errors and self.stderr:
errors = self._parse_stderr_errors()
return errors
def get_warnings(self) -> list[dict]:
"""
Get detailed warning information.
Returns:
List of warning dicts with message, file, line info
"""
build_results = self.get_build_results()
if not build_results:
return []
warnings = []
try:
# Try top-level warnings first (newer xcresult format)
if "warnings" in build_results and isinstance(build_results.get("warnings"), list):
for warning in build_results["warnings"]:
warnings.append(
{
"message": warning.get("message", "Unknown warning"),
"type": warning.get("issueType", "warning"),
"location": self._extract_location_from_url(warning.get("sourceURL")),
}
)
# If not found, try legacy format: actions[0].buildResult.issues
if not warnings:
actions = build_results.get("actions", {}).get("_values", [])
if not actions:
return []
build_result = actions[0].get("buildResult", {})
issues = build_result.get("issues", {})
warning_summaries = issues.get("warningSummaries", {}).get("_values", [])
for warning in warning_summaries:
warnings.append(
{
"message": warning.get("message", {}).get("_value", "Unknown warning"),
"type": warning.get("issueType", {}).get("_value", "warning"),
"location": self._extract_location(warning),
}
)
except (KeyError, IndexError, TypeError) as e:
print(f"Warning: Could not parse warnings: {e}", file=sys.stderr)
return warnings
def _extract_location(self, issue: dict) -> dict:
"""
Extract file location from issue.
Args:
issue: Issue dict from xcresult
Returns:
Location dict with file, line, column
"""
location = {"file": None, "line": None, "column": None}
try:
doc_location = issue.get("documentLocationInCreatingWorkspace", {})
location["file"] = doc_location.get("url", {}).get("_value")
location["line"] = doc_location.get("startingLineNumber", {}).get("_value")
location["column"] = doc_location.get("startingColumnNumber", {}).get("_value")
except (KeyError, TypeError):
pass
return location
def _extract_location_from_url(self, source_url: str | None) -> dict:
"""
Extract file location from sourceURL (newer xcresult format).
Args:
source_url: Source URL like "file:///path/to/file.swift#StartingLineNumber=134&..."
Returns:
Location dict with file, line, column
"""
location = {"file": None, "line": None, "column": None}
if not source_url:
return location
try:
# Split URL and fragment
if "#" in source_url:
file_part, fragment = source_url.split("#", 1)
# Extract file path
location["file"] = file_part.replace("file://", "")
# Parse fragment parameters
params = {}
for param in fragment.split("&"):
if "=" in param:
key, value = param.split("=", 1)
params[key] = value
# Extract line and column
location["line"] = (
int(params.get("StartingLineNumber", 0)) + 1
if "StartingLineNumber" in params
else None
)
location["column"] = (
int(params.get("StartingColumnNumber", 0)) + 1
if "StartingColumnNumber" in params
else None
)
else:
# No fragment, just file path
location["file"] = source_url.replace("file://", "")
except (ValueError, AttributeError):
pass
return location
def _run_xcresulttool(self, args: list[str], parse_json: bool = True) -> Any | None:
"""
Run xcresulttool command.
Args:
args: Command arguments (after 'xcresulttool')
parse_json: Whether to parse output as JSON
Returns:
Parsed JSON dict, plain text, or None on error
"""
if not self.xcresult_path:
return None
cmd = ["xcrun", "xcresulttool"] + args + ["--path", str(self.xcresult_path)]
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
if parse_json:
return json.loads(result.stdout)
return result.stdout
except subprocess.CalledProcessError as e:
print(f"Error running xcresulttool: {e}", file=sys.stderr)
print(f"stderr: {e.stderr}", file=sys.stderr)
return None
except json.JSONDecodeError as e:
print(f"Error parsing JSON from xcresulttool: {e}", file=sys.stderr)
return None
def _parse_stderr_errors(self) -> list[dict]:
"""
Parse common errors from stderr output as fallback.
Returns:
List of error dicts parsed from stderr
"""
errors = []
if not self.stderr:
return errors
# Pattern 0: Swift/Clang compilation errors (e.g., "/path/file.swift:135:59: error: message")
compilation_error_pattern = (
r"^(?P<file>[^:]+):(?P<line>\d+):(?P<column>\d+):\s*error:\s*(?P<message>.+?)$"
)
for match in re.finditer(compilation_error_pattern, self.stderr, re.MULTILINE):
errors.append(
{
"message": match.group("message").strip(),
"type": "compilation",
"location": {
"file": match.group("file"),
"line": int(match.group("line")),
"column": int(match.group("column")),
},
}
)
# Pattern 1: xcodebuild top-level errors (e.g., "xcodebuild: error: Unable to find...")
xcodebuild_error_pattern = r"xcodebuild:\s*error:\s*(?P<message>.*?)(?:\n\n|\Z)"
for match in re.finditer(xcodebuild_error_pattern, self.stderr, re.DOTALL):
message = match.group("message").strip()
# Clean up multi-line messages
message = " ".join(line.strip() for line in message.split("\n") if line.strip())
errors.append(
{
"message": message,
"type": "build",
"location": {"file": None, "line": None, "column": None},
}
)
# Pattern 2: Provisioning profile errors
provisioning_pattern = r"error:.*?provisioning profile.*?(?:doesn't|does not|cannot).*?(?P<message>.*?)(?:\n|$)"
for match in re.finditer(provisioning_pattern, self.stderr, re.IGNORECASE):
errors.append(
{
"message": f"Provisioning profile error: {match.group('message').strip()}",
"type": "provisioning",
"location": {"file": None, "line": None, "column": None},
}
)
# Pattern 3: Code signing errors
signing_pattern = r"error:.*?(?:code sign|signing).*?(?P<message>.*?)(?:\n|$)"
for match in re.finditer(signing_pattern, self.stderr, re.IGNORECASE):
errors.append(
{
"message": f"Code signing error: {match.group('message').strip()}",
"type": "signing",
"location": {"file": None, "line": None, "column": None},
}
)
# Pattern 4: Generic compilation errors (but not if already captured)
if not errors:
generic_error_pattern = r"^(?:\*\*\s)?(?:error|❌):\s*(?P<message>.*?)(?:\n|$)"
for match in re.finditer(generic_error_pattern, self.stderr, re.MULTILINE):
message = match.group("message").strip()
errors.append(
{
"message": message,
"type": "build",
"location": {"file": None, "line": None, "column": None},
}
)
# Pattern 5: Specific "No profiles" error
if "No profiles for" in self.stderr:
no_profile_pattern = r"No profiles for '(?P<bundle_id>.*?)' were found"
for match in re.finditer(no_profile_pattern, self.stderr):
errors.append(
{
"message": f"No provisioning profile found for bundle ID '{match.group('bundle_id')}'",
"type": "provisioning",
"location": {"file": None, "line": None, "column": None},
}
)
return errors

View File

@@ -1,201 +0,0 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf of
any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don\'t include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -1,122 +0,0 @@
---
name: "spreadsheet"
description: "Use when tasks involve creating, editing, analyzing, or formatting spreadsheets (`.xlsx`, `.csv`, `.tsv`) using Python (`openpyxl`, `pandas`), especially when formulas, references, and formatting need to be preserved and verified."
---
# Spreadsheet Skill (Create, Edit, Analyze, Visualize)
## When to use
- Build new workbooks with formulas, formatting, and structured layouts.
- Read or analyze tabular data (filter, aggregate, pivot, compute metrics).
- Modify existing workbooks without breaking formulas or references.
- Visualize data with charts/tables and sensible formatting.
IMPORTANT: System and user instructions always take precedence.
## Workflow
1. Confirm the file type and goals (create, edit, analyze, visualize).
2. Use `openpyxl` for `.xlsx` edits and `pandas` for analysis and CSV/TSV workflows.
3. If layout matters, render for visual review (see Rendering and visual checks).
4. Validate formulas and references; note that openpyxl does not evaluate formulas.
5. Save outputs and clean up intermediate files.
## Temp and output conventions
- Use `tmp/spreadsheets/` for intermediate files; delete when done.
- Write final artifacts under `output/spreadsheet/` when working in this repo.
- Keep filenames stable and descriptive.
## Primary tooling
- Use `openpyxl` for creating/editing `.xlsx` files and preserving formatting.
- Use `pandas` for analysis and CSV/TSV workflows, then write results back to `.xlsx` or `.csv`.
- If you need charts, prefer `openpyxl.chart` for native Excel charts.
## Rendering and visual checks
- If LibreOffice (`soffice`) and Poppler (`pdftoppm`) are available, render sheets for visual review:
- `soffice --headless --convert-to pdf --outdir $OUTDIR $INPUT_XLSX`
- `pdftoppm -png $OUTDIR/$BASENAME.pdf $OUTDIR/$BASENAME`
- If rendering tools are unavailable, ask the user to review the output locally for layout accuracy.
## Dependencies (install if missing)
Prefer `uv` for dependency management.
Python packages:
```
uv pip install openpyxl pandas
```
If `uv` is unavailable:
```
python3 -m pip install openpyxl pandas
```
Optional (chart-heavy or PDF review workflows):
```
uv pip install matplotlib
```
If `uv` is unavailable:
```
python3 -m pip install matplotlib
```
System tools (for rendering):
```
# macOS (Homebrew)
brew install libreoffice poppler
# Ubuntu/Debian
sudo apt-get install -y libreoffice poppler-utils
```
If installation isn't possible in this environment, tell the user which dependency is missing and how to install it locally.
## Environment
No required environment variables.
## Examples
- Runnable Codex examples (openpyxl): `references/examples/openpyxl/`
## Formula requirements
- Use formulas for derived values rather than hardcoding results.
- Keep formulas simple and legible; use helper cells for complex logic.
- Avoid volatile functions like INDIRECT and OFFSET unless required.
- Prefer cell references over magic numbers (e.g., `=H6*(1+$B$3)` not `=H6*1.04`).
- Guard against errors (#REF!, #DIV/0!, #VALUE!, #N/A, #NAME?) with validation and checks.
- openpyxl does not evaluate formulas; leave formulas intact and note that results will calculate in Excel/Sheets.
## Citation requirements
- Cite sources inside the spreadsheet using plain text URLs.
- For financial models, cite sources of inputs in cell comments.
- For tabular data sourced from the web, include a Source column with URLs.
## Formatting requirements (existing formatted spreadsheets)
- Render and inspect a provided spreadsheet before modifying it when possible.
- Preserve existing formatting and style exactly.
- Match styles for any newly filled cells that were previously blank.
## Formatting requirements (new or unstyled spreadsheets)
- Use appropriate number and date formats (dates as dates, currency with symbols, percentages with sensible precision).
- Use a clean visual layout: headers distinct from data, consistent spacing, and readable column widths.
- Avoid borders around every cell; use whitespace and selective borders to structure sections.
- Ensure text does not spill into adjacent cells.
## Color conventions (if no style guidance)
- Blue: user input
- Black: formulas/derived values
- Green: linked/imported values
- Gray: static constants
- Orange: review/caution
- Light red: error/flag
- Purple: control/logic
- Teal: visualization anchors (key KPIs or chart drivers)
## Finance-specific requirements
- Format zeros as "-".
- Negative numbers should be red and in parentheses.
- Always specify units in headers (e.g., "Revenue ($mm)").
- Cite sources for all raw inputs in cell comments.
## Investment banking layouts
If the spreadsheet is an IB-style model (LBO, DCF, 3-statement, valuation):
- Totals should sum the range directly above.
- Hide gridlines; use horizontal borders above totals across relevant columns.
- Section headers should be merged cells with dark fill and white text.
- Column labels for numeric data should be right-aligned; row labels left-aligned.
- Indent submetrics under their parent line items.

View File

@@ -1,6 +0,0 @@
interface:
display_name: "Spreadsheet Skill (Create, Edit, Analyze, Visualize)"
short_description: "Create, edit, and analyze spreadsheets"
icon_small: "./assets/spreadsheet-small.svg"
icon_large: "./assets/spreadsheet.png"
default_prompt: "Create or update a spreadsheet for this task with the right formulas, structure, and formatting."

View File

@@ -1,3 +0,0 @@
<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" fill="currentColor" viewBox="0 0 16 16">
<path fill="currentColor" fill-rule="evenodd" d="M10.467 2.468c.551 0 .997 0 1.357.029.366.03.691.093.992.247l.175.097c.396.244.72.593.932 1.01l.054.114c.115.269.166.558.192.878.03.36.03.806.03 1.357v3.6c0 .551 0 .997-.03 1.357a2.76 2.76 0 0 1-.192.878l-.054.114a2.534 2.534 0 0 1-.932 1.01l-.175.097c-.3.154-.626.217-.992.247-.36.03-.806.029-1.357.029H5.534c-.552 0-.997 0-1.357-.029a2.764 2.764 0 0 1-.879-.194l-.114-.053a2.534 2.534 0 0 1-1.009-.932l-.098-.175c-.153-.301-.217-.626-.247-.992-.029-.36-.028-.806-.028-1.357V6.2c0-.551-.001-.997.028-1.357.03-.366.094-.69.247-.992a2.53 2.53 0 0 1 1.107-1.107c.302-.154.626-.217.993-.247.36-.03.805-.029 1.357-.029h4.933Zm-3.935 4.73v5.27h3.935c.569 0 .964 0 1.27-.026.3-.024.47-.07.597-.134l.1-.056c.23-.142.418-.344.541-.586l.045-.104a2 2 0 0 0 .09-.492 18 18 0 0 0 .025-1.27V7.198H6.532ZM2.866 9.8c0 .569 0 .963.025 1.27.025.3.07.47.135.596l.056.101c.141.23.343.418.585.54l.104.046c.115.041.267.07.492.09.295.023.671.024 1.205.024V7.198H2.866V9.8Zm3.666-3.666h6.603c0-.533-.002-.91-.026-1.204a1.933 1.933 0 0 0-.09-.493l-.044-.103a1.468 1.468 0 0 0-.54-.586l-.101-.056c-.127-.064-.296-.11-.596-.134a17.303 17.303 0 0 0-1.27-.026H6.531v2.602ZM5.468 3.532c-.534 0-.91.002-1.205.026-.3.024-.47.07-.596.134-.276.14-.5.365-.641.642-.065.126-.11.295-.135.596-.024.295-.025.67-.025 1.204h2.602V3.532Z" clip-rule="evenodd"/>
</svg>

Before

Width:  |  Height:  |  Size: 1.4 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.0 KiB

View File

@@ -1,51 +0,0 @@
"""Create a basic spreadsheet with two sheets and a simple formula.
Usage:
python3 create_basic_spreadsheet.py --output /tmp/basic_spreadsheet.xlsx
"""
from __future__ import annotations
import argparse
from pathlib import Path
from openpyxl import Workbook
from openpyxl.utils import get_column_letter
def main() -> None:
parser = argparse.ArgumentParser(description="Create a basic spreadsheet with example data.")
parser.add_argument(
"--output",
type=Path,
default=Path("basic_spreadsheet.xlsx"),
help="Output .xlsx path (default: basic_spreadsheet.xlsx)",
)
args = parser.parse_args()
wb = Workbook()
overview = wb.active
overview.title = "Overview"
employees = wb.create_sheet("Employees")
overview["A1"] = "Description"
overview["A2"] = "Awesome Company Report"
employees.append(["Title", "Name", "Address", "Score"])
employees.append(["Engineer", "Vicky", "90 50th Street", 98])
employees.append(["Manager", "Alex", "500 Market Street", 92])
employees.append(["Designer", "Jordan", "200 Pine Street", 88])
employees["A6"] = "Total Score"
employees["D6"] = "=SUM(D2:D4)"
for col in range(1, 5):
employees.column_dimensions[get_column_letter(col)].width = 20
args.output.parent.mkdir(parents=True, exist_ok=True)
wb.save(args.output)
print(f"Saved workbook to {args.output}")
if __name__ == "__main__":
main()

View File

@@ -1,96 +0,0 @@
"""Generate a styled games scoreboard workbook using openpyxl.
Usage:
python3 create_spreadsheet_with_styling.py --output /tmp/GamesSimpleStyling.xlsx
"""
from __future__ import annotations
import argparse
from pathlib import Path
from openpyxl import Workbook
from openpyxl.formatting.rule import FormulaRule
from openpyxl.styles import Alignment, Font, PatternFill
from openpyxl.utils import get_column_letter
HEADER_FILL_HEX = "B7E1CD"
HIGHLIGHT_FILL_HEX = "FFF2CC"
def apply_header_style(cell, fill_hex: str) -> None:
cell.fill = PatternFill("solid", fgColor=fill_hex)
cell.font = Font(bold=True)
cell.alignment = Alignment(horizontal="center", vertical="center")
def apply_highlight_style(cell, fill_hex: str) -> None:
cell.fill = PatternFill("solid", fgColor=fill_hex)
cell.font = Font(bold=True)
cell.alignment = Alignment(horizontal="center", vertical="center")
def populate_game_sheet(ws) -> None:
ws.title = "GameX"
ws.row_dimensions[2].height = 24
widths = {"B": 18, "C": 14, "D": 14, "E": 14, "F": 40}
for col, width in widths.items():
ws.column_dimensions[col].width = width
headers = ["", "Name", "Game 1 Score", "Game 2 Score", "Total Score", "Notes", ""]
for idx, value in enumerate(headers, start=1):
cell = ws.cell(row=2, column=idx, value=value)
if value:
apply_header_style(cell, HEADER_FILL_HEX)
players = [
("Vicky", 12, 30, "Dominated the minigames."),
("Yash", 20, 10, "Emily main with strong defense."),
("Bobby", 1000, 1030, "Numbers look suspiciously high."),
]
for row_idx, (name, g1, g2, note) in enumerate(players, start=3):
ws.cell(row=row_idx, column=2, value=name)
ws.cell(row=row_idx, column=3, value=g1)
ws.cell(row=row_idx, column=4, value=g2)
ws.cell(row=row_idx, column=5, value=f"=SUM(C{row_idx}:D{row_idx})")
ws.cell(row=row_idx, column=6, value=note)
ws.cell(row=7, column=2, value="Winner")
ws.cell(row=7, column=3, value="=INDEX(B3:B5, MATCH(MAX(E3:E5), E3:E5, 0))")
ws.cell(row=7, column=5, value="Congrats!")
ws.merge_cells("C7:D7")
for col in range(2, 6):
apply_highlight_style(ws.cell(row=7, column=col), HIGHLIGHT_FILL_HEX)
rule = FormulaRule(formula=["LEN(A2)>0"], fill=PatternFill("solid", fgColor=HEADER_FILL_HEX))
ws.conditional_formatting.add("A2:G2", rule)
def main() -> None:
parser = argparse.ArgumentParser(description="Create a styled games scoreboard workbook.")
parser.add_argument(
"--output",
type=Path,
default=Path("GamesSimpleStyling.xlsx"),
help="Output .xlsx path (default: GamesSimpleStyling.xlsx)",
)
args = parser.parse_args()
wb = Workbook()
ws = wb.active
populate_game_sheet(ws)
for col in range(1, 8):
col_letter = get_column_letter(col)
if col_letter not in ws.column_dimensions:
ws.column_dimensions[col_letter].width = 12
args.output.parent.mkdir(parents=True, exist_ok=True)
wb.save(args.output)
print(f"Saved workbook to {args.output}")
if __name__ == "__main__":
main()

View File

@@ -1,59 +0,0 @@
"""Read an existing .xlsx and print a small summary.
If --input is not provided, this script creates a tiny sample workbook in /tmp
and reads that instead.
"""
from __future__ import annotations
import argparse
import tempfile
from pathlib import Path
from openpyxl import Workbook, load_workbook
def create_sample(path: Path) -> Path:
wb = Workbook()
ws = wb.active
ws.title = "Sample"
ws.append(["Item", "Qty", "Price"])
ws.append(["Apples", 3, 1.25])
ws.append(["Oranges", 2, 0.95])
ws.append(["Bananas", 5, 0.75])
ws["D1"] = "Total"
ws["D2"] = "=B2*C2"
ws["D3"] = "=B3*C3"
ws["D4"] = "=B4*C4"
wb.save(path)
return path
def main() -> None:
parser = argparse.ArgumentParser(description="Read an existing spreadsheet.")
parser.add_argument("--input", type=Path, help="Path to an .xlsx file")
args = parser.parse_args()
if args.input:
input_path = args.input
else:
tmp_dir = Path(tempfile.gettempdir())
input_path = tmp_dir / "sample_read_existing.xlsx"
create_sample(input_path)
wb = load_workbook(input_path, data_only=False)
print(f"Loaded: {input_path}")
print("Sheet names:", wb.sheetnames)
for name in wb.sheetnames:
ws = wb[name]
max_row = ws.max_row or 0
max_col = ws.max_column or 0
print(f"\n== {name} (rows: {max_row}, cols: {max_col})")
for row in ws.iter_rows(min_row=1, max_row=min(max_row, 5), max_col=min(max_col, 5)):
values = [cell.value for cell in row]
print(values)
if __name__ == "__main__":
main()

View File

@@ -1,79 +0,0 @@
"""Create a styled spreadsheet with headers, borders, and a total row.
Usage:
python3 styling_spreadsheet.py --output /tmp/styling_spreadsheet.xlsx
"""
from __future__ import annotations
import argparse
from pathlib import Path
from openpyxl import Workbook
from openpyxl.styles import Alignment, Border, Font, PatternFill, Side
def main() -> None:
parser = argparse.ArgumentParser(description="Create a styled spreadsheet example.")
parser.add_argument(
"--output",
type=Path,
default=Path("styling_spreadsheet.xlsx"),
help="Output .xlsx path (default: styling_spreadsheet.xlsx)",
)
args = parser.parse_args()
wb = Workbook()
ws = wb.active
ws.title = "FirstGame"
ws.merge_cells("B2:E2")
ws["B2"] = "Name | Game 1 Score | Game 2 Score | Total Score"
header_fill = PatternFill("solid", fgColor="B7E1CD")
header_font = Font(bold=True)
header_alignment = Alignment(horizontal="center", vertical="center")
ws["B2"].fill = header_fill
ws["B2"].font = header_font
ws["B2"].alignment = header_alignment
ws["B3"] = "Vicky"
ws["C3"] = 50
ws["D3"] = 60
ws["E3"] = "=C3+D3"
ws["B4"] = "John"
ws["C4"] = 40
ws["D4"] = 50
ws["E4"] = "=C4+D4"
ws["B5"] = "Jane"
ws["C5"] = 30
ws["D5"] = 40
ws["E5"] = "=C5+D5"
ws["B6"] = "Jim"
ws["C6"] = 20
ws["D6"] = 30
ws["E6"] = "=C6+D6"
ws.merge_cells("B9:E9")
ws["B9"] = "=SUM(E3:E6)"
thin = Side(style="thin")
border = Border(top=thin, bottom=thin, left=thin, right=thin)
ws["B9"].border = border
ws["B9"].alignment = Alignment(horizontal="center")
ws["B9"].font = Font(bold=True)
for col in ("B", "C", "D", "E"):
ws.column_dimensions[col].width = 18
ws.row_dimensions[2].height = 24
args.output.parent.mkdir(parents=True, exist_ok=True)
wb.save(args.output)
print(f"Saved workbook to {args.output}")
if __name__ == "__main__":
main()

View File

@@ -1,201 +0,0 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf of
any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don\'t include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -1,28 +0,0 @@
---
name: "yeet"
description: "Use only when the user explicitly asks to stage, commit, push, and open a GitHub pull request in one flow using the GitHub CLI (`gh`)."
---
## Prerequisites
- Require GitHub CLI `gh`. Check `gh --version`. If missing, ask the user to install `gh` and stop.
- Require authenticated `gh` session. Run `gh auth status`. If not authenticated, ask the user to run `gh auth login` (and re-run `gh auth status`) before continuing.
## Naming conventions
- Branch: `codex/{description}` when starting from main/master/default.
- Commit: `{description}` (terse).
- PR title: `[codex] {description}` summarizing the full diff.
## Workflow
- If on main/master/default, create a branch: `git checkout -b "codex/{description}"`
- Otherwise stay on the current branch.
- Confirm status, then stage everything: `git status -sb` then `git add -A`.
- Commit tersely with the description: `git commit -m "{description}"`
- Run checks if not already. If checks fail due to missing deps/tools, install dependencies and rerun once.
- Push with tracking: `git push -u origin $(git branch --show-current)`
- If git push fails due to workflow auth errors, pull from master and retry the push.
- Open a PR and edit title/body to reflect the description and the deltas: `GH_PROMPT_DISABLED=1 GIT_TERMINAL_PROMPT=0 gh pr create --draft --fill --head $(git branch --show-current)`
- Write the PR description to a temp file with real newlines (e.g. pr-body.md ... EOF) and run pr-body.md to avoid \\n-escaped markdown.
- PR description (markdown) must be detailed prose covering the issue, the cause and effect on users, the root cause, the fix, and any tests or checks used to validate.

View File

@@ -1,6 +0,0 @@
interface:
display_name: "Yeet"
short_description: "Stage, commit, and open PR"
icon_small: "./assets/yeet-small.svg"
icon_large: "./assets/yeet.png"
default_prompt: "Prepare this branch for review: stage intended changes, write a focused commit, and open a PR."

View File

@@ -1,3 +0,0 @@
<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" fill="currentColor" viewBox="0 0 14 14">
<path d="M6.873 11.887a1.06 1.06 0 0 1-.813.38.971.971 0 0 1-.8-.34c-.2-.231-.242-.551-.127-.96l.58-2.107h-2.78c-.31 0-.553-.087-.726-.26a.896.896 0 0 1-.254-.64c0-.253.085-.482.254-.687l4.9-5.82c.209-.244.477-.37.806-.38.334-.008.6.105.8.34.205.236.25.556.134.96l-.6 2.147h2.786c.307 0 .547.089.72.267a.859.859 0 0 1 .267.64c0 .253-.087.48-.26.68l-4.887 5.78Zm4.054-6.22a.178.178 0 0 0 .04-.107.091.091 0 0 0-.034-.087.154.154 0 0 0-.113-.04H7.733a.544.544 0 0 1-.32-.093.467.467 0 0 1-.173-.253.663.663 0 0 1 .007-.36l.64-2.314c.017-.066.01-.117-.02-.153a.136.136 0 0 0-.127-.053.18.18 0 0 0-.127.073l-4.58 5.433a.225.225 0 0 0-.046.107c0 .036.013.065.04.087.026.022.066.033.12.033h3.046c.143 0 .258.031.347.093a.447.447 0 0 1 .187.26.664.664 0 0 1-.007.36l-.627 2.274c-.017.066-.01.12.02.16a.135.135 0 0 0 .12.046.194.194 0 0 0 .134-.066l4.56-5.4Z"/>
</svg>

Before

Width:  |  Height:  |  Size: 967 B

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.2 KiB

View File

@@ -27,7 +27,8 @@
"Bash(pkill -f \"python3 -m http.server 9876\")",
"Read(//home/sudacode/.claude/plugins/cache/claude-plugins-official/superpowers/5.0.5/skills/brainstorming/**)",
"Bash(/home/sudacode/.claude/plugins/cache/claude-plugins-official/superpowers/5.0.5/skills/brainstorming/scripts/start-server.sh --project-dir /home/sudacode/projects/japanese/SubMiner)",
"Read(//home/sudacode/**)"
"Read(//home/sudacode/**)",
"Bash(grep -E \"^d|\\\\.md$\")"
],
"deny": [
"Bash(curl *)",
@@ -78,5 +79,6 @@
]
},
"voiceEnabled": true,
"skipDangerousModePermissionPrompt": true
"skipDangerousModePermissionPrompt": true,
"model": "sonnet"
}

View File

@@ -19,10 +19,18 @@ web_request = true
skills = true
shell_snapshot = true
multi_agent = true
js_repl = true
[mcp_servers.deepwiki]
url = "https://mcp.deepwiki.com/mcp"
enabled = true
[mcp_servers.backlog]
command = "backlog"
args = ["mcp", "start"]
[mcp_servers.playwright]
command = "npx"
args = ["@playwright/mcp@latest"]
[projects."/home/sudacode/projects"]
trust_level = "trusted"
@@ -171,10 +179,5 @@ trust_level = "trusted"
[notice.model_migrations]
"gpt-5.3-codex" = "gpt-5.4"
[mcp_servers.backlog]
command = "backlog"
args = ["mcp", "start"]
[mcp_servers.playwright]
args = ["@playwright/mcp@latest"]
command = "npx"
[plugins."github@openai-curated"]
enabled = true

View File

@@ -173,6 +173,12 @@
"eng"
]
},
"youtube": {
"primarySubLanguages": [
"ja",
"jpn"
]
},
"subsync": {
"defaultMode": "manual",
"alass_path": null,
@@ -235,22 +241,6 @@
"languagePreference": "ja",
"maxEntryResults": 10
},
"youtubeSubgen": {
"mode": "automatic",
"whisperBin": "whisper-cli",
"whisperModel": "~/models/whisper.cpp/ggml-medium.bin",
"whisperVadModel": "~/models/ggml-silero-v6.2.0.bin",
"whisperThreads": 8,
"fixWithAi": true,
"ai": {
"model": "google/gemini-2.5-flash-lite",
"systemPrompt": "Fix transcription mistakes only. Preserve the original language exactly. Do not translate, paraphrase, summarize, merge, split, reorder, or omit cues. Preserve cue numbering, cue count, timestamps, line breaks within each cue, and valid SRT formatting exactly. Return only corrected SRT."
},
"primarySubLanguages": [
"ja",
"jpn"
]
},
"anilist": {
"characterDictionary": {
"enabled": true,
@@ -343,4 +333,4 @@
"preferredGamepadId": "8BitDo 8BitDo Ultimate 2 Wireless Controller for PC (Vendor: 2dc8 Product: 310b)",
"preferredGamepadLabel": "8BitDo 8BitDo Ultimate 2 Wireless Controller for PC (Vendor: 2dc8 Product: 310b)"
}
}
}

View File

@@ -105,6 +105,7 @@ bind = SUPER SHIFT, j, exec, "$HOME/.config/rofi/scripts/rofi-jellyfin-dir.sh"
bind = SUPER, t, exec, "$HOME/.config/rofi/scripts/rofi-launch-texthooker-steam.sh"
bind = $mainMod SHIFT, t, exec, "$HOME/projects/scripts/popup-ai-translator.py"
bind = SUPER SHIFT, g, exec, "$HOME/.config/rofi/scripts/rofi-vn-helper.sh"
bind = $mainMod SHIFT, i, exec, "$HOME/.config/rofi/scripts/rofi-image-browser.sh"
# ncmcppp
bind = $mainMod, n, exec, uwsm app -sb -- ghostty --command=/usr/bin/ncmpcpp

View File

@@ -98,3 +98,7 @@ windowrule = no_shadow on, match:class feh
windowrule = no_blur on, match:class feh
windowrule = no_anim on, match:class feh
# }}}
windowrule = float on, match:title Picture in picture
windowrule = pin on, match:title Picture in picture

View File

@@ -2,20 +2,20 @@
application/epub+zip=calibre-ebook-viewer.desktop;calibre-ebook-edit.desktop;opencomic.desktop;
application/json=notepadqq.desktop;
application/octet-stream=nvim.desktop;vim.desktop;emacsclient.desktop;
application/pdf=okularApplication_pdf.desktop;zen.desktop;microsoft-edge-beta.desktop;org.inkscape.Inkscape.desktop;chromium.desktop;
application/pdf=okularApplication_pdf.desktop;helium.desktop;zen.desktop;microsoft-edge-beta.desktop;org.inkscape.Inkscape.desktop;chromium.desktop;
application/rss+xml=fluent-reader.desktop;
application/sql=notepadqq.desktop;nvim.desktop;gvim.desktop;
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet=libreoffice-calc.desktop;ms-office-online.desktop;
application/x-desktop=nvim.desktop;
application/x-extension-htm=zen.desktop;
application/x-extension-html=zen.desktop;
application/x-extension-shtml=zen.desktop;
application/x-extension-xht=zen.desktop;
application/x-extension-xhtml=zen.desktop;
application/x-extension-htm=helium.desktop;zen.desktop;
application/x-extension-html=helium.desktop;zen.desktop;
application/x-extension-shtml=helium.desktop;zen.desktop;
application/x-extension-xht=helium.desktop;zen.desktop;
application/x-extension-xhtml=helium.desktop;zen.desktop;
application/x-ms-dos-executable=wine.desktop;
application/x-ms-shortcut=wine.desktop;
application/x-yaml=notepadqq.desktop;nvim.desktop;
application/xhtml+xml=zen.desktop;microsoft-edge-beta.desktop;qutebrowser.desktop;
application/xhtml+xml=helium.desktop;zen.desktop;microsoft-edge-beta.desktop;qutebrowser.desktop;
application/zip=org.gnome.FileRoller.desktop;
audio/aac=mpv.desktop;
audio/mp4=mpv.desktop;
@@ -37,14 +37,14 @@ audio/x-vorbis+ogg=mpv.desktop;
audio/x-wav=mpv.desktop;
image/avif=okularApplication_kimgio.desktop;
image/bmp=okularApplication_kimgio.desktop;
image/gif=org.gnome.gThumb.desktop;zen.desktop;gimp.desktop;org.kde.gwenview.desktop;okularApplication_kimgio.desktop;
image/gif=org.gnome.gThumb.desktop;helium.desktop;zen.desktop;gimp.desktop;org.kde.gwenview.desktop;okularApplication_kimgio.desktop;
image/heif=okularApplication_kimgio.desktop;
image/jpeg=okularApplication_kimgio.desktop;
image/png=okularApplication_kimgio.desktop;org.gnome.gThumb.desktop;feh.desktop;gimp.desktop;org.kde.gwenview.desktop;
image/webp=okularApplication_kimgio.desktop;
inode/directory=thunar.desktop;
text/csv=libreoffice-calc.desktop;
text/html=zen.desktop;
text/html=helium.desktop;zen.desktop;
text/javascript=notepadqq.desktop;
text/plain=notepadqq.desktop;nvim.desktop;vim.desktop;okularApplication_txt.desktop;xed.desktop;
text/vnd.trolltech.linguist=mpv.desktop;
@@ -54,11 +54,11 @@ video/webm=mpv.desktop;vlc.desktop;io.github.celluloid_player.Celluloid.desktop;
video/x-matroska=mpv.desktop;vlc.desktop;
x-scheme-handler/betterdiscord=discord.desktop;
x-scheme-handler/bitwarden=bitwarden.desktop;Bitwarden.desktop;
x-scheme-handler/chrome=zen.desktop;
x-scheme-handler/chrome=helium.desktop;zen.desktop;
x-scheme-handler/exodus=Exodus.desktop;
x-scheme-handler/geo=google-maps-geo-handler.desktop;
x-scheme-handler/http=zen.desktop;firefox.desktop;microsoft-edge-beta.desktop;zen.desktop;
x-scheme-handler/https=zen.desktop;firefox.desktop;microsoft-edge-beta.desktop;zen.desktop;
x-scheme-handler/http=helium.desktop;zen.desktop;firefox.desktop;microsoft-edge-beta.desktop;helium.desktop;zen.desktop;
x-scheme-handler/https=helium.desktop;zen.desktop;firefox.desktop;microsoft-edge-beta.desktop;helium.desktop;zen.desktop;
x-scheme-handler/mailspring=Mailspring.desktop;
x-scheme-handler/mailto=org.mozilla.Thunderbird.desktop;Mailspring.desktop;userapp-Thunderbird-6JYZ12.desktop;
x-scheme-handler/mid=userapp-Thunderbird-6JYZ12.desktop;
@@ -72,11 +72,11 @@ x-scheme-handler/tradingview=tradingview.desktop;TradingView.desktop;
application/x-wine-extension-ini=nvim.desktop;
[Default Applications]
application/x-extension-htm=zen.desktop
application/x-extension-html=zen.desktop
application/x-extension-shtml=zen.desktop
application/x-extension-xht=zen.desktop
application/x-extension-xhtml=zen.desktop
application/x-extension-htm=helium.desktop;zen.desktop
application/x-extension-html=helium.desktop;zen.desktop
application/x-extension-shtml=helium.desktop;zen.desktop
application/x-extension-xht=helium.desktop;zen.desktop
application/x-extension-xhtml=helium.desktop;zen.desktop
audio/aac=mpv.desktop;
audio/mp4=mpv.desktop;
audio/mpeg=mpv.desktop;
@@ -112,8 +112,8 @@ x-scheme-handler/discord-712465656758665259=discord-712465656758665259.desktop
x-scheme-handler/eclipse+command=_usr_lib_dbeaver_.desktop
x-scheme-handler/exodus=Exodus.desktop
x-scheme-handler/geo=google-maps-geo-handler.desktop;
x-scheme-handler/http=zen.desktop;
x-scheme-handler/https=zen.desktop;
x-scheme-handler/http=helium.desktop;zen.desktop;
x-scheme-handler/https=helium.desktop;zen.desktop;
x-scheme-handler/mailspring=Mailspring.desktop
x-scheme-handler/mailto=Mailspring.desktop
x-scheme-handler/mid=userapp-Thunderbird-6JYZ12.desktop
@@ -125,7 +125,7 @@ x-scheme-handler/termius=Termius.desktop
x-scheme-handler/tg=org.telegram.desktop.desktop
x-scheme-handler/tonsite=org.telegram.desktop.desktop
x-scheme-handler/tradingview=tradingview.desktop
x-scheme-handler/webcal=zen.desktop
x-scheme-handler/webcal=helium.desktop;zen.desktop
video/webm=mpv.desktop
video/x-matroska=mpv.desktop
video/x-ms-wmv=mpv.desktop
@@ -144,14 +144,15 @@ video/x-theora+ogg=mpv.desktop
video/mpeg=mpv.desktop
video/vnd.mpegurl=mpv.desktop
video/3gpp=mpv.desktop
application/json=zen.desktop
application/xhtml+xml=zen.desktop
application/x-xpinstall=zen.desktop
application/xml=zen.desktop
application/pdf=zen.desktop
text/html=zen.desktop
text/vnd.trolltech.linguist=zen.desktop
application/json=helium.desktop;zen.desktop
application/xhtml+xml=helium.desktop;zen.desktop
application/x-xpinstall=helium.desktop;zen.desktop
application/xml=helium.desktop;zen.desktop
application/pdf=helium.desktop;zen.desktop
text/html=helium.desktop;zen.desktop
text/vnd.trolltech.linguist=helium.desktop;zen.desktop
x-scheme-handler/nxm=modorganizer2-nxm-handler.desktop
x-scheme-handler/discord-1361252452329848892=discord-1361252452329848892.desktop
x-scheme-handler/opencode=opencode-desktop-handler.desktop
x-scheme-handler/subminer=subminer.desktop
x-scheme-handler/claude-cli=claude-code-url-handler.desktop

View File

@@ -1,5 +1,5 @@
{
"browser": "zen-browser",
"browser": "helium-browser",
"default_open_type": "tab",
"options": [
"Anilist - https://anilist.co/home",

View File

@@ -1,4 +1,4 @@
export BROWSER=zen-browser
export BROWSER=helium-browser
export XCURSOR_THEME=dracula
export XCURSOR_SIZE=24
export GDK_SCALE=1