mirror of
https://github.com/ksyasuda/dotfiles.git
synced 2026-03-20 06:11:27 -07:00
123 lines
5.0 KiB
Markdown
123 lines
5.0 KiB
Markdown
---
|
|
name: "spreadsheet"
|
|
description: "Use when tasks involve creating, editing, analyzing, or formatting spreadsheets (`.xlsx`, `.csv`, `.tsv`) using Python (`openpyxl`, `pandas`), especially when formulas, references, and formatting need to be preserved and verified."
|
|
---
|
|
|
|
|
|
# Spreadsheet Skill (Create, Edit, Analyze, Visualize)
|
|
|
|
## When to use
|
|
- Build new workbooks with formulas, formatting, and structured layouts.
|
|
- Read or analyze tabular data (filter, aggregate, pivot, compute metrics).
|
|
- Modify existing workbooks without breaking formulas or references.
|
|
- Visualize data with charts/tables and sensible formatting.
|
|
|
|
IMPORTANT: System and user instructions always take precedence.
|
|
|
|
## Workflow
|
|
1. Confirm the file type and goals (create, edit, analyze, visualize).
|
|
2. Use `openpyxl` for `.xlsx` edits and `pandas` for analysis and CSV/TSV workflows.
|
|
3. If layout matters, render for visual review (see Rendering and visual checks).
|
|
4. Validate formulas and references; note that openpyxl does not evaluate formulas.
|
|
5. Save outputs and clean up intermediate files.
|
|
|
|
## Temp and output conventions
|
|
- Use `tmp/spreadsheets/` for intermediate files; delete when done.
|
|
- Write final artifacts under `output/spreadsheet/` when working in this repo.
|
|
- Keep filenames stable and descriptive.
|
|
|
|
## Primary tooling
|
|
- Use `openpyxl` for creating/editing `.xlsx` files and preserving formatting.
|
|
- Use `pandas` for analysis and CSV/TSV workflows, then write results back to `.xlsx` or `.csv`.
|
|
- If you need charts, prefer `openpyxl.chart` for native Excel charts.
|
|
|
|
## Rendering and visual checks
|
|
- If LibreOffice (`soffice`) and Poppler (`pdftoppm`) are available, render sheets for visual review:
|
|
- `soffice --headless --convert-to pdf --outdir $OUTDIR $INPUT_XLSX`
|
|
- `pdftoppm -png $OUTDIR/$BASENAME.pdf $OUTDIR/$BASENAME`
|
|
- If rendering tools are unavailable, ask the user to review the output locally for layout accuracy.
|
|
|
|
## Dependencies (install if missing)
|
|
Prefer `uv` for dependency management.
|
|
|
|
Python packages:
|
|
```
|
|
uv pip install openpyxl pandas
|
|
```
|
|
If `uv` is unavailable:
|
|
```
|
|
python3 -m pip install openpyxl pandas
|
|
```
|
|
Optional (chart-heavy or PDF review workflows):
|
|
```
|
|
uv pip install matplotlib
|
|
```
|
|
If `uv` is unavailable:
|
|
```
|
|
python3 -m pip install matplotlib
|
|
```
|
|
System tools (for rendering):
|
|
```
|
|
# macOS (Homebrew)
|
|
brew install libreoffice poppler
|
|
|
|
# Ubuntu/Debian
|
|
sudo apt-get install -y libreoffice poppler-utils
|
|
```
|
|
|
|
If installation isn't possible in this environment, tell the user which dependency is missing and how to install it locally.
|
|
|
|
## Environment
|
|
No required environment variables.
|
|
|
|
## Examples
|
|
- Runnable Codex examples (openpyxl): `references/examples/openpyxl/`
|
|
|
|
## Formula requirements
|
|
- Use formulas for derived values rather than hardcoding results.
|
|
- Keep formulas simple and legible; use helper cells for complex logic.
|
|
- Avoid volatile functions like INDIRECT and OFFSET unless required.
|
|
- Prefer cell references over magic numbers (e.g., `=H6*(1+$B$3)` not `=H6*1.04`).
|
|
- Guard against errors (#REF!, #DIV/0!, #VALUE!, #N/A, #NAME?) with validation and checks.
|
|
- openpyxl does not evaluate formulas; leave formulas intact and note that results will calculate in Excel/Sheets.
|
|
|
|
## Citation requirements
|
|
- Cite sources inside the spreadsheet using plain text URLs.
|
|
- For financial models, cite sources of inputs in cell comments.
|
|
- For tabular data sourced from the web, include a Source column with URLs.
|
|
|
|
## Formatting requirements (existing formatted spreadsheets)
|
|
- Render and inspect a provided spreadsheet before modifying it when possible.
|
|
- Preserve existing formatting and style exactly.
|
|
- Match styles for any newly filled cells that were previously blank.
|
|
|
|
## Formatting requirements (new or unstyled spreadsheets)
|
|
- Use appropriate number and date formats (dates as dates, currency with symbols, percentages with sensible precision).
|
|
- Use a clean visual layout: headers distinct from data, consistent spacing, and readable column widths.
|
|
- Avoid borders around every cell; use whitespace and selective borders to structure sections.
|
|
- Ensure text does not spill into adjacent cells.
|
|
|
|
## Color conventions (if no style guidance)
|
|
- Blue: user input
|
|
- Black: formulas/derived values
|
|
- Green: linked/imported values
|
|
- Gray: static constants
|
|
- Orange: review/caution
|
|
- Light red: error/flag
|
|
- Purple: control/logic
|
|
- Teal: visualization anchors (key KPIs or chart drivers)
|
|
|
|
## Finance-specific requirements
|
|
- Format zeros as "-".
|
|
- Negative numbers should be red and in parentheses.
|
|
- Always specify units in headers (e.g., "Revenue ($mm)").
|
|
- Cite sources for all raw inputs in cell comments.
|
|
|
|
## Investment banking layouts
|
|
If the spreadsheet is an IB-style model (LBO, DCF, 3-statement, valuation):
|
|
- Totals should sum the range directly above.
|
|
- Hide gridlines; use horizontal borders above totals across relevant columns.
|
|
- Section headers should be merged cells with dark fill and white text.
|
|
- Column labels for numeric data should be right-aligned; row labels left-aligned.
|
|
- Indent submetrics under their parent line items.
|