From b6e3c2169bbcb34a23dedf8242794f69d5407f25 Mon Sep 17 00:00:00 2001 From: sudacode Date: Wed, 27 May 2026 11:40:11 -0700 Subject: [PATCH] update --- .agents/skills/pdf/LICENSE.txt | 30 - .agents/skills/pdf/SKILL.md | 314 ----- .agents/skills/pdf/agents/openai.yaml | 5 - .agents/skills/pdf/assets/pdf.png | Bin 1312 -> 0 bytes .agents/skills/pdf/forms.md | 294 ----- .agents/skills/pdf/reference.md | 612 --------- .../pdf/scripts/check_bounding_boxes.py | 65 - .../pdf/scripts/check_fillable_fields.py | 11 - .../pdf/scripts/convert_pdf_to_images.py | 33 - .../pdf/scripts/create_validation_image.py | 37 - .../pdf/scripts/extract_form_field_info.py | 122 -- .../pdf/scripts/extract_form_structure.py | 115 -- .../pdf/scripts/fill_fillable_fields.py | 98 -- .../scripts/fill_pdf_form_with_annotations.py | 107 -- .agents/skills/screenshot/LICENSE.txt | 201 --- .agents/skills/screenshot/SKILL.md | 267 ---- .agents/skills/screenshot/agents/openai.yaml | 6 - .../screenshot/assets/screenshot-small.svg | 5 - .../skills/screenshot/assets/screenshot.png | Bin 860 -> 0 bytes .../scripts/ensure_macos_permissions.sh | 54 - .../scripts/macos_display_info.swift | 22 - .../scripts/macos_permissions.swift | 40 - .../scripts/macos_window_info.swift | 126 -- .../screenshot/scripts/take_screenshot.ps1 | 163 --- .../screenshot/scripts/take_screenshot.py | 585 --------- .../security-best-practices/LICENSE.txt | 201 --- .../skills/security-best-practices/SKILL.md | 86 -- .../agents/openai.yaml | 4 - .../golang-general-backend-security.md | 826 ------------ .../javascript-express-web-server-security.md | 1158 ----------------- ...avascript-general-web-frontend-security.md | 747 ----------- ...javascript-jquery-web-frontend-security.md | 678 ---------- ...t-typescript-nextjs-web-server-security.md | 1144 ---------------- ...-typescript-react-web-frontend-security.md | 990 -------------- ...pt-typescript-vue-web-frontend-security.md | 791 ----------- .../python-django-web-server-security.md | 882 ------------- .../python-fastapi-web-server-security.md | 1036 --------------- .../python-flask-web-server-security.md | 705 ---------- .../skills/security-threat-model/LICENSE.txt | 201 --- .agents/skills/security-threat-model/SKILL.md | 81 -- .../security-threat-model/agents/openai.yaml | 4 - .../references/prompt-template.md | 255 ---- .../security-controls-and-assets.md | 32 - .agents/skills/speech/LICENSE.txt | 201 --- .agents/skills/speech/SKILL.md | 144 -- .agents/skills/speech/agents/openai.yaml | 6 - .agents/skills/speech/assets/speech-small.svg | 3 - .agents/skills/speech/assets/speech.png | Bin 1234 -> 0 bytes .../skills/speech/references/accessibility.md | 32 - .agents/skills/speech/references/audio-api.md | 31 - .agents/skills/speech/references/cli.md | 99 -- .../skills/speech/references/codex-network.md | 28 - .agents/skills/speech/references/ivr.md | 32 - .agents/skills/speech/references/narration.md | 31 - .agents/skills/speech/references/prompting.md | 38 - .../speech/references/sample-prompts.md | 44 - .../speech/references/voice-directions.md | 80 -- .agents/skills/speech/references/voiceover.md | 31 - .../skills/speech/scripts/text_to_speech.py | 528 -------- .agents/skills/transcribe/LICENSE.txt | 201 --- .agents/skills/transcribe/SKILL.md | 81 -- .agents/skills/transcribe/agents/openai.yaml | 6 - .../transcribe/assets/transcribe-small.svg | 3 - .../skills/transcribe/assets/transcribe.png | Bin 1288 -> 0 bytes .agents/skills/transcribe/references/api.md | 8 - .../transcribe/scripts/transcribe_diarize.py | 276 ---- .claude/settings.json##os.Darwin | 7 +- .codex/config.toml##os.Darwin | 51 +- .config/SubMiner/config.jsonc##os.Darwin | 153 +-- .config/mpv/mpv.conf##os.Darwin | 21 +- .config/mpv/script-opts/modernz.conf | 2 +- .zsh/.zshrc##os.Darwin | 19 +- 72 files changed, 135 insertions(+), 15154 deletions(-) delete mode 100644 .agents/skills/pdf/LICENSE.txt delete mode 100644 .agents/skills/pdf/SKILL.md delete mode 100644 .agents/skills/pdf/agents/openai.yaml delete mode 100644 .agents/skills/pdf/assets/pdf.png delete mode 100644 .agents/skills/pdf/forms.md delete mode 100644 .agents/skills/pdf/reference.md delete mode 100644 .agents/skills/pdf/scripts/check_bounding_boxes.py delete mode 100644 .agents/skills/pdf/scripts/check_fillable_fields.py delete mode 100644 .agents/skills/pdf/scripts/convert_pdf_to_images.py delete mode 100644 .agents/skills/pdf/scripts/create_validation_image.py delete mode 100644 .agents/skills/pdf/scripts/extract_form_field_info.py delete mode 100755 .agents/skills/pdf/scripts/extract_form_structure.py delete mode 100644 .agents/skills/pdf/scripts/fill_fillable_fields.py delete mode 100644 .agents/skills/pdf/scripts/fill_pdf_form_with_annotations.py delete mode 100644 .agents/skills/screenshot/LICENSE.txt delete mode 100644 .agents/skills/screenshot/SKILL.md delete mode 100644 .agents/skills/screenshot/agents/openai.yaml delete mode 100644 .agents/skills/screenshot/assets/screenshot-small.svg delete mode 100644 .agents/skills/screenshot/assets/screenshot.png delete mode 100644 .agents/skills/screenshot/scripts/ensure_macos_permissions.sh delete mode 100644 .agents/skills/screenshot/scripts/macos_display_info.swift delete mode 100644 .agents/skills/screenshot/scripts/macos_permissions.swift delete mode 100644 .agents/skills/screenshot/scripts/macos_window_info.swift delete mode 100644 .agents/skills/screenshot/scripts/take_screenshot.ps1 delete mode 100644 .agents/skills/screenshot/scripts/take_screenshot.py delete mode 100644 .agents/skills/security-best-practices/LICENSE.txt delete mode 100644 .agents/skills/security-best-practices/SKILL.md delete mode 100644 .agents/skills/security-best-practices/agents/openai.yaml delete mode 100644 .agents/skills/security-best-practices/references/golang-general-backend-security.md delete mode 100644 .agents/skills/security-best-practices/references/javascript-express-web-server-security.md delete mode 100644 .agents/skills/security-best-practices/references/javascript-general-web-frontend-security.md delete mode 100644 .agents/skills/security-best-practices/references/javascript-jquery-web-frontend-security.md delete mode 100644 .agents/skills/security-best-practices/references/javascript-typescript-nextjs-web-server-security.md delete mode 100644 .agents/skills/security-best-practices/references/javascript-typescript-react-web-frontend-security.md delete mode 100644 .agents/skills/security-best-practices/references/javascript-typescript-vue-web-frontend-security.md delete mode 100644 .agents/skills/security-best-practices/references/python-django-web-server-security.md delete mode 100644 .agents/skills/security-best-practices/references/python-fastapi-web-server-security.md delete mode 100644 .agents/skills/security-best-practices/references/python-flask-web-server-security.md delete mode 100644 .agents/skills/security-threat-model/LICENSE.txt delete mode 100644 .agents/skills/security-threat-model/SKILL.md delete mode 100644 .agents/skills/security-threat-model/agents/openai.yaml delete mode 100644 .agents/skills/security-threat-model/references/prompt-template.md delete mode 100644 .agents/skills/security-threat-model/references/security-controls-and-assets.md delete mode 100644 .agents/skills/speech/LICENSE.txt delete mode 100644 .agents/skills/speech/SKILL.md delete mode 100644 .agents/skills/speech/agents/openai.yaml delete mode 100644 .agents/skills/speech/assets/speech-small.svg delete mode 100644 .agents/skills/speech/assets/speech.png delete mode 100644 .agents/skills/speech/references/accessibility.md delete mode 100644 .agents/skills/speech/references/audio-api.md delete mode 100644 .agents/skills/speech/references/cli.md delete mode 100644 .agents/skills/speech/references/codex-network.md delete mode 100644 .agents/skills/speech/references/ivr.md delete mode 100644 .agents/skills/speech/references/narration.md delete mode 100644 .agents/skills/speech/references/prompting.md delete mode 100644 .agents/skills/speech/references/sample-prompts.md delete mode 100644 .agents/skills/speech/references/voice-directions.md delete mode 100644 .agents/skills/speech/references/voiceover.md delete mode 100644 .agents/skills/speech/scripts/text_to_speech.py delete mode 100644 .agents/skills/transcribe/LICENSE.txt delete mode 100644 .agents/skills/transcribe/SKILL.md delete mode 100644 .agents/skills/transcribe/agents/openai.yaml delete mode 100644 .agents/skills/transcribe/assets/transcribe-small.svg delete mode 100644 .agents/skills/transcribe/assets/transcribe.png delete mode 100644 .agents/skills/transcribe/references/api.md delete mode 100644 .agents/skills/transcribe/scripts/transcribe_diarize.py diff --git a/.agents/skills/pdf/LICENSE.txt b/.agents/skills/pdf/LICENSE.txt deleted file mode 100644 index c55ab42..0000000 --- a/.agents/skills/pdf/LICENSE.txt +++ /dev/null @@ -1,30 +0,0 @@ -© 2025 Anthropic, PBC. All rights reserved. - -LICENSE: Use of these materials (including all code, prompts, assets, files, -and other components of this Skill) is governed by your agreement with -Anthropic regarding use of Anthropic's services. If no separate agreement -exists, use is governed by Anthropic's Consumer Terms of Service or -Commercial Terms of Service, as applicable: -https://www.anthropic.com/legal/consumer-terms -https://www.anthropic.com/legal/commercial-terms -Your applicable agreement is referred to as the "Agreement." "Services" are -as defined in the Agreement. - -ADDITIONAL RESTRICTIONS: Notwithstanding anything in the Agreement to the -contrary, users may not: - -- Extract these materials from the Services or retain copies of these - materials outside the Services -- Reproduce or copy these materials, except for temporary copies created - automatically during authorized use of the Services -- Create derivative works based on these materials -- Distribute, sublicense, or transfer these materials to any third party -- Make, offer to sell, sell, or import any inventions embodied in these - materials -- Reverse engineer, decompile, or disassemble these materials - -The receipt, viewing, or possession of these materials does not convey or -imply any license or right beyond those expressly granted above. - -Anthropic retains all right, title, and interest in these materials, -including all copyrights, patents, and other intellectual property rights. diff --git a/.agents/skills/pdf/SKILL.md b/.agents/skills/pdf/SKILL.md deleted file mode 100644 index d3e046a..0000000 --- a/.agents/skills/pdf/SKILL.md +++ /dev/null @@ -1,314 +0,0 @@ ---- -name: pdf -description: Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill. -license: Proprietary. LICENSE.txt has complete terms ---- - -# PDF Processing Guide - -## Overview - -This guide covers essential PDF processing operations using Python libraries and command-line tools. For advanced features, JavaScript libraries, and detailed examples, see REFERENCE.md. If you need to fill out a PDF form, read FORMS.md and follow its instructions. - -## Quick Start - -```python -from pypdf import PdfReader, PdfWriter - -# Read a PDF -reader = PdfReader("document.pdf") -print(f"Pages: {len(reader.pages)}") - -# Extract text -text = "" -for page in reader.pages: - text += page.extract_text() -``` - -## Python Libraries - -### pypdf - Basic Operations - -#### Merge PDFs -```python -from pypdf import PdfWriter, PdfReader - -writer = PdfWriter() -for pdf_file in ["doc1.pdf", "doc2.pdf", "doc3.pdf"]: - reader = PdfReader(pdf_file) - for page in reader.pages: - writer.add_page(page) - -with open("merged.pdf", "wb") as output: - writer.write(output) -``` - -#### Split PDF -```python -reader = PdfReader("input.pdf") -for i, page in enumerate(reader.pages): - writer = PdfWriter() - writer.add_page(page) - with open(f"page_{i+1}.pdf", "wb") as output: - writer.write(output) -``` - -#### Extract Metadata -```python -reader = PdfReader("document.pdf") -meta = reader.metadata -print(f"Title: {meta.title}") -print(f"Author: {meta.author}") -print(f"Subject: {meta.subject}") -print(f"Creator: {meta.creator}") -``` - -#### Rotate Pages -```python -reader = PdfReader("input.pdf") -writer = PdfWriter() - -page = reader.pages[0] -page.rotate(90) # Rotate 90 degrees clockwise -writer.add_page(page) - -with open("rotated.pdf", "wb") as output: - writer.write(output) -``` - -### pdfplumber - Text and Table Extraction - -#### Extract Text with Layout -```python -import pdfplumber - -with pdfplumber.open("document.pdf") as pdf: - for page in pdf.pages: - text = page.extract_text() - print(text) -``` - -#### Extract Tables -```python -with pdfplumber.open("document.pdf") as pdf: - for i, page in enumerate(pdf.pages): - tables = page.extract_tables() - for j, table in enumerate(tables): - print(f"Table {j+1} on page {i+1}:") - for row in table: - print(row) -``` - -#### Advanced Table Extraction -```python -import pandas as pd - -with pdfplumber.open("document.pdf") as pdf: - all_tables = [] - for page in pdf.pages: - tables = page.extract_tables() - for table in tables: - if table: # Check if table is not empty - df = pd.DataFrame(table[1:], columns=table[0]) - all_tables.append(df) - -# Combine all tables -if all_tables: - combined_df = pd.concat(all_tables, ignore_index=True) - combined_df.to_excel("extracted_tables.xlsx", index=False) -``` - -### reportlab - Create PDFs - -#### Basic PDF Creation -```python -from reportlab.lib.pagesizes import letter -from reportlab.pdfgen import canvas - -c = canvas.Canvas("hello.pdf", pagesize=letter) -width, height = letter - -# Add text -c.drawString(100, height - 100, "Hello World!") -c.drawString(100, height - 120, "This is a PDF created with reportlab") - -# Add a line -c.line(100, height - 140, 400, height - 140) - -# Save -c.save() -``` - -#### Create PDF with Multiple Pages -```python -from reportlab.lib.pagesizes import letter -from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak -from reportlab.lib.styles import getSampleStyleSheet - -doc = SimpleDocTemplate("report.pdf", pagesize=letter) -styles = getSampleStyleSheet() -story = [] - -# Add content -title = Paragraph("Report Title", styles['Title']) -story.append(title) -story.append(Spacer(1, 12)) - -body = Paragraph("This is the body of the report. " * 20, styles['Normal']) -story.append(body) -story.append(PageBreak()) - -# Page 2 -story.append(Paragraph("Page 2", styles['Heading1'])) -story.append(Paragraph("Content for page 2", styles['Normal'])) - -# Build PDF -doc.build(story) -``` - -#### Subscripts and Superscripts - -**IMPORTANT**: Never use Unicode subscript/superscript characters (₀₁₂₃₄₅₆₇₈₉, ⁰¹²³⁴⁵⁶⁷⁸⁹) in ReportLab PDFs. The built-in fonts do not include these glyphs, causing them to render as solid black boxes. - -Instead, use ReportLab's XML markup tags in Paragraph objects: -```python -from reportlab.platypus import Paragraph -from reportlab.lib.styles import getSampleStyleSheet - -styles = getSampleStyleSheet() - -# Subscripts: use tag -chemical = Paragraph("H2O", styles['Normal']) - -# Superscripts: use tag -squared = Paragraph("x2 + y2", styles['Normal']) -``` - -For canvas-drawn text (not Paragraph objects), manually adjust font the size and position rather than using Unicode subscripts/superscripts. - -## Command-Line Tools - -### pdftotext (poppler-utils) -```bash -# Extract text -pdftotext input.pdf output.txt - -# Extract text preserving layout -pdftotext -layout input.pdf output.txt - -# Extract specific pages -pdftotext -f 1 -l 5 input.pdf output.txt # Pages 1-5 -``` - -### qpdf -```bash -# Merge PDFs -qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf - -# Split pages -qpdf input.pdf --pages . 1-5 -- pages1-5.pdf -qpdf input.pdf --pages . 6-10 -- pages6-10.pdf - -# Rotate pages -qpdf input.pdf output.pdf --rotate=+90:1 # Rotate page 1 by 90 degrees - -# Remove password -qpdf --password=mypassword --decrypt encrypted.pdf decrypted.pdf -``` - -### pdftk (if available) -```bash -# Merge -pdftk file1.pdf file2.pdf cat output merged.pdf - -# Split -pdftk input.pdf burst - -# Rotate -pdftk input.pdf rotate 1east output rotated.pdf -``` - -## Common Tasks - -### Extract Text from Scanned PDFs -```python -# Requires: pip install pytesseract pdf2image -import pytesseract -from pdf2image import convert_from_path - -# Convert PDF to images -images = convert_from_path('scanned.pdf') - -# OCR each page -text = "" -for i, image in enumerate(images): - text += f"Page {i+1}:\n" - text += pytesseract.image_to_string(image) - text += "\n\n" - -print(text) -``` - -### Add Watermark -```python -from pypdf import PdfReader, PdfWriter - -# Create watermark (or load existing) -watermark = PdfReader("watermark.pdf").pages[0] - -# Apply to all pages -reader = PdfReader("document.pdf") -writer = PdfWriter() - -for page in reader.pages: - page.merge_page(watermark) - writer.add_page(page) - -with open("watermarked.pdf", "wb") as output: - writer.write(output) -``` - -### Extract Images -```bash -# Using pdfimages (poppler-utils) -pdfimages -j input.pdf output_prefix - -# This extracts all images as output_prefix-000.jpg, output_prefix-001.jpg, etc. -``` - -### Password Protection -```python -from pypdf import PdfReader, PdfWriter - -reader = PdfReader("input.pdf") -writer = PdfWriter() - -for page in reader.pages: - writer.add_page(page) - -# Add password -writer.encrypt("userpassword", "ownerpassword") - -with open("encrypted.pdf", "wb") as output: - writer.write(output) -``` - -## Quick Reference - -| Task | Best Tool | Command/Code | -|------|-----------|--------------| -| Merge PDFs | pypdf | `writer.add_page(page)` | -| Split PDFs | pypdf | One page per file | -| Extract text | pdfplumber | `page.extract_text()` | -| Extract tables | pdfplumber | `page.extract_tables()` | -| Create PDFs | reportlab | Canvas or Platypus | -| Command line merge | qpdf | `qpdf --empty --pages ...` | -| OCR scanned PDFs | pytesseract | Convert to image first | -| Fill PDF forms | pdf-lib or pypdf (see FORMS.md) | See FORMS.md | - -## Next Steps - -- For advanced pypdfium2 usage, see REFERENCE.md -- For JavaScript libraries (pdf-lib), see REFERENCE.md -- If you need to fill out a PDF form, follow the instructions in FORMS.md -- For troubleshooting guides, see REFERENCE.md diff --git a/.agents/skills/pdf/agents/openai.yaml b/.agents/skills/pdf/agents/openai.yaml deleted file mode 100644 index fe2876a..0000000 --- a/.agents/skills/pdf/agents/openai.yaml +++ /dev/null @@ -1,5 +0,0 @@ -interface: - display_name: "PDF Skill" - short_description: "Create, edit, and review PDFs" - icon_large: "./assets/pdf.png" - default_prompt: "Create, edit, or review this PDF and summarize the key output or changes." diff --git a/.agents/skills/pdf/assets/pdf.png b/.agents/skills/pdf/assets/pdf.png deleted file mode 100644 index dd16ba283f6056cfe99cc3d36612d3b8cd79e00d..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 1312 zcmeAS@N?(olHy`uVBq!ia0vp^DIm``W z$lZxy-8q?;Kn_c~qpu?a!^VE@KZ&eB?p9A1$B+ufw{!RVi#UoLzyD5d!2}6Ysoskj zJ-oMOtXOL%=8(u-&tf7H5Pb1+fP*iushC+nhKLvQ#R(jWQ6COpkYTy8xA^{y#-}%$ zt!K>q`|gw7Nj3I2j%D^|sz0ZtKiiYV!)fqC#X-^OfPi8`hk$_#hlCOfk04VU2cvUK z12*BsE|P(F?<_fWNhtK%wT`m#lPdq-&K9||-~G!M$%}{Gr>NUp_vcx>)3WQ?^Qk=1 zQ{-#|*n{_$&fden*Tp<9K(0Wog;mDSZ`HKK*-HdmqoptYUiV*@O-9}6&`e3L<;zcs zE^o<*@OsYebgU|QA>qiZ zb+l{3sYgnoVd0!|4w_*tErn|*N*`Di2Jvb+>kA za(S;^a^dd1jO1>#~z9$Eu1Q{k8fqp`wIuZ&)+yF_j8h&`3aMmXJbk@yDegV zE{&^}efuOfTUDRq#ik$Utu!k<-F6B&O$yoR9b*=}ydw4U*FK)PSYre^qKxEYzF1jx ztc&;m8(EsY`r+rx=WIXjvo3!T^X;3_e6HC$1^!>v&XSZ^@w+(g`nhb;uSKsmpL@Ra zdX;RAZtmm9C35Y;!YkjGa@L&Sy0>@nwTBPe&V8M4AHFH-*Kf`B4z~Q360)}UA{OkO z+hm@<;>vCBdykL5O1u8<-vpccyQ`-+ZO@1McnsXq2_YL1iiyYc@rXHBB;`Qfy z@xGJN0lF>Dli%@fAdo679+mSlu*RLztcVC~9-Z^jj(b#C&X*>Kr&a(+TImdIg z<0pr8tt=ZSM3p#pTu9!|#;HrSs--^YT3FR4_c3CS={JXn*>}t3`_z&AS-Eb^4lC>AD%YM{ca~>|?ue z;ME1s(-#633faVazdpfn^T;ZJT1M9ik6uO1dGmLjdFZ(+x8tsyj~=ZzF(@$cOA}r$ zA0BAElwG#2Z++3i!%5eFK5sXu`ZuL2f@NE#n!VGv$H({F(b^`, and depending on the result go to either the "Fillable fields" or "Non-fillable fields" and follow those instructions. - -# Fillable fields -If the PDF has fillable form fields: -- Run this script from this file's directory: `python scripts/extract_form_field_info.py `. It will create a JSON file with a list of fields in this format: -``` -[ - { - "field_id": (unique ID for the field), - "page": (page number, 1-based), - "rect": ([left, bottom, right, top] bounding box in PDF coordinates, y=0 is the bottom of the page), - "type": ("text", "checkbox", "radio_group", or "choice"), - }, - // Checkboxes have "checked_value" and "unchecked_value" properties: - { - "field_id": (unique ID for the field), - "page": (page number, 1-based), - "type": "checkbox", - "checked_value": (Set the field to this value to check the checkbox), - "unchecked_value": (Set the field to this value to uncheck the checkbox), - }, - // Radio groups have a "radio_options" list with the possible choices. - { - "field_id": (unique ID for the field), - "page": (page number, 1-based), - "type": "radio_group", - "radio_options": [ - { - "value": (set the field to this value to select this radio option), - "rect": (bounding box for the radio button for this option) - }, - // Other radio options - ] - }, - // Multiple choice fields have a "choice_options" list with the possible choices: - { - "field_id": (unique ID for the field), - "page": (page number, 1-based), - "type": "choice", - "choice_options": [ - { - "value": (set the field to this value to select this option), - "text": (display text of the option) - }, - // Other choice options - ], - } -] -``` -- Convert the PDF to PNGs (one image for each page) with this script (run from this file's directory): -`python scripts/convert_pdf_to_images.py ` -Then analyze the images to determine the purpose of each form field (make sure to convert the bounding box PDF coordinates to image coordinates). -- Create a `field_values.json` file in this format with the values to be entered for each field: -``` -[ - { - "field_id": "last_name", // Must match the field_id from `extract_form_field_info.py` - "description": "The user's last name", - "page": 1, // Must match the "page" value in field_info.json - "value": "Simpson" - }, - { - "field_id": "Checkbox12", - "description": "Checkbox to be checked if the user is 18 or over", - "page": 1, - "value": "/On" // If this is a checkbox, use its "checked_value" value to check it. If it's a radio button group, use one of the "value" values in "radio_options". - }, - // more fields -] -``` -- Run the `fill_fillable_fields.py` script from this file's directory to create a filled-in PDF: -`python scripts/fill_fillable_fields.py ` -This script will verify that the field IDs and values you provide are valid; if it prints error messages, correct the appropriate fields and try again. - -# Non-fillable fields -If the PDF doesn't have fillable form fields, you'll add text annotations. First try to extract coordinates from the PDF structure (more accurate), then fall back to visual estimation if needed. - -## Step 1: Try Structure Extraction First - -Run this script to extract text labels, lines, and checkboxes with their exact PDF coordinates: -`python scripts/extract_form_structure.py form_structure.json` - -This creates a JSON file containing: -- **labels**: Every text element with exact coordinates (x0, top, x1, bottom in PDF points) -- **lines**: Horizontal lines that define row boundaries -- **checkboxes**: Small square rectangles that are checkboxes (with center coordinates) -- **row_boundaries**: Row top/bottom positions calculated from horizontal lines - -**Check the results**: If `form_structure.json` has meaningful labels (text elements that correspond to form fields), use **Approach A: Structure-Based Coordinates**. If the PDF is scanned/image-based and has few or no labels, use **Approach B: Visual Estimation**. - ---- - -## Approach A: Structure-Based Coordinates (Preferred) - -Use this when `extract_form_structure.py` found text labels in the PDF. - -### A.1: Analyze the Structure - -Read form_structure.json and identify: - -1. **Label groups**: Adjacent text elements that form a single label (e.g., "Last" + "Name") -2. **Row structure**: Labels with similar `top` values are in the same row -3. **Field columns**: Entry areas start after label ends (x0 = label.x1 + gap) -4. **Checkboxes**: Use the checkbox coordinates directly from the structure - -**Coordinate system**: PDF coordinates where y=0 is at TOP of page, y increases downward. - -### A.2: Check for Missing Elements - -The structure extraction may not detect all form elements. Common cases: -- **Circular checkboxes**: Only square rectangles are detected as checkboxes -- **Complex graphics**: Decorative elements or non-standard form controls -- **Faded or light-colored elements**: May not be extracted - -If you see form fields in the PDF images that aren't in form_structure.json, you'll need to use **visual analysis** for those specific fields (see "Hybrid Approach" below). - -### A.3: Create fields.json with PDF Coordinates - -For each field, calculate entry coordinates from the extracted structure: - -**Text fields:** -- entry x0 = label x1 + 5 (small gap after label) -- entry x1 = next label's x0, or row boundary -- entry top = same as label top -- entry bottom = row boundary line below, or label bottom + row_height - -**Checkboxes:** -- Use the checkbox rectangle coordinates directly from form_structure.json -- entry_bounding_box = [checkbox.x0, checkbox.top, checkbox.x1, checkbox.bottom] - -Create fields.json using `pdf_width` and `pdf_height` (signals PDF coordinates): -```json -{ - "pages": [ - {"page_number": 1, "pdf_width": 612, "pdf_height": 792} - ], - "form_fields": [ - { - "page_number": 1, - "description": "Last name entry field", - "field_label": "Last Name", - "label_bounding_box": [43, 63, 87, 73], - "entry_bounding_box": [92, 63, 260, 79], - "entry_text": {"text": "Smith", "font_size": 10} - }, - { - "page_number": 1, - "description": "US Citizen Yes checkbox", - "field_label": "Yes", - "label_bounding_box": [260, 200, 280, 210], - "entry_bounding_box": [285, 197, 292, 205], - "entry_text": {"text": "X"} - } - ] -} -``` - -**Important**: Use `pdf_width`/`pdf_height` and coordinates directly from form_structure.json. - -### A.4: Validate Bounding Boxes - -Before filling, check your bounding boxes for errors: -`python scripts/check_bounding_boxes.py fields.json` - -This checks for intersecting bounding boxes and entry boxes that are too small for the font size. Fix any reported errors before filling. - ---- - -## Approach B: Visual Estimation (Fallback) - -Use this when the PDF is scanned/image-based and structure extraction found no usable text labels (e.g., all text shows as "(cid:X)" patterns). - -### B.1: Convert PDF to Images - -`python scripts/convert_pdf_to_images.py ` - -### B.2: Initial Field Identification - -Examine each page image to identify form sections and get **rough estimates** of field locations: -- Form field labels and their approximate positions -- Entry areas (lines, boxes, or blank spaces for text input) -- Checkboxes and their approximate locations - -For each field, note approximate pixel coordinates (they don't need to be precise yet). - -### B.3: Zoom Refinement (CRITICAL for accuracy) - -For each field, crop a region around the estimated position to refine coordinates precisely. - -**Create a zoomed crop using ImageMagick:** -```bash -magick -crop x++ +repage -``` - -Where: -- `, ` = top-left corner of crop region (use your rough estimate minus padding) -- `, ` = size of crop region (field area plus ~50px padding on each side) - -**Example:** To refine a "Name" field estimated around (100, 150): -```bash -magick images_dir/page_1.png -crop 300x80+50+120 +repage crops/name_field.png -``` - -(Note: if the `magick` command isn't available, try `convert` with the same arguments). - -**Examine the cropped image** to determine precise coordinates: -1. Identify the exact pixel where the entry area begins (after the label) -2. Identify where the entry area ends (before next field or edge) -3. Identify the top and bottom of the entry line/box - -**Convert crop coordinates back to full image coordinates:** -- full_x = crop_x + crop_offset_x -- full_y = crop_y + crop_offset_y - -Example: If the crop started at (50, 120) and the entry box starts at (52, 18) within the crop: -- entry_x0 = 52 + 50 = 102 -- entry_top = 18 + 120 = 138 - -**Repeat for each field**, grouping nearby fields into single crops when possible. - -### B.4: Create fields.json with Refined Coordinates - -Create fields.json using `image_width` and `image_height` (signals image coordinates): -```json -{ - "pages": [ - {"page_number": 1, "image_width": 1700, "image_height": 2200} - ], - "form_fields": [ - { - "page_number": 1, - "description": "Last name entry field", - "field_label": "Last Name", - "label_bounding_box": [120, 175, 242, 198], - "entry_bounding_box": [255, 175, 720, 218], - "entry_text": {"text": "Smith", "font_size": 10} - } - ] -} -``` - -**Important**: Use `image_width`/`image_height` and the refined pixel coordinates from the zoom analysis. - -### B.5: Validate Bounding Boxes - -Before filling, check your bounding boxes for errors: -`python scripts/check_bounding_boxes.py fields.json` - -This checks for intersecting bounding boxes and entry boxes that are too small for the font size. Fix any reported errors before filling. - ---- - -## Hybrid Approach: Structure + Visual - -Use this when structure extraction works for most fields but misses some elements (e.g., circular checkboxes, unusual form controls). - -1. **Use Approach A** for fields that were detected in form_structure.json -2. **Convert PDF to images** for visual analysis of missing fields -3. **Use zoom refinement** (from Approach B) for the missing fields -4. **Combine coordinates**: For fields from structure extraction, use `pdf_width`/`pdf_height`. For visually-estimated fields, you must convert image coordinates to PDF coordinates: - - pdf_x = image_x * (pdf_width / image_width) - - pdf_y = image_y * (pdf_height / image_height) -5. **Use a single coordinate system** in fields.json - convert all to PDF coordinates with `pdf_width`/`pdf_height` - ---- - -## Step 2: Validate Before Filling - -**Always validate bounding boxes before filling:** -`python scripts/check_bounding_boxes.py fields.json` - -This checks for: -- Intersecting bounding boxes (which would cause overlapping text) -- Entry boxes that are too small for the specified font size - -Fix any reported errors in fields.json before proceeding. - -## Step 3: Fill the Form - -The fill script auto-detects the coordinate system and handles conversion: -`python scripts/fill_pdf_form_with_annotations.py fields.json ` - -## Step 4: Verify Output - -Convert the filled PDF to images and verify text placement: -`python scripts/convert_pdf_to_images.py ` - -If text is mispositioned: -- **Approach A**: Check that you're using PDF coordinates from form_structure.json with `pdf_width`/`pdf_height` -- **Approach B**: Check that image dimensions match and coordinates are accurate pixels -- **Hybrid**: Ensure coordinate conversions are correct for visually-estimated fields diff --git a/.agents/skills/pdf/reference.md b/.agents/skills/pdf/reference.md deleted file mode 100644 index 41400bf..0000000 --- a/.agents/skills/pdf/reference.md +++ /dev/null @@ -1,612 +0,0 @@ -# PDF Processing Advanced Reference - -This document contains advanced PDF processing features, detailed examples, and additional libraries not covered in the main skill instructions. - -## pypdfium2 Library (Apache/BSD License) - -### Overview -pypdfium2 is a Python binding for PDFium (Chromium's PDF library). It's excellent for fast PDF rendering, image generation, and serves as a PyMuPDF replacement. - -### Render PDF to Images -```python -import pypdfium2 as pdfium -from PIL import Image - -# Load PDF -pdf = pdfium.PdfDocument("document.pdf") - -# Render page to image -page = pdf[0] # First page -bitmap = page.render( - scale=2.0, # Higher resolution - rotation=0 # No rotation -) - -# Convert to PIL Image -img = bitmap.to_pil() -img.save("page_1.png", "PNG") - -# Process multiple pages -for i, page in enumerate(pdf): - bitmap = page.render(scale=1.5) - img = bitmap.to_pil() - img.save(f"page_{i+1}.jpg", "JPEG", quality=90) -``` - -### Extract Text with pypdfium2 -```python -import pypdfium2 as pdfium - -pdf = pdfium.PdfDocument("document.pdf") -for i, page in enumerate(pdf): - text = page.get_text() - print(f"Page {i+1} text length: {len(text)} chars") -``` - -## JavaScript Libraries - -### pdf-lib (MIT License) - -pdf-lib is a powerful JavaScript library for creating and modifying PDF documents in any JavaScript environment. - -#### Load and Manipulate Existing PDF -```javascript -import { PDFDocument } from 'pdf-lib'; -import fs from 'fs'; - -async function manipulatePDF() { - // Load existing PDF - const existingPdfBytes = fs.readFileSync('input.pdf'); - const pdfDoc = await PDFDocument.load(existingPdfBytes); - - // Get page count - const pageCount = pdfDoc.getPageCount(); - console.log(`Document has ${pageCount} pages`); - - // Add new page - const newPage = pdfDoc.addPage([600, 400]); - newPage.drawText('Added by pdf-lib', { - x: 100, - y: 300, - size: 16 - }); - - // Save modified PDF - const pdfBytes = await pdfDoc.save(); - fs.writeFileSync('modified.pdf', pdfBytes); -} -``` - -#### Create Complex PDFs from Scratch -```javascript -import { PDFDocument, rgb, StandardFonts } from 'pdf-lib'; -import fs from 'fs'; - -async function createPDF() { - const pdfDoc = await PDFDocument.create(); - - // Add fonts - const helveticaFont = await pdfDoc.embedFont(StandardFonts.Helvetica); - const helveticaBold = await pdfDoc.embedFont(StandardFonts.HelveticaBold); - - // Add page - const page = pdfDoc.addPage([595, 842]); // A4 size - const { width, height } = page.getSize(); - - // Add text with styling - page.drawText('Invoice #12345', { - x: 50, - y: height - 50, - size: 18, - font: helveticaBold, - color: rgb(0.2, 0.2, 0.8) - }); - - // Add rectangle (header background) - page.drawRectangle({ - x: 40, - y: height - 100, - width: width - 80, - height: 30, - color: rgb(0.9, 0.9, 0.9) - }); - - // Add table-like content - const items = [ - ['Item', 'Qty', 'Price', 'Total'], - ['Widget', '2', '$50', '$100'], - ['Gadget', '1', '$75', '$75'] - ]; - - let yPos = height - 150; - items.forEach(row => { - let xPos = 50; - row.forEach(cell => { - page.drawText(cell, { - x: xPos, - y: yPos, - size: 12, - font: helveticaFont - }); - xPos += 120; - }); - yPos -= 25; - }); - - const pdfBytes = await pdfDoc.save(); - fs.writeFileSync('created.pdf', pdfBytes); -} -``` - -#### Advanced Merge and Split Operations -```javascript -import { PDFDocument } from 'pdf-lib'; -import fs from 'fs'; - -async function mergePDFs() { - // Create new document - const mergedPdf = await PDFDocument.create(); - - // Load source PDFs - const pdf1Bytes = fs.readFileSync('doc1.pdf'); - const pdf2Bytes = fs.readFileSync('doc2.pdf'); - - const pdf1 = await PDFDocument.load(pdf1Bytes); - const pdf2 = await PDFDocument.load(pdf2Bytes); - - // Copy pages from first PDF - const pdf1Pages = await mergedPdf.copyPages(pdf1, pdf1.getPageIndices()); - pdf1Pages.forEach(page => mergedPdf.addPage(page)); - - // Copy specific pages from second PDF (pages 0, 2, 4) - const pdf2Pages = await mergedPdf.copyPages(pdf2, [0, 2, 4]); - pdf2Pages.forEach(page => mergedPdf.addPage(page)); - - const mergedPdfBytes = await mergedPdf.save(); - fs.writeFileSync('merged.pdf', mergedPdfBytes); -} -``` - -### pdfjs-dist (Apache License) - -PDF.js is Mozilla's JavaScript library for rendering PDFs in the browser. - -#### Basic PDF Loading and Rendering -```javascript -import * as pdfjsLib from 'pdfjs-dist'; - -// Configure worker (important for performance) -pdfjsLib.GlobalWorkerOptions.workerSrc = './pdf.worker.js'; - -async function renderPDF() { - // Load PDF - const loadingTask = pdfjsLib.getDocument('document.pdf'); - const pdf = await loadingTask.promise; - - console.log(`Loaded PDF with ${pdf.numPages} pages`); - - // Get first page - const page = await pdf.getPage(1); - const viewport = page.getViewport({ scale: 1.5 }); - - // Render to canvas - const canvas = document.createElement('canvas'); - const context = canvas.getContext('2d'); - canvas.height = viewport.height; - canvas.width = viewport.width; - - const renderContext = { - canvasContext: context, - viewport: viewport - }; - - await page.render(renderContext).promise; - document.body.appendChild(canvas); -} -``` - -#### Extract Text with Coordinates -```javascript -import * as pdfjsLib from 'pdfjs-dist'; - -async function extractText() { - const loadingTask = pdfjsLib.getDocument('document.pdf'); - const pdf = await loadingTask.promise; - - let fullText = ''; - - // Extract text from all pages - for (let i = 1; i <= pdf.numPages; i++) { - const page = await pdf.getPage(i); - const textContent = await page.getTextContent(); - - const pageText = textContent.items - .map(item => item.str) - .join(' '); - - fullText += `\n--- Page ${i} ---\n${pageText}`; - - // Get text with coordinates for advanced processing - const textWithCoords = textContent.items.map(item => ({ - text: item.str, - x: item.transform[4], - y: item.transform[5], - width: item.width, - height: item.height - })); - } - - console.log(fullText); - return fullText; -} -``` - -#### Extract Annotations and Forms -```javascript -import * as pdfjsLib from 'pdfjs-dist'; - -async function extractAnnotations() { - const loadingTask = pdfjsLib.getDocument('annotated.pdf'); - const pdf = await loadingTask.promise; - - for (let i = 1; i <= pdf.numPages; i++) { - const page = await pdf.getPage(i); - const annotations = await page.getAnnotations(); - - annotations.forEach(annotation => { - console.log(`Annotation type: ${annotation.subtype}`); - console.log(`Content: ${annotation.contents}`); - console.log(`Coordinates: ${JSON.stringify(annotation.rect)}`); - }); - } -} -``` - -## Advanced Command-Line Operations - -### poppler-utils Advanced Features - -#### Extract Text with Bounding Box Coordinates -```bash -# Extract text with bounding box coordinates (essential for structured data) -pdftotext -bbox-layout document.pdf output.xml - -# The XML output contains precise coordinates for each text element -``` - -#### Advanced Image Conversion -```bash -# Convert to PNG images with specific resolution -pdftoppm -png -r 300 document.pdf output_prefix - -# Convert specific page range with high resolution -pdftoppm -png -r 600 -f 1 -l 3 document.pdf high_res_pages - -# Convert to JPEG with quality setting -pdftoppm -jpeg -jpegopt quality=85 -r 200 document.pdf jpeg_output -``` - -#### Extract Embedded Images -```bash -# Extract all embedded images with metadata -pdfimages -j -p document.pdf page_images - -# List image info without extracting -pdfimages -list document.pdf - -# Extract images in their original format -pdfimages -all document.pdf images/img -``` - -### qpdf Advanced Features - -#### Complex Page Manipulation -```bash -# Split PDF into groups of pages -qpdf --split-pages=3 input.pdf output_group_%02d.pdf - -# Extract specific pages with complex ranges -qpdf input.pdf --pages input.pdf 1,3-5,8,10-end -- extracted.pdf - -# Merge specific pages from multiple PDFs -qpdf --empty --pages doc1.pdf 1-3 doc2.pdf 5-7 doc3.pdf 2,4 -- combined.pdf -``` - -#### PDF Optimization and Repair -```bash -# Optimize PDF for web (linearize for streaming) -qpdf --linearize input.pdf optimized.pdf - -# Remove unused objects and compress -qpdf --optimize-level=all input.pdf compressed.pdf - -# Attempt to repair corrupted PDF structure -qpdf --check input.pdf -qpdf --fix-qdf damaged.pdf repaired.pdf - -# Show detailed PDF structure for debugging -qpdf --show-all-pages input.pdf > structure.txt -``` - -#### Advanced Encryption -```bash -# Add password protection with specific permissions -qpdf --encrypt user_pass owner_pass 256 --print=none --modify=none -- input.pdf encrypted.pdf - -# Check encryption status -qpdf --show-encryption encrypted.pdf - -# Remove password protection (requires password) -qpdf --password=secret123 --decrypt encrypted.pdf decrypted.pdf -``` - -## Advanced Python Techniques - -### pdfplumber Advanced Features - -#### Extract Text with Precise Coordinates -```python -import pdfplumber - -with pdfplumber.open("document.pdf") as pdf: - page = pdf.pages[0] - - # Extract all text with coordinates - chars = page.chars - for char in chars[:10]: # First 10 characters - print(f"Char: '{char['text']}' at x:{char['x0']:.1f} y:{char['y0']:.1f}") - - # Extract text by bounding box (left, top, right, bottom) - bbox_text = page.within_bbox((100, 100, 400, 200)).extract_text() -``` - -#### Advanced Table Extraction with Custom Settings -```python -import pdfplumber -import pandas as pd - -with pdfplumber.open("complex_table.pdf") as pdf: - page = pdf.pages[0] - - # Extract tables with custom settings for complex layouts - table_settings = { - "vertical_strategy": "lines", - "horizontal_strategy": "lines", - "snap_tolerance": 3, - "intersection_tolerance": 15 - } - tables = page.extract_tables(table_settings) - - # Visual debugging for table extraction - img = page.to_image(resolution=150) - img.save("debug_layout.png") -``` - -### reportlab Advanced Features - -#### Create Professional Reports with Tables -```python -from reportlab.platypus import SimpleDocTemplate, Table, TableStyle, Paragraph -from reportlab.lib.styles import getSampleStyleSheet -from reportlab.lib import colors - -# Sample data -data = [ - ['Product', 'Q1', 'Q2', 'Q3', 'Q4'], - ['Widgets', '120', '135', '142', '158'], - ['Gadgets', '85', '92', '98', '105'] -] - -# Create PDF with table -doc = SimpleDocTemplate("report.pdf") -elements = [] - -# Add title -styles = getSampleStyleSheet() -title = Paragraph("Quarterly Sales Report", styles['Title']) -elements.append(title) - -# Add table with advanced styling -table = Table(data) -table.setStyle(TableStyle([ - ('BACKGROUND', (0, 0), (-1, 0), colors.grey), - ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke), - ('ALIGN', (0, 0), (-1, -1), 'CENTER'), - ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'), - ('FONTSIZE', (0, 0), (-1, 0), 14), - ('BOTTOMPADDING', (0, 0), (-1, 0), 12), - ('BACKGROUND', (0, 1), (-1, -1), colors.beige), - ('GRID', (0, 0), (-1, -1), 1, colors.black) -])) -elements.append(table) - -doc.build(elements) -``` - -## Complex Workflows - -### Extract Figures/Images from PDF - -#### Method 1: Using pdfimages (fastest) -```bash -# Extract all images with original quality -pdfimages -all document.pdf images/img -``` - -#### Method 2: Using pypdfium2 + Image Processing -```python -import pypdfium2 as pdfium -from PIL import Image -import numpy as np - -def extract_figures(pdf_path, output_dir): - pdf = pdfium.PdfDocument(pdf_path) - - for page_num, page in enumerate(pdf): - # Render high-resolution page - bitmap = page.render(scale=3.0) - img = bitmap.to_pil() - - # Convert to numpy for processing - img_array = np.array(img) - - # Simple figure detection (non-white regions) - mask = np.any(img_array != [255, 255, 255], axis=2) - - # Find contours and extract bounding boxes - # (This is simplified - real implementation would need more sophisticated detection) - - # Save detected figures - # ... implementation depends on specific needs -``` - -### Batch PDF Processing with Error Handling -```python -import os -import glob -from pypdf import PdfReader, PdfWriter -import logging - -logging.basicConfig(level=logging.INFO) -logger = logging.getLogger(__name__) - -def batch_process_pdfs(input_dir, operation='merge'): - pdf_files = glob.glob(os.path.join(input_dir, "*.pdf")) - - if operation == 'merge': - writer = PdfWriter() - for pdf_file in pdf_files: - try: - reader = PdfReader(pdf_file) - for page in reader.pages: - writer.add_page(page) - logger.info(f"Processed: {pdf_file}") - except Exception as e: - logger.error(f"Failed to process {pdf_file}: {e}") - continue - - with open("batch_merged.pdf", "wb") as output: - writer.write(output) - - elif operation == 'extract_text': - for pdf_file in pdf_files: - try: - reader = PdfReader(pdf_file) - text = "" - for page in reader.pages: - text += page.extract_text() - - output_file = pdf_file.replace('.pdf', '.txt') - with open(output_file, 'w', encoding='utf-8') as f: - f.write(text) - logger.info(f"Extracted text from: {pdf_file}") - - except Exception as e: - logger.error(f"Failed to extract text from {pdf_file}: {e}") - continue -``` - -### Advanced PDF Cropping -```python -from pypdf import PdfWriter, PdfReader - -reader = PdfReader("input.pdf") -writer = PdfWriter() - -# Crop page (left, bottom, right, top in points) -page = reader.pages[0] -page.mediabox.left = 50 -page.mediabox.bottom = 50 -page.mediabox.right = 550 -page.mediabox.top = 750 - -writer.add_page(page) -with open("cropped.pdf", "wb") as output: - writer.write(output) -``` - -## Performance Optimization Tips - -### 1. For Large PDFs -- Use streaming approaches instead of loading entire PDF in memory -- Use `qpdf --split-pages` for splitting large files -- Process pages individually with pypdfium2 - -### 2. For Text Extraction -- `pdftotext -bbox-layout` is fastest for plain text extraction -- Use pdfplumber for structured data and tables -- Avoid `pypdf.extract_text()` for very large documents - -### 3. For Image Extraction -- `pdfimages` is much faster than rendering pages -- Use low resolution for previews, high resolution for final output - -### 4. For Form Filling -- pdf-lib maintains form structure better than most alternatives -- Pre-validate form fields before processing - -### 5. Memory Management -```python -# Process PDFs in chunks -def process_large_pdf(pdf_path, chunk_size=10): - reader = PdfReader(pdf_path) - total_pages = len(reader.pages) - - for start_idx in range(0, total_pages, chunk_size): - end_idx = min(start_idx + chunk_size, total_pages) - writer = PdfWriter() - - for i in range(start_idx, end_idx): - writer.add_page(reader.pages[i]) - - # Process chunk - with open(f"chunk_{start_idx//chunk_size}.pdf", "wb") as output: - writer.write(output) -``` - -## Troubleshooting Common Issues - -### Encrypted PDFs -```python -# Handle password-protected PDFs -from pypdf import PdfReader - -try: - reader = PdfReader("encrypted.pdf") - if reader.is_encrypted: - reader.decrypt("password") -except Exception as e: - print(f"Failed to decrypt: {e}") -``` - -### Corrupted PDFs -```bash -# Use qpdf to repair -qpdf --check corrupted.pdf -qpdf --replace-input corrupted.pdf -``` - -### Text Extraction Issues -```python -# Fallback to OCR for scanned PDFs -import pytesseract -from pdf2image import convert_from_path - -def extract_text_with_ocr(pdf_path): - images = convert_from_path(pdf_path) - text = "" - for i, image in enumerate(images): - text += pytesseract.image_to_string(image) - return text -``` - -## License Information - -- **pypdf**: BSD License -- **pdfplumber**: MIT License -- **pypdfium2**: Apache/BSD License -- **reportlab**: BSD License -- **poppler-utils**: GPL-2 License -- **qpdf**: Apache License -- **pdf-lib**: MIT License -- **pdfjs-dist**: Apache License \ No newline at end of file diff --git a/.agents/skills/pdf/scripts/check_bounding_boxes.py b/.agents/skills/pdf/scripts/check_bounding_boxes.py deleted file mode 100644 index 2cc5e34..0000000 --- a/.agents/skills/pdf/scripts/check_bounding_boxes.py +++ /dev/null @@ -1,65 +0,0 @@ -from dataclasses import dataclass -import json -import sys - - - - -@dataclass -class RectAndField: - rect: list[float] - rect_type: str - field: dict - - -def get_bounding_box_messages(fields_json_stream) -> list[str]: - messages = [] - fields = json.load(fields_json_stream) - messages.append(f"Read {len(fields['form_fields'])} fields") - - def rects_intersect(r1, r2): - disjoint_horizontal = r1[0] >= r2[2] or r1[2] <= r2[0] - disjoint_vertical = r1[1] >= r2[3] or r1[3] <= r2[1] - return not (disjoint_horizontal or disjoint_vertical) - - rects_and_fields = [] - for f in fields["form_fields"]: - rects_and_fields.append(RectAndField(f["label_bounding_box"], "label", f)) - rects_and_fields.append(RectAndField(f["entry_bounding_box"], "entry", f)) - - has_error = False - for i, ri in enumerate(rects_and_fields): - for j in range(i + 1, len(rects_and_fields)): - rj = rects_and_fields[j] - if ri.field["page_number"] == rj.field["page_number"] and rects_intersect(ri.rect, rj.rect): - has_error = True - if ri.field is rj.field: - messages.append(f"FAILURE: intersection between label and entry bounding boxes for `{ri.field['description']}` ({ri.rect}, {rj.rect})") - else: - messages.append(f"FAILURE: intersection between {ri.rect_type} bounding box for `{ri.field['description']}` ({ri.rect}) and {rj.rect_type} bounding box for `{rj.field['description']}` ({rj.rect})") - if len(messages) >= 20: - messages.append("Aborting further checks; fix bounding boxes and try again") - return messages - if ri.rect_type == "entry": - if "entry_text" in ri.field: - font_size = ri.field["entry_text"].get("font_size", 14) - entry_height = ri.rect[3] - ri.rect[1] - if entry_height < font_size: - has_error = True - messages.append(f"FAILURE: entry bounding box height ({entry_height}) for `{ri.field['description']}` is too short for the text content (font size: {font_size}). Increase the box height or decrease the font size.") - if len(messages) >= 20: - messages.append("Aborting further checks; fix bounding boxes and try again") - return messages - - if not has_error: - messages.append("SUCCESS: All bounding boxes are valid") - return messages - -if __name__ == "__main__": - if len(sys.argv) != 2: - print("Usage: check_bounding_boxes.py [fields.json]") - sys.exit(1) - with open(sys.argv[1]) as f: - messages = get_bounding_box_messages(f) - for msg in messages: - print(msg) diff --git a/.agents/skills/pdf/scripts/check_fillable_fields.py b/.agents/skills/pdf/scripts/check_fillable_fields.py deleted file mode 100644 index 36dfb95..0000000 --- a/.agents/skills/pdf/scripts/check_fillable_fields.py +++ /dev/null @@ -1,11 +0,0 @@ -import sys -from pypdf import PdfReader - - - - -reader = PdfReader(sys.argv[1]) -if (reader.get_fields()): - print("This PDF has fillable form fields") -else: - print("This PDF does not have fillable form fields; you will need to visually determine where to enter data") diff --git a/.agents/skills/pdf/scripts/convert_pdf_to_images.py b/.agents/skills/pdf/scripts/convert_pdf_to_images.py deleted file mode 100644 index 7939cef..0000000 --- a/.agents/skills/pdf/scripts/convert_pdf_to_images.py +++ /dev/null @@ -1,33 +0,0 @@ -import os -import sys - -from pdf2image import convert_from_path - - - - -def convert(pdf_path, output_dir, max_dim=1000): - images = convert_from_path(pdf_path, dpi=200) - - for i, image in enumerate(images): - width, height = image.size - if width > max_dim or height > max_dim: - scale_factor = min(max_dim / width, max_dim / height) - new_width = int(width * scale_factor) - new_height = int(height * scale_factor) - image = image.resize((new_width, new_height)) - - image_path = os.path.join(output_dir, f"page_{i+1}.png") - image.save(image_path) - print(f"Saved page {i+1} as {image_path} (size: {image.size})") - - print(f"Converted {len(images)} pages to PNG images") - - -if __name__ == "__main__": - if len(sys.argv) != 3: - print("Usage: convert_pdf_to_images.py [input pdf] [output directory]") - sys.exit(1) - pdf_path = sys.argv[1] - output_directory = sys.argv[2] - convert(pdf_path, output_directory) diff --git a/.agents/skills/pdf/scripts/create_validation_image.py b/.agents/skills/pdf/scripts/create_validation_image.py deleted file mode 100644 index 10eadd8..0000000 --- a/.agents/skills/pdf/scripts/create_validation_image.py +++ /dev/null @@ -1,37 +0,0 @@ -import json -import sys - -from PIL import Image, ImageDraw - - - - -def create_validation_image(page_number, fields_json_path, input_path, output_path): - with open(fields_json_path, 'r') as f: - data = json.load(f) - - img = Image.open(input_path) - draw = ImageDraw.Draw(img) - num_boxes = 0 - - for field in data["form_fields"]: - if field["page_number"] == page_number: - entry_box = field['entry_bounding_box'] - label_box = field['label_bounding_box'] - draw.rectangle(entry_box, outline='red', width=2) - draw.rectangle(label_box, outline='blue', width=2) - num_boxes += 2 - - img.save(output_path) - print(f"Created validation image at {output_path} with {num_boxes} bounding boxes") - - -if __name__ == "__main__": - if len(sys.argv) != 5: - print("Usage: create_validation_image.py [page number] [fields.json file] [input image path] [output image path]") - sys.exit(1) - page_number = int(sys.argv[1]) - fields_json_path = sys.argv[2] - input_image_path = sys.argv[3] - output_image_path = sys.argv[4] - create_validation_image(page_number, fields_json_path, input_image_path, output_image_path) diff --git a/.agents/skills/pdf/scripts/extract_form_field_info.py b/.agents/skills/pdf/scripts/extract_form_field_info.py deleted file mode 100644 index 64cd470..0000000 --- a/.agents/skills/pdf/scripts/extract_form_field_info.py +++ /dev/null @@ -1,122 +0,0 @@ -import json -import sys - -from pypdf import PdfReader - - - - -def get_full_annotation_field_id(annotation): - components = [] - while annotation: - field_name = annotation.get('/T') - if field_name: - components.append(field_name) - annotation = annotation.get('/Parent') - return ".".join(reversed(components)) if components else None - - -def make_field_dict(field, field_id): - field_dict = {"field_id": field_id} - ft = field.get('/FT') - if ft == "/Tx": - field_dict["type"] = "text" - elif ft == "/Btn": - field_dict["type"] = "checkbox" - states = field.get("/_States_", []) - if len(states) == 2: - if "/Off" in states: - field_dict["checked_value"] = states[0] if states[0] != "/Off" else states[1] - field_dict["unchecked_value"] = "/Off" - else: - print(f"Unexpected state values for checkbox `${field_id}`. Its checked and unchecked values may not be correct; if you're trying to check it, visually verify the results.") - field_dict["checked_value"] = states[0] - field_dict["unchecked_value"] = states[1] - elif ft == "/Ch": - field_dict["type"] = "choice" - states = field.get("/_States_", []) - field_dict["choice_options"] = [{ - "value": state[0], - "text": state[1], - } for state in states] - else: - field_dict["type"] = f"unknown ({ft})" - return field_dict - - -def get_field_info(reader: PdfReader): - fields = reader.get_fields() - - field_info_by_id = {} - possible_radio_names = set() - - for field_id, field in fields.items(): - if field.get("/Kids"): - if field.get("/FT") == "/Btn": - possible_radio_names.add(field_id) - continue - field_info_by_id[field_id] = make_field_dict(field, field_id) - - - radio_fields_by_id = {} - - for page_index, page in enumerate(reader.pages): - annotations = page.get('/Annots', []) - for ann in annotations: - field_id = get_full_annotation_field_id(ann) - if field_id in field_info_by_id: - field_info_by_id[field_id]["page"] = page_index + 1 - field_info_by_id[field_id]["rect"] = ann.get('/Rect') - elif field_id in possible_radio_names: - try: - on_values = [v for v in ann["/AP"]["/N"] if v != "/Off"] - except KeyError: - continue - if len(on_values) == 1: - rect = ann.get("/Rect") - if field_id not in radio_fields_by_id: - radio_fields_by_id[field_id] = { - "field_id": field_id, - "type": "radio_group", - "page": page_index + 1, - "radio_options": [], - } - radio_fields_by_id[field_id]["radio_options"].append({ - "value": on_values[0], - "rect": rect, - }) - - fields_with_location = [] - for field_info in field_info_by_id.values(): - if "page" in field_info: - fields_with_location.append(field_info) - else: - print(f"Unable to determine location for field id: {field_info.get('field_id')}, ignoring") - - def sort_key(f): - if "radio_options" in f: - rect = f["radio_options"][0]["rect"] or [0, 0, 0, 0] - else: - rect = f.get("rect") or [0, 0, 0, 0] - adjusted_position = [-rect[1], rect[0]] - return [f.get("page"), adjusted_position] - - sorted_fields = fields_with_location + list(radio_fields_by_id.values()) - sorted_fields.sort(key=sort_key) - - return sorted_fields - - -def write_field_info(pdf_path: str, json_output_path: str): - reader = PdfReader(pdf_path) - field_info = get_field_info(reader) - with open(json_output_path, "w") as f: - json.dump(field_info, f, indent=2) - print(f"Wrote {len(field_info)} fields to {json_output_path}") - - -if __name__ == "__main__": - if len(sys.argv) != 3: - print("Usage: extract_form_field_info.py [input pdf] [output json]") - sys.exit(1) - write_field_info(sys.argv[1], sys.argv[2]) diff --git a/.agents/skills/pdf/scripts/extract_form_structure.py b/.agents/skills/pdf/scripts/extract_form_structure.py deleted file mode 100755 index f219e7d..0000000 --- a/.agents/skills/pdf/scripts/extract_form_structure.py +++ /dev/null @@ -1,115 +0,0 @@ -""" -Extract form structure from a non-fillable PDF. - -This script analyzes the PDF to find: -- Text labels with their exact coordinates -- Horizontal lines (row boundaries) -- Checkboxes (small rectangles) - -Output: A JSON file with the form structure that can be used to generate -accurate field coordinates for filling. - -Usage: python extract_form_structure.py -""" - -import json -import sys -import pdfplumber - - -def extract_form_structure(pdf_path): - structure = { - "pages": [], - "labels": [], - "lines": [], - "checkboxes": [], - "row_boundaries": [] - } - - with pdfplumber.open(pdf_path) as pdf: - for page_num, page in enumerate(pdf.pages, 1): - structure["pages"].append({ - "page_number": page_num, - "width": float(page.width), - "height": float(page.height) - }) - - words = page.extract_words() - for word in words: - structure["labels"].append({ - "page": page_num, - "text": word["text"], - "x0": round(float(word["x0"]), 1), - "top": round(float(word["top"]), 1), - "x1": round(float(word["x1"]), 1), - "bottom": round(float(word["bottom"]), 1) - }) - - for line in page.lines: - if abs(float(line["x1"]) - float(line["x0"])) > page.width * 0.5: - structure["lines"].append({ - "page": page_num, - "y": round(float(line["top"]), 1), - "x0": round(float(line["x0"]), 1), - "x1": round(float(line["x1"]), 1) - }) - - for rect in page.rects: - width = float(rect["x1"]) - float(rect["x0"]) - height = float(rect["bottom"]) - float(rect["top"]) - if 5 <= width <= 15 and 5 <= height <= 15 and abs(width - height) < 2: - structure["checkboxes"].append({ - "page": page_num, - "x0": round(float(rect["x0"]), 1), - "top": round(float(rect["top"]), 1), - "x1": round(float(rect["x1"]), 1), - "bottom": round(float(rect["bottom"]), 1), - "center_x": round((float(rect["x0"]) + float(rect["x1"])) / 2, 1), - "center_y": round((float(rect["top"]) + float(rect["bottom"])) / 2, 1) - }) - - lines_by_page = {} - for line in structure["lines"]: - page = line["page"] - if page not in lines_by_page: - lines_by_page[page] = [] - lines_by_page[page].append(line["y"]) - - for page, y_coords in lines_by_page.items(): - y_coords = sorted(set(y_coords)) - for i in range(len(y_coords) - 1): - structure["row_boundaries"].append({ - "page": page, - "row_top": y_coords[i], - "row_bottom": y_coords[i + 1], - "row_height": round(y_coords[i + 1] - y_coords[i], 1) - }) - - return structure - - -def main(): - if len(sys.argv) != 3: - print("Usage: extract_form_structure.py ") - sys.exit(1) - - pdf_path = sys.argv[1] - output_path = sys.argv[2] - - print(f"Extracting structure from {pdf_path}...") - structure = extract_form_structure(pdf_path) - - with open(output_path, "w") as f: - json.dump(structure, f, indent=2) - - print(f"Found:") - print(f" - {len(structure['pages'])} pages") - print(f" - {len(structure['labels'])} text labels") - print(f" - {len(structure['lines'])} horizontal lines") - print(f" - {len(structure['checkboxes'])} checkboxes") - print(f" - {len(structure['row_boundaries'])} row boundaries") - print(f"Saved to {output_path}") - - -if __name__ == "__main__": - main() diff --git a/.agents/skills/pdf/scripts/fill_fillable_fields.py b/.agents/skills/pdf/scripts/fill_fillable_fields.py deleted file mode 100644 index 51c2600..0000000 --- a/.agents/skills/pdf/scripts/fill_fillable_fields.py +++ /dev/null @@ -1,98 +0,0 @@ -import json -import sys - -from pypdf import PdfReader, PdfWriter - -from extract_form_field_info import get_field_info - - - - -def fill_pdf_fields(input_pdf_path: str, fields_json_path: str, output_pdf_path: str): - with open(fields_json_path) as f: - fields = json.load(f) - fields_by_page = {} - for field in fields: - if "value" in field: - field_id = field["field_id"] - page = field["page"] - if page not in fields_by_page: - fields_by_page[page] = {} - fields_by_page[page][field_id] = field["value"] - - reader = PdfReader(input_pdf_path) - - has_error = False - field_info = get_field_info(reader) - fields_by_ids = {f["field_id"]: f for f in field_info} - for field in fields: - existing_field = fields_by_ids.get(field["field_id"]) - if not existing_field: - has_error = True - print(f"ERROR: `{field['field_id']}` is not a valid field ID") - elif field["page"] != existing_field["page"]: - has_error = True - print(f"ERROR: Incorrect page number for `{field['field_id']}` (got {field['page']}, expected {existing_field['page']})") - else: - if "value" in field: - err = validation_error_for_field_value(existing_field, field["value"]) - if err: - print(err) - has_error = True - if has_error: - sys.exit(1) - - writer = PdfWriter(clone_from=reader) - for page, field_values in fields_by_page.items(): - writer.update_page_form_field_values(writer.pages[page - 1], field_values, auto_regenerate=False) - - writer.set_need_appearances_writer(True) - - with open(output_pdf_path, "wb") as f: - writer.write(f) - - -def validation_error_for_field_value(field_info, field_value): - field_type = field_info["type"] - field_id = field_info["field_id"] - if field_type == "checkbox": - checked_val = field_info["checked_value"] - unchecked_val = field_info["unchecked_value"] - if field_value != checked_val and field_value != unchecked_val: - return f'ERROR: Invalid value "{field_value}" for checkbox field "{field_id}". The checked value is "{checked_val}" and the unchecked value is "{unchecked_val}"' - elif field_type == "radio_group": - option_values = [opt["value"] for opt in field_info["radio_options"]] - if field_value not in option_values: - return f'ERROR: Invalid value "{field_value}" for radio group field "{field_id}". Valid values are: {option_values}' - elif field_type == "choice": - choice_values = [opt["value"] for opt in field_info["choice_options"]] - if field_value not in choice_values: - return f'ERROR: Invalid value "{field_value}" for choice field "{field_id}". Valid values are: {choice_values}' - return None - - -def monkeypatch_pydpf_method(): - from pypdf.generic import DictionaryObject - from pypdf.constants import FieldDictionaryAttributes - - original_get_inherited = DictionaryObject.get_inherited - - def patched_get_inherited(self, key: str, default = None): - result = original_get_inherited(self, key, default) - if key == FieldDictionaryAttributes.Opt: - if isinstance(result, list) and all(isinstance(v, list) and len(v) == 2 for v in result): - result = [r[0] for r in result] - return result - - DictionaryObject.get_inherited = patched_get_inherited - - -if __name__ == "__main__": - if len(sys.argv) != 4: - print("Usage: fill_fillable_fields.py [input pdf] [field_values.json] [output pdf]") - sys.exit(1) - monkeypatch_pydpf_method() - input_pdf = sys.argv[1] - fields_json = sys.argv[2] - output_pdf = sys.argv[3] - fill_pdf_fields(input_pdf, fields_json, output_pdf) diff --git a/.agents/skills/pdf/scripts/fill_pdf_form_with_annotations.py b/.agents/skills/pdf/scripts/fill_pdf_form_with_annotations.py deleted file mode 100644 index b430069..0000000 --- a/.agents/skills/pdf/scripts/fill_pdf_form_with_annotations.py +++ /dev/null @@ -1,107 +0,0 @@ -import json -import sys - -from pypdf import PdfReader, PdfWriter -from pypdf.annotations import FreeText - - - - -def transform_from_image_coords(bbox, image_width, image_height, pdf_width, pdf_height): - x_scale = pdf_width / image_width - y_scale = pdf_height / image_height - - left = bbox[0] * x_scale - right = bbox[2] * x_scale - - top = pdf_height - (bbox[1] * y_scale) - bottom = pdf_height - (bbox[3] * y_scale) - - return left, bottom, right, top - - -def transform_from_pdf_coords(bbox, pdf_height): - left = bbox[0] - right = bbox[2] - - pypdf_top = pdf_height - bbox[1] - pypdf_bottom = pdf_height - bbox[3] - - return left, pypdf_bottom, right, pypdf_top - - -def fill_pdf_form(input_pdf_path, fields_json_path, output_pdf_path): - - with open(fields_json_path, "r") as f: - fields_data = json.load(f) - - reader = PdfReader(input_pdf_path) - writer = PdfWriter() - - writer.append(reader) - - pdf_dimensions = {} - for i, page in enumerate(reader.pages): - mediabox = page.mediabox - pdf_dimensions[i + 1] = [mediabox.width, mediabox.height] - - annotations = [] - for field in fields_data["form_fields"]: - page_num = field["page_number"] - - page_info = next(p for p in fields_data["pages"] if p["page_number"] == page_num) - pdf_width, pdf_height = pdf_dimensions[page_num] - - if "pdf_width" in page_info: - transformed_entry_box = transform_from_pdf_coords( - field["entry_bounding_box"], - float(pdf_height) - ) - else: - image_width = page_info["image_width"] - image_height = page_info["image_height"] - transformed_entry_box = transform_from_image_coords( - field["entry_bounding_box"], - image_width, image_height, - float(pdf_width), float(pdf_height) - ) - - if "entry_text" not in field or "text" not in field["entry_text"]: - continue - entry_text = field["entry_text"] - text = entry_text["text"] - if not text: - continue - - font_name = entry_text.get("font", "Arial") - font_size = str(entry_text.get("font_size", 14)) + "pt" - font_color = entry_text.get("font_color", "000000") - - annotation = FreeText( - text=text, - rect=transformed_entry_box, - font=font_name, - font_size=font_size, - font_color=font_color, - border_color=None, - background_color=None, - ) - annotations.append(annotation) - writer.add_annotation(page_number=page_num - 1, annotation=annotation) - - with open(output_pdf_path, "wb") as output: - writer.write(output) - - print(f"Successfully filled PDF form and saved to {output_pdf_path}") - print(f"Added {len(annotations)} text annotations") - - -if __name__ == "__main__": - if len(sys.argv) != 4: - print("Usage: fill_pdf_form_with_annotations.py [input pdf] [fields.json] [output pdf]") - sys.exit(1) - input_pdf = sys.argv[1] - fields_json = sys.argv[2] - output_pdf = sys.argv[3] - - fill_pdf_form(input_pdf, fields_json, output_pdf) diff --git a/.agents/skills/screenshot/LICENSE.txt b/.agents/skills/screenshot/LICENSE.txt deleted file mode 100644 index 13e25df..0000000 --- a/.agents/skills/screenshot/LICENSE.txt +++ /dev/null @@ -1,201 +0,0 @@ -Apache License -Version 2.0, January 2004 -http://www.apache.org/licenses/ - -TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION - -1. Definitions. - - "License" shall mean the terms and conditions for use, reproduction, - and distribution as defined by Sections 1 through 9 of this document. - - "Licensor" shall mean the copyright owner or entity authorized by - the copyright owner that is granting the License. - - "Legal Entity" shall mean the union of the acting entity and all - other entities that control, are controlled by, or are under common - control with that entity. For the purposes of this definition, - "control" means (i) the power, direct or indirect, to cause the - direction or management of such entity, whether by contract or - otherwise, or (ii) ownership of fifty percent (50%) or more of the - outstanding shares, or (iii) beneficial ownership of such entity. - - "You" (or "Your") shall mean an individual or Legal Entity - exercising permissions granted by this License. - - "Source" form shall mean the preferred form for making modifications, - including but not limited to software source code, documentation - source, and configuration files. - - "Object" form shall mean any form resulting from mechanical - transformation or translation of a Source form, including but - not limited to compiled object code, generated documentation, - and conversions to other media types. - - "Work" shall mean the work of authorship, whether in Source or - Object form, made available under the License, as indicated by a - copyright notice that is included in or attached to the work - (an example is provided in the Appendix below). - - "Derivative Works" shall mean any work, whether in Source or Object - form, that is based on (or derived from) the Work and for which the - editorial revisions, annotations, elaborations, or other modifications - represent, as a whole, an original work of authorship. For the purposes - of this License, Derivative Works shall not include works that remain - separable from, or merely link (or bind by name) to the interfaces of, - the Work and Derivative Works thereof. - - "Contribution" shall mean any work of authorship, including - the original version of the Work and any modifications or additions - to that Work or Derivative Works thereof, that is intentionally - submitted to Licensor for inclusion in the Work by the copyright owner - or by an individual or Legal Entity authorized to submit on behalf of - the copyright owner. For the purposes of this definition, "submitted" - means any form of electronic, verbal, or written communication sent - to the Licensor or its representatives, including but not limited to - communication on electronic mailing lists, source code control systems, - and issue tracking systems that are managed by, or on behalf of, the - Licensor for the purpose of discussing and improving the Work, but - excluding communication that is conspicuously marked or otherwise - designated in writing by the copyright owner as "Not a Contribution." - - "Contributor" shall mean Licensor and any individual or Legal Entity - on behalf of whom a Contribution has been received by Licensor and - subsequently incorporated within the Work. - -2. Grant of Copyright License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - copyright license to reproduce, prepare Derivative Works of, - publicly display, publicly perform, sublicense, and distribute the - Work and such Derivative Works in Source or Object form. - -3. Grant of Patent License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - (except as stated in this section) patent license to make, have made, - use, offer to sell, sell, import, and otherwise transfer the Work, - where such license applies only to those patent claims licensable - by such Contributor that are necessarily infringed by their - Contribution(s) alone or by combination of their Contribution(s) - with the Work to which such Contribution(s) was submitted. If You - institute patent litigation against any entity (including a - cross-claim or counterclaim in a lawsuit) alleging that the Work - or a Contribution incorporated within the Work constitutes direct - or contributory patent infringement, then any patent licenses - granted to You under this License for that Work shall terminate - as of the date such litigation is filed. - -4. Redistribution. You may reproduce and distribute copies of the - Work or Derivative Works thereof in any medium, with or without - modifications, and in Source or Object form, provided that You - meet the following conditions: - - (a) You must give any other recipients of the Work or - Derivative Works a copy of this License; and - - (b) You must cause any modified files to carry prominent notices - stating that You changed the files; and - - (c) You must retain, in the Source form of any Derivative Works - that You distribute, all copyright, patent, trademark, and - attribution notices from the Source form of the Work, - excluding those notices that do not pertain to any part of - the Derivative Works; and - - (d) If the Work includes a "NOTICE" text file as part of its - distribution, then any Derivative Works that You distribute must - include a readable copy of the attribution notices contained - within such NOTICE file, excluding those notices that do not - pertain to any part of the Derivative Works, in at least one - of the following places: within a NOTICE text file distributed - as part of the Derivative Works; within the Source form or - documentation, if provided along with the Derivative Works; or, - within a display generated by the Derivative Works, if and - wherever such third-party notices normally appear. The contents - of the NOTICE file are for informational purposes only and - do not modify the License. You may add Your own attribution - notices within Derivative Works that You distribute, alongside - or as an addendum to the NOTICE text from the Work, provided - that such additional attribution notices cannot be construed - as modifying the License. - - You may add Your own copyright statement to Your modifications and - may provide additional or different license terms and conditions - for use, reproduction, or distribution of Your modifications, or - for any such Derivative Works as a whole, provided Your use, - reproduction, and distribution of the Work otherwise complies with - the conditions stated in this License. - -5. Submission of Contributions. Unless You explicitly state otherwise, - any Contribution intentionally submitted for inclusion in the Work - by You to the Licensor shall be under the terms and conditions of - this License, without any additional terms or conditions. - Notwithstanding the above, nothing herein shall supersede or modify - the terms of any separate license agreement you may have executed - with Licensor regarding such Contributions. - -6. Trademarks. This License does not grant permission to use the trade - names, trademarks, service marks, or product names of the Licensor, - except as required for reasonable and customary use in describing the - origin of the Work and reproducing the content of the NOTICE file. - -7. Disclaimer of Warranty. Unless required by applicable law or - agreed to in writing, Licensor provides the Work (and each - Contributor provides its Contributions) on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or - implied, including, without limitation, any warranties or conditions - of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A - PARTICULAR PURPOSE. You are solely responsible for determining the - appropriateness of using or redistributing the Work and assume any - risks associated with Your exercise of permissions under this License. - -8. Limitation of Liability. In no event and under no legal theory, - whether in tort (including negligence), contract, or otherwise, - unless required by applicable law (such as deliberate and grossly - negligent acts) or agreed to in writing, shall any Contributor be - liable to You for damages, including any direct, indirect, special, - incidental, or consequential damages of any character arising as a - result of this License or out of the use or inability to use the - Work (including but not limited to damages for loss of goodwill, - work stoppage, computer failure or malfunction, or any and all - other commercial damages or losses), even if such Contributor - has been advised of the possibility of such damages. - -9. Accepting Warranty or Additional Liability. While redistributing - the Work or Derivative Works thereof, You may choose to offer, - and charge a fee for, acceptance of support, warranty, indemnity, - or other liability obligations and/or rights consistent with this - License. However, in accepting such obligations, You may act only - on Your own behalf and on Your sole responsibility, not on behalf of - any other Contributor, and only if You agree to indemnify, - defend, and hold each Contributor harmless for any liability - incurred by, or claims asserted against, such Contributor by reason - of your accepting any such warranty or additional liability. - -END OF TERMS AND CONDITIONS - -APPENDIX: How to apply the Apache License to your work. - - To apply the Apache License to your work, attach the following - boilerplate notice, with the fields enclosed by brackets "[]" - replaced with your own identifying information. (Don\'t include - the brackets!) The text should be enclosed in the appropriate - comment syntax for the file format. We also recommend that a - file or class name and description of purpose be included on the - same "printed page" as the copyright notice for easier - identification within third-party archives. - -Copyright [yyyy] [name of copyright owner] - -Licensed under the Apache License, Version 2.0 (the "License"); -you may not use this file except in compliance with the License. -You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, software -distributed under the License is distributed on an "AS IS" BASIS, -WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -See the License for the specific language governing permissions and -limitations under the License. diff --git a/.agents/skills/screenshot/SKILL.md b/.agents/skills/screenshot/SKILL.md deleted file mode 100644 index 1d967bf..0000000 --- a/.agents/skills/screenshot/SKILL.md +++ /dev/null @@ -1,267 +0,0 @@ ---- -name: "screenshot" -description: "Use when the user explicitly asks for a desktop or system screenshot (full screen, specific app or window, or a pixel region), or when tool-specific capture capabilities are unavailable and an OS-level capture is needed." ---- - - -# Screenshot Capture - -Follow these save-location rules every time: - -1) If the user specifies a path, save there. -2) If the user asks for a screenshot without a path, save to the OS default screenshot location. -3) If Codex needs a screenshot for its own inspection, save to the temp directory. - -## Tool priority - -- Prefer tool-specific screenshot capabilities when available (for example: a Figma MCP/skill for Figma files, or Playwright/agent-browser tools for browsers and Electron apps). -- Use this skill when explicitly asked, for whole-system desktop captures, or when a tool-specific capture cannot get what you need. -- Otherwise, treat this skill as the default for desktop apps without a better-integrated capture tool. - -## macOS permission preflight (reduce repeated prompts) - -On macOS, run the preflight helper once before window/app capture. It checks -Screen Recording permission, explains why it is needed, and requests it in one -place. - -The helpers route Swift's module cache to `$TMPDIR/codex-swift-module-cache` -to avoid extra sandbox module-cache prompts. - -```bash -bash /scripts/ensure_macos_permissions.sh -``` - -To avoid multiple sandbox approval prompts, combine preflight + capture in one -command when possible: - -```bash -bash /scripts/ensure_macos_permissions.sh && \ -python3 /scripts/take_screenshot.py --app "Codex" -``` - -For Codex inspection runs, keep the output in temp: - -```bash -bash /scripts/ensure_macos_permissions.sh && \ -python3 /scripts/take_screenshot.py --app "" --mode temp -``` - -Use the bundled scripts to avoid re-deriving OS-specific commands. - -## macOS and Linux (Python helper) - -Run the helper from the repo root: - -```bash -python3 /scripts/take_screenshot.py -``` - -Common patterns: - -- Default location (user asked for "a screenshot"): - -```bash -python3 /scripts/take_screenshot.py -``` - -- Temp location (Codex visual check): - -```bash -python3 /scripts/take_screenshot.py --mode temp -``` - -- Explicit location (user provided a path or filename): - -```bash -python3 /scripts/take_screenshot.py --path output/screen.png -``` - -- App/window capture by app name (macOS only; substring match is OK; captures all matching windows): - -```bash -python3 /scripts/take_screenshot.py --app "Codex" -``` - -- Specific window title within an app (macOS only): - -```bash -python3 /scripts/take_screenshot.py --app "Codex" --window-name "Settings" -``` - -- List matching window ids before capturing (macOS only): - -```bash -python3 /scripts/take_screenshot.py --list-windows --app "Codex" -``` - -- Pixel region (x,y,w,h): - -```bash -python3 /scripts/take_screenshot.py --mode temp --region 100,200,800,600 -``` - -- Focused/active window (captures only the frontmost window; use `--app` to capture all windows): - -```bash -python3 /scripts/take_screenshot.py --mode temp --active-window -``` - -- Specific window id (use --list-windows on macOS to discover ids): - -```bash -python3 /scripts/take_screenshot.py --window-id 12345 -``` - -The script prints one path per capture. When multiple windows or displays match, it prints multiple paths (one per line) and adds suffixes like `-w` or `-d`. View each path sequentially with the image viewer tool, and only manipulate images if needed or requested. - -### Workflow examples - -- "Take a look at and tell me what you see": capture to temp, then view each printed path in order. - -```bash -bash /scripts/ensure_macos_permissions.sh && \ -python3 /scripts/take_screenshot.py --app "" --mode temp -``` - -- "The design from Figma is not matching what is implemented": use a Figma MCP/skill to capture the design first, then capture the running app with this skill (typically to temp) and compare the raw screenshots before any manipulation. - -### Multi-display behavior - -- On macOS, full-screen captures save one file per display when multiple monitors are connected. -- On Linux and Windows, full-screen captures use the virtual desktop (all monitors in one image); use `--region` to isolate a single display when needed. - -### Linux prerequisites and selection logic - -The helper automatically selects the first available tool: - -1) `scrot` -2) `gnome-screenshot` -3) ImageMagick `import` - -If none are available, ask the user to install one of them and retry. - -Coordinate regions require `scrot` or ImageMagick `import`. - -`--app`, `--window-name`, and `--list-windows` are macOS-only. On Linux, use -`--active-window` or provide `--window-id` when available. - -## Windows (PowerShell helper) - -Run the PowerShell helper: - -```powershell -powershell -ExecutionPolicy Bypass -File /scripts/take_screenshot.ps1 -``` - -Common patterns: - -- Default location: - -```powershell -powershell -ExecutionPolicy Bypass -File /scripts/take_screenshot.ps1 -``` - -- Temp location (Codex visual check): - -```powershell -powershell -ExecutionPolicy Bypass -File /scripts/take_screenshot.ps1 -Mode temp -``` - -- Explicit path: - -```powershell -powershell -ExecutionPolicy Bypass -File /scripts/take_screenshot.ps1 -Path "C:\Temp\screen.png" -``` - -- Pixel region (x,y,w,h): - -```powershell -powershell -ExecutionPolicy Bypass -File /scripts/take_screenshot.ps1 -Mode temp -Region 100,200,800,600 -``` - -- Active window (ask the user to focus it first): - -```powershell -powershell -ExecutionPolicy Bypass -File /scripts/take_screenshot.ps1 -Mode temp -ActiveWindow -``` - -- Specific window handle (only when provided): - -```powershell -powershell -ExecutionPolicy Bypass -File /scripts/take_screenshot.ps1 -WindowHandle 123456 -``` - -## Direct OS commands (fallbacks) - -Use these when you cannot run the helpers. - -### macOS - -- Full screen to a specific path: - -```bash -screencapture -x output/screen.png -``` - -- Pixel region: - -```bash -screencapture -x -R100,200,800,600 output/region.png -``` - -- Specific window id: - -```bash -screencapture -x -l12345 output/window.png -``` - -- Interactive selection or window pick: - -```bash -screencapture -x -i output/interactive.png -``` - -### Linux - -- Full screen: - -```bash -scrot output/screen.png -``` - -```bash -gnome-screenshot -f output/screen.png -``` - -```bash -import -window root output/screen.png -``` - -- Pixel region: - -```bash -scrot -a 100,200,800,600 output/region.png -``` - -```bash -import -window root -crop 800x600+100+200 output/region.png -``` - -- Active window: - -```bash -scrot -u output/window.png -``` - -```bash -gnome-screenshot -w -f output/window.png -``` - -## Error handling - -- On macOS, run `bash /scripts/ensure_macos_permissions.sh` first to request Screen Recording in one place. -- If you see "screen capture checks are blocked in the sandbox", "could not create image from display", or Swift `ModuleCache` permission errors in a sandboxed run, rerun the command with escalated permissions. -- If macOS app/window capture returns no matches, run `--list-windows --app "AppName"` and retry with `--window-id`, and make sure the app is visible on screen. -- If Linux region/window capture fails, check tool availability with `command -v scrot`, `command -v gnome-screenshot`, and `command -v import`. -- If saving to the OS default location fails with permission errors in a sandbox, rerun the command with escalated permissions. -- Always report the saved file path in the response. diff --git a/.agents/skills/screenshot/agents/openai.yaml b/.agents/skills/screenshot/agents/openai.yaml deleted file mode 100644 index 9426051..0000000 --- a/.agents/skills/screenshot/agents/openai.yaml +++ /dev/null @@ -1,6 +0,0 @@ -interface: - display_name: "Screenshot Capture" - short_description: "Capture screenshots" - icon_small: "./assets/screenshot-small.svg" - icon_large: "./assets/screenshot.png" - default_prompt: "Capture the right screenshot for this task (target, area, and output path)." diff --git a/.agents/skills/screenshot/assets/screenshot-small.svg b/.agents/skills/screenshot/assets/screenshot-small.svg deleted file mode 100644 index 11738c8..0000000 --- a/.agents/skills/screenshot/assets/screenshot-small.svg +++ /dev/null @@ -1,5 +0,0 @@ - - - - - diff --git a/.agents/skills/screenshot/assets/screenshot.png b/.agents/skills/screenshot/assets/screenshot.png deleted file mode 100644 index d2c805da6398b2d4b535819286271f929f27b399..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 860 zcmeAS@N?(olHy`uVBq!ia0vp^DIm``W z$lZxy-8q?;Kn_c~qpu?a!^VE@KZ&di3``$AT^vIy7~kIZ%@+!kXnQEG?3#0M%aX8M z;n*v!YGn_P)xSPv|M*y|UQEejC!Z*$tiTIS(a&22*k|xP=h1t&@%&VY%ZV0z#pjPp z25GymU%$@BAaCXZ4UV3s0~o}aTmmTj&m|S$Mao<-p<4>%wuJ=Ea?6mLZ zXYJ<9n#=R)xJAYc+icbU6|eq$`B*lok#GCidv%er-!iV}?BBQ{_oPJi?8~`dniuyh z(u=H1(>WGV^Y>}ud%~Ee7``XHX`}E``l_BxVdlohM^zD1~>g%5u z`k!u{c~ew&H@ltD`u6pe(?0M?Zaw)sI`Q?-rRv$~KQFHe8kU+;* zg5`7Eaw{V@rycyH*S+xHZ)25XKawhMCZrs66E9PCWsNkgp7l|3lE35TY5n@T9{+8s zE6Xzu?%KbmWKmPiFS%`T_ofuzHLN=6_m0)&wB`r)PcS-~X=k*PHEYpT_4fckX-J8Ym#a+K5K1+*5h|di@Q?l_gfPtc({`x!oKS fL|6tf@r=Lm&@7>KQ?4rjGXsOCtDnm{r-UW|qnvBP diff --git a/.agents/skills/screenshot/scripts/ensure_macos_permissions.sh b/.agents/skills/screenshot/scripts/ensure_macos_permissions.sh deleted file mode 100644 index a422985..0000000 --- a/.agents/skills/screenshot/scripts/ensure_macos_permissions.sh +++ /dev/null @@ -1,54 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -if [[ "$(uname)" != "Darwin" ]]; then - echo "ensure_macos_permissions.sh only supports macOS" >&2 - exit 1 -fi - -if ! command -v swift >/dev/null 2>&1; then - echo "swift is required to check macOS screen capture permissions" >&2 - exit 1 -fi - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -PERM_SWIFT="$SCRIPT_DIR/macos_permissions.swift" -MODULE_CACHE="${TMPDIR:-/tmp}/codex-swift-module-cache" -mkdir -p "$MODULE_CACHE" - -screen_capture_status() { - local json - json="$(swift -module-cache-path "$MODULE_CACHE" "$PERM_SWIFT" "$@")" - python3 -c 'import json, sys; data=json.loads(sys.argv[1]); print("1" if data.get("screenCapture") else "0")' "$json" -} - -if [[ -n "${CODEX_SANDBOX:-}" ]]; then - echo "Screen capture checks are blocked in the sandbox; rerun with escalated permissions." >&2 - exit 3 -fi - -if [[ "$(screen_capture_status)" == "1" ]]; then - echo "Screen Recording permission already granted." - exit 0 -fi - -cat <<'MSG' -This workflow needs macOS Screen Recording permission to capture screenshots. -macOS will show a single system prompt for Screen Recording. Approve it, then -return here. If macOS opens System Settings instead of prompting, enable Screen -Recording for your terminal and rerun the command. -MSG - -# Request permission once after explaining why it is needed. -screen_capture_status --request >/dev/null || true - -if [[ "$(screen_capture_status)" != "1" ]]; then - cat <<'MSG' -Screen Recording is still not granted. -Open System Settings > Privacy & Security > Screen Recording and enable it for -your terminal (and Codex if needed), then rerun your screenshot command. -MSG - exit 2 -fi - -echo "Screen Recording permission granted." diff --git a/.agents/skills/screenshot/scripts/macos_display_info.swift b/.agents/skills/screenshot/scripts/macos_display_info.swift deleted file mode 100644 index da6e2f1..0000000 --- a/.agents/skills/screenshot/scripts/macos_display_info.swift +++ /dev/null @@ -1,22 +0,0 @@ -import AppKit -import Foundation - -struct Response: Encodable { - let count: Int - let displays: [Int] -} - -let count = max(NSScreen.screens.count, 1) -let displays = Array(1...count) - -let response = Response(count: count, displays: displays) -let encoder = JSONEncoder() -encoder.outputFormatting = [.sortedKeys] - -if let data = try? encoder.encode(response), - let json = String(data: data, encoding: .utf8) { - print(json) -} else { - fputs("{\"count\":\(count)}\n", stderr) - exit(1) -} diff --git a/.agents/skills/screenshot/scripts/macos_permissions.swift b/.agents/skills/screenshot/scripts/macos_permissions.swift deleted file mode 100644 index c736d10..0000000 --- a/.agents/skills/screenshot/scripts/macos_permissions.swift +++ /dev/null @@ -1,40 +0,0 @@ -import CoreGraphics -import Foundation - -struct Status: Encodable { - let screenCapture: Bool - let requested: Bool -} - -let shouldRequest = CommandLine.arguments.contains("--request") - -@available(macOS 10.15, *) -func screenCaptureGranted(request: Bool) -> Bool { - if CGPreflightScreenCaptureAccess() { - return true - } - if request { - _ = CGRequestScreenCaptureAccess() - return CGPreflightScreenCaptureAccess() - } - return false -} - -let granted: Bool -if #available(macOS 10.15, *) { - granted = screenCaptureGranted(request: shouldRequest) -} else { - granted = true -} - -let status = Status(screenCapture: granted, requested: shouldRequest) -let encoder = JSONEncoder() -encoder.outputFormatting = [.sortedKeys] - -if let data = try? encoder.encode(status), - let json = String(data: data, encoding: .utf8) { - print(json) -} else { - fputs("{\"requested\":\(shouldRequest),\"screenCapture\":\(granted)}\n", stderr) - exit(1) -} diff --git a/.agents/skills/screenshot/scripts/macos_window_info.swift b/.agents/skills/screenshot/scripts/macos_window_info.swift deleted file mode 100644 index 83bd4a8..0000000 --- a/.agents/skills/screenshot/scripts/macos_window_info.swift +++ /dev/null @@ -1,126 +0,0 @@ -import AppKit -import CoreGraphics -import Foundation - -struct Bounds: Encodable { - let x: Int - let y: Int - let width: Int - let height: Int -} - -struct WindowInfo: Encodable { - let id: Int - let owner: String - let name: String - let layer: Int - let bounds: Bounds - let area: Int -} - -struct Response: Encodable { - let count: Int - let selected: WindowInfo? - let windows: [WindowInfo]? -} - -func value(for flag: String) -> String? { - guard let idx = CommandLine.arguments.firstIndex(of: flag) else { - return nil - } - let next = CommandLine.arguments.index(after: idx) - guard next < CommandLine.arguments.endIndex else { - return nil - } - return CommandLine.arguments[next] -} - -let frontmostFlag = CommandLine.arguments.contains("--frontmost") -let explicitApp = value(for: "--app") -let frontmostName = frontmostFlag ? NSWorkspace.shared.frontmostApplication?.localizedName : nil -if frontmostFlag && frontmostName == nil { - fputs("{\"count\":0}\n", stderr) - exit(1) -} -let appFilter = (explicitApp ?? frontmostName)?.lowercased() -let nameFilter = value(for: "--window-name")?.lowercased() -let includeList = CommandLine.arguments.contains("--list") - -let options: CGWindowListOption = [.optionOnScreenOnly, .excludeDesktopElements] -guard let raw = CGWindowListCopyWindowInfo(options, kCGNullWindowID) as? [[String: Any]] else { - fputs("{\"count\":0}\n", stderr) - exit(1) -} - -var exactMatches: [WindowInfo] = [] -var partialMatches: [WindowInfo] = [] -exactMatches.reserveCapacity(raw.count) -partialMatches.reserveCapacity(raw.count) - -for entry in raw { - guard let owner = entry[kCGWindowOwnerName as String] as? String else { continue } - let ownerLower = owner.lowercased() - if let appFilter, !ownerLower.contains(appFilter) { continue } - - let name = (entry[kCGWindowName as String] as? String) ?? "" - if let nameFilter, !name.lowercased().contains(nameFilter) { continue } - - guard let number = entry[kCGWindowNumber as String] as? Int else { continue } - let layer = (entry[kCGWindowLayer as String] as? Int) ?? 0 - - guard let boundsDict = entry[kCGWindowBounds as String] as? [String: Any] else { continue } - let x = Int((boundsDict["X"] as? Double) ?? 0) - let y = Int((boundsDict["Y"] as? Double) ?? 0) - let width = Int((boundsDict["Width"] as? Double) ?? 0) - let height = Int((boundsDict["Height"] as? Double) ?? 0) - if width <= 0 || height <= 0 { continue } - - let bounds = Bounds(x: x, y: y, width: width, height: height) - let area = width * height - let info = WindowInfo(id: number, owner: owner, name: name, layer: layer, bounds: bounds, area: area) - if let appFilter, ownerLower == appFilter { - exactMatches.append(info) - } else { - partialMatches.append(info) - } -} - -let windows: [WindowInfo] -if appFilter != nil && !exactMatches.isEmpty { - windows = exactMatches -} else { - windows = partialMatches -} - -func rank(_ window: WindowInfo) -> (Int, Int) { - // Prefer normal-layer windows, then larger area. - let layerScore = window.layer == 0 ? 0 : 1 - return (layerScore, -window.area) -} - -let ordered: [WindowInfo] -if frontmostFlag { - ordered = windows -} else { - ordered = windows.sorted { rank($0) < rank($1) } -} -let selected = ordered.first - -let list: [WindowInfo]? -if includeList { - list = ordered -} else { - list = nil -} - -let response = Response(count: windows.count, selected: selected, windows: list) -let encoder = JSONEncoder() -encoder.outputFormatting = [.sortedKeys] - -if let data = try? encoder.encode(response), - let json = String(data: data, encoding: .utf8) { - print(json) -} else { - fputs("{\"count\":\(windows.count)}\n", stderr) - exit(1) -} diff --git a/.agents/skills/screenshot/scripts/take_screenshot.ps1 b/.agents/skills/screenshot/scripts/take_screenshot.ps1 deleted file mode 100644 index d8b8de0..0000000 --- a/.agents/skills/screenshot/scripts/take_screenshot.ps1 +++ /dev/null @@ -1,163 +0,0 @@ -param( - [string]$Path, - [ValidateSet("default", "temp")][string]$Mode = "default", - [string]$Format = "png", - [string]$Region, - [switch]$ActiveWindow, - [int]$WindowHandle -) - -Set-StrictMode -Version Latest -$ErrorActionPreference = "Stop" - -function Get-Timestamp { - Get-Date -Format "yyyy-MM-dd_HH-mm-ss" -} - -function Get-DefaultDirectory { - $home = [Environment]::GetFolderPath("UserProfile") - $pictures = Join-Path $home "Pictures" - $screenshots = Join-Path $pictures "Screenshots" - if (Test-Path $screenshots) { return $screenshots } - if (Test-Path $pictures) { return $pictures } - return $home -} - -function New-DefaultFilename { - param([string]$Prefix) - if (-not $Prefix) { $Prefix = "screenshot" } - "$Prefix-$(Get-Timestamp).$Format" -} - -function Resolve-OutputPath { - if ($Path) { - $expanded = [Environment]::ExpandEnvironmentVariables($Path) - $homeDir = [Environment]::GetFolderPath("UserProfile") - if ($expanded -eq "~") { - $expanded = $homeDir - } elseif ($expanded.StartsWith("~/") -or $expanded.StartsWith("~\\")) { - $expanded = Join-Path $homeDir $expanded.Substring(2) - } - $full = [System.IO.Path]::GetFullPath($expanded) - if ((Test-Path $full) -and (Get-Item $full).PSIsContainer) { - $full = Join-Path $full (New-DefaultFilename "") - } elseif (($expanded.EndsWith("\") -or $expanded.EndsWith("/")) -and -not (Test-Path $full)) { - New-Item -ItemType Directory -Path $full -Force | Out-Null - $full = Join-Path $full (New-DefaultFilename "") - } elseif ([System.IO.Path]::GetExtension($full) -eq "") { - $full = "$full.$Format" - } - $parent = Split-Path -Parent $full - if ($parent) { - New-Item -ItemType Directory -Path $parent -Force | Out-Null - } - return $full - } - - if ($Mode -eq "temp") { - $tmp = [System.IO.Path]::GetTempPath() - return Join-Path $tmp (New-DefaultFilename "codex-shot") - } - - $dest = Get-DefaultDirectory - return Join-Path $dest (New-DefaultFilename "") -} - -function Parse-Region { - if (-not $Region) { return $null } - $parts = $Region.Split(",") | ForEach-Object { $_.Trim() } - if ($parts.Length -ne 4) { - throw "Region must be x,y,w,h" - } - $values = $parts | ForEach-Object { - $out = 0 - if (-not [int]::TryParse($_, [ref]$out)) { - throw "Region values must be integers" - } - $out - } - if ($values[2] -le 0 -or $values[3] -le 0) { - throw "Region width and height must be positive" - } - return $values -} - -if ($Region -and $ActiveWindow) { - throw "Choose either -Region or -ActiveWindow" -} -if ($Region -and $WindowHandle) { - throw "Choose either -Region or -WindowHandle" -} -if ($ActiveWindow -and $WindowHandle) { - throw "Choose either -ActiveWindow or -WindowHandle" -} - -$regionValues = Parse-Region -$outputPath = Resolve-OutputPath - -Add-Type -AssemblyName System.Windows.Forms -Add-Type -AssemblyName System.Drawing - -$imageFormat = switch ($Format.ToLowerInvariant()) { - "png" { [System.Drawing.Imaging.ImageFormat]::Png } - "jpg" { [System.Drawing.Imaging.ImageFormat]::Jpeg } - "jpeg" { [System.Drawing.Imaging.ImageFormat]::Jpeg } - "bmp" { [System.Drawing.Imaging.ImageFormat]::Bmp } - default { throw "Unsupported format: $Format" } -} - -Add-Type @" -using System; -using System.Runtime.InteropServices; -public static class NativeMethods { - [StructLayout(LayoutKind.Sequential)] - public struct RECT { - public int Left; - public int Top; - public int Right; - public int Bottom; - } - - [DllImport("user32.dll")] - public static extern IntPtr GetForegroundWindow(); - - [DllImport("user32.dll")] - public static extern bool GetWindowRect(IntPtr hWnd, out RECT rect); -} -"@ - -if ($regionValues) { - $x = $regionValues[0] - $y = $regionValues[1] - $w = $regionValues[2] - $h = $regionValues[3] - $bounds = New-Object System.Drawing.Rectangle($x, $y, $w, $h) -} elseif ($ActiveWindow -or $WindowHandle) { - $handle = if ($WindowHandle) { [IntPtr]$WindowHandle } else { [NativeMethods]::GetForegroundWindow() } - $rect = New-Object NativeMethods+RECT - if (-not [NativeMethods]::GetWindowRect($handle, [ref]$rect)) { - throw "Failed to get window bounds" - } - $width = $rect.Right - $rect.Left - $height = $rect.Bottom - $rect.Top - $bounds = New-Object System.Drawing.Rectangle($rect.Left, $rect.Top, $width, $height) -} else { - $vs = [System.Windows.Forms.SystemInformation]::VirtualScreen - $bounds = New-Object System.Drawing.Rectangle($vs.Left, $vs.Top, $vs.Width, $vs.Height) -} - -$bitmap = New-Object System.Drawing.Bitmap($bounds.Width, $bounds.Height) -$graphics = [System.Drawing.Graphics]::FromImage($bitmap) - -try { - $source = New-Object System.Drawing.Point($bounds.Left, $bounds.Top) - $target = [System.Drawing.Point]::Empty - $size = New-Object System.Drawing.Size($bounds.Width, $bounds.Height) - $graphics.CopyFromScreen($source, $target, $size) - $bitmap.Save($outputPath, $imageFormat) -} finally { - $graphics.Dispose() - $bitmap.Dispose() -} - -Write-Output $outputPath diff --git a/.agents/skills/screenshot/scripts/take_screenshot.py b/.agents/skills/screenshot/scripts/take_screenshot.py deleted file mode 100644 index 4cbb33b..0000000 --- a/.agents/skills/screenshot/scripts/take_screenshot.py +++ /dev/null @@ -1,585 +0,0 @@ -#!/usr/bin/env python3 -"""Cross-platform screenshot helper for Codex skills.""" - -from __future__ import annotations - -import argparse -import datetime as dt -import json -import os -import platform -import shutil -import subprocess -import tempfile -from pathlib import Path - -SCRIPT_DIR = Path(__file__).resolve().parent -MAC_PERM_SCRIPT = SCRIPT_DIR / "macos_permissions.swift" -MAC_PERM_HELPER = SCRIPT_DIR / "ensure_macos_permissions.sh" -MAC_WINDOW_SCRIPT = SCRIPT_DIR / "macos_window_info.swift" -MAC_DISPLAY_SCRIPT = SCRIPT_DIR / "macos_display_info.swift" -TEST_MODE_ENV = "CODEX_SCREENSHOT_TEST_MODE" -TEST_PLATFORM_ENV = "CODEX_SCREENSHOT_TEST_PLATFORM" -TEST_WINDOWS_ENV = "CODEX_SCREENSHOT_TEST_WINDOWS" -TEST_DISPLAYS_ENV = "CODEX_SCREENSHOT_TEST_DISPLAYS" -TEST_PNG = ( - b"\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00\x01\x00\x00\x00\x01" - b"\x08\x06\x00\x00\x00\x1f\x15\xc4\x89\x00\x00\x00\x0cIDAT\x08\xd7c" - b"\xf8\xff\xff?\x00\x05\xfe\x02\xfeA\xad\x1c\x1c\x00\x00\x00\x00IEND" - b"\xaeB`\x82" -) - - -def parse_region(value: str) -> tuple[int, int, int, int]: - parts = [p.strip() for p in value.split(",")] - if len(parts) != 4: - raise argparse.ArgumentTypeError("region must be x,y,w,h") - try: - x, y, w, h = (int(p) for p in parts) - except ValueError as exc: - raise argparse.ArgumentTypeError("region values must be integers") from exc - if w <= 0 or h <= 0: - raise argparse.ArgumentTypeError("region width and height must be positive") - return x, y, w, h - - -def test_mode_enabled() -> bool: - value = os.environ.get(TEST_MODE_ENV, "") - return value.lower() in {"1", "true", "yes", "on"} - - -def normalize_platform(value: str) -> str: - lowered = value.strip().lower() - if lowered in {"darwin", "mac", "macos", "osx"}: - return "Darwin" - if lowered in {"linux", "ubuntu"}: - return "Linux" - if lowered in {"windows", "win"}: - return "Windows" - return value - - -def test_platform_override() -> str | None: - value = os.environ.get(TEST_PLATFORM_ENV) - if value: - return normalize_platform(value) - return None - - -def parse_int_list(value: str) -> list[int]: - results: list[int] = [] - for part in value.split(","): - part = part.strip() - if not part: - continue - try: - results.append(int(part)) - except ValueError: - continue - return results - - -def test_window_ids() -> list[int]: - value = os.environ.get(TEST_WINDOWS_ENV, "101,102") - ids = parse_int_list(value) - return ids or [101] - - -def test_display_ids() -> list[int]: - value = os.environ.get(TEST_DISPLAYS_ENV, "1,2") - ids = parse_int_list(value) - return ids or [1] - - -def write_test_png(path: Path) -> None: - ensure_parent(path) - path.write_bytes(TEST_PNG) - - -def timestamp() -> str: - return dt.datetime.now().strftime("%Y-%m-%d_%H-%M-%S") - - -def default_filename(fmt: str, prefix: str = "screenshot") -> str: - return f"{prefix}-{timestamp()}.{fmt}" - - -def mac_default_dir() -> Path: - desktop = Path.home() / "Desktop" - try: - proc = subprocess.run( - ["defaults", "read", "com.apple.screencapture", "location"], - check=False, - capture_output=True, - text=True, - ) - location = proc.stdout.strip() - if location: - return Path(location).expanduser() - except OSError: - pass - return desktop - - -def default_dir(system: str) -> Path: - home = Path.home() - if system == "Darwin": - return mac_default_dir() - if system == "Windows": - pictures = home / "Pictures" - screenshots = pictures / "Screenshots" - if screenshots.exists(): - return screenshots - if pictures.exists(): - return pictures - return home - pictures = home / "Pictures" - screenshots = pictures / "Screenshots" - if screenshots.exists(): - return screenshots - if pictures.exists(): - return pictures - return home - - -def ensure_parent(path: Path) -> None: - try: - path.parent.mkdir(parents=True, exist_ok=True) - except OSError: - # Fall back to letting the capture command report a clearer error. - pass - - -def resolve_output_path( - requested_path: str | None, mode: str, fmt: str, system: str -) -> Path: - if requested_path: - path = Path(requested_path).expanduser() - if path.exists() and path.is_dir(): - path = path / default_filename(fmt) - elif requested_path.endswith(("/", "\\")) and not path.exists(): - path.mkdir(parents=True, exist_ok=True) - path = path / default_filename(fmt) - elif path.suffix == "": - path = path.with_suffix(f".{fmt}") - ensure_parent(path) - return path - - if mode == "temp": - tmp_dir = Path(tempfile.gettempdir()) - tmp_path = tmp_dir / default_filename(fmt, prefix="codex-shot") - ensure_parent(tmp_path) - return tmp_path - - dest_dir = default_dir(system) - dest_path = dest_dir / default_filename(fmt) - ensure_parent(dest_path) - return dest_path - - -def multi_output_paths(base: Path, suffixes: list[str]) -> list[Path]: - if len(suffixes) <= 1: - return [base] - paths: list[Path] = [] - for suffix in suffixes: - candidate = base.with_name(f"{base.stem}-{suffix}{base.suffix}") - ensure_parent(candidate) - paths.append(candidate) - return paths - - -def run(cmd: list[str]) -> None: - try: - subprocess.run(cmd, check=True) - except FileNotFoundError as exc: - raise SystemExit(f"required command not found: {cmd[0]}") from exc - except subprocess.CalledProcessError as exc: - raise SystemExit(f"command failed ({exc.returncode}): {' '.join(cmd)}") from exc - - -def swift_json(script: Path, extra_args: list[str] | None = None) -> dict: - module_cache = Path(tempfile.gettempdir()) / "codex-swift-module-cache" - module_cache.mkdir(parents=True, exist_ok=True) - cmd = ["swift", "-module-cache-path", str(module_cache), str(script)] - if extra_args: - cmd.extend(extra_args) - try: - proc = subprocess.run(cmd, check=True, capture_output=True, text=True) - except FileNotFoundError as exc: - raise SystemExit("swift not found; install Xcode command line tools") from exc - except subprocess.CalledProcessError as exc: - stderr = (exc.stderr or "").strip() - if "ModuleCache" in stderr and "Operation not permitted" in stderr: - raise SystemExit( - "swift needs module-cache access; rerun with escalated permissions" - ) from exc - msg = stderr or (exc.stdout or "").strip() or "swift helper failed" - raise SystemExit(msg) from exc - try: - return json.loads(proc.stdout) - except json.JSONDecodeError as exc: - raise SystemExit(f"swift helper returned invalid JSON: {proc.stdout.strip()}") from exc - - -def macos_screen_capture_granted(request: bool = False) -> bool: - args = ["--request"] if request else [] - payload = swift_json(MAC_PERM_SCRIPT, args) - return bool(payload.get("screenCapture")) - - -def ensure_macos_permissions() -> None: - if os.environ.get("CODEX_SANDBOX"): - raise SystemExit( - "screen capture checks are blocked in the sandbox; rerun with escalated permissions" - ) - if macos_screen_capture_granted(): - return - subprocess.run(["bash", str(MAC_PERM_HELPER)], check=False) - if not macos_screen_capture_granted(): - raise SystemExit( - "Screen Recording permission is required; enable it in System Settings and retry" - ) - - -def activate_app(app: str) -> None: - safe_app = app.replace('"', '\\"') - script = f'tell application "{safe_app}" to activate' - subprocess.run(["osascript", "-e", script], check=False, capture_output=True, text=True) - - -def macos_window_payload(args: argparse.Namespace, frontmost: bool, include_list: bool) -> dict: - flags: list[str] = [] - if frontmost: - flags.append("--frontmost") - if args.app: - flags.extend(["--app", args.app]) - if args.window_name: - flags.extend(["--window-name", args.window_name]) - if include_list: - flags.append("--list") - return swift_json(MAC_WINDOW_SCRIPT, flags) - - -def macos_display_indexes() -> list[int]: - payload = swift_json(MAC_DISPLAY_SCRIPT) - displays = payload.get("displays") or [] - indexes: list[int] = [] - for item in displays: - try: - value = int(item) - except (TypeError, ValueError): - continue - if value > 0: - indexes.append(value) - return indexes or [1] - - -def macos_window_ids(args: argparse.Namespace, capture_all: bool) -> list[int]: - payload = macos_window_payload( - args, - frontmost=args.active_window, - include_list=capture_all, - ) - if capture_all: - windows = payload.get("windows") or [] - ids: list[int] = [] - for item in windows: - win_id = item.get("id") - if win_id is None: - continue - try: - ids.append(int(win_id)) - except (TypeError, ValueError): - continue - if ids: - return ids - selected = payload.get("selected") or {} - win_id = selected.get("id") - if win_id is not None: - try: - return [int(win_id)] - except (TypeError, ValueError): - pass - raise SystemExit("no matching macOS window found; try --list-windows to inspect ids") - - -def list_macos_windows(args: argparse.Namespace) -> None: - payload = macos_window_payload(args, frontmost=args.active_window, include_list=True) - windows = payload.get("windows") or [] - if not windows: - print("no matching windows found") - return - for item in windows: - bounds = item.get("bounds") or {} - name = item.get("name") or "" - width = bounds.get("width", 0) - height = bounds.get("height", 0) - x = bounds.get("x", 0) - y = bounds.get("y", 0) - print(f"{item.get('id')}\t{item.get('owner')}\t{name}\t{width}x{height}+{x}+{y}") - - -def list_test_macos_windows(args: argparse.Namespace) -> None: - owner = args.app or "TestApp" - name = args.window_name or "" - ids = test_window_ids() - if args.active_window and ids: - ids = [ids[0]] - for idx, win_id in enumerate(ids, start=1): - window_name = name or f"Window {idx}" - print(f"{win_id}\t{owner}\t{window_name}\t800x600+0+0") - - -def resolve_macos_windows(args: argparse.Namespace) -> list[int]: - if args.app: - activate_app(args.app) - capture_all = not args.active_window - return macos_window_ids(args, capture_all=capture_all) - - -def resolve_test_macos_windows(args: argparse.Namespace) -> list[int]: - ids = test_window_ids() - if args.active_window and ids: - return [ids[0]] - return ids - - -def capture_macos( - args: argparse.Namespace, - output: Path, - *, - window_id: int | None = None, - display: int | None = None, -) -> None: - cmd = ["screencapture", "-x", f"-t{args.format}"] - if args.interactive: - cmd.append("-i") - if display is not None: - cmd.append(f"-D{display}") - effective_window_id = window_id if window_id is not None else args.window_id - if effective_window_id is not None: - cmd.append(f"-l{effective_window_id}") - elif args.region is not None: - x, y, w, h = args.region - cmd.append(f"-R{x},{y},{w},{h}") - cmd.append(str(output)) - run(cmd) - - -def capture_linux(args: argparse.Namespace, output: Path) -> None: - scrot = shutil.which("scrot") - gnome = shutil.which("gnome-screenshot") - imagemagick = shutil.which("import") - xdotool = shutil.which("xdotool") - - if args.region is not None: - x, y, w, h = args.region - if scrot: - run(["scrot", "-a", f"{x},{y},{w},{h}", str(output)]) - return - if imagemagick: - geometry = f"{w}x{h}+{x}+{y}" - run(["import", "-window", "root", "-crop", geometry, str(output)]) - return - raise SystemExit("region capture requires scrot or ImageMagick (import)") - - if args.window_id is not None: - if imagemagick: - run(["import", "-window", str(args.window_id), str(output)]) - return - raise SystemExit("window-id capture requires ImageMagick (import)") - - if args.active_window: - if scrot: - run(["scrot", "-u", str(output)]) - return - if gnome: - run(["gnome-screenshot", "-w", "-f", str(output)]) - return - if imagemagick and xdotool: - win_id = ( - subprocess.check_output(["xdotool", "getactivewindow"], text=True) - .strip() - ) - run(["import", "-window", win_id, str(output)]) - return - raise SystemExit("active-window capture requires scrot, gnome-screenshot, or import+xdotool") - - if scrot: - run(["scrot", str(output)]) - return - if gnome: - run(["gnome-screenshot", "-f", str(output)]) - return - if imagemagick: - run(["import", "-window", "root", str(output)]) - return - raise SystemExit("no supported screenshot tool found (scrot, gnome-screenshot, or import)") - - -def main() -> None: - parser = argparse.ArgumentParser(description=__doc__) - parser.add_argument( - "--path", - help="output file path or directory; overrides --mode", - ) - parser.add_argument( - "--mode", - choices=("default", "temp"), - default="default", - help="default saves to the OS screenshot location; temp saves to the temp dir", - ) - parser.add_argument( - "--format", - default="png", - help="image format/extension (default: png)", - ) - parser.add_argument( - "--app", - help="macOS only: capture all matching on-screen windows for this app name", - ) - parser.add_argument( - "--window-name", - help="macOS only: substring match for a window title (optionally scoped by --app)", - ) - parser.add_argument( - "--list-windows", - action="store_true", - help="macOS only: list matching window ids instead of capturing", - ) - parser.add_argument( - "--region", - type=parse_region, - help="capture region as x,y,w,h (pixel coordinates)", - ) - parser.add_argument( - "--window-id", - type=int, - help="capture a specific window id when supported", - ) - parser.add_argument( - "--active-window", - action="store_true", - help="capture the focused/active window only when supported", - ) - parser.add_argument( - "--interactive", - action="store_true", - help="use interactive selection where the OS tool supports it", - ) - args = parser.parse_args() - - if args.region and args.window_id is not None: - raise SystemExit("choose either --region or --window-id, not both") - if args.region and args.active_window: - raise SystemExit("choose either --region or --active-window, not both") - if args.window_id is not None and args.active_window: - raise SystemExit("choose either --window-id or --active-window, not both") - if args.app and args.window_id is not None: - raise SystemExit("choose either --app or --window-id, not both") - if args.region and args.app: - raise SystemExit("choose either --region or --app, not both") - if args.region and args.window_name: - raise SystemExit("choose either --region or --window-name, not both") - if args.interactive and args.app: - raise SystemExit("choose either --interactive or --app, not both") - if args.interactive and args.window_name: - raise SystemExit("choose either --interactive or --window-name, not both") - if args.interactive and args.window_id is not None: - raise SystemExit("choose either --interactive or --window-id, not both") - if args.interactive and args.active_window: - raise SystemExit("choose either --interactive or --active-window, not both") - if args.list_windows and (args.region or args.window_id is not None or args.interactive): - raise SystemExit("--list-windows only supports --app, --window-name, and --active-window") - - test_mode = test_mode_enabled() - system = platform.system() - if test_mode: - override = test_platform_override() - if override: - system = override - window_ids: list[int] = [] - display_ids: list[int] = [] - - if system != "Darwin" and (args.app or args.window_name or args.list_windows): - raise SystemExit("--app/--window-name/--list-windows are supported on macOS only") - - if system == "Darwin": - if test_mode: - if args.list_windows: - list_test_macos_windows(args) - return - if args.window_id is not None: - window_ids = [args.window_id] - elif args.app or args.window_name or args.active_window: - window_ids = resolve_test_macos_windows(args) - elif args.region is None and not args.interactive: - display_ids = test_display_ids() - else: - ensure_macos_permissions() - if args.list_windows: - list_macos_windows(args) - return - if args.window_id is not None: - window_ids = [args.window_id] - elif args.app or args.window_name or args.active_window: - window_ids = resolve_macos_windows(args) - elif args.region is None and not args.interactive: - display_ids = macos_display_indexes() - - output = resolve_output_path(args.path, args.mode, args.format, system) - - if test_mode: - if system == "Darwin": - if window_ids: - suffixes = [f"w{wid}" for wid in window_ids] - paths = multi_output_paths(output, suffixes) - for path in paths: - write_test_png(path) - for path in paths: - print(path) - return - if len(display_ids) > 1: - suffixes = [f"d{did}" for did in display_ids] - paths = multi_output_paths(output, suffixes) - for path in paths: - write_test_png(path) - for path in paths: - print(path) - return - write_test_png(output) - print(output) - return - - if system == "Darwin": - if window_ids: - suffixes = [f"w{wid}" for wid in window_ids] - paths = multi_output_paths(output, suffixes) - for wid, path in zip(window_ids, paths): - capture_macos(args, path, window_id=wid) - for path in paths: - print(path) - return - if len(display_ids) > 1: - suffixes = [f"d{did}" for did in display_ids] - paths = multi_output_paths(output, suffixes) - for did, path in zip(display_ids, paths): - capture_macos(args, path, display=did) - for path in paths: - print(path) - return - capture_macos(args, output) - elif system == "Linux": - capture_linux(args, output) - elif system == "Windows": - raise SystemExit( - "Windows support lives in scripts/take_screenshot.ps1; run it with PowerShell" - ) - else: - raise SystemExit(f"unsupported platform: {system}") - - print(output) - - -if __name__ == "__main__": - main() diff --git a/.agents/skills/security-best-practices/LICENSE.txt b/.agents/skills/security-best-practices/LICENSE.txt deleted file mode 100644 index 13e25df..0000000 --- a/.agents/skills/security-best-practices/LICENSE.txt +++ /dev/null @@ -1,201 +0,0 @@ -Apache License -Version 2.0, January 2004 -http://www.apache.org/licenses/ - -TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION - -1. Definitions. - - "License" shall mean the terms and conditions for use, reproduction, - and distribution as defined by Sections 1 through 9 of this document. - - "Licensor" shall mean the copyright owner or entity authorized by - the copyright owner that is granting the License. - - "Legal Entity" shall mean the union of the acting entity and all - other entities that control, are controlled by, or are under common - control with that entity. For the purposes of this definition, - "control" means (i) the power, direct or indirect, to cause the - direction or management of such entity, whether by contract or - otherwise, or (ii) ownership of fifty percent (50%) or more of the - outstanding shares, or (iii) beneficial ownership of such entity. - - "You" (or "Your") shall mean an individual or Legal Entity - exercising permissions granted by this License. - - "Source" form shall mean the preferred form for making modifications, - including but not limited to software source code, documentation - source, and configuration files. - - "Object" form shall mean any form resulting from mechanical - transformation or translation of a Source form, including but - not limited to compiled object code, generated documentation, - and conversions to other media types. - - "Work" shall mean the work of authorship, whether in Source or - Object form, made available under the License, as indicated by a - copyright notice that is included in or attached to the work - (an example is provided in the Appendix below). - - "Derivative Works" shall mean any work, whether in Source or Object - form, that is based on (or derived from) the Work and for which the - editorial revisions, annotations, elaborations, or other modifications - represent, as a whole, an original work of authorship. For the purposes - of this License, Derivative Works shall not include works that remain - separable from, or merely link (or bind by name) to the interfaces of, - the Work and Derivative Works thereof. - - "Contribution" shall mean any work of authorship, including - the original version of the Work and any modifications or additions - to that Work or Derivative Works thereof, that is intentionally - submitted to Licensor for inclusion in the Work by the copyright owner - or by an individual or Legal Entity authorized to submit on behalf of - the copyright owner. For the purposes of this definition, "submitted" - means any form of electronic, verbal, or written communication sent - to the Licensor or its representatives, including but not limited to - communication on electronic mailing lists, source code control systems, - and issue tracking systems that are managed by, or on behalf of, the - Licensor for the purpose of discussing and improving the Work, but - excluding communication that is conspicuously marked or otherwise - designated in writing by the copyright owner as "Not a Contribution." - - "Contributor" shall mean Licensor and any individual or Legal Entity - on behalf of whom a Contribution has been received by Licensor and - subsequently incorporated within the Work. - -2. Grant of Copyright License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - copyright license to reproduce, prepare Derivative Works of, - publicly display, publicly perform, sublicense, and distribute the - Work and such Derivative Works in Source or Object form. - -3. Grant of Patent License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - (except as stated in this section) patent license to make, have made, - use, offer to sell, sell, import, and otherwise transfer the Work, - where such license applies only to those patent claims licensable - by such Contributor that are necessarily infringed by their - Contribution(s) alone or by combination of their Contribution(s) - with the Work to which such Contribution(s) was submitted. If You - institute patent litigation against any entity (including a - cross-claim or counterclaim in a lawsuit) alleging that the Work - or a Contribution incorporated within the Work constitutes direct - or contributory patent infringement, then any patent licenses - granted to You under this License for that Work shall terminate - as of the date such litigation is filed. - -4. Redistribution. You may reproduce and distribute copies of the - Work or Derivative Works thereof in any medium, with or without - modifications, and in Source or Object form, provided that You - meet the following conditions: - - (a) You must give any other recipients of the Work or - Derivative Works a copy of this License; and - - (b) You must cause any modified files to carry prominent notices - stating that You changed the files; and - - (c) You must retain, in the Source form of any Derivative Works - that You distribute, all copyright, patent, trademark, and - attribution notices from the Source form of the Work, - excluding those notices that do not pertain to any part of - the Derivative Works; and - - (d) If the Work includes a "NOTICE" text file as part of its - distribution, then any Derivative Works that You distribute must - include a readable copy of the attribution notices contained - within such NOTICE file, excluding those notices that do not - pertain to any part of the Derivative Works, in at least one - of the following places: within a NOTICE text file distributed - as part of the Derivative Works; within the Source form or - documentation, if provided along with the Derivative Works; or, - within a display generated by the Derivative Works, if and - wherever such third-party notices normally appear. The contents - of the NOTICE file are for informational purposes only and - do not modify the License. You may add Your own attribution - notices within Derivative Works that You distribute, alongside - or as an addendum to the NOTICE text from the Work, provided - that such additional attribution notices cannot be construed - as modifying the License. - - You may add Your own copyright statement to Your modifications and - may provide additional or different license terms and conditions - for use, reproduction, or distribution of Your modifications, or - for any such Derivative Works as a whole, provided Your use, - reproduction, and distribution of the Work otherwise complies with - the conditions stated in this License. - -5. Submission of Contributions. Unless You explicitly state otherwise, - any Contribution intentionally submitted for inclusion in the Work - by You to the Licensor shall be under the terms and conditions of - this License, without any additional terms or conditions. - Notwithstanding the above, nothing herein shall supersede or modify - the terms of any separate license agreement you may have executed - with Licensor regarding such Contributions. - -6. Trademarks. This License does not grant permission to use the trade - names, trademarks, service marks, or product names of the Licensor, - except as required for reasonable and customary use in describing the - origin of the Work and reproducing the content of the NOTICE file. - -7. Disclaimer of Warranty. Unless required by applicable law or - agreed to in writing, Licensor provides the Work (and each - Contributor provides its Contributions) on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or - implied, including, without limitation, any warranties or conditions - of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A - PARTICULAR PURPOSE. You are solely responsible for determining the - appropriateness of using or redistributing the Work and assume any - risks associated with Your exercise of permissions under this License. - -8. Limitation of Liability. In no event and under no legal theory, - whether in tort (including negligence), contract, or otherwise, - unless required by applicable law (such as deliberate and grossly - negligent acts) or agreed to in writing, shall any Contributor be - liable to You for damages, including any direct, indirect, special, - incidental, or consequential damages of any character arising as a - result of this License or out of the use or inability to use the - Work (including but not limited to damages for loss of goodwill, - work stoppage, computer failure or malfunction, or any and all - other commercial damages or losses), even if such Contributor - has been advised of the possibility of such damages. - -9. Accepting Warranty or Additional Liability. While redistributing - the Work or Derivative Works thereof, You may choose to offer, - and charge a fee for, acceptance of support, warranty, indemnity, - or other liability obligations and/or rights consistent with this - License. However, in accepting such obligations, You may act only - on Your own behalf and on Your sole responsibility, not on behalf of - any other Contributor, and only if You agree to indemnify, - defend, and hold each Contributor harmless for any liability - incurred by, or claims asserted against, such Contributor by reason - of your accepting any such warranty or additional liability. - -END OF TERMS AND CONDITIONS - -APPENDIX: How to apply the Apache License to your work. - - To apply the Apache License to your work, attach the following - boilerplate notice, with the fields enclosed by brackets "[]" - replaced with your own identifying information. (Don\'t include - the brackets!) The text should be enclosed in the appropriate - comment syntax for the file format. We also recommend that a - file or class name and description of purpose be included on the - same "printed page" as the copyright notice for easier - identification within third-party archives. - -Copyright [yyyy] [name of copyright owner] - -Licensed under the Apache License, Version 2.0 (the "License"); -you may not use this file except in compliance with the License. -You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, software -distributed under the License is distributed on an "AS IS" BASIS, -WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -See the License for the specific language governing permissions and -limitations under the License. diff --git a/.agents/skills/security-best-practices/SKILL.md b/.agents/skills/security-best-practices/SKILL.md deleted file mode 100644 index 45ccbd8..0000000 --- a/.agents/skills/security-best-practices/SKILL.md +++ /dev/null @@ -1,86 +0,0 @@ ---- -name: "security-best-practices" -description: "Perform language and framework specific security best-practice reviews and suggest improvements. Trigger only when the user explicitly requests security best practices guidance, a security review/report, or secure-by-default coding help. Trigger only for supported languages (python, javascript/typescript, go). Do not trigger for general code review, debugging, or non-security tasks." ---- - -# Security Best Practices - -## Overview - -This skill provides a description of how to identify the language and frameworks used by the current context, and then to load information from this skill's references directory about the security best practices for this language and or frameworks. - -This information, if present, can be used to write new secure by default code, or to passively detect major issues within existing code, or (if requested by the user) provide a vulnerability report and suggest fixes. - -## Workflow - -The initial step for this skill is to identify ALL languages and ALL frameworks which you are being asked to use or already exist in the scope of the project you are working in. Focus on the primary core frameworks. Often you will want to identify both frontend and backend languages and frameworks. - -Then check this skill's references directory to see if there are any relevant documentation for the language and or frameworks. Make sure you read ALL reference files which relate to the specific framework or language. The format of the filenames is `---security.md`. You should also check if there is a `-general--security.md` which is agnostic to the framework you may be using. - -If working on a web application which includes a frontend and a backend, make sure you have checked for reference documents for BOTH the frontend and backend! - -If you are asked to make a web app which will include both a frontend and backend, but the frontend framework is not specified, also check out `javascript-general-web-frontend-security.md`. It is important that you understand how to secure both the frontend and backend. - -If no relevant information is available in the skill's references directory, think a little bit about what you know about the language, the framework, and all well known security best practices for it. If you are unsure you can try to search online for documentation on security best practices. - -From there it can operate in a few ways. - -1. The primary mode is to just use the information to write secure by default code from this point forward. This is useful for starting a new project or when writing new code. - -2. The secondary mode is to passively detect vulnerabilities while working in the project and writing code for the user. Critical or very important vulnerabilities or major issues going against security guidance can be flagged and the user can be told about them. This passive mode should focus on the largest impact vulnerabilities and secure defaults. - -3. The user can ask for a security report or to improve the security of the codebase. In this case a full report should be produced describe anyways the project fails to follow security best practices guidance. The report should be prioritized and have clear sections of severity and urgency. Then offer to start working on fixes for these issues. See #fixes below. - -## Workflow Decision Tree - -- If the language/framework is unclear, inspect the repo to determine it and list your evidence. -- If matching guidance exists in `references/`, load only the relevant files and follow their instructions. -- If no matching guidance exists, consider if you know any well known security best practices for the chosen language and or frameworks, but if asked to generate a report, let the user know that concrete guidance is not available (you can still generate the report or detect for sure critical vulnerabilities) - -# Overrides - -While these references contain the security best practices for languages and frameworks, customers may have cases where they need to bypass or override these practices. Pay attention to specific rules and instructions in the project's documentation and prompt files which may require you to override certain best practices. When overriding a best practice, you MAY report it to the user, but do not fight with them. If a security best practice needs to be bypassed / ignored for some project specific reason, you can also suggest to add documentation about this to the project so it is clear why the best practice is not being followed and to follow that bypass in the future. - -# Report Format - -When producing a report, you should write the report as a markdown file in `security_best_practices_report.md` or some other location if provided by the user. You can ask the user where they would like the report to be written to. - -The report should have a short executive summary at the top. - -The report should be clearly delineated into multiple sections based on severity of the vulnerability. The report should focus on the most critical findings as these have the highest impact for the user. All findings should be noted with an numeric ID to make them easier to reference. - -For critical findings include a one sentence impact statement. - -Once the report is written, also report it to the user directly, although you may be less verbose. You can offer to explain any of the findings or the reasons behind the security best practices guidance if the user wants more info on any findings. - -Important: When referencing code in the report, make sure to find and include line numbers for the code you are referencing. - -After you write the report file, summarize the findings to the user. - -Also tell the user where the final report was written to - -# Fixes - -If you produced a report, let the user read the report and ask to begin performing fixes. - -If you passively found a critical finding, notify the user and ask if they would like you to fix this finding. - -When producing fixes, focus on fixing a single finding at a time. The fixes should have concise clear comments explaining that the new code is based on the specific security best practice, and perhaps a very short reason why it would be dangerous to not do it in this way. - -Always consider if the changes you want to make will impact the functionality of the user's code. Consider if the changes may cause regressions with how the project works currently. It is often the case that insecure code is relied on for other reasons (and this is why insecure code lives on for so long). Avoid breaking the user's project as this may make them not want to apply security fixes in the future. It is better to write a well thought out, well informed by the rest of the project, fix, then a quick slapdash change. - -Always follow any normal change or commit flow the user has configured. If making git commits, provide clear commit messages explaining this is to align with security best practices. Try to avoid bunching a number of unrelated findings into a single commit. - -Always follow any normal testing flows the user has configured (if any) to confirm that your changes are not introducing regressions. Consider the second order impacts the changes may have and inform the user before making them if there are any. - -# General Security Advice - -Below is a few bits of secure coding advice that applies to almost any language or framework. - -### Avoid Using Incrementing IDs for Public IDs of Resources - -When assigning an ID for some resource, which will then be used by exposed to the internet, avoid using small auto-incrementing IDs. Use longer, random UUID4 or random hex string instead. This will prevent users from learning the quantity of a resource and being able to guess resource IDs. - -### A note on TLS - -While TLS is important for production deployments, most development work will be with TLS disabled or provided by some out-of-scope TLS proxy. Due to this, be very careful about not reporting lack of TLS as a security issue. Also be very careful around use of "secure" cookies. They should only be set if the application will actually be over TLS. If they are set on non-TLS applications (such as when deployed for local dev or testing), it will break the application. You can provide a env or other flag to override setting secure as a way to keep it off until on a TLS production deployment. Additionally avoid recommending HSTS. It is dangerous to use without full understanding of the lasting impacts (can cause major outages and user lockout) and it is not generally recommended for the scope of projects being reviewed by codex. diff --git a/.agents/skills/security-best-practices/agents/openai.yaml b/.agents/skills/security-best-practices/agents/openai.yaml deleted file mode 100644 index 1721736..0000000 --- a/.agents/skills/security-best-practices/agents/openai.yaml +++ /dev/null @@ -1,4 +0,0 @@ -interface: - display_name: "Security Best Practices" - short_description: "Security reviews and secure-by-default guidance" - default_prompt: "Review this codebase for security best practices and suggest secure-by-default improvements." diff --git a/.agents/skills/security-best-practices/references/golang-general-backend-security.md b/.agents/skills/security-best-practices/references/golang-general-backend-security.md deleted file mode 100644 index 85e9c6a..0000000 --- a/.agents/skills/security-best-practices/references/golang-general-backend-security.md +++ /dev/null @@ -1,826 +0,0 @@ -# Go (Golang) Security Spec (Go 1.25.x, Standard Library, net/http) - -This document is designed as a **security spec** that supports: -1) **Secure-by-default code generation** for new Go code. -2) **Security review / vulnerability hunting** in existing Go code (passive “notice issues while working” and active “scan the repo and report findings”). - -It is intentionally written as a set of **normative requirements** (“MUST/SHOULD/MAY”) plus **audit rules** (what bad patterns look like, how to detect them, and how to fix/mitigate them). - --------------------------------------------------------------------- - -## 0) Safety, boundaries, and anti-abuse constraints (MUST FOLLOW) - -- MUST NOT request, output, log, or commit secrets (API keys, passwords, private keys, session cookies, JWTs, database URLs with credentials, signing keys, client secrets). -- MUST NOT “fix” security by disabling protections (e.g., `InsecureSkipVerify`, `GOSUMDB=off` for public modules, wildcard CORS + credentials, removing auth checks, disabling CSRF defenses on cookie-auth apps). -- MUST provide **evidence-based findings** during audits: cite file paths, code snippets, build/deploy configs, and concrete values that justify the claim. -- MUST treat uncertainty honestly: if a control might exist in infrastructure (reverse proxy, WAF, service mesh, platform config), report it as “not visible in app code; verify at runtime/config.” -- MUST keep fixes minimal, correct, and production-safe; avoid introducing breaking changes without warning (especially around auth/session flows, and proxies). - --------------------------------------------------------------------- - -## 1) Operating modes - -### 1.1 Generation mode (default) -When asked to write new Go code or modify existing code: -- MUST follow every **MUST** requirement in this spec. -- SHOULD follow every **SHOULD** requirement unless the user explicitly says otherwise. -- MUST prefer safe-by-default APIs and proven libraries over custom security code. -- MUST avoid introducing new risky sinks (shell execution, dynamic template execution, serving user files as HTML, unsafe redirects, weak crypto, unbounded parsing, etc.). - -### 1.2 Passive review mode (always on while editing) -While working anywhere in a Go repo (even if the user did not ask for a security scan): -- MUST “notice” violations of this spec in touched/nearby code. -- SHOULD mention issues as they come up, with a brief explanation + safe fix. - -### 1.3 Active audit mode (explicit scan request) -When the user asks to “scan”, “audit”, or “hunt for vulns”: -- MUST systematically search the codebase for violations of this spec. -- MUST output findings in a structured format (see §2.3). - -Recommended audit order: -1) Build/deploy entrypoints: `main.go`, `cmd/*`, Dockerfiles, Kubernetes manifests, systemd units, CI workflows. -2) Go toolchain & dependency policy: Go version, modules, `go.mod/go.sum`, proxy/sumdb settings, govulncheck usage. -3) Secret management and config loading (env, files, secret stores) + logging patterns. -4) HTTP server configuration (timeouts, body limits, proxy trust, security headers). -5) AuthN/AuthZ boundaries, session/cookie settings, token validation. -6) CSRF protections for cookie-authenticated state-changing endpoints. -7) Template usage and output encoding (XSS), and any “render template from string” behavior (SSTI). -8) File handling (uploads/downloads/path traversal/temp files), static file serving. -9) Injection sinks: SQL, OS command execution, SSRF/outbound fetch, open redirects. -10) Concurrency/resource exhaustion (unbounded goroutines/queues, missing timeouts/contexts). -11) Use of `unsafe` / `cgo` / `reflect` in security-sensitive paths. -12) Debug/diagnostic endpoints (pprof/expvar/metrics) exposure. -13) Cryptography usage (randomness, password hashing). - --------------------------------------------------------------------- - -## 2) Definitions and review guidance - -### 2.1 Untrusted input (treat as attacker-controlled unless proven otherwise) -Examples include: -- `*http.Request` fields: `r.URL.Path`, `r.URL.RawQuery`, `r.Form`, `r.PostForm`, headers, cookies, `r.Body` -- Path parameters from routers (including values extracted from URL paths) -- JSON/XML/YAML bodies, multipart form parts, uploaded files -- Any data from external systems (webhooks, third-party APIs, message queues) -- Any persisted user content (DB rows) that originated from users -- Configuration values that might be attacker-influenced in some deployments (headers set by upstream proxies, environment variables in multi-tenant systems) - -### 2.2 State-changing request -A request is state-changing if it can create/update/delete data, change auth/session state, trigger side effects (purchase, email send, webhook send), or initiate privileged actions. - -### 2.3 Required audit finding format -For each issue found, output: - -- Rule ID: -- Severity: Critical / High / Medium / Low -- Location: file path + function/handler name + line(s) -- Evidence: the exact code/config snippet -- Impact: what could go wrong, who can exploit it -- Fix: safe change (prefer minimal diff) -- Mitigation: defense-in-depth if immediate fix is hard -- False positive notes: what to verify if uncertain (edge configs, proxy behavior, auth assumptions) - --------------------------------------------------------------------- - -## 3) Secure baseline: minimum production configuration (MUST in production) - -This is the smallest “production baseline” that prevents common Go misconfigurations. - -### 3.1 Toolchain, patching, and dependency hygiene (MUST) -- MUST run a supported Go major version and keep to the latest patch releases. -- MUST treat Go standard library patch releases as security-relevant (many security fixes land in stdlib components like `net/http`, `crypto/*`, parsing packages). -- MUST use Go modules with committed `go.mod` and `go.sum`. -- MUST NOT disable module authenticity mechanisms for public modules (checksum DB) unless you have a controlled, documented replacement. -- MUST run `govulncheck` (source scan and/or binary scan) in CI and address findings. - -### 3.2 HTTP server baseline (MUST for network-facing services) -If the program serves HTTP (directly or via a framework built on `net/http`): -- MUST configure an `http.Server` with explicit timeouts and header limits. -- MUST set request body size limits (global and per-route as needed). -- MUST avoid exposing diagnostic endpoints (pprof/expvar) publicly. -- SHOULD set a consistent set of security headers (or verify they are set at the edge). -- MUST set cookie security attributes for any cookies you issue. -- SHOULD implement rate limiting and abuse controls for auth and expensive endpoints. - -Illustrative baseline skeleton (adjust to your project): -- Create a dedicated mux (avoid implicit global defaults unless intentionally managed). -- Wrap handlers with: panic-safe error handling, request ID, logging, auth, and limits. - --------------------------------------------------------------------- - -## 4) Rules (generation + audit) - -Each rule contains: required practice, insecure patterns, detection hints, and remediation. - -### GO-DEPLOY-001: Keep the Go toolchain and standard library updated (security releases) -Severity: Medium - -NOTE: Upgrading dependencies and the core Go version can break projects in unexpected ways. Focus on only security-critical dependencies and if noticed, let the user know rather than upgrading automatically. - -Required: -- MUST run a supported Go major release and apply patch releases promptly. -- SHOULD treat patch releases as security-relevant, even if your application code didn’t change. - -Insecure patterns: -- Production builds pinned to old Go versions without a patching process. -- Docker images like `golang:1.xx` or custom base images that are not updated regularly. -- CI pipelines that intentionally suppress Go updates. - -Detection hints: -- Inspect CI (`.github/workflows`, `gitlab-ci.yml`, etc.) for `go-version:` or toolchain setup. -- Inspect Dockerfiles for `FROM golang:` tags. -- Inspect `go.mod` `go` directive and any toolchain pinning. - -Fix: -- Upgrade to the latest patch of a supported Go version. -- Add an automated check (CI) that fails when Go is below an approved minimum. - -Notes: -- Go publishes regular minor releases that frequently include security fixes across standard library packages. - ---- - -### GO-SUPPLY-001: Go module authenticity MUST NOT be disabled for public dependencies -Severity: High - -Required: -- MUST keep module checksum verification enabled for public modules. -- SHOULD commit `go.sum` and treat changes as security-sensitive. -- MUST NOT use insecure module fetching settings for public modules. -- MAY configure private module behavior using `GOPRIVATE`/`GONOSUMDB` for private repos, but must do so narrowly and intentionally. - -Insecure patterns: -- `GOSUMDB=off` in CI or production build environments for public modules. -- `GONOSUMDB=*` or overly broad patterns that effectively disable verification. -- `GOINSECURE=*` or broad `GOINSECURE` patterns for public modules. -- `GOPROXY=direct` everywhere without a clear policy. - -Detection hints: -- Search build configs for `GOSUMDB`, `GONOSUMDB`, `GOINSECURE`, `GOPROXY`, `GOPRIVATE`. -- Look for documentation/scripts that recommend disabling checksum DB “to make builds work”. - -Fix: -- Restore defaults for public module verification. -- For private modules: - - Set `GOPRIVATE=your.private.domain/*` - - Configure an internal proxy or direct fetching, and restrict `GONOSUMDB` to private patterns only. - -Notes: -- Disabling checksum verification removes an important integrity layer against targeted or compromised upstream delivery. - ---- - -### GO-CONFIG-001: Secrets must be externalized and never logged or committed -Severity: High (Critical if credentials are committed) - -Required: -- MUST load secrets from environment variables, secret managers, or secure config files with restricted permissions. -- MUST NOT hard-code secrets in Go source, test fixtures that may reach production, or build args. -- MUST NOT log secrets or full credential-bearing connection strings. -- SHOULD fail closed in production if required secrets are missing. - -Insecure patterns: -- String constants containing tokens/keys/passwords. -- `.env` files or config files with secrets committed to repo. -- Logging `os.Environ()`, dumping full configs, or printing DSNs. - -Detection hints: -- Search for suspicious literals (`API_KEY`, `SECRET`, `PASSWORD`, `Authorization:`). -- Inspect config loaders and logging statements. -- Inspect CI logs or debug print paths. - -Fix: -- Move secrets to a secret store / environment variables. -- Redact sensitive fields in logs. -- Add secret scanning to CI and pre-commit. - ---- - -### GO-HTTP-001: HTTP servers MUST set timeouts and MaxHeaderBytes -Severity: High (DoS risk) - -Required: -- MUST set: `ReadHeaderTimeout`, and SHOULD set `ReadTimeout`, `WriteTimeout`, `IdleTimeout` as appropriate for the service. -- MUST set `MaxHeaderBytes` to a justified limit for your application. -- MUST NOT rely on default zero-values for timeouts in production for internet-facing servers. - -Insecure patterns: -- `http.ListenAndServe(":8080", handler)` with a default `http.Server` (no explicit timeouts). -- `&http.Server{}` with timeouts left at zero. -- Missing `MaxHeaderBytes`. - -Detection hints: -- Search for `http.ListenAndServe(`, `ListenAndServeTLS(`, `Server{` and inspect configured fields. -- Check for reverse proxies; even with a proxy, app-level timeouts still matter. - -Fix: -- Use `http.Server{ReadHeaderTimeout: ..., ReadTimeout: ..., WriteTimeout: ..., IdleTimeout: ..., MaxHeaderBytes: ...}`. -- Calibrate timeouts per endpoint type (streaming vs JSON APIs). - -Notes: -- Net/http documents that these timeouts exist and that zero/negative values mean “no timeout”; production services should choose explicit values. - ---- - -### GO-HTTP-002: Request body and multipart parsing MUST be size-bounded -Severity: Medium (DoS risk; can be High for upload-heavy apps) - -Required: -- MUST enforce a global maximum request body size for endpoints that accept bodies. -- MUST enforce strict multipart upload limits and avoid unbounded form parsing. -- SHOULD enforce per-route limits when some endpoints legitimately need larger bodies. -- SHOULD set upstream (proxy) limits as defense-in-depth. - -Insecure patterns: -- Reading `r.Body` with `io.ReadAll(r.Body)` without a size cap. -- Calling `r.ParseMultipartForm(...)` with overly large limits (or forgetting size controls). -- Accepting file uploads with no limits on file size, number of parts, or total body size. - -Detection hints: -- Search for `io.ReadAll(r.Body)`, `json.NewDecoder(r.Body)`, `ParseMultipartForm`, `FormFile`, `multipart`. -- Look for missing `http.MaxBytesReader` or equivalent per-handler limiting. -- Look for “upload” endpoints and check limits. - -Fix: -- Wrap request bodies with `http.MaxBytesReader(w, r.Body, maxBytes)` before parsing. -- For multipart, set conservative limits and validate file sizes/part counts explicitly. -- Set proxy limits (e.g., at ingress) in addition to app limits. - -Notes: -- There are known vulnerability classes and advisories related to excessive resource consumption in multipart/form parsing; treat unbounded parsing as a security issue. - ---- - -### GO-DEPLOY-002: Diagnostic endpoints (pprof/expvar/metrics) MUST NOT be publicly exposed -Severity: High - -NOTE: This only applies to production configurations. These endpoints are often used for debug or dev endpoints. If found, confirm that it would be reachable from the actual production deployment. - -Required: -- MUST NOT expose `net/http/pprof` handlers on a public internet-facing listener without strong access controls. -- SHOULD run diagnostics on a separate, internal-only listener (loopback/VPC-only) and require auth. -- MUST review what diagnostic endpoints reveal (stack traces, memory, command lines, environment, internal URLs). - -Insecure patterns: -- Side-effect import `import _ "net/http/pprof"` in a server binary with a public mux. -- `/debug/pprof/*` reachable without auth. -- `/debug/vars` (expvar) reachable without auth. - -Detection hints: -- Search for `net/http/pprof` imports (including blank imports). -- Search for route prefixes `/debug/pprof`, `/debug/vars`. -- Check whether `http.DefaultServeMux` is used and whether any debug handlers register globally. - -Fix: -- Remove diagnostics from production builds, or bind them to an internal-only listener. -- Add strong authentication/authorization (and ideally network-level restrictions). - -Notes: -- pprof is typically imported for its side effect of registering HTTP handlers under `/debug/pprof/`. - ---- - -### GO-HTTP-003: Reverse proxy and forwarded header trust MUST be explicit -Severity: High (auth, URL generation, logging/auditing correctness) - -Required: -- If behind a reverse proxy, MUST define which proxy is trusted and how client IP/scheme/host are derived. -- MUST NOT trust `X-Forwarded-For`, `X-Forwarded-Proto`, `Forwarded`, or similar headers from the open internet. -- MUST ensure “secure cookie” logic, redirects, and absolute URL generation do not rely on spoofable headers. - -Insecure patterns: -- Using `r.Header.Get("X-Forwarded-For")` as the client IP without validating the proxy boundary. -- Deriving “is HTTPS” from `X-Forwarded-Proto` without confirming it came from a trusted proxy. -- Using forwarded `Host` values for password reset links without allowlisting. - -Detection hints: -- Search for `X-Forwarded-For`, `X-Forwarded-Proto`, `Forwarded`, `Real-IP`, and any custom “client IP” helpers. -- Inspect ingress/proxy configs; if not visible, mark as “verify at edge”. - -Fix: -- Enforce proxy trust at the edge and in app: - - Accept forwarded headers only from known proxy IP ranges. - - Prefer platform-provided mechanisms where available. -- If generating external links, use a configured allowlisted canonical origin (not the request’s Host header). - ---- - -### GO-HTTP-004: Security headers SHOULD be set (in app or at the edge) -Severity: Medium - -Required (typical web app serving browsers): -- SHOULD set: - - `Content-Security-Policy` (CSP) appropriate to the app. NOTE: It is most important to set the CSP's script-src. All other directives are not as important and can generally be excluded for the ease of development. - - `X-Content-Type-Options: nosniff` - - Clickjacking protection (`X-Frame-Options` and/or CSP `frame-ancestors`) - - `Referrer-Policy` and `Permissions-Policy` where appropriate -- MUST ensure cookies have secure attributes (see GO-HTTP-005). - -NOTE: -- These headers may be set via reverse proxy/CDN; if not visible in app code, report as “verify at edge”. - -Insecure patterns: -- No security headers anywhere (app or edge) for a browser-facing app. -- CSP missing for apps rendering untrusted content. - -Detection hints: -- Search for middleware setting headers: `w.Header().Set("Content-Security-Policy", ...)`, etc. -- Search for reverse proxy config that sets headers. - -Fix: -- Add centralized header middleware in Go, or configure at the edge. -- Keep CSP realistic; avoid `unsafe-inline` where possible. - ---- - -### GO-HTTP-005: Cookies MUST use secure attributes in production -Severity: Medium - -Required (production, HTTPS): -- MUST set `Secure` on cookies that carry auth/session state. IMPORTANT NOTE: Only set `Secure` in production environment when TLS is configured. When running in a local dev environment over HTTP, do not set `Secure` property on cookies. You should do this conditionally based on if the app is running in production mode. You should also include a property like `SESSION_COOKIE_SECURE` which can be used to disable `Secure` cookies when testing over HTTP. -- MUST set `HttpOnly` on auth/session cookies. -- SHOULD set `SameSite=Lax` by default (or `Strict` if compatible), and only use `None` when necessary (and only with `Secure`). -- SHOULD set bounded lifetimes (`Max-Age`/`Expires`) appropriate to the app. - -Insecure patterns: -- Setting auth/session cookies without `Secure` in HTTPS deployments. -- Cookies without `HttpOnly` for session identifiers. -- `SameSite=None` for cookie-authenticated apps without a strong CSRF strategy. - -Detection hints: -- Search for `http.SetCookie`, `&http.Cookie{`, `Set-Cookie`. -- Inspect cookie flags in auth/session code. - -Fix: -- Set the correct fields on `http.Cookie` and centralize cookie creation. - -Notes: -- SameSite is defense-in-depth and does not replace CSRF protections for cookie-auth apps. - ---- - -### GO-HTTP-006: Cookie-authenticated state-changing endpoints MUST be CSRF-protected -Severity: High - -- IMPORTANT NOTE: If cookies are not used for auth (e.g., pure bearer token in Authorization header with no ambient cookies), CSRF is not a risk for those endpoints. - -Required: -- MUST protect all state-changing endpoints (POST/PUT/PATCH/DELETE) that rely on cookies for authentication. -- SHOULD use a well-tested CSRF library/middleware rather than rolling your own. -- MAY use additional defenses (Origin/Referer checks, Fetch Metadata, SameSite cookies), but tokens remain the primary defense for cookie-authenticated apps. -If tokens are impractical, or for small applications: -* MUST at a minimum require a custom header to be set and set the session cookie SESSION_COOKIE_SAMESITE=lax, as this is the strongest method besides requiring a form token, and may be much easier to implement. - - -Insecure patterns: -- Cookie-authenticated JSON endpoints that mutate state with no CSRF checks. -- Using GET for state-changing actions. - -Detection hints: -- Enumerate all non-GET routes and identify auth mechanism. -- Look for CSRF middleware usage; if absent, treat as suspicious in browser-facing apps. - -Fix: -- Add CSRF middleware and ensure it covers all state-changing routes. -- If the service is an API intended for non-browser clients, avoid cookie auth; use Authorization headers. - ---- - -### GO-HTTP-007: CORS must be explicit and least-privilege -Severity: Medium (High if misconfigured with credentials) - -Required: -- If CORS is not needed, MUST keep it disabled. -- If CORS is needed: - - MUST allowlist trusted origins (do not reflect arbitrary origins) - - MUST be careful with credentialed requests; do not combine broad origins with cookies - - SHOULD restrict allowed methods/headers - -Insecure patterns: -- `Access-Control-Allow-Origin: *` paired with cookies (`Access-Control-Allow-Credentials: true`). -- Reflecting `Origin` without validation. - -Detection hints: -- Search for `Access-Control-Allow-` header setting. -- Search for CORS middleware configuration. - -Fix: -- Implement strict origin allowlists and minimal methods/headers. -- Ensure cookie-auth endpoints are not exposed cross-origin unless required. - ---- - -### GO-XSS-001: Use html/template and avoid bypassing auto-escaping with untrusted data -Severity: High - -Required: -- MUST use `html/template` for HTML rendering (not `text/template`). -- MUST NOT convert untrusted data into “trusted” template types (`template.HTML`, `template.JS`, `template.URL`, etc.). -- SHOULD keep templates static and controlled by developers; treat dynamic templates as high risk. -- MUST NOT serve user-uploaded HTML/JS as active content unless explicitly intended and safely sandboxed. - -Insecure patterns: -- `text/template` used to generate HTML. -- Using `template.HTML(userInput)` or similar typed wrappers. -- Directly writing unescaped user content into HTML responses. - -Detection hints: -- Search for `text/template`, `template.New(...).Parse(...)`, and typed wrappers like `template.HTML(`. -- Inspect handlers that return HTML with string concatenation. - -Fix: -- Use `html/template` and pass untrusted data as data, not markup. -- If you must allow limited HTML, use a vetted HTML sanitizer and still be careful with attributes/URLs. - ---- - -### GO-SSTI-001: Never parse/execute templates from untrusted input (SSTI) -Severity: Critical - -Required: -- MUST NOT call `template.Parse` / `template.ParseFiles` / `template.New(...).Parse(...)` on template text influenced by untrusted input. -- MUST treat “user-defined templates” as a special high-risk design: - - MUST use heavy sandboxing and strict allowlists - - MUST isolate execution (process/container boundary) if truly required - -Insecure patterns: -- `tmpl := template.Must(template.New("x").Parse(r.FormValue("tmpl")))` -- Reading templates from uploads / DB entries and executing them in the same trust domain as server code. - -Detection hints: -- Search for `.Parse(` and trace the origin of the template string. -- Look for “custom email templates”, “user theming templates”, etc. - -Fix: -- Replace with safe substitution mechanisms (no code execution). -- If templates must be user-controlled, isolate and sandbox aggressively. - ---- - -### GO-PATH-001: Prevent path traversal and unsafe file serving -Severity: High - -Required: -- MUST NOT pass user-controlled paths to `os.Open`, `os.ReadFile`, `http.ServeFile`, or `http.FileServer` without strict validation and base-dir enforcement. -- MUST treat `..`, absolute paths, and OS-specific path tricks as hostile input. -- SHOULD store user uploads outside any static web root; serve through controlled handlers. -- MUST avoid directory listing for sensitive file trees. - -Insecure patterns: -- `http.ServeFile(w, r, r.URL.Query().Get("path"))` -- `os.Open(filepath.Join(baseDir, userPath))` without checking that the result stays under `baseDir` -- `http.FileServer(http.Dir("."))` serving the project root or user-writable directories - -Detection hints: -- Search for `ServeFile(`, `FileServer(`, `http.Dir(`, `os.Open(`, `ReadFile(`, `filepath.Join(`. -- Trace whether path components come from request/DB. - -Fix: -- Use an allowlist of file identifiers (e.g., database IDs) mapped to server-side paths. -- Enforce base directory containment after cleaning and joining. -- Serve active formats as downloads (`Content-Disposition: attachment`) unless explicitly intended. - ---- - -### GO-UPLOAD-001: File uploads must be validated, stored safely, and served safely -Severity: High - -Required: -- MUST enforce upload size limits (app + edge). -- MUST validate file type using allowlists and content checks (not only extensions). -- MUST store uploads outside executable/static roots when possible. -- SHOULD generate server-side filenames (random IDs) and avoid trusting original names. -- MUST serve potentially active formats safely (download attachment) unless explicitly intended. - -Insecure patterns: -- Accepting arbitrary file types and serving them back inline. -- Using user-supplied filename as storage path. -- Missing size/type validation. - -Detection hints: -- Search for `multipart`, `FormFile`, `ParseMultipartForm`, `io.Copy` to disk. -- Check where files are stored and how they are served. - -Fix: -- Implement allowlist validation + safe storage + safe serving. -- Add scanning/quarantine workflows where applicable. - ---- - -### GO-INJECT-001: Prevent SQL injection (parameterized queries / ORM) -Severity: High - -Required: -- MUST use parameterized queries or an ORM that parameterizes under the hood. -- MUST NOT build SQL by string concatenation / `fmt.Sprintf` / string interpolation with untrusted input. - -Insecure patterns: -- `fmt.Sprintf("SELECT ... WHERE id=%s", r.URL.Query().Get("id"))` -- `query := "UPDATE users SET role='" + role + "' WHERE id=" + id` - -Detection hints: -- Grep for `SELECT`, `INSERT`, `UPDATE`, `DELETE` and check how query strings are built. -- Trace untrusted data into `db.Query`, `db.Exec`, `QueryRow`, etc. - -Fix: -- Replace with placeholders (`?`, `$1`, etc.) and pass parameters separately. -- Validate and type-check IDs before use. - ---- - -### GO-INJECT-002: Prevent OS command injection; avoid shelling out with untrusted input -Severity: Critical to High (depends on exposure) - -Required: -- MUST avoid executing external commands with attacker-controlled strings. -- If subprocess is necessary: - - MUST use `exec.CommandContext` with an argument list (not `sh -c`). - - MUST NOT pass untrusted input to a shell (`bash -c`, `sh -c`, PowerShell). - - SHOULD use strict allowlists for any variable component (subcommand, flags, filenames). -- MUST assume CLI tools may interpret attacker-controlled args as flags or special values. - -Insecure patterns: -- `exec.Command("sh", "-c", userString)` -- `exec.Command("bash", "-c", fmt.Sprintf("tool %s", user))` -- Calling the shell to get glob expansion for user-supplied globs. - -Detection hints: -- Search for `os/exec`, `exec.Command(`, `CommandContext(`, `"sh"`, `"bash"`, `"-c"`. -- Trace untrusted input into command name/args. - -Fix: -- Use library APIs instead of subprocesses. -- Hardcode command and allowlist/validate args. -- If a shell is unavoidable, escape robustly and treat as high risk (prefer avoiding). - -Notes: -- The Go `os/exec` package intentionally does invoke a shell; introducing `sh -c` reintroduces shell injection hazards. - ---- - -### GO-SSRF-001: Prevent SSRF in outbound HTTP requests -Severity: Medium (High in cloud/LAN environments) - -- Note: For small stand alone projects this is less important. It is most important when deploying into an LAN or with other services listening on the same server. - -Required: -- MUST treat outbound requests to user-provided URLs as high risk. -- SHOULD allowlist hosts/domains for any user-influenced URL fetch. -- SHOULD block access to localhost/private IP ranges/link-local addresses and cloud metadata endpoints. -- MUST restrict schemes to `http`/`https` (no `file:`, `gopher:`, etc.). -- MUST set client timeouts and restrict redirects. - -Insecure patterns: -- `http.Get(r.URL.Query().Get("url"))` -- “URL preview” / “webhook test” endpoints that fetch arbitrary URLs. - -Detection hints: -- Search for `http.Get`, `client.Do`, and URL values derived from requests/DB. -- Identify features that fetch remote resources. - -Fix: -- Parse URLs strictly; enforce scheme and allowlisted hostnames. -- Resolve DNS and enforce IP-range restrictions (with care for DNS rebinding). -- Set timeouts, disable redirects unless needed, and cap response sizes. - ---- - -### GO-HTTPCLIENT-001: Outbound HTTP clients MUST set timeouts and close bodies -Severity: High (DoS and resource exhaustion) - -Required: -- MUST set an overall timeout on `http.Client` usage (or equivalent per-request deadlines via context + transport timeouts). -- MUST ensure `resp.Body.Close()` is called for all successful requests (typically `defer resp.Body.Close()` immediately after error check). -- SHOULD limit response body reads (do not `io.ReadAll` unbounded responses). -- SHOULD restrict redirects for security-sensitive fetches (SSRF, auth flows). - -Insecure patterns: -- Using `http.DefaultClient` / `http.Get` for user-influenced destinations with no timeout policy. -- Missing `defer resp.Body.Close()` leading to resource leaks. -- `io.ReadAll(resp.Body)` with no limit. - -Detection hints: -- Search for `http.Get(`, `http.Post(`, `client := &http.Client{}` without `Timeout`, `client.Do(` and missing closes. -- Search for `io.ReadAll(resp.Body)`. - -Fix: -- Use a configured client with timeouts. -- Always close response bodies. -- Use bounded readers (`io.LimitReader`) for large/untrusted responses. - -Notes: -- The net/http package exposes `DefaultClient` as a zero-valued `http.Client`, which can easily lead to “no timeout” behavior unless configured. - ---- - -### GO-REDIRECT-001: Prevent open redirects -Severity: Medium (can be High with auth flows) - -Required: -- MUST validate redirect targets derived from untrusted input (`next`, `redirect`, `return_to`). -- SHOULD prefer only same-site relative paths. -- SHOULD fall back to a safe default on validation failure. - -Insecure patterns: -- `http.Redirect(w, r, r.URL.Query().Get("next"), http.StatusFound)` with no validation. - -Detection hints: -- Search for `http.Redirect(` and check origin of the location. - -Fix: -- Allowlist internal paths or known domains. -- Reject absolute URLs unless explicitly needed and allowlisted. - ---- - -### GO-CRYPTO-001: Cryptographic randomness MUST come from crypto/rand -Severity: High (Critical if used for auth/session tokens or keys) - -Required: -- MUST use `crypto/rand` for: - - session IDs, password reset tokens, API keys, CSRF tokens, nonces - - encryption keys, signing keys, salts when required -- MUST NOT use `math/rand` for any security-sensitive value. -- SHOULD use built-in helpers that produce appropriately strong tokens when available. - -Insecure patterns: -- `math/rand.Seed(time.Now().UnixNano())` followed by token generation for auth or sessions. -- Using UUIDv4-like constructs built from `math/rand`. - -Detection hints: -- Search for `math/rand`, `rand.Seed`, `rand.Intn` in code that touches auth/session/token flows. -- Search for custom token generators. - -Fix: -- Switch to `crypto/rand` (`rand.Reader`, `rand.Read`, or secure token helpers). -- Ensure sufficient entropy and use URL-safe encoding. - -Notes: -- The crypto/rand package provides secure randomness APIs and token generation helpers. - ---- - -### GO-AUTH-001: Password storage MUST use adaptive hashing (bcrypt/argon2id) and safe comparisons -Severity: High - -Required: -- MUST hash passwords using an adaptive password hashing function (bcrypt or argon2id). -- MUST NOT store plaintext passwords or reversible encryption of passwords. -- MUST compare secrets in constant time when relevant (tokens, MACs, API keys) to reduce timing leaks. -- SHOULD ensure password policies do not exceed algorithm constraints (e.g., bcrypt has input length limits; handle long passphrases appropriately). - -Insecure patterns: -- `sha256(password)` stored as password hash. -- Plaintext password storage. -- Comparing secrets with `==` in timing-sensitive contexts. - -Detection hints: -- Search for `sha1`, `sha256`, `md5` used on passwords. -- Search for `bcrypt`/`argon2` usage; if absent, suspect. -- Search for `==` comparisons on tokens/API keys. - -Fix: -- Use `bcrypt.GenerateFromPassword` / `CompareHashAndPassword` or argon2id with recommended parameters. -- Use constant-time compare helpers when comparing MACs/tokens. - -Notes: -- Go provides bcrypt in `golang.org/x/crypto/bcrypt`, and constant-time comparisons in `crypto/subtle`. - ---- - -### GO-CONC-001: Data races and concurrency hazards MUST be treated as security-relevant -Severity: Medium to High (depends on what races affect) - -Required: -- MUST run tests with the race detector (`go test -race`) in CI for security-sensitive services. -- MUST fix detected races; do not suppress without deep justification. -- SHOULD treat shared mutable state in handlers as high risk; enforce synchronization or avoid shared mutability. - -Insecure patterns: -- Global maps/slices mutated from multiple goroutines without a mutex. -- Caches or auth/session state stored in globals without concurrency protection. -- Racy access to authorization state (can lead to bypasses or inconsistent enforcement). - -Detection hints: -- Search for `var someMap = map[...]...` used in handlers. -- Look for missing `sync.Mutex`, `sync.Map`, channels, or other synchronization. -- Ensure CI includes `-race` and that it runs relevant tests. - -Fix: -- Add proper synchronization or redesign to avoid shared mutable state. -- Add race tests and run them continuously. - -Notes: -- The Go race detector only finds races that occur in executed code paths; improve test coverage and run realistic workloads with `-race` where feasible. - ---- - -### GO-UNSAFE-001: Use of unsafe/cgo MUST be minimized and audited like memory-unsafe code -Severity: High (Critical in high-risk code paths) - -Required: -- SHOULD avoid importing `unsafe` in application code unless absolutely necessary. -- If `unsafe` is used, MUST treat it as “manual memory safety” requiring careful review and test coverage. -- If `cgo` is used, MUST treat the C/C++ boundary as memory-unsafe; apply secure coding practices on the C side and isolate where possible. - -Insecure patterns: -- Widespread `unsafe.Pointer` casts in parsing, serialization, auth, or network code. -- `cgo` used for parsing or security boundaries without sandboxing. - -Detection hints: -- Search for `import "unsafe"`, `unsafe.Pointer`, `// #cgo`, `import "C"`. -- Prioritize review where unsafe touches untrusted input. - -Fix: -- Replace unsafe/cgo usage with safe standard library alternatives where possible. -- Isolate unsafe code in small, well-tested modules with fuzz/race tests. - -Notes: -- The unsafe package explicitly provides operations that step around Go’s type safety guarantees. - --------------------------------------------------------------------- - -## 5) Practical scanning heuristics (how to “hunt”) - -When actively scanning, use these high-signal patterns: - -Toolchain & dependencies: -- `FROM golang:` (Dockerfiles), `go-version:` (CI), `toolchain go` (go.mod), pinned old versions -- `GOSUMDB=off`, `GOINSECURE`, `GONOSUMDB`, `GOPROXY=direct` -- `replace` directives in `go.mod` to forks/paths -- `govulncheck` missing in CI - -HTTP server hardening: -- `http.ListenAndServe(`, `ListenAndServeTLS(`, `&http.Server{` with missing timeouts -- `ReadHeaderTimeout: 0`, `ReadTimeout: 0`, `WriteTimeout: 0`, `IdleTimeout: 0`, missing `MaxHeaderBytes` - -Body parsing / DoS: -- `io.ReadAll(r.Body)`, `json.NewDecoder(r.Body)` without size cap -- `ParseMultipartForm`, `FormFile`, `multipart.NewReader` without explicit limits -- Missing `http.MaxBytesReader` - -Debug exposure: -- `import _ "net/http/pprof"` -- `/debug/pprof`, `/debug/vars` - -Templates / XSS / SSTI: -- `text/template` used for HTML output -- `template.HTML(`, `template.JS(`, `template.URL(` with user-controlled data -- `.Parse(` on user-controlled strings - -Files: -- `http.ServeFile(` with user path -- `http.FileServer(http.Dir(` pointing at repo root or uploads -- `os.Open(filepath.Join(base, user))` without containment checks - -Injection: -- SQL building with `fmt.Sprintf`, string concatenation near `db.Query/Exec` -- `exec.Command("sh","-c", ...)`, `exec.Command("bash","-c", ...)` - -SSRF / outbound HTTP: -- `http.Get(userURL)`, `client.Do(req)` where URL comes from request/DB -- Missing client timeout, missing `resp.Body.Close()`, unbounded `io.ReadAll(resp.Body)` - -Crypto: -- `math/rand` in token/session generation -- `InsecureSkipVerify: true` -- Password hashing with `sha256`/`md5` instead of bcrypt/argon2 - -Concurrency: -- Shared maps/slices mutated from handlers without locks -- CI lacking `go test -race` - -Always try to confirm: -- data origin (untrusted vs trusted) -- sink type (template/SQL/subprocess/files/http) -- protective controls present (limits, validation, allowlists, middleware, network controls) - --------------------------------------------------------------------- - -## 6) Sources (accessed 2026-01-28) - -Primary Go documentation: -- Go Security Policy — https://go.dev/doc/security/policy -- Go Release History (security fixes in patch releases) — https://go.dev/doc/devel/release -- Go 1.25 Release Notes — https://go.dev/doc/go1.25 -- net/http (server timeouts, MaxHeaderBytes, DefaultClient) — https://pkg.go.dev/net/http -- html/template (auto-escaping and trusted-template assumptions) — https://pkg.go.dev/html/template -- crypto/tls (MinVersion defaults, InsecureSkipVerify warnings) — https://pkg.go.dev/crypto/tls -- crypto/rand (secure randomness, token helpers) — https://pkg.go.dev/crypto/rand -- crypto/subtle (constant-time comparisons) — https://pkg.go.dev/crypto/subtle -- os/exec (no shell by default; command execution guidance) — https://pkg.go.dev/os/exec -- unsafe (bypasses type safety) — https://go.dev/src/unsafe/unsafe.go -- net/http/pprof (debug endpoints) — https://pkg.go.dev/net/http/pprof -- cmd/go (module authentication via go.sum/checksum DB; env vars like GOINSECURE) — https://pkg.go.dev/cmd/go -- Module Mirror and Checksum Database Launched (Go blog) — https://go.dev/blog/module-mirror-launch -- govulncheck documentation — https://pkg.go.dev/golang.org/x/vuln/cmd/govulncheck -- Go Race Detector documentation — https://go.dev/doc/articles/race_detector -- bcrypt (password hashing) — https://pkg.go.dev/golang.org/x/crypto/bcrypt -- Go vulnerability entry example (multipart resource consumption) — https://pkg.go.dev/vuln/GO-2023-1569 - -OWASP Cheat Sheet Series (general web security): -- Session Management — https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html -- CSRF Prevention — https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html -- SSRF Prevention — https://cheatsheetseries.owasp.org/cheatsheets/Server_Side_Request_Forgery_Prevention_Cheat_Sheet.html -- XSS Prevention — https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html -- HTTP Security Response Headers — https://cheatsheetseries.owasp.org/cheatsheets/HTTP_Headers_Cheat_Sheet.html \ No newline at end of file diff --git a/.agents/skills/security-best-practices/references/javascript-express-web-server-security.md b/.agents/skills/security-best-practices/references/javascript-express-web-server-security.md deleted file mode 100644 index 98d7906..0000000 --- a/.agents/skills/security-best-practices/references/javascript-express-web-server-security.md +++ /dev/null @@ -1,1158 +0,0 @@ -# Express (Node.js) Web Security Spec (Express 5.x / 4.19.2+, Node.js LTS) - -This document is designed as a **security spec** that supports: - -1. **Secure-by-default code generation** for new Express apps and routes. -2. **Security review / vulnerability hunting** in existing Express code (passive “notice issues while working” and active “scan the repo and report findings”). - -It is intentionally written as a set of **normative requirements** (“MUST/SHOULD/MAY”) plus **audit rules** (what bad patterns look like, how to detect them, and how to fix/mitigate them). - ---- - -## 0) Safety, boundaries, and anti-abuse constraints (MUST FOLLOW) - -* MUST NOT request, output, log, or commit secrets (API keys, passwords, private keys, session secrets, cookies, tokens). -* MUST NOT “fix” security by disabling protections (e.g., weakening cookie flags, disabling CSRF defenses for cookie-authenticated apps, enabling permissive CORS, trusting proxy headers from the open internet, turning on debugging/stack traces in production, disabling TLS without a replacement). -* MUST provide **evidence-based findings** during audits: cite file paths, code snippets, middleware/config values, and runtime assumptions that justify the claim. -* MUST treat uncertainty honestly: if a protection might exist in infrastructure (reverse proxy, gateway, WAF, CDN), report it as “not visible in app code; verify at runtime/config.” -* MUST prefer vetted libraries and platform controls over “roll your own” crypto/auth/session/CSRF. Express explicitly expects the application to validate/handle user input correctly; it does not do this automatically. ([Express][1]) - ---- - -## 1) Operating modes - -### 1.1 Generation mode (default) - -When asked to write new Express code or modify existing code: - -* MUST follow every **MUST** requirement in this spec. -* SHOULD follow every **SHOULD** requirement unless the user explicitly says otherwise. -* MUST prefer safe-by-default APIs and proven libraries over custom security code. -* MUST avoid introducing new risky sinks (shell execution, dynamic code evaluation, unsafe redirects, serving user files as HTML, template rendering from untrusted strings, unsafe filesystem paths, SSRF URL fetch endpoints, etc.). - -### 1.2 Passive review mode (always on while editing) - -While working anywhere in an Express repo (even if the user did not ask for a security scan): - -* MUST “notice” violations of this spec in touched/nearby code. -* SHOULD mention issues as they come up, with a brief explanation + safe fix. - -### 1.3 Active audit mode (explicit scan request) - -When the user asks to “scan”, “audit”, or “hunt for vulns”: - -* MUST systematically search the codebase for violations of this spec. -* MUST output findings in a structured format (see §2.3). - -Recommended audit order: - -1. Entrypoints (server/app bootstrap), deployment manifests, Dockerfiles, process manager config, CI/CD. -2. Express settings + middleware stack order (helmet, parsers, auth, sessions, CSRF, CORS). -3. Proxy trust (`trust proxy`) and IP/protocol/host handling. ([Express][2]) -4. Auth flows, sessions, cookies, password reset links, redirect handling. ([Express][1]) -5. State-changing routes + CSRF protections (cookie-authenticated apps). ([OWASP Cheat Sheet Series][3]) -6. Template rendering and XSS defenses (HTML generation, CSP, `res.locals`). ([OWASP Cheat Sheet Series][4]) -7. File handling (uploads + downloads + static files) and path traversal. ([Express][5]) -8. Injection classes (SQL, NoSQL, command execution, unsafe deserialization). ([OWASP Cheat Sheet Series][6]) -9. Outbound requests (SSRF) and webhook/callback delivery. ([OWASP Cheat Sheet Series][7]) -10. Rate limiting / brute-force defenses / abuse controls. ([Express][1]) -11. Dependency hygiene / lockfiles / npm audit / vulnerable Express versions. ([Express][1]) - ---- - -## 2) Definitions and review guidance - -### 2.1 Untrusted input (treat as attacker-controlled unless proven otherwise) - -In Express, common untrusted inputs include: - -* `req.params` (route parameters) -* `req.query` (query string parameters; can be strings/arrays/objects depending on parsing) ([OWASP Cheat Sheet Series][8]) -* `req.body` from `express.json()`, `express.urlencoded()`, `express.text()`, `express.raw()` ([Express][5]) -* `req.headers` / `req.get(...)` -* `req.cookies` / `req.signedCookies` (if cookie parsing middleware is used) -* Upload metadata and filenames (e.g., multer `file.originalname`, `file.mimetype`) -* Any data from external systems (webhooks, third-party APIs, message queues) -* Any persisted user content (DB rows) that originated from users - -Special proxy note: - -* If `trust proxy` is enabled, values like `req.ip`, `req.hostname`, and `req.protocol` may be derived from `X-Forwarded-*` headers which **can be attacker-controlled** if your proxy chain is not correctly overwriting/removing them. ([Express][2]) - -### 2.2 State-changing request - -A request is state-changing if it can create/update/delete data, change auth/session state, trigger side effects (purchase, email send, webhook send), or initiate privileged actions. - -### 2.3 Required audit finding format - -For each issue found, output: - -* Rule ID: -* Severity: Critical / High / Medium / Low -* Location: file path + function/route/middleware name + line(s) -* Evidence: the exact code/config snippet -* Impact: what could go wrong, who can exploit it -* Fix: safe change (prefer minimal diff) -* Mitigation: defense-in-depth if immediate fix is hard -* False positive notes: what to verify if uncertain - ---- - -## 3) Secure baseline: minimum production configuration (MUST in production) - -This is the smallest “production baseline” that prevents common Express misconfigurations. - -Minimum baseline targets: - -* `helmet()` is used and configured (especially CSP where applicable), and fingerprinting is reduced (disable `x-powered-by`). ([Express][1]) -* A custom 404 handler and a custom error handler exist, and production does not leak internal stack traces. ([Express][1]) -* Cookie/session usage is deliberate: - - * Not using default session cookie names - * Cookies use secure attributes (`Secure`, `HttpOnly`, `SameSite`) as appropriate - * Cookie-backed sessions never store secrets (they are readable by the client) - * Server-side sessions never use MemoryStore in production. ([Express][1]) -* Request body parsing has explicit limits (`express.json({ limit })`, `express.urlencoded({ limit, parameterLimit, depth })`). ([Express][5]) -* `trust proxy` is configured explicitly to match your proxy topology; not blindly `true`. ([Express][2]) -* Login/auth endpoints have brute-force protection and rate limiting. ([Express][1]) -* Dependencies are regularly audited/updated (`npm audit` + advisory response). ([Express][1]) - ---- - -## 4) Rules (generation + audit) - -Each rule contains: required practice, insecure patterns, detection hints, and remediation. - -### EXPRESS-INPUT-001: Treat all user input as untrusted and validate it - -Severity: High - -Required: - -* MUST validate and normalize untrusted inputs before using them in security-sensitive logic or dangerous sinks (DB queries, redirects, filesystem, HTML output, shell commands). Ensure the untrusted inputs are type checked and structure checked before using or passing forward. -* SHOULD apply allowlists (known-good) rather than blocklists when feasible. -* MUST reject or safely handle unexpected types/shapes in `req.query`, `req.params`, and `req.body`. - -Insecure patterns: - -* Passing `req.query`, `req.params`, `req.body` directly into database/query builders, redirects, filesystem paths, or templates. -* Assuming `req.query.foo` is always a string (it can be an array/object depending on parsing). ([OWASP Cheat Sheet Series][8]) - -Detection hints: - -* Identify “untrusted-to-sink” flows: request → sink (`res.redirect`, SQL execution, `sendFile`, `child_process`, template render, outbound fetch). -* Search for direct usage of `req.query.*`, `req.body.*`, `req.params.*` in sensitive calls. - -Fix: - -* Add schema validation (e.g., zod/joi/express-validator) at route boundaries. -* Normalize types (e.g., force IDs to integers; reject arrays when scalar expected). - -Notes: - -* Express production security guidance explicitly says input validation/handling is the application’s responsibility. ([Express][1]) - ---- - -### EXPRESS-REDIRECT-001: Prevent open redirects; validate redirect targets - -Severity: Medium - -Required: - -* MUST validate redirect destinations derived from untrusted input (`next`, `return_to`, `url`). -* SHOULD allowlist only same-site relative paths (preferred) or a strict allowlist of domains. -* MUST fall back to a safe default when validation fails. - -Insecure patterns: - -* `res.redirect(req.query.next)` with no validation. -* `res.redirect(req.body.url)` or `res.location(...)` using untrusted URLs. - -Detection hints: - -* Search for `res.redirect(` and `res.location(` and trace the source of the target. -* Look for query params named `next`, `redirect`, `return`, `url`. - -Fix: - -* Only allow relative paths (starting with `/`) and disallow `//`, backslashes, and encoded variants. -* If cross-domain redirects are required, allowlist exact hosts and enforce `https`. - -Notes: - -* Express documentation calls out open redirects as dangerous user input and shows validating the host before redirecting. ([Express][1]) -* Keep Express updated: Express has had an open-redirect-related CVE affecting some versions, and upgrades are part of the mitigation posture. ([NVD][9]) - ---- - -### EXPRESS-HEADERS-001: Use Helmet (or equivalent) to set essential security headers - -Severity: Medium - -Required: - -* SHOULD use `helmet()` to set common security headers. -* SHOULD configure CSP realistically (avoid `unsafe-inline` where possible) for pages that render user-influenced content. -* SHOULD set `X-Content-Type-Options: nosniff`, clickjacking defenses (`X-Frame-Options` or CSP `frame-ancestors`), and appropriate referrer policy. - -NOTE: It is most important to set the CSP's script-src. All other directives are not as important and can generally be excluded for the ease of development. - -Insecure patterns: - -* No security headers set in app code and no evidence they are set at the edge. -* CSP missing on apps that display user content. -* Misconfigured framing headers that unintentionally allow clickjacking. - -Detection hints: - -* Search for `helmet(` usage; check if CSP is configured or disabled. -* Search for `res.setHeader(` / `res.set(` for security header setting. -* If not visible in app code, check nginx/CDN config; otherwise flag “verify at edge.” - -Fix: - -* Add `helmet()` early in middleware order and configure: - - * CSP (`contentSecurityPolicy`) - * Frame protections (`frameguard` or CSP `frame-ancestors`) - * `X-Content-Type-Options` (`noSniff`) - -Notes: - -* Express production security best practices recommend Helmet and list headers Helmet sets by default. ([Express][1]) -* OWASP HTTP Headers guidance is a useful reference when tuning policies. ([OWASP Cheat Sheet Series][10]) - ---- - -### EXPRESS-FINGERPRINT-001: Reduce fingerprinting by disabling `x-powered-by` and customizing error/404 responses - -Severity: Low (defense-in-depth) - -Required: - -* SHOULD disable `X-Powered-By` using `app.disable('x-powered-by')`. -* SHOULD provide a custom 404 handler and a custom error handler to avoid distinct default responses and to control information leakage. - -Insecure patterns: - -* Default `X-Powered-By: Express` header left enabled. -* Default Express 404/error responses in production with identifiable formatting and/or stack traces. - -Detection hints: - -* Search for `app.disable('x-powered-by')`. -* Check middleware tail for a custom 404 (`app.use((req,res)=>...)`) and a custom error handler (`app.use((err,req,res,next)=>...)`). -* Check if `NODE_ENV` is correctly set for production behavior (see EXPRESS-ERROR-001). ([Express][11]) - -Fix: - -* Add: - - * `app.disable('x-powered-by')` - * A custom 404 handler - * A custom error handler that logs server-side and returns generic messages client-side - -Notes: - -* Express docs explicitly recommend disabling `x-powered-by` and adding your own not-found and error handlers. ([Express][1]) - ---- - -### EXPRESS-COOKIE-001: Cookies must use secure attributes and minimal scope - -Severity: Medium - -Required: - -* MUST set cookie flags appropriately for any authentication/session cookie: - - * `Secure` when HTTPS (production) IMPORTANT NOTE: Only set `Secure` in production environment if TLS is configured. When running in a local dev environment over HTTP, do not set `Secure` property on cookies. You should do this conditionally based on if the app is running in production mode. You should also include a property like `SESSION_COOKIE_SECURE` which can be used to disable `Secure` cookies when testing over HTTP. - * `HttpOnly` for auth/session cookies - * `SameSite` set deliberately (`Lax` is a common baseline; `Strict` if compatible; `None` only with `Secure` and a justified cross-site need) -* SHOULD avoid setting `domain` broadly (avoid “all subdomains” unless required). -* SHOULD set bounded expiry appropriate to risk and UX. - -Insecure patterns: - -* Session/auth cookies without `HttpOnly`. -* Cookies without `Secure` in production HTTPS. -* `SameSite=None` + cookie-authenticated state-changing endpoints without CSRF protections. - -Detection hints: - -* Search for `res.cookie(`, `Set-Cookie`, `cookie: { ... }`, `express-session`, `cookie-session`. -* Verify cookie flags in session middleware configuration. - -Fix: - -* Set these attributes centrally in session/cookie middleware configuration. - -Notes: - -* Express production security guidance lists cookie security options (`secure`, `httpOnly`, etc.). ([Express][1]) -* `res.cookie()` ultimately sets `Set-Cookie` with options; defaults follow RFC 6265 behavior when options are omitted. ([Express][5]) -* OWASP session management guidance is relevant for choosing flags and lifetimes. ([OWASP Cheat Sheet Series][12]) - ---- - -### EXPRESS-SESS-001: Do not use the default session cookie name; avoid session fingerprinting - -Severity: Low (defense-in-depth) - -Required: - -* SHOULD override the default session cookie name (e.g., do not keep `connect.sid` when using `express-session`). -* SHOULD use a generic name (e.g., `sessionId`) unless you have a compatibility reason. - -Insecure patterns: - -* `express-session` used with no `name:` configured (default cookie name). -* Multiple apps on the same domain sharing a cookie name accidentally. - -Detection hints: - -* Search for `express-session` config blocks; check for `name:`. - -Fix: - -* Set `name: 'sessionId'` (or similar) in `express-session` options. - -Notes: - -* Express docs explicitly recommend not using the default session cookie name to reduce fingerprinting. ([Express][1]) - ---- - -### EXPRESS-SESS-002: Session storage and lifecycle must be production-safe - -Severity: High - -Required: - -* MUST NOT use `MemoryStore` in production (it is not designed for production use). -* MUST store session secrets outside source control and rotate them safely. -* SHOULD regenerate sessions on login / privilege changes to reduce session fixation risk. -* MUST NOT store sensitive secrets in client-readable cookie sessions. - -Insecure patterns: - -* `app.use(session({ store: new MemoryStore(), ... }))` or missing store (defaults to MemoryStore). -* Hard-coded for example: `secret: 'keyboard cat'` / `secret: 's3Cur3'` in repo. -* Using `cookie-session` to store access tokens, refresh tokens, or PII. - -Detection hints: - -* Search for `express-session` and look for `MemoryStore` usage or missing `store`. -* Search for `secret:` in session config and check if it’s hard-coded. -* Look for `req.session = ...` patterns and whether sensitive data is stored. - -Fix: - -* Use a production session store (Redis, database-backed, etc.). -* Load secrets from environment/secret manager. -* On login: `req.session.regenerate(...)` or equivalent flow with safe privilege re-binding. - -Notes: - -* `express-session` explicitly warns that `MemoryStore` is not designed for production. ([Express][1]) -* `express-session` documents rotating secrets and session regeneration to guard against fixation. ([Express][1]) -* Express notes that cookie-backed sessions serialize data into the cookie and that cookie data is visible to the client; keep it small and non-secret. ([Express][1]) - ---- - -### EXPRESS-CSRF-001: Cookie-authenticated state-changing requests MUST be CSRF-protected - -Severity: High - -- IMPORTANT NOTE: If cookies are not being used for auth (ie auth is via Authentication header or other passed token), then there is no CSRF risk. - -Required: - -* MUST protect all state-changing endpoints (POST/PUT/PATCH/DELETE) that rely on cookies for authentication. -* SHOULD use a well-understood CSRF mitigation (token-based is the typical baseline). -* MAY add defense-in-depth: Origin/Referer validation, Fetch Metadata enforcement, SameSite cookies, custom header requirements for XHR/fetch—**but do not treat these as a full replacement** unless explicitly designed and justified. -* MUST use at a minimum require a custom HTTP header if form based CRSF tokens are not practical, as this is the second strongest method. - -IMPORTANT NOTE: - -* If authentication is done via `Authorization: Bearer ...` headers (and not cookies), classic browser CSRF is typically not applicable; - -Insecure patterns: - -* Cookie-authenticated endpoints that change state with no CSRF protection. -* Using GET for state-changing actions (amplifies CSRF risk). -* “CSRF protection” that only checks a user-controlled field. - -Detection hints: - -* Enumerate routes with methods other than GET/HEAD and identify whether cookies gate auth. -* Look for presence/absence of CSRF middleware and token checks. -* Check JSON APIs too, not only HTML forms. - -Fix: - -* Implement CSRF tokens for cookie-authenticated flows. -* Add Origin/Referer checks where feasible, and ensure SameSite is set appropriately. - -Notes: - -* OWASP CSRF guidance and OWASP Node.js guidance both recommend anti-CSRF tokens as a standard control for web apps. ([OWASP Cheat Sheet Series][3]) - ---- - -### EXPRESS-CORS-001: CORS must be explicit and least-privilege - -Severity: Medium (High if misconfigured with credentials) - -Required: - -* If CORS is not needed, MUST keep it disabled. -* If CORS is needed: - - * MUST allowlist trusted origins (do not reflect arbitrary `Origin` without validation). - * MUST NOT combine broad origins with credentialed cookies (`Access-Control-Allow-Credentials: true`). - * SHOULD restrict methods, headers, and exposed headers to what’s required. - -Insecure patterns: - -* `Access-Control-Allow-Origin: *` with `Access-Control-Allow-Credentials: true`. -* Reflecting `Origin` for all requests without allowlist validation. -* Applying permissive CORS middleware globally when only a subset needs cross-origin access. - -Detection hints: - -* Search for `cors(`, `Access-Control-Allow-Origin`, `Access-Control-Allow-Credentials`. -* Inspect whether cookies are used for auth on endpoints exposed cross-origin. - -Fix: - -* Implement strict origin allowlist and ensure credentialed requests only for intended origins. -* Consider splitting CORS config per route group rather than global. - -Notes: - -* OWASP HTTP header guidance covers security implications of response headers, including those that affect browser behavior; use it as a reference when reviewing header posture. ([OWASP Cheat Sheet Series][10]) - ---- - -### EXPRESS-PROXY-001: Reverse proxy trust (`trust proxy`) must be configured correctly - -Severity: Medium (High if using IP based authentication) - -Required: - -* If behind a reverse proxy/LB, MUST configure `app.set('trust proxy', ...)` to match the real proxy chain. -* MUST NOT blindly set `trust proxy = true` unless you fully control the proxy behavior and header rewriting. -* MUST ensure the last trusted proxy overwrites/removes `X-Forwarded-For`, `X-Forwarded-Host`, and `X-Forwarded-Proto` so clients cannot spoof them. - -Insecure patterns: - -* `app.set('trust proxy', true)` in an app directly exposed to the internet or behind unknown proxies. -* Using `req.ip`, `req.protocol`, `req.hostname` for security decisions without correct proxy trust configuration. -* Rate limiting keyed by `req.ip` with spoofable forwarded headers. - -Detection hints: - -* Search for `app.set('trust proxy'`. -* Check infra docs (nginx/LB) for header rewriting behavior. -* Identify any security logic using `req.ip`, `req.ips`, `req.protocol`, `req.hostname`. - -Fix: - -* Set `trust proxy` to a hop count, explicit IP/subnet list, or a custom function matching your network. -* Ensure proxies overwrite forwarded headers. - -Notes: - -* Express explicitly warns that when `trust proxy` is `true`, the client IP is derived from `X-Forwarded-For`, and if proxies don’t overwrite forwarded headers, the client can provide any value. It also describes that enabling trust proxy impacts `req.hostname` and `req.protocol` derived from forwarded headers. ([Express][2]) - ---- - -### EXPRESS-BODY-001: Request body size and parsing limits MUST be set appropriately - -Severity: Low - -Required: - -* SHOULD set explicit body size limits for: - - * `express.json({ limit })` - * `express.urlencoded({ limit, parameterLimit, depth })` -* SHOULD only enable the parsers you need; do not parse large bodies by default for all routes. -* SHOULD enforce additional limits at the reverse proxy / gateway level. - -Insecure patterns: - -* No explicit body limits (accepting arbitrarily large JSON/urlencoded). -* Global parsers applied to all routes when only some need bodies. -* `parameterLimit` very high without justification (DoS potential). - -Detection hints: - -* Search for `express.json(` and confirm `limit` is set (or consciously accepted). -* Search for `express.urlencoded(` and inspect `limit`, `parameterLimit`, and `depth`. -* Review upload/webhook endpoints for special parsing needs. - -Fix: - -* Configure parsers with conservative defaults and override per route group when needed. - -Notes: - -* Express documents `express.json` options (including `limit`, defaulting to 100kb) and explicitly notes `req.body` is untrusted and should be validated. ([Express][5]) -* Express documents `express.urlencoded` options including `limit`, `parameterLimit`, and `depth`. ([Express][5]) -* OWASP Node.js guidance also recommends setting request size limits. ([OWASP Cheat Sheet Series][8]) - ---- - -### EXPRESS-INPUT-002: Prevent HTTP Parameter Pollution and type confusion in `req.query` - -Severity: Medium - -Required: - -* MUST treat `req.query` values as potentially multi-valued (array/object), depending on query parsing. -* SHOULD reject ambiguous multi-valued parameters for security-sensitive fields (e.g., `role`, `isAdmin`, `redirect`, `amount`, `userId`). -* SHOULD consider explicit parsing or dedicated middleware if parameter pollution is a concern. - -Insecure patterns: - -* `if (req.query.admin) { ... }` without type checks (arrays/objects may coerce truthy). -* Passing `req.query` directly into ORM/NoSQL query objects. - -Detection hints: - -* Search for security-sensitive comparisons on `req.query.*` without type enforcement. -* Look for code that assumes query params are strings. - -Fix: - -* Validate shape: enforce string-only for certain params and reject arrays/objects. -* Normalize query parsing settings (simple vs extended) where applicable and documented. - -Notes: - -* OWASP Node.js cheat sheet explicitly highlights that Express query parsing can produce strings, arrays, or objects and recommends preventing HTTP Parameter Pollution. ([OWASP Cheat Sheet Series][8]) - ---- - -### EXPRESS-XSS-001: Prevent reflected/stored XSS in HTML responses and templating - -Severity: High - -Required: - -* MUST escape untrusted content in HTML output (templates should auto-escape by default; do not bypass). -* MUST NOT inject untrusted strings into HTML without escaping/sanitization. -* SHOULD set CSP (via Helmet) for apps rendering user-controlled content. -* SHOULD keep `res.locals` free of user-controlled input intended for templates unless it is validated/escaped. - -Insecure patterns: - -* `res.send("
" + req.query.q + "
")` -* Passing untrusted HTML through “safe” template flags/filters. -* Writing untrusted strings into `res.locals` and then rendering without escaping. - -Detection hints: - -* Search for `res.send(` with strings containing user input. -* Search for template “safe” flags (engine-specific) and trace data origin. -* Search for assignments to `res.locals` and whether they might contain untrusted data. - -Fix: - -* Use a template engine with autoescaping; pass only validated data. -* For rich text that must contain HTML, use a trusted sanitizer and an allowlist policy. -* Add CSP with realistic directives. - -Notes: - -* Express API docs explicitly warn that `res.locals` “should not contain user-controlled input” and is often used to expose things like CSRF tokens to templates. ([Express][5]) -* OWASP XSS prevention guidance provides standard output-encoding and policy recommendations. ([OWASP Cheat Sheet Series][4]) -* Helmet can mitigate some XSS classes via headers such as CSP. ([Express][1]) - ---- - -### EXPRESS-TEMPLATE-001: Never render untrusted templates or template paths (SSTI / LFI risk) - -Severity: Critical (if you can prove template strings/paths are user/attacker-controlled) - -Required: - -* MUST NOT render templates whose contents or template path/name is influenced by untrusted input. -* MUST NOT load templates from user-controlled filesystem locations. -* SHOULD treat “email template editors”, “theme engines”, and “CMS-like template storage” as high-risk designs requiring sandboxing and isolation. - -Insecure patterns: - -* `res.render(req.query.view, data)` where `view` is not allowlisted. -* Rendering a template from a string that includes user input (engine-specific). -* Loading templates from uploads directories. - -Detection hints: - -* Search for `res.render(` where the first argument is derived from request/DB without allowlist. -* Search for template compilation APIs (engine-specific) fed by user content. - -Fix: - -* Use allowlisted template names and a fixed templates directory. -* If user-defined templates are required, implement strict sandboxing and isolate execution. - -Notes: - -* Express’s template system depends on the chosen engine; assume unsafe if user input influences template selection or source. - ---- - -### EXPRESS-FILES-001: Prevent path traversal and unsafe file serving (sendFile/download) - -Severity: High - -Required: - -* MUST NOT pass user-controlled filesystem paths directly to `res.sendFile()` / `res.download()` / filesystem APIs. -* SHOULD use `res.sendFile` with a fixed `root` and strict options (e.g., deny dotfiles) when serving user-selected files from a directory. -* MUST enforce authorization checks before serving user-specific files. - -Insecure patterns: - -* `res.sendFile(req.query.path)` or `res.download(req.params.file)` with no root restriction. -* File-serving routes that accept `..` segments, encoded traversal, or absolute paths. - -Detection hints: - -* Search for `res.sendFile(` and trace the `path` argument origin. -* Search for `res.download(` and trace the `path` argument origin. -* Look for `fs.readFile`/`createReadStream` on paths derived from requests. - -Fix: - -* Use an identifier-to-path mapping stored server-side (DB), not raw paths from clients. -* Use `root: ` and `dotfiles: 'deny'` where appropriate; validate the filename component strictly. - -Notes: - -* Express’s `res.sendFile` docs show using a `root` option and `dotfiles: 'deny'` as part of a safe serving configuration. ([Express][5]) -* `res.download` transfers the file as an attachment, but you still must control/validate the underlying `path`. ([Express][5]) - ---- - -### EXPRESS-STATIC-001: Harden `express.static` / serve-static and never serve untrusted uploads as active content - -Severity: Medium (if serving untrusted user files if there are not robust limits tot eh file extensions) - -Required: - -* MUST NOT serve user uploads from a public static directory as active content (especially HTML/JS/SVG) unless explicitly intended and sandboxed. If sure that the content is inactive (png, jpg, other images etc) then it may be safe. It may be good to validate image file extensions are allow-listed before serving them. -* SHOULD configure static serving to: - - * deny/ignore dotfiles - * avoid unintended directory indexes if not needed - * apply appropriate cache controls for immutable assets - -Insecure patterns: - -* `app.use(express.static('uploads'))` where users can upload arbitrary files. -* Serving uploaded HTML or SVG inline from the same origin as the app. - -Detection hints: - -* Search for `express.static(` and identify served directories. -* Compare served directories with upload storage locations. -* Check for `dotfiles` and `index` options in static middleware. - -Fix: - -* Store uploads outside any static web root and serve via controlled routes that set safe `Content-Type` and `Content-Disposition: attachment` when appropriate. -* Configure `express.static(root, { dotfiles: 'deny'|'ignore', index: false (if desired) })`. - -Notes: - -* Express documents `express.static` options, including `dotfiles` behavior and `index`. ([Express][5]) - ---- - -### EXPRESS-UPLOAD-001: File uploads must be validated, stored safely, and served safely - -Severity: Low - Medium - -Required: - -* SHOULD enforce upload size limits (app + edge). -* MUST validate file type using allowlists and content checks (not only filename extension). -* MUST store uploads outside executable/static roots when possible. -* SHOULD generate server-side filenames (random IDs); do not trust original names. -* MUST serve potentially active formats safely (download attachment) unless explicitly intended. - -Insecure patterns: - -* Accepting arbitrary file types and serving them back inline. -* Using `file.originalname` as the storage path. -* Missing size/type validation. - -Detection hints: - -* Look for multer/busboy/formidable usage and check for `limits`. -* Check where uploaded files are written and how they are served. -* Check whether uploads end up under `public/` or any `express.static` root. - -Fix: - -* Implement allowlist validation + safe storage + safe serving, per OWASP upload guidance. - -Notes: - -* OWASP File Upload guidance covers allowlists, content validation, storage, and safe serving patterns. ([OWASP Cheat Sheet Series][13]) - ---- - -### EXPRESS-INJECT-001: Prevent SQL injection (use parameterized queries / ORM) - -Severity: High - -Required: - -* MUST use parameterized queries or an ORM/query builder that parameterizes under the hood. -* MUST NOT build SQL via string concatenation/template literals with untrusted input. - -Insecure patterns: - -* ``db.query(`SELECT * FROM users WHERE id = ${req.query.id}`)`` -* `"SELECT ... WHERE name = '" + req.body.name + "'"` - -Detection hints: - -* Grep for `SELECT`, `INSERT`, `UPDATE`, `DELETE` strings in JS/TS. -* Trace untrusted input into `.query(...)`, `.execute(...)`, or raw SQL APIs. - -Fix: - -* Replace with parameterized queries (placeholders) or ORM query APIs. -* Validate types (e.g., integer IDs) before querying. - -Notes: - -* OWASP SQL injection prevention guidance strongly favors parameterized queries. ([OWASP Cheat Sheet Series][6]) - ---- - -### EXPRESS-INJECT-002: Prevent NoSQL injection / operator injection (Mongo-style) - -Severity: High (app-dependent) - -Required: - -* MUST validate types and schemas for any query object built from untrusted input. -* MUST prevent operator injection (e.g., `$ne`, `$gt`, `$where`) if user input is merged into query objects. -* SHOULD consider defensive libraries/middleware when appropriate. - -Insecure patterns: - -* `collection.find(req.body)` where the body is attacker-controlled. -* Merging `req.query`/`req.body` into Mongo queries without schema validation. - -Detection hints: - -* Search for `find(`, `findOne(`, `aggregate(` calls where argument is request-derived. -* Check for patterns like `{ ...req.query }` or `Object.assign(query, req.body)`. - -Fix: - -* Use schema validation at boundary; explicitly construct query objects from validated fields only. - -Notes: - -* OWASP Node.js cheat sheet discusses input validation and mentions Node ecosystem modules commonly used for sanitization in NoSQL contexts. ([OWASP Cheat Sheet Series][8]) - ---- - -### EXPRESS-CMD-001: Prevent OS command injection (child_process) - -Severity: Critical to High (depends on exposure), please prove it is user/attacker controlled - -Required: - -* MUST avoid executing shell commands with untrusted input. -* If subprocess is necessary: - - * MUST avoid `exec()` / `execSync()` with attacker-influenced strings - * MUST NOT use `shell: true` with attacker-influenced data - * SHOULD use `spawn()` with an argument array and strict allowlists. Ensure the executable is hardcoded or allow-listed, do not use a user supplied command name. - * SHOULD place user-controlled values after `--` when supported by the subcommand to avoid flag injection - -Insecure patterns: - -* `exec(req.query.cmd)` -* `exec(`convert ${userPath} ...`)` -* `spawn('sh', ['-c', userString])` -* `spawn(userString, ['foo'])` - -Detection hints: - -* Search for `child_process`, `exec(`, `execSync(`, `spawn(`, `fork(`. -* Trace request/DB data into command construction. - -Fix: - -* If possible, write the functionality in javascript or use a library instead of subprocess. -* If unavoidable, hard-code command and strictly allowlist parameters. - -Notes: - -* OWASP OS command injection defense guidance covers avoid-shell and allowlist patterns. ([OWASP Cheat Sheet Series][14]) - ---- - -### EXPRESS-SSRF-001: Prevent server-side request forgery (SSRF) in outbound HTTP - -Severity: Medium (High in cloud/LAN deployments) - -NOTE: This is mostly only applicable to apps which will be deployed in a cloud/LAN setup or have other http services on the same box. Sometimes the feature requires this functionality unavoidably (webhooks). - -Required: - -* MUST treat outbound requests to user-provided URLs as high risk if there are other reachable private http endpoints. -* SHOULD validate and restrict destinations (allowlist hosts/domains) for any user-influenced URL fetch. -* SHOULD block access to: - - * localhost / private IP ranges / link-local - * cloud metadata endpoints -* MUST allow only `http`/`https` for URL fetch features (to avoid schemas such as `file:`,`javascript:`) -* SHOULD set timeouts and restrict redirects. - -Insecure patterns: - -* `fetch(req.query.url)` -* “URL preview” / “import from URL” endpoints that accept arbitrary URLs. - -Detection hints: - -* Search for `fetch(`, `axios(`, `got(`, `request(`, `node-fetch` usage where URL originates from users/DB. -* Review webhook testers, previewers, image fetchers. - -Fix: - -* Enforce scheme allowlist, host allowlist, DNS/IP resolution checks, timeouts, and redirect policy. -* Consider network egress controls at infrastructure level. - -Notes: - -* OWASP SSRF prevention guidance provides standard controls and common pitfalls. ([OWASP Cheat Sheet Series][7]) - ---- - -### EXPRESS-ERROR-001: Error handling MUST not leak sensitive details in production - -Severity: Low - -Required: - -* SHOULD define a centralized error handler (`app.use((err, req, res, next) => ...)`) at the end of middleware. -* MUST avoid returning stack traces, internal error messages, or secrets to clients in production. -* SHOULD log errors server-side with appropriate redaction. -* SHOULD ensure the app runs with production settings so default behavior doesn’t leak details. -* MUST avoid logging or returning sensitive information such as secrets, env vars, sessions, cookies in error messages in production. - -Insecure patterns: - -* Returning `err.stack` to clients. -* Using dev-only error middleware in production. -* `NODE_ENV` left as development, causing verbose error responses. - -Detection hints: - -* Verify there is a final error-handling middleware. -* Search for `res.status(500).send(err)` or similar. -* Check production environment variables and startup scripts. - -Fix: - -* Add a production-safe error handler that returns generic messages and logs details internally. -* Ensure environment is configured for production behavior. - -Notes: - -* Express production security guidance recommends custom error handling. ([Express][1]) -* Express error handling docs describe the default error handler behavior and how production mode affects what is exposed. ([Express][11]) - ---- - -### EXPRESS-AUTH-001: Prevent brute-force attacks against authorization endpoints - -Severity: Medium - -NOTE: This is highly application specific and while it is good to bring to the attention of the user, it is hard to fix without additional complex configurations. Prefer to inform the user and if they request you to help implement a solution, help walk them through possible solutions. - -Required: - -* SHOULD protect login/auth endpoints against brute forcing. -* SHOULD rate-limit by: - - 1. consecutive failed attempts per username+IP - 2. failed attempts per IP over a time window - -Insecure patterns: - -* Unlimited login attempts. - -Detection hints: - -* Identify all auth endpoints and check for rate limiting/throttling. -* Search for `rate-limiter-flexible`, `express-rate-limit`, or gateway policies. - -Fix: - -* Implement rate-limiting/throttling (app or edge). Express docs point to `rate-limiter-flexible` as a tool for this approach. ([Express][1]) - -Notes: - -* OWASP Node.js cheat sheet also recommends precautions against brute forcing. ([OWASP Cheat Sheet Series][8]) - ---- - -### EXPRESS-DEPS-001: Dependency and patch hygiene (Express + Node + critical middleware) - -Severity: Medium / Low - -NOTE: `npm audit` often returns a large number of insignificant "vulnerabilities" which do not actually matter. You should only focus on Express or other extremely critical packages, ignoring ones listed in dev tools, bundlers, etc. - -Do not upgrade packages without concent from the user. This may break existing code in unexpected ways. Instead, inform them of the outdated packages. - -Required: - -* MUST keep Express on a maintained version line (avoid EOL major versions). -* MAY use `npm audit` in CI and during maintenance work. -* SHOULD pin dependencies via lockfiles and review major updates carefully. - -Insecure patterns: - -* Running EOL Express versions (e.g., very old major lines). -* Ignoring `npm audit` findings without triage. -* Unpinned dependency ranges that auto-upgrade into insecure versions. - -Detection hints: - -* Check `package.json` and lockfiles for `express` version and other critical middleware versions. -* Inspect CI pipelines for `npm audit`/SCA steps. - -Fix: - -* Upgrade to latest stable Express and apply patches. -* Add automated dependency scanning and upgrade process. - -Notes: - -* Express production security guidance emphasizes that dependency vulnerabilities can compromise the app, and recommends `npm audit`. ([Express][1]) -* Track security issues affecting Express versions (including known open-redirect-related CVEs). ([NVD][9]) - ---- - -### EXPRESS-DOS-001: Configure DoS protections (timeouts, limits, reverse proxy) - -Severity: Low - -NOTE: It may be hard to tell from the provided application context if the application runs behind a reverse proxy. You can inform the user or recommend one, but do not attempt to configure one without them initiating it. This is highly deployment dependant. - -Required: - -* SHOULD use a reverse proxy to provide caching, load balancing, and filtering controls when feasible. -* MAY configure server/proxy timeouts and connection limits to reduce exposure to Slowloris and similar DoS patterns. -* MUST ensure server/socket errors are handled so malformed connections do not crash the process. (Express should handle exceptions, but there are edgecases) - -Insecure patterns: - -* No reverse proxy in front of a public Node server, with defaults everywhere. -* Missing error handlers on server/socket objects. -* Extremely permissive timeouts and unlimited body sizes. - -Detection hints: - -* Inspect server creation (`http.createServer`, `https.createServer`) and whether timeouts are set. -* Check proxy/gateway config for timeouts and max body size. - -Fix: - -* Explain how to configure reverse proxy and timeouts, set request size limits -* add robust error handling middleware - -Notes: - -* Node’s security guidance for HTTP DoS discusses using reverse proxies and correctly configuring server timeouts. ([Node.js][15]) - ---- - -### EXPRESS-NODE-INSPECT-001: Do not expose the Node inspector in production - -Severity: Critical - -NOTE: Ensure that this detection is actually in the production path, and not just being used for local debugging. - -Required: - -* MUST NOT run Node with `--inspect` (especially bound to non-loopback) in production. -* MUST ensure `NODE_OPTIONS` or startup scripts do not enable inspector in prod. -* SHOULD firewall/debug locally only. - -Insecure patterns: - -* `node --inspect=0.0.0.0:9229 app.js` in production. -* Container/PM2/systemd configs enabling inspector. - -Detection hints: - -* Search for `--inspect` in Dockerfiles, Procfiles, systemd units, PM2 configs, npm scripts. -* Check `NODE_OPTIONS`. - -Fix: - -* Remove inspector flags from production start commands; restrict to local dev. - -Notes: - -* Node security guidance discusses inspector exposure risks (e.g., DNS rebinding) and recommends not running inspector in production. ([Node.js][15]) - ---- - -### EXPRESS-NODE-HTTP-001: Do not enable insecure HTTP parsing in production - -Severity: High - -NOTE: Ensure that this detection is actually in the production path, and not just being used for local dev. - -Required: - -* MUST NOT use Node’s `insecureHTTPParser` in production. -* MAY suggest configuring front-end proxies to normalize ambiguous requests to reduce request smuggling risk. - -Insecure patterns: - -* Creating an HTTP server with `{ insecureHTTPParser: true }`. - -Detection hints: - -* Search for `insecureHTTPParser` in server creation code. - -Fix: - -* Remove insecure parsing; rely on spec-compliant parsing and normalize at the edge. - -Notes: - -* Node security guidance explicitly recommends not using `insecureHTTPParser`. ([Node.js][15]) - ---- - -## 5) Practical scanning heuristics (how to “hunt”) - -When actively scanning an Express repo, these patterns are high-signal: - -* TLS / transport: - - * `app.listen(80` without reverse proxy mention; missing `helmet`; cookies missing `secure` ([Express][1]) (NOTE this only applies to web facing applications, internal apps likely won't have TLS) -* Proxy trust: - - * `app.set('trust proxy', true)`; logic using `req.ip`/`req.protocol`/`req.hostname` ([Express][2]) -* Security headers / fingerprinting: - - * missing `helmet(`; missing `app.disable('x-powered-by')` ([Express][1]) -* Cookies / sessions: - - * `express-session` with missing `store` (MemoryStore risk), hard-coded `secret:`, missing `cookie: { secure/httpOnly/sameSite }` ([Express][1]) - * `cookie-session` storing large objects or secrets ([Express][1]) -* Body parsing limits: - - * `express.json()` or `express.urlencoded()` without `limit`/`parameterLimit`/`depth` ([Express][5]) -* CSRF: - - * POST/PUT/PATCH/DELETE routes using cookie auth with no CSRF tokens/origin checks ([OWASP Cheat Sheet Series][3]) -* Open redirects: - - * `res.redirect(req.query.next)` or similar ([Express][1]) -* XSS / HTML output: - - * `res.send(` building HTML with user input; template “safe” flags; untrusted values in `res.locals` ([Express][5]) -* File handling: - - * `res.sendFile(` / `res.download(` where path originates from request; `express.static('uploads')` ([Express][5]) -* Injection: - - * SQL strings + template literals into DB calls ([OWASP Cheat Sheet Series][6]) - * `child_process.exec` / `execSync` / `shell: true` ([OWASP Cheat Sheet Series][14]) -* SSRF: - - * outbound `fetch/axios/got` to user-provided URLs ([OWASP Cheat Sheet Series][7]) -* Brute force / abuse: - - * auth endpoints lacking throttling; no rate limiting middleware ([Express][1]) -* Supply chain: - - * outdated Express versions; no lockfiles; no `npm audit` workflow ([Express][1]) -* Node runtime hazards: - - * `--inspect` in production scripts; `insecureHTTPParser` usage ([Node.js][15]) - -Always try to confirm: - -* data origin (untrusted vs trusted) -* sink type (HTML/template, SQL/NoSQL, subprocess, filesystem, redirect, outbound HTTP) -* protective controls present (validation, allowlists, middleware, proxy config, header policies) -* whether protections are at the edge vs in app code - ---- - -## 6) Sources (accessed 2026-01-27) - -Primary Express documentation: - -* Express: Production Best Practices — Security: `https://expressjs.com/en/advanced/best-practice-security.html` ([Express][1]) -* Express: Behind Proxies (`trust proxy`): `https://expressjs.com/en/guide/behind-proxies.html` ([Express][2]) -* Express 5.x API Reference (parsers, static, sendFile, redirect, cookies): `https://expressjs.com/en/5x/api.html` ([Express][5]) -* Express: Error Handling: `https://expressjs.com/en/guide/error-handling.html` ([Express][11]) - -Session middleware documentation: - -* express-session docs (cookie flags, secret rotation, fixation mitigation, MemoryStore warning): `https://expressjs.com/en/resources/middleware/session.html` ([Express][1]) - -Node.js and npm official references: - -* Node.js — Security Best Practices (DoS, proxy guidance, inspector risks, request smuggling notes): `https://nodejs.org/en/learn/getting-started/security-best-practices` ([Node.js][15]) -* npm Docs — `npm audit`: `https://docs.npmjs.com/cli/v9/commands/npm-audit/` ([npm Docs][16]) - -OWASP Cheat Sheet Series: - -* Session Management: `https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html` ([OWASP Cheat Sheet Series][12]) -* CSRF Prevention: `https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html` ([OWASP Cheat Sheet Series][3]) -* XSS Prevention: `https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html` ([OWASP Cheat Sheet Series][4]) -* Input Validation: `https://cheatsheetseries.owasp.org/cheatsheets/Input_Validation_Cheat_Sheet.html` ([OWASP Cheat Sheet Series][17]) -* SQL Injection Prevention: `https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html` ([OWASP Cheat Sheet Series][6]) -* OS Command Injection Defense: `https://cheatsheetseries.owasp.org/cheatsheets/OS_Command_Injection_Defense_Cheat_Sheet.html` ([OWASP Cheat Sheet Series][14]) -* SSRF Prevention: `https://cheatsheetseries.owasp.org/cheatsheets/Server_Side_Request_Forgery_Prevention_Cheat_Sheet.html` ([OWASP Cheat Sheet Series][7]) -* File Upload: `https://cheatsheetseries.owasp.org/cheatsheets/File_Upload_Cheat_Sheet.html` ([OWASP Cheat Sheet Series][13]) -* Unvalidated Redirects: `https://cheatsheetseries.owasp.org/cheatsheets/Unvalidated_Redirects_and_Forwards_Cheat_Sheet.html` ([OWASP Cheat Sheet Series][18]) -* HTTP Headers: `https://cheatsheetseries.owasp.org/cheatsheets/HTTP_Headers_Cheat_Sheet.html` ([OWASP Cheat Sheet Series][10]) - -Versioning / advisories: - -* Express package version (npm): `https://www.npmjs.com/package/express` -* Express open redirect advisory (CVE): `https://nvd.nist.gov/vuln/detail/CVE-2024-29041` ([NVD][9]) - -[1]: https://expressjs.com/en/advanced/best-practice-security.html "Security Best Practices for Express in Production" -[2]: https://expressjs.com/en/guide/behind-proxies.html "Express behind proxies" -[3]: https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html "Cross-Site Request Forgery Prevention - OWASP Cheat Sheet Series" -[4]: https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html "Cross Site Scripting Prevention - OWASP Cheat Sheet Series" -[5]: https://expressjs.com/en/5x/api.html "Express 5.x - API Reference" -[6]: https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html "SQL Injection Prevention - OWASP Cheat Sheet Series" -[7]: https://cheatsheetseries.owasp.org/cheatsheets/Server_Side_Request_Forgery_Prevention_Cheat_Sheet.html "Server Side Request Forgery Prevention - OWASP Cheat Sheet Series" -[8]: https://cheatsheetseries.owasp.org/cheatsheets/Nodejs_Security_Cheat_Sheet.html "Nodejs Security - OWASP Cheat Sheet Series" -[9]: https://nvd.nist.gov/vuln/detail/cve-2024-29041?utm_source=chatgpt.com "CVE-2024-29041 Detail - NVD" -[10]: https://cheatsheetseries.owasp.org/cheatsheets/HTTP_Headers_Cheat_Sheet.html "HTTP Headers - OWASP Cheat Sheet Series" -[11]: https://expressjs.com/en/guide/error-handling.html "Express error handling" -[12]: https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html "Session Management - OWASP Cheat Sheet Series" -[13]: https://cheatsheetseries.owasp.org/cheatsheets/File_Upload_Cheat_Sheet.html "File Upload - OWASP Cheat Sheet Series" -[14]: https://cheatsheetseries.owasp.org/cheatsheets/OS_Command_Injection_Defense_Cheat_Sheet.html "OS Command Injection Defense - OWASP Cheat Sheet Series" -[15]: https://nodejs.org/en/learn/getting-started/security-best-practices "Node.js — Security Best Practices" -[16]: https://docs.npmjs.com/cli/v9/commands/npm-audit/ "npm-audit | npm Docs" -[17]: https://cheatsheetseries.owasp.org/cheatsheets/Input_Validation_Cheat_Sheet.html "Input Validation - OWASP Cheat Sheet Series" -[18]: https://cheatsheetseries.owasp.org/cheatsheets/Unvalidated_Redirects_and_Forwards_Cheat_Sheet.html "Unvalidated Redirects and Forwards - OWASP Cheat Sheet Series" diff --git a/.agents/skills/security-best-practices/references/javascript-general-web-frontend-security.md b/.agents/skills/security-best-practices/references/javascript-general-web-frontend-security.md deleted file mode 100644 index d8b696f..0000000 --- a/.agents/skills/security-best-practices/references/javascript-general-web-frontend-security.md +++ /dev/null @@ -1,747 +0,0 @@ -# Frontend JavaScript/TypeScript Web Security Spec (Vanilla Browser JS/TS, Modern Browsers) - -This document is designed as a **security spec** that supports: - -1. **Secure-by-default code generation** for new frontend JavaScript/TypeScript (no specific framework assumed). -2. **Security review / vulnerability hunting** in existing frontend code (passive “notice issues while working” and active “scan the repo and report findings”). - -It is intentionally written as a set of **normative requirements** (“MUST/SHOULD/MAY”) plus **audit rules** (what bad patterns look like, how to detect them, and how to fix/mitigate them). - ---- - -## 0) Safety, boundaries, and anti-abuse constraints (MUST FOLLOW) - -* MUST NOT request, output, log, hard-code, or commit secrets (API keys intended to be secret, private keys, passwords, OAuth refresh tokens, session tokens, cookies). - Notes: - - * Frontend code is inherently observable by end users. If a value must remain secret, it must not be in browser-delivered code. - * If the project uses “public” keys (e.g., publishable analytics keys), they MUST be treated as non-secret and scoped accordingly. - -* MUST NOT “fix” security by disabling protections (e.g., weakening CSP with `unsafe-inline`/`unsafe-eval` without justification, removing origin checks for `postMessage`, switching to `innerHTML` for convenience, accepting arbitrary redirects/URLs, or turning off sanitization). - -* MUST provide **evidence-based findings** during audits: cite file paths, code snippets, and relevant HTML/CSP/config values that justify the claim. - -* MUST treat uncertainty honestly: - - * Security headers (CSP, frame-ancestors, etc.) might be set by server/edge/CDN rather than in repo code. If not visible, report as “not visible here; verify at runtime/edge config.” (Also note that `` only simulates a subset of headers; don’t assume other security headers exist just because a meta tag exists.) ([MDN Web Docs][1]) - ---- - -## 1) Operating modes - -### 1.1 Generation mode (default) - -When asked to write new frontend JS/TS code or modify existing code: - -* MUST follow every **MUST** requirement in this spec. -* SHOULD follow every **SHOULD** requirement unless the user explicitly says otherwise. -* MUST prefer safe-by-default browser APIs and proven libraries over custom security code (especially for HTML sanitization). -* MUST avoid introducing new risky sinks (DOM XSS injection sinks like `innerHTML`, navigation to `javascript:` URLs, dynamic code execution via `eval`/`Function`, unsafe `postMessage`, unsafe third-party script loading, etc.). ([OWASP Cheat Sheet Series][2]) - -### 1.2 Passive review mode (always on while editing) - -While working anywhere in a frontend repo (even if the user did not ask for a security scan): - -* MUST “notice” violations of this spec in touched/nearby code. -* SHOULD mention issues as they come up, with a brief explanation + safe fix. - -### 1.3 Active audit mode (explicit scan request) - -When the user asks to “scan”, “audit”, or “hunt for vulns”: - -* MUST systematically search the codebase for violations of this spec. -* MUST output findings in a structured format (see §2.3). - -Recommended audit order: - -1. HTML entrypoints (`index.html`, server-rendered templates), script/style includes, and any CSP delivery (header vs meta). ([W3C][3]) -2. DOM XSS sinks (`innerHTML`, `document.write`, `insertAdjacentHTML`, event-handler attributes) and their data sources (URL params/hash, storage, postMessage, API responses). ([OWASP Cheat Sheet Series][2]) -3. Navigation/redirect handling (`window.location*`, link targets, URL allowlists) including `javascript:` URL hazards. ([MDN Web Docs][4]) -4. Cross-origin communication (`postMessage`, iframe embed patterns, sandboxing). ([MDN Web Docs][5]) -5. Storage of sensitive data (localStorage/sessionStorage) and assumptions about trust. ([OWASP Cheat Sheet Series][6]) -6. Third-party scripts / tag managers / CDNs, and integrity controls (SRI) and policy controls (CSP). ([OWASP Cheat Sheet Series][7]) -7. DOM clobbering gadgets and unsafe reliance on `window`/`document` named properties. ([OWASP Cheat Sheet Series][8]) - ---- - -## 2) Definitions and review guidance - -### 2.1 Untrusted input (treat as attacker-controlled unless proven otherwise) - -Examples include: - -* URL-derived data: `location.href`, `location.search`, `location.hash`, `document.baseURI`, `new URLSearchParams(location.search)`, routing fragments. ([OWASP Cheat Sheet Series][2]) -* DOM content that may include user-controlled markup (comments, profiles, CMS content, markdown-to-HTML output, etc.), especially if inserted dynamically. ([OWASP Cheat Sheet Series][2]) -* `postMessage` event data (`event.data`) and metadata (`event.origin`) from other windows/frames. ([MDN Web Docs][5]) -* Browser storage: `localStorage`, `sessionStorage`, IndexedDB (contents can be attacker-influenced via XSS or local machine access; never treat as “trusted”). ([OWASP Cheat Sheet Series][6]) -* Any data returned from network calls (even if from “your API”), because it may contain stored attacker content that becomes dangerous only when inserted into the DOM. ([OWASP Cheat Sheet Series][2]) - -### 2.2 Dangerous sink (DOM XSS / code execution sink) - -A sink is any API/operation that can execute script or interpret attacker-controlled strings as HTML/JS/URL in a security-sensitive way. High-signal sinks include: - -* HTML parsing / insertion: `innerHTML`, `outerHTML`, `insertAdjacentHTML`, `document.write`, `document.writeln`. ([OWASP Cheat Sheet Series][2]) -* Dynamic code execution: `eval`, `new Function`, `setTimeout("...")`, `setInterval("...")`. ([MDN Web Docs][10]) -* Navigation to script-bearing URLs (e.g., `javascript:`) via setters like `Location.href`/`window.location` (and via link `href` if attacker-controlled). ([MDN Web Docs][4]) -* Setting event handler attributes from strings, e.g. `setAttribute("onclick", "...")`. ([OWASP Cheat Sheet Series][2]) - -### 2.3 Required audit finding format - -For each issue found, output: - -* Rule ID: -* Severity: Critical / High / Medium / Low -* Location: file path + function/class/module + line(s) -* Evidence: the exact code/config snippet -* Impact: what could go wrong, who can exploit it -* Fix: safe change (prefer minimal diff) -* Mitigation: defense-in-depth if immediate fix is hard -* False positive notes: what to verify if uncertain - ---- - -## 3) Secure baseline: minimum production configuration (MUST in production) - -This is the smallest baseline that prevents common frontend JS/TS security misconfigurations. Some items are “in repo” (HTML/JS) and some may live at the server/edge. - -### 3.1 Content Security Policy (CSP) baseline (SHOULD; MUST for high-risk apps) - -* SHOULD deliver CSP via HTTP response headers when possible. -* MAY deliver CSP via an HTML `` tag when you cannot set headers (e.g., purely static hosting constraints). ([MDN Web Docs][1]) -* If using CSP via ``, MUST understand the limitations: - - * The policy only applies to content that follows the meta element (so it must appear very early, before any scripts/resources you want governed). ([W3C][3]) - * The following directives are **not supported** in a meta-delivered policy and will be ignored: `report-uri`, `frame-ancestors`, and `sandbox`. ([W3C][3]) - * “Report-only” CSP cannot be set via a meta element. ([W3C][3]) - -Practical baseline goals: - -* Avoid script sources `unsafe-inline` and `unsafe-eval` (they significantly weaken CSP’s value against XSS). ([MDN Web Docs][10]) -* Prefer nonce- or hash-based script policies if you need inline scripts. ([MDN Web Docs][10]) -* Consider enabling Trusted Types enforcement where feasible. ([MDN Web Docs][11]) - -### 3.2 Third-party scripts baseline (SHOULD) - -* SHOULD minimize third-party script execution and treat it as equivalent privilege to first-party JS (it runs with your origin’s privileges). ([OWASP Cheat Sheet Series][7]) -* SHOULD use Subresource Integrity (SRI) for third-party scripts/styles loaded from CDNs. ([MDN Web Docs][12]) - -### 3.3 Cross-window communication baseline (SHOULD) - -* SHOULD restrict `postMessage` communications to explicit origins, and validate both origin and message shape. ([MDN Web Docs][5]) - ---- - -## 4) Rules (generation + audit) - -Each rule contains: required practice, insecure patterns, detection hints, and remediation. - -### JS-XSS-001: Do not inject untrusted HTML into the DOM (avoid `innerHTML` and friends) - -Severity: Critical if you can prove attacker-controlled input can reach these APIs; otherwise Medium - - -Required: - -* MUST treat `innerHTML`, `outerHTML`, and `insertAdjacentHTML` as dangerous sinks when their input can contain untrusted data. ([OWASP Cheat Sheet Series][2]) -* MUST prefer safe DOM APIs that do not parse HTML: - - * `textContent` for text. ([OWASP Cheat Sheet Series][2]) - * `document.createElement`, `appendChild`, `setAttribute` for non-event-handler attributes. ([OWASP Cheat Sheet Series][2]) -* If HTML insertion is truly required, SHOULD sanitize with a well-reviewed HTML sanitizer and strongly consider enforcing Trusted Types to confine usage to audited code paths. ([MDN Web Docs][11]) - -Insecure patterns: - -* `el.innerHTML = userInput` -* `el.insertAdjacentHTML('beforeend', userInput)` -* `el.outerHTML = userInput` - -Detection hints: - -* Search for: `.innerHTML`, `.outerHTML`, `insertAdjacentHTML(`. -* Trace the origin of inserted string: URL params/hash, postMessage, storage, API responses, DOM attributes. ([OWASP Cheat Sheet Series][2]) - -Fix: - -* Replace with `textContent` for plain text. ([OWASP Cheat Sheet Series][2]) -* For structured UI, build DOM nodes explicitly. -* For “rich text” requirements: - - * Sanitize using an allowlist-based sanitizer. - * Prefer returning safe “components” instead of arbitrary HTML strings. - * Use Trusted Types enforcement to ensure only `TrustedHTML` reaches sinks where supported. ([MDN Web Docs][11]) - -Mitigation: - -* Deploy a strict CSP and consider Trusted Types enforcement (`require-trusted-types-for 'script'`). ([MDN Web Docs][10]) - -False positive notes: - -* If the string is provably constant or fully generated from trusted constants, it may be safe. Still prefer safer APIs. - ---- - -### JS-XSS-002: Avoid `document.write` / `document.writeln` (XSS + document clobbering hazards) - -Severity: Critical if you can prove attacker-controlled input can reach these APIs; otherwise Medium - -Required: - -* MUST avoid `document.write()` and `document.writeln()` in production code (they are XSS vectors and can be abused with crafted HTML even if some browsers block injected `` with no `integrity`. -* Loading `latest` or unpinned third-party resources. - -Detection hints: - -* Search for `` with no `integrity`. -* Loading jQuery from random third-party CDNs without an explicit trust decision. - -Detection hints: - -* Scan HTML for `` with no integrity. -* Tag managers that dynamically load arbitrary scripts without governance. - -Detection hints: - -* Search in `public/index.html`, templates, or SSR wrappers for: - - * `