initial commit

2025-08-28 20:48:10 -07:00
commit 69a7bd4f98
7 changed files with 702 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,91 @@
+# Nihongo Con Teppei Downloader
+
+A Python script to download audio files and subtitles from the Nihongo Con Teppei podcast.
+
+## Features
+
+- Download individual episodes or ranges of episodes
+- Automatic file existence checking (skips existing files)
+- Configurable delays between requests to avoid being banned
+- Robust error handling with retry logic
+- Progress tracking for batch downloads
+- Clean, modular code structure
+
+## Installation
+
+1. Install the required dependencies:
+   ```bash
+   pip install -r requirements.txt
+   ```
+
+2. Make sure you have Chrome/Chromium installed (required for Selenium)
+
+## Usage
+
+### Download a single episode
+```bash
+python teppei.py 11 --download
+```
+
+### Download a range of episodes
+```bash
+python teppei.py --start 1 --end 20 --download
+```
+
+### Download to a specific directory
+```bash
+python teppei.py --start 11 --end 15 --download --output ./teppei_episodes
+```
+
+### Force re-download existing files
+```bash
+python teppei.py 11 --download --force
+```
+
+### Show URLs without downloading
+```bash
+python teppei.py 11
+```
+
+### Customize request delays and timeouts
+```bash
+python teppei.py --start 1 --end 10 --download --delay 3 --timeout 60
+```
+
+## Command Line Options
+
+- `episode_num`: Episode number to download (for single episode mode)
+- `--start`: Starting episode number for range download
+- `--end`: Ending episode number for range download
+- `--download, -d`: Download the files (if not specified, only show URLs)
+- `--output, -o`: Output directory (default: current directory)
+- `--force`: Force re-download even if files already exist
+- `--delay`: Delay between requests in seconds (default: 2)
+- `--timeout`: HTTP request timeout in seconds (default: 30)
+
+## File Structure
+
+The script downloads files with the following naming convention:
+- Audio: `Nihongo-Con-Teppei-E{episode:02d}.mp3`
+- Subtitles: `Nihongo-Con-Teppei-E{episode:02d}.vtt`
+
+## Examples
+
+```bash
+# Download episodes 11-15 to a specific folder
+python teppei.py --start 11 --end 15 --download --output ./japanese_lessons
+
+# Download episode 20 with custom delay
+python teppei.py 20 --download --delay 5
+
+# Check what URLs would be downloaded without actually downloading
+python teppei.py --start 1 --end 3
+```
+
+## Notes
+
+- The script uses Selenium to scrape audio URLs from the website
+- Subtitle URLs are constructed directly (no scraping needed)
+- Built-in delays help prevent being rate-limited or banned
+- Files are checked for existence before downloading to avoid duplicates
+- Failed downloads are automatically retried up to 3 times