Nihongo Con Teppei Downloader

A Python script to download audio files and subtitles from the Nihongo Con Teppei podcast.

Features

Download individual episodes or ranges of episodes
Automatic file existence checking (skips existing files)
Configurable delays between requests to avoid being banned
Robust error handling with retry logic
Progress tracking for batch downloads
Clean, modular code structure

Installation

Install the required dependencies:
```
pip install -r requirements.txt
```
Make sure you have Chrome/Chromium installed (required for Selenium)

Usage

Download a single episode

python teppei.py 11 --download

Download a range of episodes

python teppei.py --start 1 --end 20 --download

Download to a specific directory

python teppei.py --start 11 --end 15 --download --output ./teppei_episodes

Force re-download existing files

python teppei.py 11 --download --force

Show URLs without downloading

python teppei.py 11

Customize request delays and timeouts

python teppei.py --start 1 --end 10 --download --delay 3 --timeout 60

Command Line Options

episode_num: Episode number to download (for single episode mode)
--start: Starting episode number for range download
--end: Ending episode number for range download
--download, -d: Download the files (if not specified, only show URLs)
--output, -o: Output directory (default: current directory)
--force: Force re-download even if files already exist
--delay: Delay between requests in seconds (default: 2)
--timeout: HTTP request timeout in seconds (default: 30)

File Structure

The script downloads files with the following naming convention:

Audio: Nihongo-Con-Teppei-E{episode:02d}.mp3
Subtitles: Nihongo-Con-Teppei-E{episode:02d}.vtt

Examples

# Download episodes 11-15 to a specific folder
python teppei.py --start 11 --end 15 --download --output ./japanese_lessons

# Download episode 20 with custom delay
python teppei.py 20 --download --delay 5

# Check what URLs would be downloaded without actually downloading
python teppei.py --start 1 --end 3

Notes

The script uses Selenium to scrape audio URLs from the website
Subtitle URLs are constructed directly (no scraping needed)
Built-in delays help prevent being rate-limited or banned
Files are checked for existence before downloading to avoid duplicates
Failed downloads are automatically retried up to 3 times