initial commit
This commit is contained in:
91
README.md
Normal file
91
README.md
Normal file
@@ -0,0 +1,91 @@
|
||||
# Nihongo Con Teppei Downloader
|
||||
|
||||
A Python script to download audio files and subtitles from the Nihongo Con Teppei podcast.
|
||||
|
||||
## Features
|
||||
|
||||
- Download individual episodes or ranges of episodes
|
||||
- Automatic file existence checking (skips existing files)
|
||||
- Configurable delays between requests to avoid being banned
|
||||
- Robust error handling with retry logic
|
||||
- Progress tracking for batch downloads
|
||||
- Clean, modular code structure
|
||||
|
||||
## Installation
|
||||
|
||||
1. Install the required dependencies:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
2. Make sure you have Chrome/Chromium installed (required for Selenium)
|
||||
|
||||
## Usage
|
||||
|
||||
### Download a single episode
|
||||
```bash
|
||||
python teppei.py 11 --download
|
||||
```
|
||||
|
||||
### Download a range of episodes
|
||||
```bash
|
||||
python teppei.py --start 1 --end 20 --download
|
||||
```
|
||||
|
||||
### Download to a specific directory
|
||||
```bash
|
||||
python teppei.py --start 11 --end 15 --download --output ./teppei_episodes
|
||||
```
|
||||
|
||||
### Force re-download existing files
|
||||
```bash
|
||||
python teppei.py 11 --download --force
|
||||
```
|
||||
|
||||
### Show URLs without downloading
|
||||
```bash
|
||||
python teppei.py 11
|
||||
```
|
||||
|
||||
### Customize request delays and timeouts
|
||||
```bash
|
||||
python teppei.py --start 1 --end 10 --download --delay 3 --timeout 60
|
||||
```
|
||||
|
||||
## Command Line Options
|
||||
|
||||
- `episode_num`: Episode number to download (for single episode mode)
|
||||
- `--start`: Starting episode number for range download
|
||||
- `--end`: Ending episode number for range download
|
||||
- `--download, -d`: Download the files (if not specified, only show URLs)
|
||||
- `--output, -o`: Output directory (default: current directory)
|
||||
- `--force`: Force re-download even if files already exist
|
||||
- `--delay`: Delay between requests in seconds (default: 2)
|
||||
- `--timeout`: HTTP request timeout in seconds (default: 30)
|
||||
|
||||
## File Structure
|
||||
|
||||
The script downloads files with the following naming convention:
|
||||
- Audio: `Nihongo-Con-Teppei-E{episode:02d}.mp3`
|
||||
- Subtitles: `Nihongo-Con-Teppei-E{episode:02d}.vtt`
|
||||
|
||||
## Examples
|
||||
|
||||
```bash
|
||||
# Download episodes 11-15 to a specific folder
|
||||
python teppei.py --start 11 --end 15 --download --output ./japanese_lessons
|
||||
|
||||
# Download episode 20 with custom delay
|
||||
python teppei.py 20 --download --delay 5
|
||||
|
||||
# Check what URLs would be downloaded without actually downloading
|
||||
python teppei.py --start 1 --end 3
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- The script uses Selenium to scrape audio URLs from the website
|
||||
- Subtitle URLs are constructed directly (no scraping needed)
|
||||
- Built-in delays help prevent being rate-limited or banned
|
||||
- Files are checked for existence before downloading to avoid duplicates
|
||||
- Failed downloads are automatically retried up to 3 times
|
||||
Reference in New Issue
Block a user