63 lines
1.7 KiB
Markdown
63 lines
1.7 KiB
Markdown
# whisper-remote
|
|
|
|
A split repo for running the upstream `openai/whisper` CLI remotely.
|
|
|
|
Two separate Python packages live here:
|
|
|
|
- `backend/`: FastAPI wrapper around the upstream `whisper` CLI
|
|
- `cli/`: local `whisper-remote` command that forwards work to the remote API
|
|
|
|
The repo also includes a Gitea Actions workflow at `.gitea/workflows/ci.yml` that tests and builds both packages on pushes to `main` and pull requests.
|
|
|
|
## Docker backend (no image build)
|
|
|
|
Run the backend directly from an official Python image without creating a Dockerfile:
|
|
|
|
```bash
|
|
docker compose up backend
|
|
```
|
|
|
|
This uses `python:3.14-slim`, installs `ffmpeg` and `openai-whisper` at container startup, mounts this repo into the container, and serves the API on `http://localhost:8000`.
|
|
|
|
## Backend setup
|
|
```bash
|
|
cd backend
|
|
pip install -e .
|
|
uvicorn server:app --app-dir src --host 0.0.0.0 --port 8000
|
|
```
|
|
|
|
The backend machine must already have the upstream `whisper` CLI available on `PATH`.
|
|
|
|
## CLI setup
|
|
|
|
```bash
|
|
cd cli
|
|
pip install -e .
|
|
whisper-remote ./audio.mp3 --model base --language en --output-format txt
|
|
```
|
|
|
|
PowerShell:
|
|
|
|
```powershell
|
|
$env:WHISPER_REMOTE = "http://your-server:8000"
|
|
whisper-remote .\audio.mp3 --model base --language en --output-format txt
|
|
```
|
|
|
|
Bash:
|
|
|
|
```bash
|
|
export WHISPER_REMOTE=http://your-server:8000
|
|
whisper-remote ./audio.mp3 --model base --language en --output-format txt
|
|
```
|
|
|
|
## Supported v1 options
|
|
|
|
- model selection
|
|
- language forwarding
|
|
- transcript output formats: `txt`, `vtt`, `srt`, `tsv`, `json`
|
|
- file upload to the backend
|
|
- backend-side cleanup of uploaded and generated files after each request
|
|
|
|
By default the CLI prints the returned transcript to stdout. Use `--to-file` to save it locally.
|
|
|