Files
screenjob/README.md
Space-Banane a19b285232
All checks were successful
CI / test (push) Successful in 48s
test: add pytest verification suite and gitea ci workflow
2026-05-27 17:55:34 +02:00

182 lines
3.7 KiB
Markdown

# ScreenJob
Desktop-and-terminal task agent with:
- CLI runner
- FastAPI job server
- SQLite task history
- WebSocket-powered monitoring UI
- Safety pre-check and per-job tool disable controls
- Live/final token and cost estimation
## Install
```powershell
pip install openai pillow pyautogui python-dotenv fastapi uvicorn
```
## Environment
Create `.env` in project root:
```env
OPENAI_API_KEY=...
SCREENJOB_TOKEN=choose_a_strong_token
# Optional
SCREENJOB_DEFAULT_MODEL=gpt-5.4-mini
SCREENJOB_SAFETY_MODEL=gpt-5.4-mini
SCREENJOB_HOST=127.0.0.1
SCREENJOB_PORT=8787
DISABLE_UI=false
```
## Entry Points
- `python main.py run "<job>"`
- `python main.py server`
- Backward-compatible wrapper: `python screenjob.py "<job>"`
## CLI Usage
```powershell
python main.py run "Open amazon.de and go to my orders"
```
Useful flags:
- `--model gpt-5.4-mini`
- `--disable-tool click --disable-tool type`
- `--skip-safety-check`
- `--max-steps 80`
## HTTP API
All API routes require token auth using `SCREENJOB_TOKEN`:
- `Authorization: Bearer <token>` or
- `X-ScreenJob-Token: <token>`
- (for browser/image fetch) `?token=<token>` query parameter
### Create Job
`POST /api/jobs`
Body:
```json
{
"job": "Open amazon.de and go to my orders",
"model": "gpt-5.4-mini",
"disabled_tools": ["click"],
"safety_override": false
}
```
Response:
```json
{ "job_id": "job_..." }
```
### Status / Output
- `GET /api/jobs/{job_id}`: full status + output + live/final usage/cost
- `GET /api/jobs/{job_id}/status`: status alias
- `GET /api/jobs/{job_id}/events`: detailed timeline
- `GET /api/jobs/{job_id}/artifact?path=<absolute_path>&token=<token>`: authenticated artifact file fetch for screenshots/enhancements
- `GET /api/jobs`: list active + past jobs
- `POST /api/jobs/{job_id}/cancel`: graceful cancellation
- `GET /api/stats`: aggregate metrics
## Monitoring UI
- Served at `/` when `DISABLE_UI=false`
- Tailwind-based read-only dashboard
- Requires entering `SCREENJOB_TOKEN` in UI before data loads
- Uses WebSocket `/ws` for live updates (tool calls, step events, usage/cost updates)
- No task launch controls in UI (monitoring only)
If `DISABLE_UI=true`, `/` returns `{ "ui_disabled": true }` and only API endpoints remain.
## Safety
Before execution, each task is classified by a model safety gate:
- Safe: task runs
- Unsafe: task is rejected and recorded
- Override: set `safety_override=true` (or `--skip-safety-check` in CLI)
## Tool Controls
Per-job tool allowlisting via disable list:
- API: `disabled_tools: ["type", "click"]`
- CLI: `--disable-tool type --disable-tool click`
Available tools:
- `execute_command(command)`
- `sleep(seconds)`
- `see_screen()`
- `enhance(coordinate)`
- `click(coordinate, offset_up/down/left/right, sleep_after_seconds)`
- `type(text)`
- `press_key(key, repeats=1)`
- `task_complete(result)`
## Cost Estimation
Live/final cost is computed from OpenAI response usage (`input`, `cached_input`, `output`) and model pricing rates in `src/pricing.py`.
- Live: exposed in `GET /api/jobs/{job_id}` during execution
- Final: persisted in SQLite and returned in status output
## Persistence
- SQLite DB: `screenjob.db`
- Runs/artifacts: `screenjob_runs/run_YYYYMMDD_HHMMSS/...`
- Full event log per job (for history and UI)
## Project Layout
```text
main.py
screenjob.py
src/
__init__.py
agent.py
app_main.py
cli.py
config.py
models.py
pricing.py
runtime.py
safety.py
server.py
storage.py
task_manager.py
ui.py
tests/
conftest.py
test_pricing.py
test_server_api.py
test_storage.py
.gitea/
workflows/
ci.yml
```
## Verification
Run local verification:
```powershell
pytest -q
```
Gitea CI pipeline:
- File: `.gitea/workflows/ci.yml`
- Runs compile checks + pytest on push and PR.