182 lines
3.7 KiB
Markdown
182 lines
3.7 KiB
Markdown
# ScreenJob
|
|
|
|
Desktop-and-terminal task agent with:
|
|
|
|
- CLI runner
|
|
- FastAPI job server
|
|
- SQLite task history
|
|
- WebSocket-powered monitoring UI
|
|
- Safety pre-check and per-job tool disable controls
|
|
- Live/final token and cost estimation
|
|
|
|
## Install
|
|
|
|
```powershell
|
|
pip install openai pillow pyautogui python-dotenv fastapi uvicorn
|
|
```
|
|
|
|
## Environment
|
|
|
|
Create `.env` in project root:
|
|
|
|
```env
|
|
OPENAI_API_KEY=...
|
|
SCREENJOB_TOKEN=choose_a_strong_token
|
|
|
|
# Optional
|
|
SCREENJOB_DEFAULT_MODEL=gpt-5.4-mini
|
|
SCREENJOB_SAFETY_MODEL=gpt-5.4-mini
|
|
SCREENJOB_HOST=127.0.0.1
|
|
SCREENJOB_PORT=8787
|
|
DISABLE_UI=false
|
|
```
|
|
|
|
## Entry Points
|
|
|
|
- `python main.py run "<job>"`
|
|
- `python main.py server`
|
|
- Backward-compatible wrapper: `python screenjob.py "<job>"`
|
|
|
|
## CLI Usage
|
|
|
|
```powershell
|
|
python main.py run "Open amazon.de and go to my orders"
|
|
```
|
|
|
|
Useful flags:
|
|
|
|
- `--model gpt-5.4-mini`
|
|
- `--disable-tool click --disable-tool type`
|
|
- `--skip-safety-check`
|
|
- `--max-steps 80`
|
|
|
|
## HTTP API
|
|
|
|
All API routes require token auth using `SCREENJOB_TOKEN`:
|
|
|
|
- `Authorization: Bearer <token>` or
|
|
- `X-ScreenJob-Token: <token>`
|
|
- (for browser/image fetch) `?token=<token>` query parameter
|
|
|
|
### Create Job
|
|
|
|
`POST /api/jobs`
|
|
|
|
Body:
|
|
|
|
```json
|
|
{
|
|
"job": "Open amazon.de and go to my orders",
|
|
"model": "gpt-5.4-mini",
|
|
"disabled_tools": ["click"],
|
|
"safety_override": false
|
|
}
|
|
```
|
|
|
|
Response:
|
|
|
|
```json
|
|
{ "job_id": "job_..." }
|
|
```
|
|
|
|
### Status / Output
|
|
|
|
- `GET /api/jobs/{job_id}`: full status + output + live/final usage/cost
|
|
- `GET /api/jobs/{job_id}/status`: status alias
|
|
- `GET /api/jobs/{job_id}/events`: detailed timeline
|
|
- `GET /api/jobs/{job_id}/artifact?path=<absolute_path>&token=<token>`: authenticated artifact file fetch for screenshots/enhancements
|
|
- `GET /api/jobs`: list active + past jobs
|
|
- `POST /api/jobs/{job_id}/cancel`: graceful cancellation
|
|
- `GET /api/stats`: aggregate metrics
|
|
|
|
## Monitoring UI
|
|
|
|
- Served at `/` when `DISABLE_UI=false`
|
|
- Tailwind-based read-only dashboard
|
|
- Requires entering `SCREENJOB_TOKEN` in UI before data loads
|
|
- Uses WebSocket `/ws` for live updates (tool calls, step events, usage/cost updates)
|
|
- No task launch controls in UI (monitoring only)
|
|
|
|
If `DISABLE_UI=true`, `/` returns `{ "ui_disabled": true }` and only API endpoints remain.
|
|
|
|
## Safety
|
|
|
|
Before execution, each task is classified by a model safety gate:
|
|
|
|
- Safe: task runs
|
|
- Unsafe: task is rejected and recorded
|
|
- Override: set `safety_override=true` (or `--skip-safety-check` in CLI)
|
|
|
|
## Tool Controls
|
|
|
|
Per-job tool allowlisting via disable list:
|
|
|
|
- API: `disabled_tools: ["type", "click"]`
|
|
- CLI: `--disable-tool type --disable-tool click`
|
|
|
|
Available tools:
|
|
|
|
- `execute_command(command)`
|
|
- `sleep(seconds)`
|
|
- `see_screen()`
|
|
- `enhance(coordinate)`
|
|
- `click(coordinate, offset_up/down/left/right, sleep_after_seconds)`
|
|
- `type(text)`
|
|
- `press_key(key, repeats=1)`
|
|
- `task_complete(result)`
|
|
|
|
## Cost Estimation
|
|
|
|
Live/final cost is computed from OpenAI response usage (`input`, `cached_input`, `output`) and model pricing rates in `src/pricing.py`.
|
|
|
|
- Live: exposed in `GET /api/jobs/{job_id}` during execution
|
|
- Final: persisted in SQLite and returned in status output
|
|
|
|
## Persistence
|
|
|
|
- SQLite DB: `screenjob.db`
|
|
- Runs/artifacts: `screenjob_runs/run_YYYYMMDD_HHMMSS/...`
|
|
- Full event log per job (for history and UI)
|
|
|
|
## Project Layout
|
|
|
|
```text
|
|
main.py
|
|
screenjob.py
|
|
src/
|
|
__init__.py
|
|
agent.py
|
|
app_main.py
|
|
cli.py
|
|
config.py
|
|
models.py
|
|
pricing.py
|
|
runtime.py
|
|
safety.py
|
|
server.py
|
|
storage.py
|
|
task_manager.py
|
|
ui.py
|
|
tests/
|
|
conftest.py
|
|
test_pricing.py
|
|
test_server_api.py
|
|
test_storage.py
|
|
.gitea/
|
|
workflows/
|
|
ci.yml
|
|
```
|
|
|
|
## Verification
|
|
|
|
Run local verification:
|
|
|
|
```powershell
|
|
pytest -q
|
|
```
|
|
|
|
Gitea CI pipeline:
|
|
|
|
- File: `.gitea/workflows/ci.yml`
|
|
- Runs compile checks + pytest on push and PR.
|