# ScreenJob Desktop-and-terminal task agent with: - CLI runner - FastAPI job server - SQLite task history - WebSocket-powered monitoring UI - Safety pre-check and per-job tool disable controls - Live/final token and cost estimation ## Install ```powershell pip install openai pillow pyautogui python-dotenv fastapi uvicorn ``` ## Environment Create `.env` in project root: ```env OPENAI_API_KEY=... SCREENJOB_TOKEN=choose_a_strong_token # Optional SCREENJOB_DEFAULT_MODEL=gpt-5.4-mini SCREENJOB_SAFETY_MODEL=gpt-5.4-mini SCREENJOB_HOST=127.0.0.1 SCREENJOB_PORT=8787 DISABLE_UI=false ``` ## Entry Points - `python main.py run ""` - `python main.py server` - Backward-compatible wrapper: `python screenjob.py ""` ## CLI Usage ```powershell python main.py run "Open amazon.de and go to my orders" ``` Useful flags: - `--model gpt-5.4-mini` - `--disable-tool click --disable-tool type` - `--skip-safety-check` - `--max-steps 80` ## HTTP API All API routes require token auth using `SCREENJOB_TOKEN`: - `Authorization: Bearer ` or - `X-ScreenJob-Token: ` - (for browser/image fetch) `?token=` query parameter ### Create Job `POST /api/jobs` Body: ```json { "job": "Open amazon.de and go to my orders", "model": "gpt-5.4-mini", "disabled_tools": ["click"], "safety_override": false } ``` Response: ```json { "job_id": "job_..." } ``` ### Status / Output - `GET /api/jobs/{job_id}`: full status + output + live/final usage/cost - `GET /api/jobs/{job_id}/status`: status alias - `GET /api/jobs/{job_id}/events`: detailed timeline - `GET /api/jobs/{job_id}/artifact?path=&token=`: authenticated artifact file fetch for screenshots/enhancements - `GET /api/jobs`: list active + past jobs - `POST /api/jobs/{job_id}/cancel`: graceful cancellation - `GET /api/stats`: aggregate metrics ## Monitoring UI - Served at `/` when `DISABLE_UI=false` - Tailwind-based read-only dashboard - Requires entering `SCREENJOB_TOKEN` in UI before data loads - Uses WebSocket `/ws` for live updates (tool calls, step events, usage/cost updates) - No task launch controls in UI (monitoring only) If `DISABLE_UI=true`, `/` returns `{ "ui_disabled": true }` and only API endpoints remain. ## Safety Before execution, each task is classified by a model safety gate: - Safe: task runs - Unsafe: task is rejected and recorded - Override: set `safety_override=true` (or `--skip-safety-check` in CLI) ## Tool Controls Per-job tool allowlisting via disable list: - API: `disabled_tools: ["type", "click"]` - CLI: `--disable-tool type --disable-tool click` Available tools: - `execute_command(command)` - `sleep(seconds)` - `see_screen()` - `enhance(coordinate)` - `click(coordinate, offset_up/down/left/right, sleep_after_seconds)` - `type(text)` - `press_key(key, repeats=1)` - `task_complete(result)` ## Cost Estimation Live/final cost is computed from OpenAI response usage (`input`, `cached_input`, `output`) and model pricing rates in `src/pricing.py`. - Live: exposed in `GET /api/jobs/{job_id}` during execution - Final: persisted in SQLite and returned in status output ## Persistence - SQLite DB: `screenjob.db` - Runs/artifacts: `screenjob_runs/run_YYYYMMDD_HHMMSS/...` - Full event log per job (for history and UI) ## Project Layout ```text main.py screenjob.py src/ __init__.py agent.py app_main.py cli.py config.py models.py pricing.py runtime.py safety.py server.py storage.py task_manager.py ui.py tests/ conftest.py test_pricing.py test_server_api.py test_storage.py .gitea/ workflows/ ci.yml ``` ## Verification Run local verification: ```powershell pytest -q ``` Gitea CI pipeline: - File: `.gitea/workflows/ci.yml` - Runs compile checks + pytest on push and PR.