Files
screenjob/README.md
Space-Banane a19b285232
All checks were successful
CI / test (push) Successful in 48s
test: add pytest verification suite and gitea ci workflow
2026-05-27 17:55:34 +02:00

3.7 KiB

ScreenJob

Desktop-and-terminal task agent with:

  • CLI runner
  • FastAPI job server
  • SQLite task history
  • WebSocket-powered monitoring UI
  • Safety pre-check and per-job tool disable controls
  • Live/final token and cost estimation

Install

pip install openai pillow pyautogui python-dotenv fastapi uvicorn

Environment

Create .env in project root:

OPENAI_API_KEY=...
SCREENJOB_TOKEN=choose_a_strong_token

# Optional
SCREENJOB_DEFAULT_MODEL=gpt-5.4-mini
SCREENJOB_SAFETY_MODEL=gpt-5.4-mini
SCREENJOB_HOST=127.0.0.1
SCREENJOB_PORT=8787
DISABLE_UI=false

Entry Points

  • python main.py run "<job>"
  • python main.py server
  • Backward-compatible wrapper: python screenjob.py "<job>"

CLI Usage

python main.py run "Open amazon.de and go to my orders"

Useful flags:

  • --model gpt-5.4-mini
  • --disable-tool click --disable-tool type
  • --skip-safety-check
  • --max-steps 80

HTTP API

All API routes require token auth using SCREENJOB_TOKEN:

  • Authorization: Bearer <token> or
  • X-ScreenJob-Token: <token>
  • (for browser/image fetch) ?token=<token> query parameter

Create Job

POST /api/jobs

Body:

{
  "job": "Open amazon.de and go to my orders",
  "model": "gpt-5.4-mini",
  "disabled_tools": ["click"],
  "safety_override": false
}

Response:

{ "job_id": "job_..." }

Status / Output

  • GET /api/jobs/{job_id}: full status + output + live/final usage/cost
  • GET /api/jobs/{job_id}/status: status alias
  • GET /api/jobs/{job_id}/events: detailed timeline
  • GET /api/jobs/{job_id}/artifact?path=<absolute_path>&token=<token>: authenticated artifact file fetch for screenshots/enhancements
  • GET /api/jobs: list active + past jobs
  • POST /api/jobs/{job_id}/cancel: graceful cancellation
  • GET /api/stats: aggregate metrics

Monitoring UI

  • Served at / when DISABLE_UI=false
  • Tailwind-based read-only dashboard
  • Requires entering SCREENJOB_TOKEN in UI before data loads
  • Uses WebSocket /ws for live updates (tool calls, step events, usage/cost updates)
  • No task launch controls in UI (monitoring only)

If DISABLE_UI=true, / returns { "ui_disabled": true } and only API endpoints remain.

Safety

Before execution, each task is classified by a model safety gate:

  • Safe: task runs
  • Unsafe: task is rejected and recorded
  • Override: set safety_override=true (or --skip-safety-check in CLI)

Tool Controls

Per-job tool allowlisting via disable list:

  • API: disabled_tools: ["type", "click"]
  • CLI: --disable-tool type --disable-tool click

Available tools:

  • execute_command(command)
  • sleep(seconds)
  • see_screen()
  • enhance(coordinate)
  • click(coordinate, offset_up/down/left/right, sleep_after_seconds)
  • type(text)
  • press_key(key, repeats=1)
  • task_complete(result)

Cost Estimation

Live/final cost is computed from OpenAI response usage (input, cached_input, output) and model pricing rates in src/pricing.py.

  • Live: exposed in GET /api/jobs/{job_id} during execution
  • Final: persisted in SQLite and returned in status output

Persistence

  • SQLite DB: screenjob.db
  • Runs/artifacts: screenjob_runs/run_YYYYMMDD_HHMMSS/...
  • Full event log per job (for history and UI)

Project Layout

main.py
screenjob.py
src/
  __init__.py
  agent.py
  app_main.py
  cli.py
  config.py
  models.py
  pricing.py
  runtime.py
  safety.py
  server.py
  storage.py
  task_manager.py
  ui.py
tests/
  conftest.py
  test_pricing.py
  test_server_api.py
  test_storage.py
.gitea/
  workflows/
    ci.yml

Verification

Run local verification:

pytest -q

Gitea CI pipeline:

  • File: .gitea/workflows/ci.yml
  • Runs compile checks + pytest on push and PR.