3.4 KiB
3.4 KiB
ScreenJob
Desktop-and-terminal task agent with:
- CLI runner
- FastAPI job server
- SQLite task history
- WebSocket-powered monitoring UI
- Safety pre-check and per-job tool disable controls
- Live/final token and cost estimation
Install
pip install openai pillow pyautogui python-dotenv fastapi uvicorn
Environment
Create .env in project root:
OPENAI_API_KEY=...
SCREENJOB_TOKEN=choose_a_strong_token
# Optional
SCREENJOB_DEFAULT_MODEL=gpt-5.4-mini
SCREENJOB_SAFETY_MODEL=gpt-5.4-mini
SCREENJOB_HOST=127.0.0.1
SCREENJOB_PORT=8787
DISABLE_UI=false
Entry Points
python main.py run "<job>"python main.py server- Backward-compatible wrapper:
python screenjob.py "<job>"
CLI Usage
python main.py run "Open amazon.de and go to my orders"
Useful flags:
--model gpt-5.4-mini--disable-tool click --disable-tool type--skip-safety-check--max-steps 80
HTTP API
All API routes require token auth using SCREENJOB_TOKEN:
Authorization: Bearer <token>orX-ScreenJob-Token: <token>- (for browser/image fetch)
?token=<token>query parameter
Create Job
POST /api/jobs
Body:
{
"job": "Open amazon.de and go to my orders",
"model": "gpt-5.4-mini",
"disabled_tools": ["click"],
"safety_override": false
}
Response:
{ "job_id": "job_..." }
Status / Output
GET /api/jobs/{job_id}: full status + output + live/final usage/costGET /api/jobs/{job_id}/status: status aliasGET /api/jobs/{job_id}/events: detailed timelineGET /api/jobs/{job_id}/artifact?path=<absolute_path>&token=<token>: authenticated artifact file fetch for screenshots/enhancementsGET /api/jobs: list active + past jobsPOST /api/jobs/{job_id}/cancel: graceful cancellationGET /api/stats: aggregate metrics
Monitoring UI
- Served at
/whenDISABLE_UI=false - Tailwind-based read-only dashboard
- Requires entering
SCREENJOB_TOKENin UI before data loads - Uses WebSocket
/wsfor live updates (tool calls, step events, usage/cost updates) - No task launch controls in UI (monitoring only)
If DISABLE_UI=true, / returns { "ui_disabled": true } and only API endpoints remain.
Safety
Before execution, each task is classified by a model safety gate:
- Safe: task runs
- Unsafe: task is rejected and recorded
- Override: set
safety_override=true(or--skip-safety-checkin CLI)
Tool Controls
Per-job tool allowlisting via disable list:
- API:
disabled_tools: ["type", "click"] - CLI:
--disable-tool type --disable-tool click
Available tools:
execute_command(command)sleep(seconds)see_screen()enhance(coordinate)click(coordinate, offset_up/down/left/right, sleep_after_seconds)type(text)press_key(key, repeats=1)task_complete(result)
Cost Estimation
Live/final cost is computed from OpenAI response usage (input, cached_input, output) and model pricing rates in src/pricing.py.
- Live: exposed in
GET /api/jobs/{job_id}during execution - Final: persisted in SQLite and returned in status output
Persistence
- SQLite DB:
screenjob.db - Runs/artifacts:
screenjob_runs/run_YYYYMMDD_HHMMSS/... - Full event log per job (for history and UI)
Project Layout
main.py
screenjob.py
src/
__init__.py
agent.py
app_main.py
cli.py
config.py
models.py
pricing.py
runtime.py
safety.py
server.py
storage.py
task_manager.py
ui.py