Files
screenjob/SKILL.md

106 lines
2.6 KiB
Markdown

# ScreenJob Skill (OpenClaw Agents)
## What ScreenJob Solves
ScreenJob lets an agent execute tasks that require a real desktop UI plus terminal access, with structured tool calls and job tracking.
## Main Features
- Screen perception (`see_screen`, `enhance`)
- Mouse/keyboard control (`click`, `type`, `press_key`)
- Terminal execution (`execute_command`, `sleep`)
- Structured completion payload (`task_complete(return=..., data=...)`)
- Automatic final verification screen capture on completion
- Safety gate, auth, history, and live monitoring
## Important Environment Note
ScreenJob runs on a separate computer (the human/operator machine), not inside the agent's own runtime environment.
## Why It Is Useful
Agents can use ScreenJob to launch and control GUI workflows, including orchestrating other GUI agents/tools on a human computer.
## Example Tasks
- Open amazon.de and buy a USB-C to USB-C cable for 10 EUR or less.
- Open google.com, go to my account, and change my profile picture to a provided image URL.
- Run `ls -a` in `C:/Users/username/Documents` and return the output in `data`.
## Practical Usage
1. Submit job via CLI or API.
2. Agent performs tool loop.
3. Read final `response.return`, `response.data`, and `verification` from job status.
Keyboard combo rule:
- For shortcuts, use one `press_key` call with combo syntax, for example: `win+r`, `ctrl+shift+esc`.
- Do not split modifier combos into separate calls.
## API Quick Reference
Base URL:
- `http://127.0.0.1:8787` (default)
Auth (required on all `/api/*` routes):
- `Authorization: Bearer <SCREENJOB_TOKEN>`
- or `X-ScreenJob-Token: <SCREENJOB_TOKEN>`
Create a job:
- `POST /api/jobs`
- Body:
```json
{
"job": "Open amazon.de and go to my orders",
"model": "gpt-5.4-mini",
"disabled_tools": [],
"safety_override": false
}
```
- Response:
```json
{ "job_id": "job_..." }
```
Check progress/result:
- `GET /api/jobs/{job_id}`
- `GET /api/jobs/{job_id}/status`
- `GET /api/jobs/{job_id}/events`
- `GET /api/jobs`
- `POST /api/jobs/{job_id}/cancel`
- `GET /api/stats`
Result contract in job payload:
```json
{
"status": "completed",
"response": {
"return": "Task completed successfully",
"data": "file1.txt\nfile2.txt",
"verification": {
"ok": true,
"path": "C:/.../screens/screen_final_verification_step_006.png"
}
},
"return": "Task completed successfully",
"data": "file1.txt\nfile2.txt",
"verification": {
"ok": true,
"path": "C:/.../screens/screen_final_verification_step_006.png"
}
}
```
Artifacts (screenshots/enhanced images):
- `GET /api/jobs/{job_id}/artifact?path=<absolute_artifact_path>&token=<SCREENJOB_TOKEN>`