2.6 KiB
2.6 KiB
ScreenJob Skill (OpenClaw Agents)
What ScreenJob Solves
ScreenJob lets an agent execute tasks that require a real desktop UI plus terminal access, with structured tool calls and job tracking.
Main Features
- Screen perception (
see_screen,enhance) - Mouse/keyboard control (
click,type,press_key) - Terminal execution (
execute_command,sleep) - Structured completion payload (
task_complete(return=..., data=...)) - Safety gate, auth, history, and live monitoring
Important Environment Note
ScreenJob runs on a separate computer (the human/operator machine), not inside the agent's own runtime environment.
Why It Is Useful
Agents can use ScreenJob to launch and control GUI workflows, including orchestrating other GUI agents/tools on a human computer.
Example Tasks
- Open amazon.de and buy a USB-C to USB-C cable for 10 EUR or less.
- Open google.com, go to my account, and change my profile picture to a provided image URL.
- Run
ls -ainC:/Users/username/Documentsand return the output indata.
Practical Usage
- Submit job via CLI or API.
- Agent performs tool loop.
- Read final
response.returnandresponse.datafrom job status.
Keyboard combo rule:
- For shortcuts, use one
press_keycall with combo syntax, for example:win+r,ctrl+shift+esc. - Do not split modifier combos into separate calls.
Verification rule:
- Before
task_complete, verify actual on-screen content matches the expected outcome. - Use
see_screen(andenhanceif needed) for this check. - Include a concise
observed_resultindatawhen completing the task.
API Quick Reference
Base URL:
http://127.0.0.1:8787(default)
Auth (required on all /api/* routes):
Authorization: Bearer <SCREENJOB_TOKEN>- or
X-ScreenJob-Token: <SCREENJOB_TOKEN>
Create a job:
POST /api/jobs- Body:
{
"job": "Open amazon.de and go to my orders",
"model": "gpt-5.4-mini",
"disabled_tools": [],
"safety_override": false
}
- Response:
{ "job_id": "job_..." }
Check progress/result:
GET /api/jobs/{job_id}GET /api/jobs/{job_id}/statusGET /api/jobs/{job_id}/eventsGET /api/jobsPOST /api/jobs/{job_id}/cancelGET /api/stats
Result contract in job payload:
{
"status": "completed",
"response": {
"return": "Task completed successfully",
"data": "file1.txt\nfile2.txt"
},
"return": "Task completed successfully",
"data": "file1.txt\nfile2.txt"
}
Artifacts (screenshots/enhanced images):
GET /api/jobs/{job_id}/artifact?path=<absolute_artifact_path>&token=<SCREENJOB_TOKEN>