This repository has been archived on 2026-05-20. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
clickthrough/README.md
Paul Wähner 1c03cab457
All checks were successful
python-syntax / syntax-check (push) Successful in 6s
refactor: simplify to see/interact/exec and split server modules
2026-05-03 20:07:12 +02:00

38 lines
1.7 KiB
Markdown

# Clickthrough
Clickthrough is a lightweight HTTP control layer that lets an AI safely operate a real computer by repeatedly capturing structured screenshots with coordinate-aware grids (`see`), executing precise mouse/keyboard actions from those coordinates (`interact`), and optionally running authenticated shell commands for system-level tasks (`exec`) under a consistent response contract.
## Core Methods
- `POST /see`: Capture a full screen or region, optionally with a click-ready grid overlay.
- `POST /see/zoom`: Capture a tighter crop around a point and draw a denser grid for precise targeting.
- `POST /interact`: Perform one mouse or keyboard action (`click`, `scroll`, `type`, `hotkey`, etc.).
- `POST /exec`: Run PowerShell/Bash/CMD commands when shell-level control is needed.
## Why this works for AI agents
- Agents do not need live vision; they iterate on snapshots.
- Grid metadata bridges image understanding to deterministic click coordinates.
- Interaction stays explicit and auditable (one action per request).
- A unified response envelope (`ok`, `data`, `error`) reduces agent-side branching.
## Minimal Agent Loop
1. Call `see` with a coarse grid.
2. If uncertain, call `see/zoom` with a denser grid.
3. Call `interact` once.
4. Call `see` again to verify state change.
5. Use `exec` only for explicit shell/system tasks.
## Safety and Auth
- `x-clickthrough-token` protects API access when enabled.
- `x-clickthrough-exec-secret` is required for `/exec`.
- Optional dry-run and allowed-region constraints reduce accidental risk.
## Docs
- API: `docs/API.md`
- Agent procedure: `skill/SKILL.md`
- Coordinate system details: `docs/coordinate-system.md`