Archived

This repository has been archived on 2026-05-20. You can view files and clone it. You cannot open issues or pull requests or push a commit.

Files

Space-Banane 22ca0097d1

python-syntax / syntax-check (push) Successful in 31s

Details

Remove interact verify endpoint

2026-05-04 15:59:43 +02:00

1.9 KiB

Raw Blame History

name: clickthrough-http-control description: Use 3 methods to control a computer: see (screenshot+grid), interact (mouse/keyboard), and exec (shell).

Clickthrough Computer Control

Use these methods:

see
interact
exec

Method 1: See

Use POST /see to capture full screen or a region with a grid overlay. Use POST /see/zoom to capture a tighter crop with a denser grid. Use POST /see with ocr=true when text localization is needed.

Rules:

Start with coarse grid (12x12).
For precision, zoom and use denser grid (20x20 or higher).
Always use returned meta.region and meta.grid when computing click targets.
Coordinates are global desktop coordinates.
OCR results are in data.meta.ocr and include confidence, bbox, and center.

Method 2: Interact

Use POST /interact for one action at a time.

Mouse actions:

move, click, right_click, double_click, middle_click, scroll
click_text (OCR-driven click; optionally scope with click_text.region)

Keyboard actions:

type, hotkey

Rules:

Prefer grid targets derived from fresh see/see/zoom captures.
For text buttons/labels, prefer click_text and bound OCR with a region when possible.
Use pixel only when you already have reliable coordinates.
After each important action, call see again before continuing.

Method 3: Exec

Use POST /exec only for shell/system tasks.

Rules:

Requires x-clickthrough-exec-secret.
Do not use exec for normal clicking/typing flows.
Prefer GUI interaction first; exec is fallback or explicit shell task.

Lightweight Procedure

see capture.
If needed, see/zoom refine.
interact one step (click_text for text UI targets).
see verify.
Repeat.

Quick Safety Rules

Never click with stale screenshots.
Never send multiple uncertain clicks in a row.
If localization is ambiguous, re-capture with a tighter zoom.

1.9 KiB Raw Blame History

Clickthrough Computer Control

Method 1: See

Method 2: Interact

Method 3: Exec

Lightweight Procedure

Quick Safety Rules

1.9 KiB

Raw Blame History