feat(verify): add compound action+verify flows
All checks were successful
python-syntax / syntax-check (push) Successful in 9s

This commit is contained in:
2026-05-01 16:26:57 +02:00
parent 02bf069425
commit c66779d929
4 changed files with 111 additions and 4 deletions

View File

@@ -420,6 +420,43 @@ Notes:
- Window waits build on the structured window discovery endpoint.
- Visual waits compare repeated captures of either the full selected display or an explicit region.
## `POST /action/verify`
Execute one action and wait for a structured success condition.
Query params:
- `screen` (int, default `0`)
```json
{
"action": {
"action": "click",
"target": {"mode": "pixel", "x": 1300, "y": 740}
},
"condition": {
"kind": "text",
"mode": "screen",
"text": "Settings",
"match": "contains",
"present": true,
"language_hint": "eng",
"min_confidence": 0.4
},
"retries": 1,
"timeout_ms": 4000,
"poll_interval_ms": 250,
"retry_delay_ms": 250
}
```
Condition kinds mirror `POST /wait`:
- `text`
- `window`
- `visual`
The response returns per-attempt action output plus structured verification output.
## `POST /ocr`
Extract visible text from either a full screenshot, a region crop, or caller-provided image bytes.