feat(vision): add screenshot diff and stability helpers
All checks were successful
python-syntax / syntax-check (push) Successful in 9s

This commit is contained in:
2026-05-01 16:24:46 +02:00
parent f00c525721
commit 02bf069425
4 changed files with 206 additions and 0 deletions

View File

@@ -54,6 +54,8 @@ Say what you actually have: screenshots, OCR output, and fresh verification capt
- `POST /windows/action` → focus/restore/minimize/maximize/close a matched window
- `POST /launch` → start an app/process without dropping to a shell
- `POST /wait?screen=0` → wait for text, window, or visual state changes
- `POST /vision/diff?screen=0` → compare screenshots or regions for meaningful visual change
- `POST /vision/stability?screen=0` → measure short-interval visual stability
- `POST /ocr` → text extraction with bounding boxes from full screen, region, or provided image bytes
- `POST /ocr/find?screen=0` → search OCR output for matching text candidates
- `POST /action?screen=0` → single interaction (`move`, `click`, `scroll`, `type`, `hotkey`, ...)