feat(window): add window lifecycle and launch endpoints
All checks were successful
python-syntax / syntax-check (push) Successful in 28s

This commit is contained in:
2026-05-01 15:52:02 +02:00
parent 1429e90be2
commit 493e5499e8
4 changed files with 382 additions and 2 deletions

View File

@@ -194,6 +194,89 @@ Move only:
}
```
## `GET /windows`
List desktop windows using structured filters instead of shelling out.
Query params:
- `title_contains` (optional substring match)
- `title_regex` (optional case-insensitive regex)
- `process_name` (optional exact process name, e.g. `explorer.exe`)
- `hwnd` (optional exact window handle)
- `visible_only` (bool, default `true`)
```json
{
"ok": true,
"count": 1,
"windows": [
{
"hwnd": 132640,
"title": "WinDirStat",
"class_name": "WinDirStatMainWindow",
"pid": 18420,
"process_name": "windirstat.exe",
"visible": true,
"enabled": true,
"minimized": false,
"maximized": false,
"foreground": true,
"rect": {"x": 194, "y": 116, "width": 1532, "height": 870}
}
]
}
```
Notes:
- Currently supported on Windows hosts only.
- Returns `409` for ambiguous write-target matches when a mutation endpoint would affect multiple windows.
## `POST /windows/action`
Perform a structured window action against exactly one matched window.
```json
{
"action": "focus",
"title_contains": "WinDirStat",
"visible_only": true,
"timeout_ms": 3000
}
```
Supported actions:
- `focus`
- `restore`
- `minimize`
- `maximize`
- `close`
The response includes the matched pre-action window and the final observed window state (or `closed=true` if it disappeared).
## `POST /launch`
Start an app/process without invoking a shell.
```json
{
"executable": "C:/Program Files/WinDirStat/WinDirStat.exe",
"args": [],
"cwd": "C:/Program Files/WinDirStat",
"wait_for_window": true,
"match": {
"title_contains": "WinDirStat",
"visible_only": true
},
"timeout_ms": 8000
}
```
Notes:
- Launch uses direct process execution (`subprocess.Popen`) rather than PowerShell/CMD.
- If `wait_for_window=true`, the server polls for a matching window and returns `window_found`.
- `dry_run=true` returns the resolved argv/cwd without launching.
## `POST /ocr`
Extract visible text from either a full screenshot, a region crop, or caller-provided image bytes.