# API Reference Base URL: `http://127.0.0.1:8123` Auth header when enabled: ```http x-clickthrough-token: ``` This API is intended for AI computer control through 3 methods only: - `see` - `interact` - `exec` All responses use one envelope. ## Response Envelope Success: ```json { "ok": true, "request_id": "...", "time_ms": 1710000000000, "data": {}, "error": null } ``` Error: ```json { "ok": false, "request_id": "...", "time_ms": 1710000000000, "data": null, "error": { "code": "validation_error", "message": "request validation failed", "details": [] } } ``` ## 1) See ### `POST /see` Capture a full screen or a region. Optional grid overlay returns coordinate metadata for click mapping. ```json { "screen": 0, "region_x": null, "region_y": null, "region_width": null, "region_height": null, "with_grid": true, "grid_rows": 12, "grid_cols": 12, "include_labels": true, "image_format": "png", "jpeg_quality": 85 } ``` Returns: - `data.image.base64` - `data.meta.region` (global desktop coords) - `data.meta.grid` (rows/cols/cell size + formula) ### `POST /see/zoom` Capture a tighter crop around a global point and draw another grid over that crop. ```json { "screen": 0, "center_x": 1200, "center_y": 720, "width": 500, "height": 350, "with_grid": true, "grid_rows": 20, "grid_cols": 20, "include_labels": true, "image_format": "png", "jpeg_quality": 90 } ``` Use this for precision before clicking tiny controls. ## 2) Interact ### `POST /interact` Mouse/keyboard action execution. ```json { "screen": 0, "action": { "action": "click", "target": { "mode": "grid", "region_x": 0, "region_y": 0, "region_width": 1920, "region_height": 1080, "rows": 12, "cols": 12, "row": 7, "col": 3, "dx": 0.0, "dy": 0.0 }, "button": "left", "clicks": 1 } } ``` Supported actions: - `move`, `click`, `right_click`, `double_click`, `middle_click` - `scroll` (`scroll_amount`) - `type` (`text`, `interval_ms`) - `hotkey` (`keys`) Target modes: - `pixel`: absolute global `x,y` - `grid`: grid cell from a `see`/`see/zoom` response ## 3) Exec ### `POST /exec` Run host shell commands (PowerShell/Bash/CMD). ```json { "command": "Get-Process | Select-Object -First 5", "shell": "powershell", "timeout_s": 20, "cwd": "C:/Users/Paul", "dry_run": false } ``` Required header: ```http x-clickthrough-exec-secret: ``` ## Minimal Procedure for Agents 1. `see` full screen with coarse grid. 2. If uncertain, `see/zoom` target area with denser grid. 3. `interact` one action. 4. `see` again to confirm state change. 5. Use `exec` only when GUI interaction is not the right tool.