# API Reference (v0.1) Base URL: `http://127.0.0.1:8123` If `CLICKTHROUGH_TOKEN` is set, include header: ```http x-clickthrough-token: ``` ## `GET /health` Returns status and runtime safety flags, including `exec` capability config. ## `GET /screen` Query params: - `with_grid` (bool, default `true`) - `grid_rows` (int, default env or `12`) - `grid_cols` (int, default env or `12`) - `include_labels` (bool, default `true`) - `image_format` (`png`|`jpeg`, default `png`) - `jpeg_quality` (1-100, default `85`) - `asImage` (bool, default `false`) — if `true`, return raw image bytes only (`image/png` or `image/jpeg`) Default response includes base64 image and metadata (`meta.region`, optional `meta.grid`). ## `POST /zoom` Body: ```json { "center_x": 1200, "center_y": 700, "width": 500, "height": 350, "with_grid": true, "grid_rows": 20, "grid_cols": 20, "include_labels": true, "image_format": "png", "jpeg_quality": 90 } ``` Query params: - `asImage` (bool, default `false`) — if `true`, return raw image bytes only (`image/png` or `image/jpeg`) Default response returns cropped image + region metadata in global pixel coordinates. ## `POST /action` Body: one action. ### Pointer target modes #### Pixel target ```json { "mode": "pixel", "x": 100, "y": 200, "dx": 0, "dy": 0 } ``` #### Grid target ```json { "mode": "grid", "region_x": 0, "region_y": 0, "region_width": 1920, "region_height": 1080, "rows": 12, "cols": 12, "row": 5, "col": 9, "dx": 0.0, "dy": 0.0 } ``` `dx`/`dy` are normalized offsets in `[-1, 1]` inside the selected cell. ### Action examples Click: ```json { "action": "click", "target": { "mode": "grid", "region_x": 0, "region_y": 0, "region_width": 1920, "region_height": 1080, "rows": 12, "cols": 12, "row": 7, "col": 3, "dx": 0.2, "dy": -0.1 }, "clicks": 1, "button": "left" } ``` Scroll: ```json { "action": "scroll", "target": {"mode": "pixel", "x": 1300, "y": 740}, "scroll_amount": -500 } ``` Type text: ```json { "action": "type", "text": "hello world", "interval_ms": 20 } ``` Hotkey: ```json { "action": "hotkey", "keys": ["ctrl", "l"] } ``` ## `POST /ocr` Extract visible text from either a full screenshot, a region crop, or caller-provided image bytes. Body: ```json { "mode": "screen", "language_hint": "eng", "min_confidence": 0.4 } ``` Modes: - `screen` (default): OCR over full captured monitor - `region`: OCR over explicit region (`region_x`, `region_y`, `region_width`, `region_height`) - `image`: OCR over provided `image_base64` (supports plain base64 or data URL) Region mode example: ```json { "mode": "region", "region_x": 220, "region_y": 160, "region_width": 900, "region_height": 400, "language_hint": "eng", "min_confidence": 0.5 } ``` Image mode example: ```json { "mode": "image", "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...", "language_hint": "eng" } ``` Response shape: ```json { "ok": true, "request_id": "...", "time_ms": 1710000000000, "result": { "mode": "screen", "language_hint": "eng", "min_confidence": 0.4, "region": {"x": 0, "y": 0, "width": 1920, "height": 1080}, "blocks": [ { "text": "Settings", "confidence": 0.9821, "bbox": {"x": 144, "y": 92, "width": 96, "height": 21} } ] } } ``` Notes: - Output is deterministic JSON (stable ordering by top-to-bottom, then left-to-right). - `bbox` coordinates are in global screen space for `screen`/`region`, and image-local for `image`. - Requires `tesseract` executable plus Python package `pytesseract`. - If `tesseract` is not on `PATH`, set `CLICKTHROUGH_TESSERACT_CMD` to the full executable path. ## `POST /exec` Execute a shell command on the host running Clickthrough. Requirements: - `CLICKTHROUGH_EXEC_SECRET` must be configured on the server - send header `x-clickthrough-exec-secret: ` ```json { "command": "Get-Process | Select-Object -First 5", "shell": "powershell", "timeout_s": 20, "cwd": "C:/Users/Paul", "dry_run": false } ``` Notes: - `shell` supports `powershell`, `bash`, `cmd` - if `shell` is omitted, server uses `CLICKTHROUGH_EXEC_DEFAULT_SHELL` - output is truncated based on `CLICKTHROUGH_EXEC_MAX_OUTPUT_CHARS` - endpoint can be disabled with `CLICKTHROUGH_EXEC_ENABLED=false` - if `CLICKTHROUGH_EXEC_SECRET` is missing, `/exec` is blocked (`403`) Response includes `stdout`, `stderr`, `exit_code`, timeout state, and execution metadata. ## `POST /batch` Runs multiple `action` payloads sequentially. ```json { "actions": [ {"action": "move", "target": {"mode": "pixel", "x": 100, "y": 100}}, {"action": "click", "target": {"mode": "pixel", "x": 100, "y": 100}} ], "stop_on_error": true } ```