# Clickthrough Let an Agent interact with your computer over HTTP, with grid-aware screenshots and precise input actions. ## What this provides - **Visual endpoints**: full-screen capture with optional grid overlay and labeled cells (`asImage=true` can return raw image bytes) - **Zoom endpoint**: crop around a point with denser grid for fine targeting (`asImage=true` supported) - **Action endpoints**: move/click/right-click/double-click/middle-click/scroll/type/hotkey - **Coordinate transform metadata** in visual responses so agents can map grid cells to real pixels - **Safety knobs**: token auth, dry-run mode, optional allowed-region restriction ## Quick start ```bash cd /root/external-projects/clickthrough python3 -m venv .venv . .venv/bin/activate pip install -r requirements.txt CLICKTHROUGH_TOKEN=change-me python -m server.app ``` Server defaults to `127.0.0.1:8123`. ## Minimal API flow 1. `GET /screen` with grid 2. Decide cell / target 3. Optional `POST /zoom` for finer targeting 4. `POST /action` to execute 5. `GET /screen` again to verify result See: - `docs/API.md` - `docs/coordinate-system.md` - `skill/SKILL.md` ## Configuration Environment variables: - `CLICKTHROUGH_HOST` (default `127.0.0.1`) - `CLICKTHROUGH_PORT` (default `8123`) - `CLICKTHROUGH_TOKEN` (optional; if set, require `x-clickthrough-token` header) - `CLICKTHROUGH_DRY_RUN` (`true`/`false`; default `false`) - `CLICKTHROUGH_GRID_ROWS` (default `12`) - `CLICKTHROUGH_GRID_COLS` (default `12`) - `CLICKTHROUGH_ALLOWED_REGION` (optional `x,y,width,height`) ## Gitea CI A Gitea Actions workflow is included at `.gitea/workflows/python-syntax.yml`. It runs Python syntax checks (`py_compile`) on every push and pull request.