feat(ocr): add /ocr endpoint for screen, region, and image input
This commit is contained in:
71
docs/API.md
71
docs/API.md
@@ -143,6 +143,77 @@ Hotkey:
|
||||
}
|
||||
```
|
||||
|
||||
## `POST /ocr`
|
||||
|
||||
Extract visible text from either a full screenshot, a region crop, or caller-provided image bytes.
|
||||
|
||||
Body:
|
||||
|
||||
```json
|
||||
{
|
||||
"mode": "screen",
|
||||
"language_hint": "eng",
|
||||
"min_confidence": 0.4
|
||||
}
|
||||
```
|
||||
|
||||
Modes:
|
||||
- `screen` (default): OCR over full captured monitor
|
||||
- `region`: OCR over explicit region (`region_x`, `region_y`, `region_width`, `region_height`)
|
||||
- `image`: OCR over provided `image_base64` (supports plain base64 or data URL)
|
||||
|
||||
Region mode example:
|
||||
|
||||
```json
|
||||
{
|
||||
"mode": "region",
|
||||
"region_x": 220,
|
||||
"region_y": 160,
|
||||
"region_width": 900,
|
||||
"region_height": 400,
|
||||
"language_hint": "eng",
|
||||
"min_confidence": 0.5
|
||||
}
|
||||
```
|
||||
|
||||
Image mode example:
|
||||
|
||||
```json
|
||||
{
|
||||
"mode": "image",
|
||||
"image_base64": "iVBORw0KGgoAAAANSUhEUgAA...",
|
||||
"language_hint": "eng"
|
||||
}
|
||||
```
|
||||
|
||||
Response shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": true,
|
||||
"request_id": "...",
|
||||
"time_ms": 1710000000000,
|
||||
"result": {
|
||||
"mode": "screen",
|
||||
"language_hint": "eng",
|
||||
"min_confidence": 0.4,
|
||||
"region": {"x": 0, "y": 0, "width": 1920, "height": 1080},
|
||||
"blocks": [
|
||||
{
|
||||
"text": "Settings",
|
||||
"confidence": 0.9821,
|
||||
"bbox": {"x": 144, "y": 92, "width": 96, "height": 21}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Notes:
|
||||
- Output is deterministic JSON (stable ordering by top-to-bottom, then left-to-right).
|
||||
- `bbox` coordinates are in global screen space for `screen`/`region`, and image-local for `image`.
|
||||
- Requires `tesseract` executable plus Python package `pytesseract`.
|
||||
|
||||
## `POST /exec`
|
||||
|
||||
Execute a shell command on the host running Clickthrough.
|
||||
|
||||
Reference in New Issue
Block a user