Support multi-display screen selection
All checks were successful
python-syntax / syntax-check (push) Successful in 1m33s
All checks were successful
python-syntax / syntax-check (push) Successful in 1m33s
This commit is contained in:
@@ -1,6 +1,8 @@
|
||||
# Coordinate System
|
||||
|
||||
All interactions ultimately execute in **global pixel coordinates** of the primary monitor.
|
||||
All interactions ultimately execute in **global desktop pixel coordinates**.
|
||||
|
||||
Use `GET /displays` to list available displays. Visual endpoints accept `?screen=X` where `X` is a zero-based display index. `screen=0` is the primary display when detectable, falling back to the first monitor reported by the capture backend. Invalid screen values fall back to `0`.
|
||||
|
||||
## Regions
|
||||
|
||||
@@ -12,6 +14,12 @@ Visual endpoints return a `region` object:
|
||||
|
||||
This describes where the image sits in global desktop space.
|
||||
|
||||
For a second display to the right of the primary display, `GET /screen?screen=1` might return:
|
||||
|
||||
```json
|
||||
{"x": 1920, "y": 0, "width": 1920, "height": 1080}
|
||||
```
|
||||
|
||||
## Grid indexing
|
||||
|
||||
- Rows/cols are **zero-based**
|
||||
@@ -35,7 +43,7 @@ Interpretation:
|
||||
|
||||
## Recommended agent loop
|
||||
|
||||
1. Capture `/screen` with coarse grid
|
||||
1. Capture `/screen?screen=0` with coarse grid, or choose another display with `/screen?screen=1`
|
||||
2. Find candidate cell
|
||||
3. If uncertain, use `/zoom` around candidate
|
||||
4. Convert target to grid action
|
||||
|
||||
Reference in New Issue
Block a user