feat(ocr): add higher-level text search helpers on top of OCR #13

Closed
opened 2026-05-01 15:39:00 +02:00 by luna · 0 comments
Collaborator

Why

Raw OCR blocks are useful but noisy. Agents often need “find matching text and give me candidate targets,” not a raw dump.

Scope

Add a higher-level OCR/text helper endpoint or mode that can:

  • search visible text by exact match / substring / regex-like pattern
  • return confidence-sorted matches
  • optionally group adjacent words into lines
  • optionally constrain to a region

Done when

  • agents can ask for “find text X on screen” without implementing custom client-side OCR post-processing every time
## Why Raw OCR blocks are useful but noisy. Agents often need “find matching text and give me candidate targets,” not a raw dump. ## Scope Add a higher-level OCR/text helper endpoint or mode that can: - search visible text by exact match / substring / regex-like pattern - return confidence-sorted matches - optionally group adjacent words into lines - optionally constrain to a region ## Done when - agents can ask for “find text X on screen” without implementing custom client-side OCR post-processing every time
luna closed this issue 2026-05-01 16:23:17 +02:00
This repo is archived. You cannot comment on issues.
No Label
1 Participants
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: space/clickthrough#13