From 8857feaf7b11150e23379462daec7d111a30677e Mon Sep 17 00:00:00 2001 From: Luna Date: Fri, 1 May 2026 16:04:24 +0200 Subject: [PATCH] docs(playbooks): expand clickthrough interaction routines --- skill/SKILL.md | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/skill/SKILL.md b/skill/SKILL.md index 6e5b70c..5facc45 100644 --- a/skill/SKILL.md +++ b/skill/SKILL.md @@ -253,6 +253,82 @@ Do not collapse those steps into fake certainty. Build per-app routines for repetitive tasks instead of generic clicking. +### Launcher / search / start app playbook + +Use this when the goal is "open app X" or "bring up tool Y". + +1. check `GET /windows` first in case the app is already open +2. if present, use `POST /windows/action` to focus or restore it +3. if absent, prefer `POST /launch` when you know the executable path +4. if launch path is unknown but the OS launcher/search UI is available, use a keyboard-first flow: + - open launcher (`win`, `cmd+space`, or app-specific shortcut depending on host) + - type exact app name + - wait for stable results with `POST /wait` or recapture + - verify the result text with OCR or the `image` tool + - press Enter or click the exact result once +5. verify the app window now exists or is focused + +Do not keep relaunching if the window already exists; that’s sloppy. + +### Dialog confirmation playbook + +Use for modals like save/discard, delete confirmation, permission prompts, and installer dialogs. + +1. capture the dialog region with `POST /zoom` +2. use OCR first for title/body/button labels +3. if button hierarchy or emphasis matters, inspect the zoomed screenshot with the `image` tool +4. identify the exact intended action (`Cancel`, `Save`, `Allow`, `Delete`, etc.) +5. for destructive actions, require explicit user confirmation unless already requested +6. click once and verify the dialog disappeared or changed state + +Good verification targets: +- dialog title vanished +- expected next window appeared +- destructive side effect is visible and confirmed + +### File picker playbook + +Use for open/save dialogs. + +1. verify the file picker window is focused +2. OCR the visible breadcrumb/path area, filename field, and button row +3. prefer keyboard-first entry when possible: + - type or paste the target path/name into the focused field + - use `tab` / `shift+tab` to move predictably between filename and action buttons +4. if the target path is uncertain, use OCR plus the `image` tool to identify the active field and selected folder/file row +5. verify the intended filename/path is visible before confirming +6. activate `Open` / `Save` once and verify the picker closes + +If the picker stays open, stop and inspect why instead of hammering Enter like a maniac. + +### Browser tab / window playbook + +Use for browser navigation, tab targeting, or web app recovery. + +1. use `GET /windows` to focus the correct browser window first +2. prefer keyboard-first navigation: + - `ctrl+l` / `cmd+l` to focus the address bar + - `ctrl+tab` / `ctrl+shift+tab` for tab movement when order is known + - `ctrl+w` only for explicitly requested close actions +3. verify tab or page identity with OCR on the tab strip or page heading +4. if multiple similar tabs are open, zoom into the tab strip and use the `image` tool to distinguish active vs inactive tabs +5. after navigation, wait for visual stability or expected text before taking the next action + +Do not assume a page loaded just because the click landed. Verify it. + +### Settings / preferences navigation playbook + +Use when the task involves toggles, dropdowns, sidebars, or nested settings panels. + +1. identify the current settings page with OCR on the heading/sidebar +2. use OCR to find the specific section label before trying to toggle anything +3. if the layout is dense, zoom into the relevant pane and use the `image` tool to distinguish labels from controls +4. prefer small reversible actions: one toggle, one dropdown, one field edit at a time +5. after each change, verify the control state changed visually or via visible text +6. if a save/apply button exists, treat it as a separate confirmation step and verify completion + +Settings UIs love hiding side effects. Assume nothing. + ### Spotify playbook - Focus app window before search/navigation.