refactor: remove verification field from responses and update related documentation

2026-05-27 21:23:40 +02:00
parent 48a145d147
commit 595375e1a7
9 changed files with 26 additions and 64 deletions
--- a/README.md
+++ b/README.md
@@ -12,7 +12,6 @@ It lets an LLM use controlled local tools (screen, click, type, shell) to comple
 - Returns structured agent output as:
  - `return`: human-readable completion message
  - `data`: structured payload (for example command output)
-  - `verification`: final screen-capture metadata for completion accuracy checks

 ## Core Features

@@ -94,11 +93,7 @@ CLI JSON output includes both legacy and structured fields:
    "data": "file1.txt\nfile2.txt"
  },
  "return": "Task completed successfully",
-  "data": "file1.txt\nfile2.txt",
-  "verification": {
-    "ok": true,
-    "path": "C:/.../screens/screen_final_verification_step_003.png"
-  }
+  "data": "file1.txt\nfile2.txt"
 }
 ```

@@ -154,7 +149,6 @@ Each job payload includes:
 - `response.return`
 - `response.data`
 - top-level `return` and `data` aliases
- `verification` (final screenshot path + metadata)

 ### Monitoring UI

@@ -174,7 +168,7 @@ Each job payload includes:
 - Use `click` offsets via `offset_up/down/left/right` and optional `sleep_after_seconds`.
 - When done, call:
  - `task_complete(return="...", data=...)`
- A final verification screen capture is always taken automatically on completion.
+- Before `task_complete`, verify expected on-screen content with `see_screen` (and `enhance` if needed), and include an `observed_result` summary in `data`.

 `data` should contain useful structured output for the requester (text, object, list, etc.).