Files
screenjob/todo.md
Space-Banane cceed18cf1
All checks were successful
CI / test (push) Successful in 7s
feat: (literally) "enhance" functionality with new parameters and improved image processing
2026-05-27 22:14:32 +02:00

1.7 KiB

TODO

P0

  • [Bug] Enforce single active desktop-control run (or a strict queue) so concurrent jobs cannot fight over the same mouse/keyboard/screen session.
  • [Bug] Fix run artifact collisions in setup_artifacts() (run_id is second-granularity, so two jobs in the same second can share/overwrite the same directory).
  • [Bug] Remove global logger handler clobbering in setup_logger() (logging.getLogger("screenjob").handlers.clear() breaks concurrent runs and can redirect logs to the wrong file).
  • More consistent clicks and more uses of enhance images.

P1

  • Move ui.py into a seperate html file and js file.
  • Think harder using effort "medium" by default.
  • Decay old screenshots after 3 to 5 steps to save (1) tokens and (2) brain fuck in the agents.
  • [Bug] Validate disabled_tools against an allowlist and disallow disabling critical completion flow (task_complete) to avoid guaranteed step-limit failures.
  • [Bug] Improve execute_command cancellation/timeout handling to terminate full process trees, not only the parent shell process.

P2

  • [Bug] Fix UI event style mapping mismatch (tool_called events are emitted, but UI color map expects tool_call).
  • [Idea] Reduce monitoring UI backend load by throttling websocket-triggered refreshes and avoiding full job/event re-fetch on every event.
  • [Idea] Add retention controls/pruning for old runs, screenshots, and DB rows.

P3

  • Add Replay Mode; Ability to replay a session by reconstructing the screen from screenshots and overlaying tool calls and click and type events.
  • [Idea] Add lightweight analytics dashboards (success rate by objective category, avg steps/cost over time).