# TODO ## P0 - [Bug] Fix CI & pytest - [Bug] Enforce single active desktop-control run (or a strict queue) so concurrent jobs cannot fight over the same mouse/keyboard/screen session. - [Bug] Fix run artifact collisions in `setup_artifacts()` (`run_id` is second-granularity, so two jobs in the same second can share/overwrite the same directory). - [Bug] Remove global logger handler clobbering in `setup_logger()` (`logging.getLogger("screenjob").handlers.clear()` breaks concurrent runs and can redirect logs to the wrong file). - [x] More consistent clicks and more uses of enhance images. ## P1 - [x] Move ui.py into a seperate html file and js file. - [x] Think harder using effort "medium" by default. - [x] Decay old screenshots after 3 to 5 steps to save (1) tokens and (2) brain fuck in the agents. - [Bug] Validate `disabled_tools` against an allowlist and disallow disabling critical completion flow (`task_complete`) to avoid guaranteed step-limit failures. - [Bug] Improve `execute_command` cancellation/timeout handling to terminate full process trees, not only the parent shell process. ## P2 - [Bug] Fix UI event style mapping mismatch (`tool_called` events are emitted, but UI color map expects `tool_call`). - [Idea] Reduce monitoring UI backend load by throttling websocket-triggered refreshes and avoiding full job/event re-fetch on every event. - [Idea] Add retention controls/pruning for old runs, screenshots, and DB rows. ## P3 - [x] Add Replay Mode; Ability to replay a session by reconstructing the screen from screenshots and overlaying tool calls and click and type events. - [x] Add lightweight analytics dashboards (success rate by objective category, avg steps/cost over time).