docs: update context compaction prompt with observe-decide-act-verify loop

Commit remaining workspace updates
Switch backend startup to interactive session
2026-05-31 20:52:49 +02:00 · 2026-05-31 20:43:36 +02:00 · 2026-05-31 20:43:36 +02:00 · 2026-05-31 18:35:35 +00:00 · 2026-05-28 13:44:31 +02:00 · 2026-05-28 13:30:27 +02:00
28 changed files with 6137 additions and 138 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -20,3 +20,8 @@ screenjob.db
 # IDE
 .vscode/
 .idea/
+
+# Service host build/publish artifacts
+service_host/**/bin/
+service_host/**/obj/
+service_host/publish/
--- a/README.md
+++ b/README.md
@@ -1,7 +1,7 @@
 # ScreenJob

 ScreenJob is an autonomous desktop-and-terminal execution service.  
-It lets an LLM use controlled local tools (screen, click, type, shell) to complete GUI-heavy tasks on a real computer.
+It lets an LLM use controlled local tools (screen, mouse, keyboard, clipboard, shell) to complete GUI-heavy tasks on a real computer.

 ## What It Solves

@@ -15,7 +15,8 @@ It lets an LLM use controlled local tools (screen, click, type, shell) to comple

 ## Core Features

- Tool-based agent loop (`execute_command`, `see_screen`, `enhance`, `click`, `type`, `press_key`, `sleep`, `task_complete`)
+- Hybrid control model: screenshot grounding plus Windows-native window, dialog, and UI-element helpers when available
+- Tool-based agent loop (`execute_command`, `see_screen`, `enhance`, `list_windows`, `find_window`, `focus_window`, `close_window`, `wait_for_window`, `wait_for_focus_change`, `detect_dialog`, `dialog_action`, `dialog_set_filename`, `wait_for_dialog_close`, `list_ui_elements`, `invoke_ui_element`, `set_ui_element_value`, `select_ui_element`, `wait_for_ui_element`, `click`, `scroll`, `drag`, `move_mouse`, `type`, `press_key`, `clipboard_get`, `clipboard_set`, `get_cursor_position`, `get_active_window`, `sleep`, `task_complete`)
 - Safety pre-check with override support
 - Per-job tool disable list
 - Live/final usage and cost estimates
@@ -109,6 +110,80 @@ Or use the PowerShell launcher:
 .\start_backend.ps1
 ```

+### Backend Startup
+
+For screenshot-driven automation, start the backend in the logged-in user session.
+That gives `pyautogui` access to the interactive desktop, which Windows services do not.
+If you previously installed the legacy service, remove it once from an elevated PowerShell session with `.\uninstall_backend_service.ps1`.
+
+Install a sign-in launcher for the current user:
+
+```powershell
+.\install_backend_service.ps1
+```
+
+Install it for all users:
+
+```powershell
+.\install_backend_service.ps1 -AllUsers
+```
+
+Start it immediately after installing:
+
+```powershell
+.\install_backend_service.ps1 -StartNow
+```
+
+Remove the launcher:
+
+```powershell
+.\uninstall_backend_service.ps1
+```
+
+The launcher runs `start_backend.ps1` hidden via `start_backend_hidden.vbs`.
+If you need to start the backend manually, run:
+
+```powershell
+.\start_backend.ps1
+```
+
+The legacy Windows service host remains in the tree for reference, but it is not the recommended path for GUI tasks.
+
+### System Tray Icon (Windows)
+
+Start tray icon now:
+
+```powershell
+powershell -NoProfile -ExecutionPolicy Bypass -STA -File .\screenjob_tray.ps1
+```
+
+Install startup shortcut (current user):
+
+```powershell
+.\install_tray_startup_shortcut.ps1
+```
+
+Install startup shortcut for all users:
+
+```powershell
+.\install_tray_startup_shortcut.ps1 -AllUsers
+```
+
+Remove startup shortcut:
+
+```powershell
+.\install_tray_startup_shortcut.ps1 -Remove
+```
+
+Tray menu actions:
+
+- The service controls are for the legacy Windows service host.
+- Refresh service status
+- Start/Stop/Restart service (prompts for admin/UAC)
+- Open dashboard URL from `.env` `SCREENJOB_HOST` / `SCREENJOB_PORT`
+- Open service logs folder
+- Exit tray icon process
+
 Auth for all API routes:

 - `Authorization: Bearer <SCREENJOB_TOKEN>`
@@ -123,6 +198,11 @@ Auth for all API routes:
 {
  "job": "run \"ls -a\" in C:/Users/username/Documents and return output",
  "model": "gpt-5.4-mini",
+  "native_automation_mode": "prefer",
+  "dialog_timeout_seconds": 12,
+  "focus_timeout_seconds": 8,
+  "ui_element_timeout_seconds": 8,
+  "max_retries_per_surface": 3,
  "disabled_tools": [],
  "safety_override": false
 }
@@ -167,17 +247,28 @@ Each job payload includes:
 ## Agent Instructions (Practical)

 - Prefer `execute_command` for deterministic actions (opening URLs, filesystem checks).
+- First classify the current Windows surface, then choose the control channel.
+- Prefer native window/dialog/element tools for focus changes, file pickers, modal confirmations, and browser-owned dialogs when available.
 - Use `see_screen` before UI interaction.
 - Use `enhance` before clicking small/ambiguous targets; prefer `region="small"` for compact controls.
 - Use `enhance` `mode="text"` for tiny labels/text, or `mode="ui"` for general UI.
 - Optionally set `enhance` `scale` (2-6) for tighter zoom control.
+- Use `list_windows`, `find_window`, `focus_window`, and `wait_for_focus_change` instead of blind Alt+Tab retries.
+- Use `detect_dialog`, `dialog_set_filename`, `dialog_action`, and `wait_for_dialog_close` for native open/save/confirm flows.
+- Use `list_ui_elements`, `invoke_ui_element`, `set_ui_element_value`, `select_ui_element`, and `wait_for_ui_element` when controls are exposed natively.
 - Use `press_key` for non-text keys (Enter, Tab, arrows, Escape).
 - For shortcuts, use one `press_key` call with combo syntax (example: `win+r`).
- Use `click` offsets via `offset_up/down/left/right` and optional `sleep_after_seconds`.
+- Use `click` offsets via `offset_up/down/left/right`; set `button` and `click_count` there instead of inventing one-off click tools.
+- Use `move_mouse` when you need hover-only behavior and `drag` for slider, selection, or window moves.
+- Use `scroll` for vertical navigation; positive amounts scroll up and negative amounts scroll down.
+- Use `clipboard_get` / `clipboard_set` for copy-paste workflows, `get_cursor_position` for cursor inspection, and `get_active_window` before interacting with uncertain focus.
+- If native automation is unavailable or disabled, ScreenJob falls back to screenshots plus mouse/keyboard control and emits fallback events.
 - When done, call:
  - `task_complete(return="...", data=...)`
 - Before `task_complete`, verify expected on-screen content with `see_screen` (and `enhance` if needed), and include an `observed_result` summary in `data`.

+Per-job `disabled_tools` must match the built-in tool allowlist. `task_complete` cannot be disabled.
+
 `data` should contain useful structured output for the requester (text, object, list, etc.).

 ## Verification
--- a/SKILL.md
+++ b/SKILL.md
@@ -6,8 +6,10 @@ ScreenJob lets an agent execute tasks that require a real desktop UI plus termin

 ## Main Features

+- Hybrid control model: screenshot grounding plus Windows-native window/dialog/element helpers when available
 - Screen perception (`see_screen`, `enhance`)
 - Mouse/keyboard control (`click`, `type`, `press_key`)
+- Native window/dialog control (`list_windows`, `find_window`, `focus_window`, `detect_dialog`, `dialog_action`, `dialog_set_filename`, `list_ui_elements`)
 - Terminal execution (`execute_command`, `sleep`)
 - Structured completion payload (`task_complete(return=..., data=...)`)
 - Safety gate, auth, history, and live monitoring
@@ -45,12 +47,25 @@ Enhance-first click rule:
 - Optional zoom control: set `scale` from `2` to `6` (defaults are tuned by region).
 - After checking the enhanced image, click using the same target coordinate (or a small directional offset if needed).

+Windows-native routing rule:
+
+- First classify whether the current surface is a normal app window, browser window, `#32770` dialog, Explorer file picker, or another system surface.
+- Prefer native window/dialog/element tools for focus changes, save/open dialogs, modal confirmations, and exposed controls.
+- Fall back to screenshots plus mouse/keyboard only when native automation is unavailable or the UI is custom-drawn.
+
 Verification rule:

 - Before `task_complete`, verify actual on-screen content matches the expected outcome.
 - Use `see_screen` (and `enhance` if needed) for this check.
 - Include a concise `observed_result` in `data` when completing the task.

+Patience / rerun rule:
+
+- If a job is still `running`, do not assume it is stuck just because it looks slow, repetitive, or token-heavy.
+- Prefer waiting longer and checking for a final status/result before starting a replacement run.
+- Only restart or replace a running job when there is clear evidence it is failed, irrecoverably stuck, or the user explicitly asks for a restart.
+- If you do replace a run, say why in one short sentence and reference the specific blocker you observed.
+
 ## API Quick Reference

 Base URL:
--- a/install_backend_service.ps1
+++ b/install_backend_service.ps1
@@ -0,0 +1,84 @@
+[CmdletBinding(SupportsShouldProcess = $true)]
+param(
+    [switch]$Remove,
+    [switch]$AllUsers,
+    [switch]$StartNow
+)
+
+Set-StrictMode -Version Latest
+$ErrorActionPreference = "Stop"
+
+$scriptDir = Split-Path -Parent $PSCommandPath
+$backendScript = Join-Path $scriptDir "start_backend.ps1"
+$vbsLauncher = Join-Path $scriptDir "start_backend_hidden.vbs"
+$shortcutName = "ScreenJob Backend.lnk"
+
+if (-not (Test-Path -LiteralPath $backendScript)) {
+    throw "Backend launcher script not found: $backendScript"
+}
+
+if (-not (Test-Path -LiteralPath $vbsLauncher)) {
+    throw "Hidden backend launcher file not found: $vbsLauncher"
+}
+
+function Test-IsAdministrator {
+    $identity = [Security.Principal.WindowsIdentity]::GetCurrent()
+    $principal = New-Object Security.Principal.WindowsPrincipal($identity)
+    return $principal.IsInRole([Security.Principal.WindowsBuiltInRole]::Administrator)
+}
+
+$legacyService = Get-Service -Name "ScreenJobBackend" -ErrorAction SilentlyContinue
+if ($null -ne $legacyService) {
+    if (Test-IsAdministrator) {
+        if ($PSCmdlet.ShouldProcess("ScreenJobBackend", "Remove legacy Windows service")) {
+            if ($legacyService.Status -ne "Stopped") {
+                Stop-Service -Name "ScreenJobBackend" -Force -ErrorAction Stop
+            }
+
+            & sc.exe delete ScreenJobBackend | Out-Null
+            if ($LASTEXITCODE -ne 0) {
+                throw "Failed to delete legacy service 'ScreenJobBackend' (sc.exe exit code $LASTEXITCODE)."
+            }
+
+            Write-Host "Removed legacy Windows service: ScreenJobBackend"
+        }
+    } else {
+        Write-Warning "Legacy Windows service 'ScreenJobBackend' is still installed. Run uninstall_backend_service.ps1 from an elevated PowerShell session once to remove it."
+    }
+}
+
+$startupFolder = if ($AllUsers) {
+    [Environment]::GetFolderPath("CommonStartup")
+} else {
+    [Environment]::GetFolderPath("Startup")
+}
+
+$shortcutPath = Join-Path $startupFolder $shortcutName
+
+if ($Remove) {
+    if (Test-Path -LiteralPath $shortcutPath) {
+        if ($PSCmdlet.ShouldProcess($shortcutPath, "Remove backend startup shortcut")) {
+            Remove-Item -LiteralPath $shortcutPath -Force
+            Write-Host "Removed backend startup shortcut: $shortcutPath"
+        }
+    } else {
+        Write-Host "No backend startup shortcut found at: $shortcutPath"
+    }
+    return
+}
+
+if ($PSCmdlet.ShouldProcess($shortcutPath, "Create backend startup shortcut")) {
+    $shell = New-Object -ComObject WScript.Shell
+    $shortcut = $shell.CreateShortcut($shortcutPath)
+    $shortcut.TargetPath = "$env:SystemRoot\System32\wscript.exe"
+    $shortcut.Arguments = '"' + $vbsLauncher + '"'
+    $shortcut.WorkingDirectory = $scriptDir
+    $shortcut.Description = "Launch ScreenJob backend at sign-in in the current user session."
+    $shortcut.Save()
+    Write-Host "Created backend startup shortcut: $shortcutPath"
+}
+
+if ($StartNow) {
+    Start-Process -FilePath "$env:SystemRoot\System32\wscript.exe" -ArgumentList @($vbsLauncher) -WorkingDirectory $scriptDir | Out-Null
+    Write-Host "Started backend launcher now."
+}
--- a/install_tray_startup_shortcut.ps1
+++ b/install_tray_startup_shortcut.ps1
@@ -0,0 +1,47 @@
+[CmdletBinding(SupportsShouldProcess = $true)]
+param(
+    [switch]$Remove,
+    [switch]$AllUsers
+)
+
+Set-StrictMode -Version Latest
+$ErrorActionPreference = "Stop"
+
+$scriptDir = Split-Path -Parent $PSCommandPath
+$vbsLauncher = Join-Path $scriptDir "start_screenjob_tray_hidden.vbs"
+$shortcutName = "ScreenJob Tray.lnk"
+
+if (-not (Test-Path -LiteralPath $vbsLauncher)) {
+    throw "Launcher file not found: $vbsLauncher"
+}
+
+$startupFolder = if ($AllUsers) {
+    [Environment]::GetFolderPath("CommonStartup")
+} else {
+    [Environment]::GetFolderPath("Startup")
+}
+
+$shortcutPath = Join-Path $startupFolder $shortcutName
+
+if ($Remove) {
+    if (Test-Path -LiteralPath $shortcutPath) {
+        if ($PSCmdlet.ShouldProcess($shortcutPath, "Remove startup shortcut")) {
+            Remove-Item -LiteralPath $shortcutPath -Force
+            Write-Host "Removed startup shortcut: $shortcutPath"
+        }
+    } else {
+        Write-Host "No startup shortcut found at: $shortcutPath"
+    }
+    return
+}
+
+if ($PSCmdlet.ShouldProcess($shortcutPath, "Create startup shortcut")) {
+    $shell = New-Object -ComObject WScript.Shell
+    $shortcut = $shell.CreateShortcut($shortcutPath)
+    $shortcut.TargetPath = "$env:SystemRoot\System32\wscript.exe"
+    $shortcut.Arguments = '"' + $vbsLauncher + '"'
+    $shortcut.WorkingDirectory = $scriptDir
+    $shortcut.Description = "Launch ScreenJob tray icon at sign-in."
+    $shortcut.Save()
+    Write-Host "Created startup shortcut: $shortcutPath"
+}
--- a/screenjob_tray.ps1
+++ b/screenjob_tray.ps1
@@ -0,0 +1,307 @@
+param(
+    [string]$ServiceName = "ScreenJobBackend"
+)
+
+Set-StrictMode -Version Latest
+$ErrorActionPreference = "Stop"
+
+Add-Type -AssemblyName System.Windows.Forms
+Add-Type -AssemblyName System.Drawing
+
+$scriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
+$controlScript = Join-Path $scriptDir "tray_service_control.ps1"
+$logsDir = Join-Path $scriptDir "screenjob_runs\service"
+$defaultHost = "127.0.0.1"
+$defaultPort = "8787"
+
+function Read-EnvConfig {
+    param([string]$EnvFilePath)
+    $result = @{}
+    if (-not (Test-Path -LiteralPath $EnvFilePath)) {
+        return $result
+    }
+
+    foreach ($line in Get-Content -Path $EnvFilePath) {
+        $trimmed = $line.Trim()
+        if ($trimmed.Length -eq 0 -or $trimmed.StartsWith("#")) {
+            continue
+        }
+        $parts = $trimmed.Split("=", 2)
+        if ($parts.Count -eq 2) {
+            $key = $parts[0].Trim()
+            $value = $parts[1].Trim()
+            if (($value.StartsWith('"') -and $value.EndsWith('"')) -or ($value.StartsWith("'") -and $value.EndsWith("'"))) {
+                $value = $value.Substring(1, $value.Length - 2)
+            }
+            $result[$key] = $value
+        }
+    }
+    return $result
+}
+
+function Get-ServiceStatusSafe {
+    param([string]$Name)
+    try {
+        $svc = Get-Service -Name $Name -ErrorAction Stop
+        return $svc.Status.ToString()
+    } catch {
+        return "NotInstalled"
+    }
+}
+
+function Invoke-ServiceActionElevated {
+    param(
+        [Parameter(Mandatory = $true)][string]$Action,
+        [Parameter(Mandatory = $true)][string]$Name
+    )
+
+    if (-not (Test-Path -LiteralPath $controlScript)) {
+        [System.Windows.Forms.MessageBox]::Show(
+            "Missing control script: $controlScript",
+            "ScreenJob Tray",
+            [System.Windows.Forms.MessageBoxButtons]::OK,
+            [System.Windows.Forms.MessageBoxIcon]::Error
+        ) | Out-Null
+        return
+    }
+
+    $argList = @(
+        "-NoProfile",
+        "-ExecutionPolicy", "Bypass",
+        "-File", "`"$controlScript`"",
+        "-Action", $Action,
+        "-ServiceName", $Name
+    )
+
+    try {
+        Start-Process -FilePath "powershell.exe" -ArgumentList $argList -Verb RunAs -WindowStyle Hidden | Out-Null
+    } catch {
+        # User canceled UAC prompt or launch failed.
+    }
+}
+
+function Get-DashboardUrl {
+    $envFile = Join-Path $scriptDir ".env"
+    $envVars = Read-EnvConfig -EnvFilePath $envFile
+
+    $dashboardHost = $defaultHost
+    $dashboardPort = $defaultPort
+
+    if ($envVars.ContainsKey("SCREENJOB_HOST") -and -not [string]::IsNullOrWhiteSpace($envVars["SCREENJOB_HOST"])) {
+        $dashboardHost = $envVars["SCREENJOB_HOST"]
+    }
+    if ($envVars.ContainsKey("SCREENJOB_PORT") -and -not [string]::IsNullOrWhiteSpace($envVars["SCREENJOB_PORT"])) {
+        $dashboardPort = $envVars["SCREENJOB_PORT"]
+    }
+
+    $connectHost = Resolve-ConnectHost -ConfiguredHost $dashboardHost
+    return "http://{0}:{1}/" -f $connectHost, $dashboardPort
+}
+
+function Resolve-ConnectHost {
+    param([string]$ConfiguredHost)
+
+    if ([string]::IsNullOrWhiteSpace($ConfiguredHost)) {
+        return "127.0.0.1"
+    }
+
+    switch ($ConfiguredHost.Trim().ToLowerInvariant()) {
+        "0.0.0.0" { return "127.0.0.1" }
+        "::" { return "127.0.0.1" }
+        "*" { return "127.0.0.1" }
+        default { return $ConfiguredHost }
+    }
+}
+
+function Get-HealthCheckHosts {
+    param([string]$ConfiguredHost)
+
+    if ([string]::IsNullOrWhiteSpace($ConfiguredHost)) {
+        return @("127.0.0.1", "localhost")
+    }
+
+    $normalized = $ConfiguredHost.Trim().ToLowerInvariant()
+    switch ($normalized) {
+        "0.0.0.0" { return @("127.0.0.1", "localhost", "::1") }
+        "::" { return @("127.0.0.1", "localhost", "::1") }
+        "*" { return @("127.0.0.1", "localhost", "::1") }
+        default { return @($ConfiguredHost) }
+    }
+}
+
+function Test-TcpEndpoint {
+    param(
+        [Parameter(Mandatory = $true)][string]$HostName,
+        [Parameter(Mandatory = $true)][int]$Port,
+        [int]$TimeoutMs = 1200
+    )
+
+    $client = New-Object System.Net.Sockets.TcpClient
+    try {
+        $async = $client.BeginConnect($HostName, $Port, $null, $null)
+        $connected = $async.AsyncWaitHandle.WaitOne($TimeoutMs, $false)
+        if (-not $connected) {
+            return $false
+        }
+        $client.EndConnect($async) | Out-Null
+        return $true
+    } catch {
+        return $false
+    } finally {
+        $client.Dispose()
+    }
+}
+
+function Get-BackendReachability {
+    $envFile = Join-Path $scriptDir ".env"
+    $envVars = Read-EnvConfig -EnvFilePath $envFile
+    $configuredHost = $defaultHost
+    $configuredPort = $defaultPort
+
+    if ($envVars.ContainsKey("SCREENJOB_HOST") -and -not [string]::IsNullOrWhiteSpace($envVars["SCREENJOB_HOST"])) {
+        $configuredHost = $envVars["SCREENJOB_HOST"]
+    }
+    if ($envVars.ContainsKey("SCREENJOB_PORT") -and -not [string]::IsNullOrWhiteSpace($envVars["SCREENJOB_PORT"])) {
+        $configuredPort = $envVars["SCREENJOB_PORT"]
+    }
+
+    $portNumber = 8787
+    [void][int]::TryParse([string]$configuredPort, [ref]$portNumber)
+    $hostsToTry = Get-HealthCheckHosts -ConfiguredHost $configuredHost
+
+    foreach ($candidateHost in $hostsToTry) {
+        if (Test-TcpEndpoint -HostName $candidateHost -Port $portNumber) {
+            return $true
+        }
+    }
+
+    return $false
+}
+
+function Update-TrayState {
+    param(
+        [System.Windows.Forms.NotifyIcon]$NotifyIcon,
+        [System.Windows.Forms.ToolStripMenuItem]$StatusItem,
+        [string]$Name
+    )
+
+    $status = Get-ServiceStatusSafe -Name $Name
+    $isBackendReachable = Get-BackendReachability
+
+    $displayStatus = $status
+    if ($status -eq "Running" -and -not $isBackendReachable) {
+        $displayStatus = "Running (Backend Down)"
+    } elseif ($status -eq "Stopped" -and $isBackendReachable) {
+        $displayStatus = "Stopped (Backend Up)"
+    } elseif ($status -eq "NotInstalled" -and $isBackendReachable) {
+        $displayStatus = "NotInstalled (Backend Up)"
+    }
+
+    $StatusItem.Text = "Status: $displayStatus"
+
+    switch ($displayStatus) {
+        "Running" {
+            $NotifyIcon.Icon = [System.Drawing.SystemIcons]::Information
+        }
+        "Stopped" {
+            $NotifyIcon.Icon = [System.Drawing.SystemIcons]::Warning
+        }
+        default {
+            $NotifyIcon.Icon = [System.Drawing.SystemIcons]::Error
+        }
+    }
+
+    $tooltip = "ScreenJob Backend: $displayStatus"
+    if ($tooltip.Length -gt 63) {
+        $tooltip = $tooltip.Substring(0, 63)
+    }
+    $NotifyIcon.Text = $tooltip
+}
+
+$appContext = New-Object System.Windows.Forms.ApplicationContext
+$notifyIcon = New-Object System.Windows.Forms.NotifyIcon
+$notifyIcon.Visible = $false
+
+$menu = New-Object System.Windows.Forms.ContextMenuStrip
+$statusItem = New-Object System.Windows.Forms.ToolStripMenuItem "Status: Unknown"
+$statusItem.Enabled = $false
+
+$refreshItem = New-Object System.Windows.Forms.ToolStripMenuItem "Refresh Status"
+$refreshItem.Add_Click({
+    Update-TrayState -NotifyIcon $notifyIcon -StatusItem $statusItem -Name $ServiceName
+})
+
+$startItem = New-Object System.Windows.Forms.ToolStripMenuItem "Start Service (Admin)"
+$startItem.Add_Click({
+    Invoke-ServiceActionElevated -Action "start" -Name $ServiceName
+})
+
+$stopItem = New-Object System.Windows.Forms.ToolStripMenuItem "Stop Service (Admin)"
+$stopItem.Add_Click({
+    Invoke-ServiceActionElevated -Action "stop" -Name $ServiceName
+})
+
+$restartItem = New-Object System.Windows.Forms.ToolStripMenuItem "Restart Service (Admin)"
+$restartItem.Add_Click({
+    Invoke-ServiceActionElevated -Action "restart" -Name $ServiceName
+})
+
+$dashboardItem = New-Object System.Windows.Forms.ToolStripMenuItem "Open Dashboard"
+$dashboardItem.Add_Click({
+    $url = Get-DashboardUrl
+    Start-Process $url | Out-Null
+})
+
+$logsItem = New-Object System.Windows.Forms.ToolStripMenuItem "Open Service Logs"
+$logsItem.Add_Click({
+    if (-not (Test-Path -LiteralPath $logsDir)) {
+        New-Item -ItemType Directory -Path $logsDir -Force | Out-Null
+    }
+    Start-Process explorer.exe $logsDir | Out-Null
+})
+
+$openFolderItem = New-Object System.Windows.Forms.ToolStripMenuItem "Open Project Folder"
+$openFolderItem.Add_Click({
+    Start-Process explorer.exe $scriptDir | Out-Null
+})
+
+$exitItem = New-Object System.Windows.Forms.ToolStripMenuItem "Exit Tray"
+$exitItem.Add_Click({
+    $refreshTimer.Stop()
+    $notifyIcon.Visible = $false
+    $notifyIcon.Dispose()
+    $menu.Dispose()
+    $appContext.ExitThread()
+})
+
+[void]$menu.Items.Add($statusItem)
+[void]$menu.Items.Add($refreshItem)
+[void]$menu.Items.Add((New-Object System.Windows.Forms.ToolStripSeparator))
+[void]$menu.Items.Add($startItem)
+[void]$menu.Items.Add($stopItem)
+[void]$menu.Items.Add($restartItem)
+[void]$menu.Items.Add((New-Object System.Windows.Forms.ToolStripSeparator))
+[void]$menu.Items.Add($dashboardItem)
+[void]$menu.Items.Add($logsItem)
+[void]$menu.Items.Add($openFolderItem)
+[void]$menu.Items.Add((New-Object System.Windows.Forms.ToolStripSeparator))
+[void]$menu.Items.Add($exitItem)
+
+$notifyIcon.ContextMenuStrip = $menu
+$notifyIcon.Visible = $true
+
+$notifyIcon.Add_DoubleClick({
+    $url = Get-DashboardUrl
+    Start-Process $url | Out-Null
+})
+
+$refreshTimer = New-Object System.Windows.Forms.Timer
+$refreshTimer.Interval = 5000
+$refreshTimer.Add_Tick({
+    Update-TrayState -NotifyIcon $notifyIcon -StatusItem $statusItem -Name $ServiceName
+})
+
+Update-TrayState -NotifyIcon $notifyIcon -StatusItem $statusItem -Name $ServiceName
+$refreshTimer.Start()
+
+[System.Windows.Forms.Application]::Run($appContext)
--- a/service_host/ScreenJob.WindowsServiceHost/BackendProcessService.cs
+++ b/service_host/ScreenJob.WindowsServiceHost/BackendProcessService.cs
@@ -0,0 +1,138 @@
+using System.Diagnostics;
+using Microsoft.Extensions.Hosting;
+using Microsoft.Extensions.Logging;
+
+namespace ScreenJob.WindowsServiceHost;
+
+internal sealed class BackendProcessService : BackgroundService
+{
+    private readonly ILogger<BackendProcessService> _logger;
+    private readonly ServiceOptions _options;
+    private readonly object _logLock = new();
+
+    private Process? _backendProcess;
+    private string _stdoutLogPath = string.Empty;
+    private string _stderrLogPath = string.Empty;
+
+    public BackendProcessService(ILogger<BackendProcessService> logger, ServiceOptions options)
+    {
+        _logger = logger;
+        _options = options;
+    }
+
+    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
+    {
+        Directory.CreateDirectory(_options.LogDirectory);
+        _stdoutLogPath = Path.Combine(_options.LogDirectory, "backend-service.stdout.log");
+        _stderrLogPath = Path.Combine(_options.LogDirectory, "backend-service.stderr.log");
+
+        LogStdOut("Service host starting backend process.");
+        LogStdOut($"Script: {_options.BackendScriptPath}");
+        LogStdOut($"Working directory: {_options.WorkingDirectory}");
+
+        var powershellPath = Path.Combine(
+            Environment.GetFolderPath(Environment.SpecialFolder.Windows),
+            "System32",
+            "WindowsPowerShell",
+            "v1.0",
+            "powershell.exe");
+
+        var startInfo = new ProcessStartInfo
+        {
+            FileName = powershellPath,
+            Arguments = $"-NoProfile -ExecutionPolicy Bypass -File \"{_options.BackendScriptPath}\"",
+            WorkingDirectory = _options.WorkingDirectory,
+            RedirectStandardOutput = true,
+            RedirectStandardError = true,
+            UseShellExecute = false,
+            CreateNoWindow = true
+        };
+
+        _backendProcess = new Process { StartInfo = startInfo };
+        if (!_backendProcess.Start())
+        {
+            throw new InvalidOperationException("Failed to start backend process.");
+        }
+
+        LogStdOut($"Backend process started with PID {_backendProcess.Id}.");
+        _logger.LogInformation("Backend process started with PID {Pid}.", _backendProcess.Id);
+
+        var stdoutPump = PumpStreamAsync(_backendProcess.StandardOutput, LogStdOut, stoppingToken);
+        var stderrPump = PumpStreamAsync(_backendProcess.StandardError, LogStdErr, stoppingToken);
+
+        try
+        {
+            await _backendProcess.WaitForExitAsync(stoppingToken);
+            var exitCode = _backendProcess.ExitCode;
+            LogStdErr($"Backend process exited unexpectedly with code {exitCode}.");
+            _logger.LogError("Backend process exited unexpectedly with code {ExitCode}.", exitCode);
+            Environment.ExitCode = exitCode == 0 ? 1 : exitCode;
+            throw new InvalidOperationException(
+                $"Backend process ended unexpectedly. Service host exit code: {Environment.ExitCode}.");
+        }
+        catch (OperationCanceledException)
+        {
+            LogStdOut("Service stop requested.");
+        }
+        finally
+        {
+            await Task.WhenAll(stdoutPump, stderrPump);
+        }
+    }
+
+    public override async Task StopAsync(CancellationToken cancellationToken)
+    {
+        if (_backendProcess is { HasExited: false })
+        {
+            try
+            {
+                LogStdOut("Stopping backend process.");
+                _backendProcess.Kill(entireProcessTree: true);
+            }
+            catch (Exception ex)
+            {
+                LogStdErr($"Failed to stop backend process cleanly: {ex.Message}");
+                _logger.LogError(ex, "Failed to stop backend process cleanly.");
+            }
+        }
+
+        await base.StopAsync(cancellationToken);
+    }
+
+    private async Task PumpStreamAsync(
+        StreamReader reader,
+        Action<string> sink,
+        CancellationToken stoppingToken)
+    {
+        while (!stoppingToken.IsCancellationRequested)
+        {
+            var line = await reader.ReadLineAsync();
+            if (line is null)
+            {
+                break;
+            }
+
+            sink(line);
+        }
+    }
+
+    private void LogStdOut(string message)
+    {
+        WriteLog(_stdoutLogPath, message);
+    }
+
+    private void LogStdErr(string message)
+    {
+        WriteLog(_stderrLogPath, message);
+    }
+
+    private void WriteLog(string path, string message)
+    {
+        var stamp = DateTimeOffset.Now.ToString("yyyy-MM-dd HH:mm:ss");
+        var line = $"[{stamp}] {message}{Environment.NewLine}";
+        lock (_logLock)
+        {
+            File.AppendAllText(path, line);
+        }
+    }
+}
--- a/service_host/ScreenJob.WindowsServiceHost/Program.cs
+++ b/service_host/ScreenJob.WindowsServiceHost/Program.cs
@@ -0,0 +1,18 @@
+using Microsoft.Extensions.DependencyInjection;
+using Microsoft.Extensions.Hosting;
+using ScreenJob.WindowsServiceHost;
+
+var options = ServiceOptions.Parse(args);
+
+Host.CreateDefaultBuilder(args)
+    .UseWindowsService(serviceOptions =>
+    {
+        serviceOptions.ServiceName = "ScreenJobBackend";
+    })
+    .ConfigureServices(services =>
+    {
+        services.AddSingleton(options);
+        services.AddHostedService<BackendProcessService>();
+    })
+    .Build()
+    .Run();
--- a/service_host/ScreenJob.WindowsServiceHost/ScreenJob.WindowsServiceHost.csproj
+++ b/service_host/ScreenJob.WindowsServiceHost/ScreenJob.WindowsServiceHost.csproj
@@ -0,0 +1,12 @@
+<Project Sdk="Microsoft.NET.Sdk.Worker">
+  <PropertyGroup>
+    <TargetFramework>net10.0-windows</TargetFramework>
+    <Nullable>enable</Nullable>
+    <ImplicitUsings>enable</ImplicitUsings>
+    <OutputType>Exe</OutputType>
+  </PropertyGroup>
+
+  <ItemGroup>
+    <PackageReference Include="Microsoft.Extensions.Hosting.WindowsServices" Version="10.0.0" />
+  </ItemGroup>
+</Project>
--- a/service_host/ScreenJob.WindowsServiceHost/ServiceOptions.cs
+++ b/service_host/ScreenJob.WindowsServiceHost/ServiceOptions.cs
@@ -0,0 +1,77 @@
+namespace ScreenJob.WindowsServiceHost;
+
+internal sealed record ServiceOptions(
+    string BackendScriptPath,
+    string WorkingDirectory,
+    string LogDirectory)
+{
+    public static ServiceOptions Parse(string[] args)
+    {
+        var map = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
+
+        for (var i = 0; i < args.Length; i++)
+        {
+            var raw = args[i];
+            if (!raw.StartsWith("--", StringComparison.Ordinal))
+            {
+                continue;
+            }
+
+            var key = raw[2..];
+            if (string.IsNullOrWhiteSpace(key))
+            {
+                continue;
+            }
+
+            if (i + 1 < args.Length && !args[i + 1].StartsWith("--", StringComparison.Ordinal))
+            {
+                map[key] = args[++i];
+            }
+            else
+            {
+                map[key] = "true";
+            }
+        }
+
+        if (!map.TryGetValue("backend-script", out var backendScript) || string.IsNullOrWhiteSpace(backendScript))
+        {
+            throw new ArgumentException("Missing required argument: --backend-script <absolute-path-to-start_backend.ps1>.");
+        }
+
+        if (!Path.IsPathRooted(backendScript))
+        {
+            throw new ArgumentException("The --backend-script value must be an absolute path.");
+        }
+
+        if (!File.Exists(backendScript))
+        {
+            throw new FileNotFoundException("Backend script not found.", backendScript);
+        }
+
+        if (!map.TryGetValue("working-dir", out var workingDir) || string.IsNullOrWhiteSpace(workingDir))
+        {
+            workingDir = Path.GetDirectoryName(backendScript)
+                ?? throw new ArgumentException("Could not resolve working directory from backend script path.");
+        }
+
+        if (!Path.IsPathRooted(workingDir))
+        {
+            throw new ArgumentException("The --working-dir value must be an absolute path.");
+        }
+
+        if (!map.TryGetValue("log-dir", out var logDir) || string.IsNullOrWhiteSpace(logDir))
+        {
+            logDir = Path.Combine(workingDir, "screenjob_runs", "service");
+        }
+
+        if (!Path.IsPathRooted(logDir))
+        {
+            throw new ArgumentException("The --log-dir value must be an absolute path.");
+        }
+
+        return new ServiceOptions(
+            Path.GetFullPath(backendScript),
+            Path.GetFullPath(workingDir),
+            Path.GetFullPath(logDir));
+    }
+}
--- a/src/agent.py
+++ b/src/agent.py
--- a/src/app_main.py
+++ b/src/app_main.py
@@ -30,6 +30,7 @@ def main(argv: list[str] | None = None) -> int:
            print("  OPENAI_API_KEY=...")
            print("  SCREENJOB_TOKEN=...")
            print("  DISABLE_UI=true|false (optional)")
+            print("  SCREENJOB_PROHIBITED_KEY_COMBOS=ctrl+shift+s,alt+f4 (optional)")
            return 0
        server.main()
        return 0
--- a/src/cli.py
+++ b/src/cli.py
@@ -5,6 +5,7 @@ import json
 import sys
 from pathlib import Path

+from .agent import normalize_disabled_tools
 from .config import load_app_config
 from .models import RuntimeOptions
 from .runtime import create_openai_client, run_job
@@ -40,8 +41,55 @@ def build_parser() -> argparse.ArgumentParser:
        default=4,
        help="Compact model context every N steps to decay old screenshots (0 disables).",
    )
+    parser.add_argument(
+        "--max-visual-context-images",
+        type=int,
+        default=3,
+        help="Maximum screenshots/enhanced images retained in model-visible context during rebases.",
+    )
+    parser.add_argument(
+        "--native-automation-mode",
+        choices=["off", "prefer", "require_fallback"],
+        default="prefer",
+        help="How strongly the agent should prefer Windows-native automation helpers over pixel fallback.",
+    )
+    parser.add_argument(
+        "--dialog-timeout-seconds",
+        type=float,
+        default=12.0,
+        help="Timeout for dialog-oriented waits and retries.",
+    )
+    parser.add_argument(
+        "--focus-timeout-seconds",
+        type=float,
+        default=8.0,
+        help="Timeout for focus-change waits and verification.",
+    )
+    parser.add_argument(
+        "--ui-element-timeout-seconds",
+        type=float,
+        default=8.0,
+        help="Timeout for native UI element lookup waits.",
+    )
+    parser.add_argument(
+        "--max-retries-per-surface",
+        type=int,
+        default=3,
+        help="Maximum repeated retries on the same classified window/dialog surface before the agent must pivot.",
+    )
+    parser.add_argument(
+        "--pretty-logs",
+        action="store_true",
+        help="Emit expanded multi-line tool call/result logs for easier debugging.",
+    )
    parser.add_argument("--disable-tool", action="append", default=[], help="Disable a tool by name.")
-    parser.add_argument("--skip-safety-check", action="store_true", help="Bypass pre-flight safety check.")
+    parser.add_argument(
+        "--skip-safety-check",
+        "--skip-safety-chec",
+        dest="skip_safety_check",
+        action="store_true",
+        help="Bypass pre-flight safety check.",
+    )
    parser.add_argument("--no-failsafe", action="store_true", help="Disable PyAutoGUI fail-safe.")
    return parser

@@ -57,7 +105,10 @@ def main(argv: list[str] | None = None) -> int:
        return 2

    model = args.model or config.default_model
-    disabled_tools = sorted({str(x).strip() for x in args.disable_tool if str(x).strip()})
+    try:
+        disabled_tools = normalize_disabled_tools(args.disable_tool)
+    except ValueError as exc:
+        parser.error(str(exc))

    if not args.skip_safety_check:
        safety_client = create_openai_client(config.openai_api_key)
@@ -92,7 +143,15 @@ def main(argv: list[str] | None = None) -> int:
        click_pause=args.click_pause,
        reasoning_effort=args.reasoning_effort,
        screen_context_decay_steps=max(0, int(args.screen_context_decay_steps)),
+        max_visual_context_images=max(0, int(args.max_visual_context_images)),
+        native_automation_mode=args.native_automation_mode,
+        dialog_timeout_seconds=max(0.5, float(args.dialog_timeout_seconds)),
+        focus_timeout_seconds=max(0.5, float(args.focus_timeout_seconds)),
+        ui_element_timeout_seconds=max(0.5, float(args.ui_element_timeout_seconds)),
+        max_retries_per_surface=max(1, int(args.max_retries_per_surface)),
+        pretty_logs=bool(args.pretty_logs),
        disable_tools=set(disabled_tools),
+        prohibited_key_combos=set(config.prohibited_key_combos),
    )
    try:
        result, artifacts = run_job(
--- a/src/config.py
+++ b/src/config.py
@@ -14,6 +14,13 @@ def _env_bool(name: str, default: bool = False) -> bool:
    return raw.strip().lower() in {"1", "true", "yes", "on"}


+def _env_csv(name: str) -> list[str]:
+    raw = os.getenv(name)
+    if raw is None:
+        return []
+    return [item.strip() for item in raw.split(",") if item.strip()]
+
+
@dataclass(frozen=True)
 class AppConfig:
    openai_api_key: str
@@ -25,6 +32,7 @@ class AppConfig:
    port: int
    runs_dir: Path
    db_path: Path
+    prohibited_key_combos: tuple[str, ...] = ()


 def load_app_config(cwd: Path) -> AppConfig:
@@ -38,6 +46,7 @@ def load_app_config(cwd: Path) -> AppConfig:
    runs_dir = cwd / "screenjob_runs"
    db_path = cwd / "screenjob.db"
    disable_ui = _env_bool("DISABLE_UI", default=False)
+    prohibited_key_combos = tuple(_env_csv("SCREENJOB_PROHIBITED_KEY_COMBOS"))
    return AppConfig(
        openai_api_key=openai_api_key,
        screenjob_token=screenjob_token,
@@ -48,5 +57,5 @@ def load_app_config(cwd: Path) -> AppConfig:
        port=port,
        runs_dir=runs_dir,
        db_path=db_path,
+        prohibited_key_combos=prohibited_key_combos,
    )
-
--- a/src/desktop_overlay.py
+++ b/src/desktop_overlay.py
@@ -0,0 +1,272 @@
+from __future__ import annotations
+
+import logging
+import os
+import queue
+import threading
+from dataclasses import dataclass
+from typing import Any
+
+
+@dataclass(frozen=True)
+class CompletionOverlayPayload:
+    job_id: str
+    objective: str
+    return_message: str
+    steps: int
+    elapsed_seconds: float
+
+
+class DesktopOverlayManager:
+    def __init__(self, logger: logging.Logger | None = None, *, auto_dismiss_seconds: float = 10.0) -> None:
+        self.logger = logger or logging.getLogger("screenjob.overlay")
+        self._queue: queue.Queue[CompletionOverlayPayload] = queue.Queue()
+        self._thread: threading.Thread | None = None
+        self._lock = threading.Lock()
+        self._ready = threading.Event()
+        self._disabled = False
+        self._warned = False
+        self._auto_dismiss_ms = max(0, int(round(float(auto_dismiss_seconds) * 1000)))
+
+    def show_completion(
+        self,
+        *,
+        job_id: str,
+        objective: str,
+        return_message: str,
+        steps: int,
+        elapsed_seconds: float,
+    ) -> None:
+        if os.name != "nt":
+            self._disable_once("Desktop completion HUD is only enabled on Windows.")
+            return
+        if not self._ensure_thread():
+            return
+        self._queue.put(
+            CompletionOverlayPayload(
+                job_id=job_id,
+                objective=objective,
+                return_message=return_message,
+                steps=max(0, int(steps)),
+                elapsed_seconds=max(0.0, float(elapsed_seconds)),
+            )
+        )
+
+    def _ensure_thread(self) -> bool:
+        with self._lock:
+            if self._disabled:
+                return False
+            if self._thread is None or not self._thread.is_alive():
+                self._ready.clear()
+                self._thread = threading.Thread(target=self._ui_main, name="screenjob-overlay", daemon=True)
+                self._thread.start()
+        self._ready.wait(timeout=2.0)
+        return not self._disabled
+
+    def _disable_once(self, reason: str) -> None:
+        with self._lock:
+            self._disabled = True
+            already_warned = self._warned
+            self._warned = True
+            self._ready.set()
+        if not already_warned:
+            self.logger.warning("%s Overlay notifications disabled.", reason)
+
+    def _format_elapsed(self, elapsed_seconds: float) -> str:
+        total_seconds = max(0, int(round(elapsed_seconds)))
+        minutes, seconds = divmod(total_seconds, 60)
+        hours, minutes = divmod(minutes, 60)
+        if hours:
+            return f"{hours}h {minutes}m {seconds}s"
+        if minutes:
+            return f"{minutes}m {seconds}s"
+        return f"{seconds}s"
+
+    def _shorten(self, text: str, limit: int) -> str:
+        raw = " ".join(str(text or "").split())
+        if len(raw) <= limit:
+            return raw
+        return raw[: max(0, limit - 1)].rstrip() + "..."
+
+    def _ui_main(self) -> None:
+        try:
+            import tkinter as tk
+        except Exception as exc:  # noqa: BLE001
+            self._disable_once(f"tkinter is unavailable ({type(exc).__name__}: {exc}).")
+            return
+
+        try:
+            root = tk.Tk()
+            root.withdraw()
+            root.update_idletasks()
+        except Exception as exc:  # noqa: BLE001
+            self._disable_once(f"Desktop overlay could not initialize ({type(exc).__name__}: {exc}).")
+            return
+
+        cards: list[dict[str, Any]] = []
+        self._ready.set()
+
+        def reposition() -> None:
+            screen_width = root.winfo_screenwidth()
+            top = 24
+            for entry in cards:
+                window = entry["window"]
+                if not bool(window.winfo_exists()):
+                    continue
+                window.update_idletasks()
+                width = max(320, int(window.winfo_width() or 360))
+                height = max(120, int(window.winfo_height() or 160))
+                left = max(12, screen_width - width - 24)
+                window.geometry(f"{width}x{height}+{left}+{top}")
+                top += height + 16
+
+        def dismiss(window: Any) -> None:
+            for index, entry in enumerate(list(cards)):
+                if entry["window"] is window:
+                    after_id = entry.get("after_id")
+                    if after_id is not None:
+                        try:
+                            window.after_cancel(after_id)
+                        except Exception:  # noqa: BLE001
+                            pass
+                    cards.pop(index)
+                    break
+            try:
+                if bool(window.winfo_exists()):
+                    window.destroy()
+            except Exception:  # noqa: BLE001
+                pass
+            if cards:
+                reposition()
+
+        def add_card(payload: CompletionOverlayPayload) -> None:
+            card = tk.Toplevel(root)
+            card.withdraw()
+            card.overrideredirect(True)
+            card.attributes("-topmost", True)
+            card.configure(bg="#0f172a")
+
+            frame = tk.Frame(card, bg="#0f172a", highlightthickness=1, highlightbackground="#22c55e", bd=0)
+            frame.pack(fill="both", expand=True)
+
+            close_button = tk.Button(
+                frame,
+                text="×",
+                command=lambda win=card: dismiss(win),
+                bg="#0f172a",
+                fg="#cbd5e1",
+                activebackground="#111827",
+                activeforeground="#ffffff",
+                relief="flat",
+                borderwidth=0,
+                font=("Segoe UI", 14, "bold"),
+                padx=6,
+                pady=0,
+            )
+            close_button.place(relx=1.0, x=-8, y=6, anchor="ne")
+
+            header = tk.Label(
+                frame,
+                text="Completed",
+                bg="#0f172a",
+                fg="#86efac",
+                font=("Segoe UI", 10, "bold"),
+                anchor="w",
+            )
+            header.pack(fill="x", padx=14, pady=(12, 2))
+
+            title = tk.Label(
+                frame,
+                text=self._shorten(payload.objective, 72) or "Job complete",
+                bg="#0f172a",
+                fg="#f8fafc",
+                font=("Segoe UI", 11, "bold"),
+                justify="left",
+                wraplength=320,
+                anchor="w",
+            )
+            title.pack(fill="x", padx=14)
+
+            job_row = tk.Label(
+                frame,
+                text=f"Job {payload.job_id}",
+                bg="#0f172a",
+                fg="#94a3b8",
+                font=("Segoe UI", 9),
+                justify="left",
+                anchor="w",
+            )
+            job_row.pack(fill="x", padx=14, pady=(2, 8))
+
+            message = tk.Label(
+                frame,
+                text=self._shorten(payload.return_message, 180) or "Task completed.",
+                bg="#0f172a",
+                fg="#e2e8f0",
+                font=("Segoe UI", 9),
+                justify="left",
+                wraplength=320,
+                anchor="w",
+            )
+            message.pack(fill="x", padx=14)
+
+            footer = tk.Label(
+                frame,
+                text=f"{payload.steps} step(s)  |  {self._format_elapsed(payload.elapsed_seconds)}",
+                bg="#0f172a",
+                fg="#94a3b8",
+                font=("Segoe UI", 9),
+                justify="left",
+                anchor="w",
+            )
+            footer.pack(fill="x", padx=14, pady=(10, 12))
+
+            after_id = None
+            if self._auto_dismiss_ms > 0:
+                after_id = card.after(self._auto_dismiss_ms, lambda win=card: dismiss(win))
+
+            cards.insert(0, {"window": card, "after_id": after_id})
+            while len(cards) > 3:
+                stale = cards.pop()
+                try:
+                    stale_after_id = stale.get("after_id")
+                    if stale_after_id is not None:
+                        stale["window"].after_cancel(stale_after_id)
+                    stale["window"].destroy()
+                except Exception:  # noqa: BLE001
+                    pass
+
+            card.update_idletasks()
+            reposition()
+            card.deiconify()
+
+        def pump_queue() -> None:
+            try:
+                while True:
+                    add_card(self._queue.get_nowait())
+            except queue.Empty:
+                pass
+            try:
+                root.after(120, pump_queue)
+            except Exception:  # noqa: BLE001
+                self._disable_once("Desktop overlay event loop stopped unexpectedly.")
+
+        pump_queue()
+        try:
+            root.mainloop()
+        except Exception as exc:  # noqa: BLE001
+            self._disable_once(f"Desktop overlay main loop failed ({type(exc).__name__}: {exc}).")
+
+
+_overlay_singleton: DesktopOverlayManager | None = None
+_overlay_lock = threading.Lock()
+
+
+def get_desktop_overlay_manager(logger: logging.Logger | None = None) -> DesktopOverlayManager:
+    global _overlay_singleton
+    with _overlay_lock:
+        if _overlay_singleton is None:
+            _overlay_singleton = DesktopOverlayManager(logger=logger)
+        elif logger is not None:
+            _overlay_singleton.logger = logger
+        return _overlay_singleton
--- a/src/models.py
+++ b/src/models.py
@@ -60,4 +60,12 @@ class RuntimeOptions:
    click_pause: float = 0.10
    reasoning_effort: str = "medium"
    screen_context_decay_steps: int = 4
+    max_visual_context_images: int = 3
+    native_automation_mode: str = "prefer"
+    dialog_timeout_seconds: float = 12.0
+    focus_timeout_seconds: float = 8.0
+    ui_element_timeout_seconds: float = 8.0
+    max_retries_per_surface: int = 3
+    pretty_logs: bool = False
    disable_tools: set[str] | None = None
+    prohibited_key_combos: set[str] | None = None
--- a/src/server.py
+++ b/src/server.py
@@ -12,6 +12,7 @@ from fastapi.responses import FileResponse
 from fastapi.responses import HTMLResponse, JSONResponse
 from pydantic import BaseModel, Field

+from .agent import normalize_disabled_tools
 from .config import AppConfig, load_app_config
 from .storage import HistoryDB
 from .task_manager import JobManager
@@ -28,6 +29,13 @@ class CreateJobRequest(BaseModel):
    click_pause: float = Field(0.10, ge=0.0, le=2.0)
    reasoning_effort: str = Field("medium", pattern="^(low|medium|high)$")
    screen_context_decay_steps: int = Field(4, ge=0, le=50)
+    max_visual_context_images: int = Field(3, ge=0, le=12)
+    native_automation_mode: str = Field("prefer", pattern="^(off|prefer|require_fallback)$")
+    dialog_timeout_seconds: float = Field(12.0, ge=0.5, le=120.0)
+    focus_timeout_seconds: float = Field(8.0, ge=0.5, le=120.0)
+    ui_element_timeout_seconds: float = Field(8.0, ge=0.5, le=120.0)
+    max_retries_per_surface: int = Field(3, ge=1, le=10)
+    pretty_logs: bool = False
    disabled_tools: list[str] = Field(default_factory=list)
    safety_override: bool = False
    no_failsafe: bool = False
@@ -297,6 +305,8 @@ def create_app(config: AppConfig | None = None) -> FastAPI:

    @app.post("/api/jobs")
    def create_job(payload: CreateJobRequest, _: None = Depends(require_token)) -> dict[str, str]:
+        try:
+            disabled_tools = normalize_disabled_tools(payload.disabled_tools)
            job_id = manager.submit_job(
                objective=payload.job,
                model=payload.model,
@@ -306,10 +316,19 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
                click_pause=payload.click_pause,
                reasoning_effort=payload.reasoning_effort,
                screen_context_decay_steps=payload.screen_context_decay_steps,
-            disabled_tools=payload.disabled_tools,
+                max_visual_context_images=payload.max_visual_context_images,
+                native_automation_mode=payload.native_automation_mode,
+                dialog_timeout_seconds=payload.dialog_timeout_seconds,
+                focus_timeout_seconds=payload.focus_timeout_seconds,
+                ui_element_timeout_seconds=payload.ui_element_timeout_seconds,
+                max_retries_per_surface=payload.max_retries_per_surface,
+                pretty_logs=payload.pretty_logs,
+                disabled_tools=disabled_tools,
                safety_override=payload.safety_override,
                no_failsafe=payload.no_failsafe,
            )
+        except ValueError as exc:
+            raise HTTPException(status_code=400, detail=str(exc)) from exc
        return {"job_id": job_id}

    @app.get("/api/jobs")
--- a/src/task_manager.py
+++ b/src/task_manager.py
@@ -8,7 +8,9 @@ from dataclasses import dataclass
 from pathlib import Path
 from typing import Any, Callable

+from .agent import normalize_disabled_tools
 from .config import AppConfig
+from .desktop_overlay import DesktopOverlayManager, get_desktop_overlay_manager
 from .models import RuntimeOptions
 from .runtime import create_openai_client, run_job
 from .safety import assess_task_safety
@@ -32,10 +34,12 @@ class JobManager:
        config: AppConfig,
        db: HistoryDB,
        broadcast: Callable[[dict[str, Any]], None] | None = None,
+        overlay_manager: DesktopOverlayManager | None = None,
    ) -> None:
        self.config = config
        self.db = db
        self.broadcast = broadcast
+        self.overlay_manager = overlay_manager or get_desktop_overlay_manager()
        self._running: dict[str, _RunningJob] = {}
        self._lock = threading.Lock()

@@ -50,6 +54,13 @@ class JobManager:
        click_pause: float = 0.10,
        reasoning_effort: str = "medium",
        screen_context_decay_steps: int = 4,
+        max_visual_context_images: int = 3,
+        native_automation_mode: str = "prefer",
+        dialog_timeout_seconds: float = 12.0,
+        focus_timeout_seconds: float = 8.0,
+        ui_element_timeout_seconds: float = 8.0,
+        max_retries_per_surface: int = 3,
+        pretty_logs: bool = False,
        disabled_tools: list[str] | None = None,
        safety_override: bool = False,
        no_failsafe: bool = False,
@@ -57,7 +68,7 @@ class JobManager:
        job_id = f"job_{int(time.time())}_{uuid.uuid4().hex[:8]}"
        created_at = utc_now_iso()
        selected_model = (model or self.config.default_model).strip() or self.config.default_model
-        disabled = sorted({tool.strip() for tool in (disabled_tools or []) if tool.strip()})
+        disabled = normalize_disabled_tools(disabled_tools)
        self.db.create_job(
            job_id=job_id,
            objective=objective,
@@ -97,6 +108,13 @@ class JobManager:
                "click_pause": click_pause,
                "reasoning_effort": reasoning_effort,
                "screen_context_decay_steps": screen_context_decay_steps,
+                "max_visual_context_images": max_visual_context_images,
+                "native_automation_mode": native_automation_mode,
+                "dialog_timeout_seconds": dialog_timeout_seconds,
+                "focus_timeout_seconds": focus_timeout_seconds,
+                "ui_element_timeout_seconds": ui_element_timeout_seconds,
+                "max_retries_per_surface": max_retries_per_surface,
+                "pretty_logs": pretty_logs,
                "no_failsafe": no_failsafe,
                "cancel_event": cancel_event,
            },
@@ -127,6 +145,13 @@ class JobManager:
        click_pause: float,
        reasoning_effort: str,
        screen_context_decay_steps: int,
+        max_visual_context_images: int,
+        native_automation_mode: str,
+        dialog_timeout_seconds: float,
+        focus_timeout_seconds: float,
+        ui_element_timeout_seconds: float,
+        max_retries_per_surface: int,
+        pretty_logs: bool,
        no_failsafe: bool,
        cancel_event: threading.Event,
    ) -> None:
@@ -226,7 +251,15 @@ class JobManager:
            click_pause=click_pause,
            reasoning_effort=reasoning_effort,
            screen_context_decay_steps=max(0, int(screen_context_decay_steps)),
+            max_visual_context_images=max(0, int(max_visual_context_images)),
+            native_automation_mode=str(native_automation_mode or "prefer").strip().lower() or "prefer",
+            dialog_timeout_seconds=max(0.5, float(dialog_timeout_seconds)),
+            focus_timeout_seconds=max(0.5, float(focus_timeout_seconds)),
+            ui_element_timeout_seconds=max(0.5, float(ui_element_timeout_seconds)),
+            max_retries_per_surface=max(1, int(max_retries_per_surface)),
+            pretty_logs=bool(pretty_logs),
            disable_tools=set(disabled_tools),
+            prohibited_key_combos=set(self.config.prohibited_key_combos),
        )
        try:
            result, artifacts = run_job(
@@ -297,6 +330,14 @@ class JobManager:
                },
            },
        )
+        if status == "completed":
+            self.overlay_manager.show_completion(
+                job_id=job_id,
+                objective=objective,
+                return_message=result.return_message,
+                steps=result.steps,
+                elapsed_seconds=max(0.0, float(result.ended_at - result.started_at)),
+            )
        with self._lock:
            self._running.pop(job_id, None)

--- a/start_backend.ps1
+++ b/start_backend.ps1
@@ -15,10 +15,76 @@ function Test-EnvVarLine {
    return [bool](Select-String -Path $FilePath -Pattern ("^\s*" + [regex]::Escape($Name) + "=") -Quiet)
 }

-if (-not (Get-Command python -ErrorAction SilentlyContinue)) {
-    throw "Python was not found in PATH. Install Python 3.11+ and retry."
+function Resolve-PythonExecutable {
+    $venvPython = Join-Path $scriptDir ".venv\Scripts\python.exe"
+    if (Test-Path -LiteralPath $venvPython) {
+        return $venvPython
+    }
+
+    $pythonCmd = Get-Command python -ErrorAction SilentlyContinue
+    if ($null -ne $pythonCmd -and (Test-Path -LiteralPath $pythonCmd.Source)) {
+        return $pythonCmd.Source
+    }
+
+    $candidatePyLaunchers = @()
+    $pyFromPath = Get-Command py -ErrorAction SilentlyContinue
+    if ($null -ne $pyFromPath -and (Test-Path -LiteralPath $pyFromPath.Source)) {
+        $candidatePyLaunchers += $pyFromPath.Source
+    }
+    $candidatePyLaunchers += "C:\Windows\py.exe"
+
+    if ($scriptDir -match "^[A-Za-z]:\\Users\\[^\\]+") {
+        $repoUserHome = $Matches[0]
+        $candidatePyLaunchers += (Join-Path $repoUserHome "AppData\Local\Programs\Python\Launcher\py.exe")
+    }
+
+    foreach ($pyLauncher in ($candidatePyLaunchers | Select-Object -Unique)) {
+        if (-not (Test-Path -LiteralPath $pyLauncher)) {
+            continue
+        }
+        try {
+            $resolved = (& $pyLauncher -3 -c "import sys; print(sys.executable)" 2>$null | Select-Object -Last 1).Trim()
+            if ($resolved -and (Test-Path -LiteralPath $resolved)) {
+                return $resolved
+            }
+        } catch {
+            continue
+        }
+    }
+
+    $candidatePythonPaths = @()
+    if ($scriptDir -match "^[A-Za-z]:\\Users\\[^\\]+") {
+        $repoUserHome = $Matches[0]
+        $pythonBase = Join-Path $repoUserHome "AppData\Local\Programs\Python"
+        if (Test-Path -LiteralPath $pythonBase) {
+            $candidatePythonPaths += (Get-ChildItem -LiteralPath $pythonBase -Directory -ErrorAction SilentlyContinue |
+                Sort-Object Name -Descending |
+                ForEach-Object { Join-Path $_.FullName "python.exe" })
+        }
+    }
+
+    $candidatePythonPaths += @(
+        "C:\Python314\python.exe",
+        "C:\Python313\python.exe",
+        "C:\Python312\python.exe",
+        "C:\Python311\python.exe",
+        "C:\Program Files\Python314\python.exe",
+        "C:\Program Files\Python313\python.exe",
+        "C:\Program Files\Python312\python.exe",
+        "C:\Program Files\Python311\python.exe"
+    )
+
+    foreach ($candidate in ($candidatePythonPaths | Select-Object -Unique)) {
+        if (Test-Path -LiteralPath $candidate) {
+            return $candidate
+        }
+    }
+
+    throw "Python was not found. Install Python 3.11+ system-wide, or create .venv in the repo root."
 }

+$pythonExe = Resolve-PythonExecutable
+
 $envFile = Join-Path $scriptDir ".env"
 if (-not (Test-Path -LiteralPath $envFile)) {
    Write-Warning ".env was not found at $envFile. Server startup may fail if required vars are missing."
@@ -31,5 +97,5 @@ if (-not (Test-Path -LiteralPath $envFile)) {
    }
 }

-Write-Host "Starting ScreenJob backend on configured host/port..." -ForegroundColor Cyan
-python main.py server
+Write-Host "Starting ScreenJob backend with Python: $pythonExe" -ForegroundColor Cyan
+& $pythonExe main.py server
--- a/start_backend_hidden.vbs
+++ b/start_backend_hidden.vbs
@@ -0,0 +1,11 @@
+Option Explicit
+
+Dim shell, fso, scriptDir, psScript, command
+Set shell = CreateObject("WScript.Shell")
+Set fso = CreateObject("Scripting.FileSystemObject")
+
+scriptDir = fso.GetParentFolderName(WScript.ScriptFullName)
+psScript = """" & fso.BuildPath(scriptDir, "start_backend.ps1") & """"
+
+command = "powershell.exe -NoProfile -ExecutionPolicy Bypass -WindowStyle Hidden -STA -File " & psScript
+shell.Run command, 0, False
--- a/start_screenjob_tray_hidden.vbs
+++ b/start_screenjob_tray_hidden.vbs
@@ -0,0 +1,11 @@
+Option Explicit
+
+Dim shell, fso, scriptDir, psScript, command
+Set shell = CreateObject("WScript.Shell")
+Set fso = CreateObject("Scripting.FileSystemObject")
+
+scriptDir = fso.GetParentFolderName(WScript.ScriptFullName)
+psScript = """" & fso.BuildPath(scriptDir, "screenjob_tray.ps1") & """"
+
+command = "powershell.exe -NoProfile -ExecutionPolicy Bypass -WindowStyle Hidden -STA -File " & psScript
+shell.Run command, 0, False
--- a/tests/test_agent_tools.py
+++ b/tests/test_agent_tools.py
@@ -1,8 +1,11 @@
 from __future__ import annotations

+import json
 import logging
 from pathlib import Path
+from typing import Any

+import pytest
 from PIL import Image

 import src.agent as agent_module
@@ -15,8 +18,12 @@ class _DummyPyAutoGUI:

    def __init__(self) -> None:
        self.last_move_to: tuple[int, int] | None = None
-        self.last_click: tuple[int, int] | None = None
+        self.last_move_duration: float | None = None
+        self.last_click: dict[str, object] | None = None
        self.last_hotkey: tuple[str, ...] | None = None
+        self.last_drag_to: dict[str, object] | None = None
+        self.last_scroll: int | None = None
+        self.current_position: tuple[int, int] = (640, 360)

    def screenshot(self) -> Image.Image:
        return Image.new("RGB", (1280, 720), color=(24, 24, 24))
@@ -26,9 +33,26 @@ class _DummyPyAutoGUI:

    def moveTo(self, x: int, y: int, duration: float = 0.0) -> None:  # noqa: N802
        self.last_move_to = (x, y)
+        self.last_move_duration = duration
+        self.current_position = (x, y)

-    def click(self, x: int, y: int) -> None:
-        self.last_click = (x, y)
+    def click(
+        self,
+        x: int,
+        y: int,
+        clicks: int = 1,
+        interval: float = 0.0,
+        button: str = "left",
+    ) -> None:
+        self.last_click = {"x": x, "y": y, "clicks": clicks, "interval": interval, "button": button}
+        self.current_position = (x, y)
+
+    def dragTo(self, x: int, y: int, duration: float = 0.0, button: str = "left") -> None:  # noqa: N802
+        self.last_drag_to = {"x": x, "y": y, "duration": duration, "button": button}
+        self.current_position = (x, y)
+
+    def scroll(self, amount: int) -> None:
+        self.last_scroll = amount

    def write(self, _: str, interval: float = 0.0) -> None:
        return None
@@ -39,6 +63,10 @@ class _DummyPyAutoGUI:
    def hotkey(self, *keys: str) -> None:
        self.last_hotkey = tuple(keys)

+    def position(self):
+        x, y = self.current_position
+        return type("Point", (), {"x": x, "y": y})()
+

 def _build_agent(tmp_path: Path, monkeypatch) -> agent_module.ScreenJobAgent:
    dummy_gui = _DummyPyAutoGUI()
@@ -84,11 +112,158 @@ def test_click_supports_directional_offsets(tmp_path: Path, monkeypatch) -> None
            "offset_up": "2px",
            "offset_right": 7,
            "offset": {"x": 3, "y": 4},
+            "button": "right",
+            "click_count": 2,
+            "interval_seconds": "0.5s",
+            "duration_seconds": "0.2s",
            "sleep_after_seconds": 0,
        }
    )
    assert click_result["ok"] is True
    assert click_result["clicked"] == {"x": 110, "y": 102}
+    assert click_result["button"] == "right"
+    assert click_result["click_count"] == 2
+    assert click_result["interval_seconds"] == 0.5
+    assert click_result["duration_seconds"] == 0.2
+    assert agent_module.pyautogui.last_click == {
+        "x": 110,
+        "y": 102,
+        "clicks": 2,
+        "interval": 0.5,
+        "button": "right",
+    }
+
+
+def test_scroll_supports_direction_and_amount(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    result = agent._tool_scroll(
+        {
+            "amount": 8,
+            "direction": "down",
+            "coordinate": {"x": 1400, "y": -5},
+            "sleep_after_seconds": 0,
+        }
+    )
+
+    assert result["ok"] is True
+    assert result["amount"] == -8
+    assert result["direction"] == "down"
+    assert result["moved_to"] == {"x": 1279, "y": 0}
+    assert agent_module.pyautogui.last_scroll == -8
+
+
+def test_drag_translates_coordinates_and_button(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    result = agent._tool_drag(
+        {
+            "start_coordinate": {"x": -10, "y": 100},
+            "end_coordinate": {"x": 1285, "y": 800},
+            "button": "middle",
+            "duration_seconds": "0.3s",
+            "sleep_after_seconds": 0,
+        }
+    )
+
+    assert result["ok"] is True
+    assert result["from"] == {"x": 0, "y": 100}
+    assert result["to"] == {"x": 1279, "y": 719}
+    assert result["button"] == "middle"
+    assert result["duration_seconds"] == 0.3
+    assert agent_module.pyautogui.last_drag_to == {
+        "x": 1279,
+        "y": 719,
+        "duration": 0.3,
+        "button": "middle",
+    }
+
+
+def test_move_mouse_clamps_target_coordinate(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    result = agent._tool_move_mouse({"coordinate": {"x": 1500, "y": -5}, "duration_seconds": "0.4s"})
+
+    assert result["ok"] is True
+    assert result["moved_to"] == {"x": 1279, "y": 0}
+    assert result["duration_seconds"] == 0.4
+    assert agent_module.pyautogui.last_move_to == (1279, 0)
+
+
+def test_clipboard_get_and_set_round_trip(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    state = {"text": ""}
+    monkeypatch.setattr(agent, "_clipboard_set_text", lambda text: state.__setitem__("text", text))
+    monkeypatch.setattr(agent, "_clipboard_get_text", lambda: state["text"])
+    monkeypatch.setattr(
+        agent,
+        "_clipboard_get_metadata",
+        lambda: {"has_text": bool(state["text"]), "has_image": True, "available_formats": ["CF_UNICODETEXT", "CF_DIB"]},
+    )
+
+    set_result = agent._tool_clipboard_set({"text": "hello clipboard"})
+    get_result = agent._tool_clipboard_get({})
+
+    assert set_result["ok"] is True
+    assert set_result["length"] == 15
+    assert get_result["ok"] is True
+    assert get_result["text"] == "hello clipboard"
+    assert get_result["length"] == 15
+    assert get_result["has_text"] is True
+    assert get_result["has_image"] is True
+    assert get_result["available_formats"] == ["CF_UNICODETEXT", "CF_DIB"]
+
+
+def test_clipboard_set_falls_back_to_powershell_when_native_path_fails(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    state = {"text": ""}
+
+    def fail_native(_: str) -> None:
+        raise OSError("[WinError 6] The handle is invalid.")
+
+    def shell_fallback(text: str) -> None:
+        state["text"] = text
+
+    monkeypatch.setattr(agent, "_clipboard_set_text", fail_native)
+    monkeypatch.setattr(agent, "_clipboard_set_text_via_shell", shell_fallback)
+
+    result = agent._tool_clipboard_set({"text": "Example Domain"})
+
+    assert result["ok"] is True
+    assert result["used_shell_fallback"] is True
+    assert "WinError 6" in result["native_error"]
+    assert state["text"] == "Example Domain"
+
+
+def test_get_cursor_position_returns_current_mouse_location(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent_module.pyautogui.current_position = (321, 654)
+
+    result = agent._tool_get_cursor_position({})
+
+    assert result["ok"] is True
+    assert result["position"] == {"x": 321, "y": 654}
+
+
+def test_get_active_window_returns_metadata_shape(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    monkeypatch.setattr(
+        agent,
+        "_get_active_window_info",
+        lambda: {
+            "available": True,
+            "hwnd": 1234,
+            "title": "Settings",
+            "class_name": "ApplicationFrameWindow",
+            "thread_id": 44,
+            "process_id": 77,
+            "is_visible": True,
+            "rect": {"left": 10, "top": 20, "right": 410, "bottom": 320, "width": 400, "height": 300},
+        },
+    )
+
+    result = agent._tool_get_active_window({})
+
+    assert result["ok"] is True
+    assert result["window"]["title"] == "Settings"
+    assert result["window"]["rect"]["width"] == 400


 def test_enhance_defaults_to_small_ui_preset(tmp_path: Path, monkeypatch) -> None:
@@ -135,6 +310,32 @@ def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None:
    assert agent_module.pyautogui.last_hotkey == ("win", "r")


+def test_press_key_blocks_prohibited_combo(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.options.prohibited_key_combos = {"ctrl+shift+s"}
+    agent.prohibited_key_combos = agent._normalize_prohibited_key_combos(agent.options.prohibited_key_combos)
+
+    result = agent._tool_press_key({"key": "ctrl+shift+s"})
+
+    assert result["ok"] is False
+    assert result["blocked"] is True
+    assert result["key"] == "ctrl+shift+s"
+    assert "prohibited by runtime configuration" in result["error"]
+    assert "another allowed route" in result["hint"]
+
+
+def test_press_key_blocks_prohibited_combo_after_alias_normalization(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.options.prohibited_key_combos = {"meta+r"}
+    agent.prohibited_key_combos = agent._normalize_prohibited_key_combos(agent.options.prohibited_key_combos)
+
+    result = agent._tool_press_key({"key": "win+r"})
+
+    assert result["ok"] is False
+    assert result["blocked"] is True
+    assert result["key"] == "win+r"
+
+
 def test_context_compaction_trigger_and_payload(tmp_path: Path, monkeypatch) -> None:
    agent = _build_agent(tmp_path, monkeypatch)
    agent.objective = "Open settings app"
@@ -147,7 +348,596 @@ def test_context_compaction_trigger_and_payload(tmp_path: Path, monkeypatch) ->
    agent.last_screen_meta = {"width": 1280, "height": 720, "path": "C:/tmp/frame.png"}

    assert agent._should_compact_context() is True
-    compacted = agent._build_compacted_pending_input()
+    visual_message = agent._build_visual_message("Current screen", "data:image/png;base64,abc", agent.last_screen_meta)
+    agent._register_visual_context_message(visual_message, agent.last_screen_meta, tool_name="see_screen")
+    compacted = agent._build_compacted_pending_input("decay")
    assert len(compacted) == 2
-    assert "Context compaction activated" in compacted[0]["content"][0]["text"]
+    assert "Context compaction activated due to stale context decay." in compacted[0]["content"][0]["text"]
    assert "Open settings app" in compacted[0]["content"][0]["text"]
+    assert "Treat prior reasoning as stale" in compacted[0]["content"][0]["text"]
+    assert "Retained visual observations:" in compacted[0]["content"][0]["text"]
+    assert "do not call see_screen again only because compaction happened" in compacted[0]["content"][0]["text"]
+    assert "observe -> decide -> act -> verify" in compacted[0]["content"][0]["text"]
+
+
+def test_context_compaction_drops_function_call_outputs_from_rebased_input(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.objective = "Open settings app"
+    visual_meta = {"path": "C:/tmp/frame.png"}
+    visual_message = agent._build_visual_message("Current screen", "data:image/png;base64,abc", visual_meta)
+    agent._register_visual_context_message(visual_message, visual_meta, tool_name="see_screen")
+
+    compacted = agent._build_compacted_pending_input(
+        "decay",
+        carryover_items=[
+            {"type": "function_call_output", "call_id": "call_123", "output": "{\"ok\": true}"},
+            {"role": "user", "content": [{"type": "input_text", "text": "blocked hint"}]},
+        ],
+    )
+
+    assert len(compacted) == 3
+    assert compacted[1]["role"] == "user"
+    assert compacted[1]["content"][0]["text"] == "blocked hint"
+    assert all(item.get("type") != "function_call_output" for item in compacted)
+
+
+def test_visual_context_budget_keeps_only_latest_three_images(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.options.max_visual_context_images = 3
+
+    captured_times = [
+        "2026-05-30T10:00:03+00:00",
+        "2026-05-30T10:00:01+00:00",
+        "2026-05-30T10:00:04+00:00",
+        "2026-05-30T10:00:02+00:00",
+    ]
+    for idx, captured_at in enumerate(captured_times):
+        meta = {"path": f"C:/tmp/frame_{idx}.png", "captured_at": captured_at}
+        message = agent._build_visual_message(f"frame {idx}", f"data:image/png;base64,{idx}", meta)
+        agent._register_visual_context_message(message, meta, tool_name="see_screen")
+
+    assert agent.visual_context_overflow_pending is True
+    assert [entry["meta"]["path"] for entry in agent.visual_context_messages] == [
+        "C:/tmp/frame_3.png",
+        "C:/tmp/frame_0.png",
+        "C:/tmp/frame_2.png",
+    ]
+
+
+def test_compacted_input_uses_latest_visuals_by_capture_time(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.options.max_visual_context_images = 3
+    agent.objective = "Verify the current app window"
+
+    for idx, captured_at in enumerate(
+        [
+            "2026-05-30T10:00:04+00:00",
+            "2026-05-30T10:00:01+00:00",
+            "2026-05-30T10:00:03+00:00",
+            "2026-05-30T10:00:02+00:00",
+        ]
+    ):
+        meta = {"path": f"C:/tmp/frame_{idx}.png", "captured_at": captured_at}
+        message = agent._build_visual_message(f"frame {idx}", f"data:image/png;base64,{idx}", meta)
+        agent._register_visual_context_message(message, meta, tool_name="see_screen")
+
+    compacted = agent._build_compacted_pending_input("visual_budget")
+    visual_messages = [
+        item
+        for item in compacted
+        if isinstance(item.get("content"), list)
+        and any(part.get("type") == "input_image" for part in item["content"] if isinstance(part, dict))
+    ]
+
+    assert len(visual_messages) == 3
+    assert [
+        json.loads(message["content"][0]["text"].split("Metadata: ", 1)[1].split("\n", 1)[0])["path"]
+        for message in visual_messages
+    ] == [
+        "C:/tmp/frame_3.png",
+        "C:/tmp/frame_2.png",
+        "C:/tmp/frame_0.png",
+    ]
+
+
+def test_context_compaction_event_includes_visual_budget_reason_and_paths(tmp_path: Path, monkeypatch) -> None:
+    events: list[dict[str, object]] = []
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.event_callback = events.append
+    agent.step = 5
+    agent.recent_tool_summaries = ["step=4 tool=enhance status=ok"]
+    agent.visual_context_messages = [
+        {"message": {"role": "user", "content": []}, "meta": {"path": "C:/tmp/1.png"}},
+        {"message": {"role": "user", "content": []}, "meta": {"path": "C:/tmp/2.png"}},
+        {"message": {"role": "user", "content": []}, "meta": {"path": "C:/tmp/3.png"}},
+    ]
+
+    agent._emit_context_compacted("visual_budget")
+
+    assert events[-1]["event_type"] == "context_compacted"
+    payload = events[-1]["payload"]
+    assert payload["rebuild_reason"] == "visual_budget"
+    assert payload["visual_context_paths"] == ["C:/tmp/1.png", "C:/tmp/2.png", "C:/tmp/3.png"]
+
+
+def test_observation_loop_blocks_repeated_broad_reobservation(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.step_history = [
+        {
+            "step": 21,
+            "tool_names": ["get_active_window", "see_screen"],
+            "window_signature": "123|#32770|Save as",
+            "window_summary": "Save as [#32770]",
+            "had_visual": True,
+        },
+        {
+            "step": 22,
+            "tool_names": ["get_active_window", "see_screen"],
+            "window_signature": "123|#32770|Save as",
+            "window_summary": "Save as [#32770]",
+            "had_visual": True,
+        },
+        {
+            "step": 23,
+            "tool_names": ["get_active_window", "see_screen"],
+            "window_signature": "123|#32770|Save as",
+            "window_summary": "Save as [#32770]",
+            "had_visual": True,
+        },
+    ]
+
+    blocked = agent._dispatch_tool("see_screen", {})
+
+    assert blocked["ok"] is False
+    assert blocked["blocked"] is True
+    assert blocked["blocked_reason"] == "observation_loop"
+    assert "unchanged foreground window" in blocked["error"]
+    assert blocked["window_summary"] == "Save as [#32770]"
+
+
+def test_repeated_ambiguous_action_requires_verification_and_then_blocks(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    type_args = {"text": "repeat me"}
+
+    first = agent._dispatch_tool("type", type_args)
+    assert first["ok"] is True
+    assert first["verification_required"] is True
+    assert first["verification_channels"] == ["enhance", "get_active_window", "see_screen"]
+
+    blocked_without_verification = agent._dispatch_tool("type", type_args)
+    assert blocked_without_verification["blocked"] is True
+    assert "see_screen" in blocked_without_verification["error"]
+
+    assert agent._dispatch_tool("see_screen", {})["ok"] is True
+    assert agent._dispatch_tool("type", type_args)["ok"] is True
+    assert agent._dispatch_tool("see_screen", {})["ok"] is True
+    assert agent._dispatch_tool("type", type_args)["ok"] is True
+    assert agent._dispatch_tool("see_screen", {})["ok"] is True
+
+    blocked_after_retry_budget = agent._dispatch_tool("type", type_args)
+    assert blocked_after_retry_budget["blocked"] is True
+    assert "3 time(s) on the same surface" in blocked_after_retry_budget["error"]
+
+    assert agent._dispatch_tool("see_screen", {})["ok"] is True
+    reset_attempt = agent._dispatch_tool("type", type_args)
+    assert reset_attempt["ok"] is True
+
+
+def test_copy_shortcut_prefers_clipboard_verification(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    monkeypatch.setattr(
+        agent,
+        "_clipboard_get_metadata",
+        lambda: {"has_text": True, "has_image": False, "available_formats": ["CF_UNICODETEXT"]},
+    )
+    monkeypatch.setattr(agent, "_clipboard_get_text", lambda: "copied")
+
+    first = agent._dispatch_tool("press_key", {"key": "ctrl+c"})
+    assert first["ok"] is True
+    assert first["verification_channels"] == ["clipboard_get"]
+
+    blocked = agent._dispatch_tool("press_key", {"key": "ctrl+c"})
+    assert blocked["blocked"] is True
+    assert "clipboard_get" in blocked["error"]
+
+    observed = agent._dispatch_tool("clipboard_get", {})
+    assert observed["ok"] is True
+    assert observed["has_text"] is True
+
+    second = agent._dispatch_tool("press_key", {"key": "ctrl+c"})
+    assert second["ok"] is True
+
+
+def test_execute_command_blocks_unrequested_recursive_file_search(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.objective = "Save the current note in Notepad"
+
+    result = agent._tool_execute_command({"command": "Get-ChildItem -Recurse -Filter *.txt"})
+
+    assert result["ok"] is False
+    assert result["blocked"] is True
+    assert "out of scope" in result["error"]
+
+
+def test_execute_command_allows_recursive_file_search_when_objective_requests_it(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.objective = "Find the saved text file path"
+
+    called: dict[str, Any] = {}
+
+    class _FakeProcess:
+        returncode = 0
+
+        def poll(self) -> int:
+            return 0
+
+        def communicate(self, timeout: int = 2):
+            return ("ok", "")
+
+    def fake_popen(*args, **kwargs):
+        called["command"] = args[0]
+        return _FakeProcess()
+
+    monkeypatch.setattr(agent_module.subprocess, "Popen", fake_popen)
+
+    result = agent._tool_execute_command({"command": "Get-ChildItem -Recurse -Filter *.txt"})
+
+    assert result["ok"] is True
+    assert called["command"] == "Get-ChildItem -Recurse -Filter *.txt"
+
+
+def test_execute_command_launch_requires_focus_verification(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    called: dict[str, Any] = {}
+
+    class _FakeProcess:
+        returncode = 0
+
+        def poll(self) -> int:
+            return 0
+
+        def communicate(self, timeout: int = 2):
+            return ("", "")
+
+    def fake_popen(*args, **kwargs):
+        called["command"] = args[0]
+        return _FakeProcess()
+
+    monkeypatch.setattr(agent_module.subprocess, "Popen", fake_popen)
+
+    first = agent._dispatch_tool("execute_command", {"command": "start notepad"})
+
+    assert first["ok"] is True
+    assert first["background_launch_assumed"] is True
+    assert first["focus_change_assumed"] is False
+    assert first["verification_required"] is True
+    assert first["verification_channels"] == ["get_active_window", "see_screen"]
+    assert called["command"] == "start notepad"
+
+    blocked = agent._dispatch_tool("execute_command", {"command": "start notepad"})
+    assert blocked["blocked"] is True
+    assert "get_active_window" in blocked["error"]
+
+    observed = agent._dispatch_tool("get_active_window", {})
+    assert observed["ok"] is True
+
+    second = agent._dispatch_tool("execute_command", {"command": "start notepad"})
+    assert second["ok"] is True
+
+
+def test_system_prompt_emphasizes_situational_awareness() -> None:
+    prompt = agent_module.SYSTEM_PROMPT
+
+    assert "Maintain a live mental model" in prompt
+    assert "classify -> choose control channel -> execute one meaningful transition -> verify" in prompt
+    assert "First classify, then act." in prompt
+    assert "Use see_screen at a balanced cadence" in prompt
+    assert "get_active_window" in prompt
+    assert "detect_dialog" in prompt
+    assert "dialog_set_filename" in prompt
+    assert "list_ui_elements" in prompt
+    assert "clipboard_get" in prompt
+    assert "Do not invent new subgoals" in prompt
+    assert "verify-and-finish" in prompt
+    assert "data.observed_result" in prompt
+    assert "Treat command-launched apps or URLs as background" in prompt
+    assert "#32770" in prompt
+    assert "secure desktop" in prompt.lower()
+
+
+def test_observation_loop_prompt_pushes_action_or_finish() -> None:
+    prompt = agent_module.build_observation_loop_prompt("Save as [#32770]", repeated_steps=3)
+
+    assert "same stable window for 3 step(s)" in prompt
+    assert "Save as [#32770]" in prompt
+    assert "Do not keep calling broad observation tools" in prompt
+    assert "native window/dialog/element tool" in prompt
+    assert "Use enhance only if a small or text-heavy control must be read before acting." in prompt
+    assert "#32770 dialog" in prompt
+
+
+def test_finish_likely_prompt_pushes_verification_then_completion() -> None:
+    prompt = agent_module.build_finish_likely_prompt(
+        'Save dialog closed and focus returned to "todo-demo.txt - Notepad". | Command verification confirms "todo-demo.txt" exists.',
+        prohibited_key_combos={"ctrl+shift+s"},
+    )
+
+    assert "objective is likely already satisfied" in prompt
+    assert "todo-demo.txt - Notepad" in prompt
+    assert "call see_screen" in prompt
+    assert "then call task_complete" in prompt
+    assert "Do not reopen menus" in prompt
+    assert "Prohibited key combos for this run: ctrl+shift+s." in prompt
+
+
+def test_initial_action_prompt_reinforces_observation_and_verification() -> None:
+    prompt = agent_module.build_initial_action_prompt("Open calculator", {"ctrl+shift+s"})
+
+    assert "JOB: Open calculator" in prompt
+    assert "First classify the current UI state from the latest evidence." in prompt
+    assert "Identify what changed since the last action or screen capture." in prompt
+    assert "classify -> choose control channel -> execute one meaningful transition -> verify" in prompt
+    assert "Prefer native window/dialog/element tools" in prompt
+    assert "get_active_window plus detect_dialog" in prompt
+    assert "click then see_screen" in prompt
+    assert "Do not invent new subgoals" in prompt
+    assert "Prefer non-visual verification when available" in prompt
+    assert "wait_for_focus_change" in prompt
+    assert "#32770 dialogs" in prompt
+    assert "Prohibited key combos for this run: ctrl+shift+s." in prompt
+    assert "do not re-capture the screen just to reconfirm an obvious large input area" in prompt
+    assert 'task_complete(return=..., data={"observed_result": ...})' in prompt
+
+
+def test_no_tool_prompt_recovers_by_reobserving() -> None:
+    prompt = agent_module.build_no_tool_prompt({"ctrl+shift+s"})
+
+    assert "Recover by re-observing the current desktop state instead of guessing." in prompt
+    assert "Start by classifying the surface." in prompt
+    assert "get_active_window" in prompt
+    assert "detect_dialog" in prompt
+    assert "clipboard_get" in prompt
+    assert "native window/dialog/element tools" in prompt
+    assert "Do not assume execute_command launches changed the foreground window" in prompt
+    assert "Prohibited key combos for this run: ctrl+shift+s." in prompt
+    assert "If a modal, picker, or browser download/upload surface is likely" in prompt
+
+
+def test_blocked_action_prompt_reanchors_on_screen_state() -> None:
+    prompt = agent_module.build_blocked_action_prompt("click", prohibited_key_combos={"ctrl+shift+s"})
+
+    assert "The last action using click was blocked or unreliable." in prompt
+    assert "Do not retry blindly." in prompt
+    assert "classify the current surface" in prompt
+    assert "detect_dialog" in prompt
+    assert "dialog_set_filename" in prompt
+    assert "get_active_window" in prompt
+    assert "get_cursor_position before move_mouse or drag" in prompt
+    assert "wait_for_focus_change" in prompt
+    assert "secure desktop or UAC" in prompt
+    assert "Switch strategy after the fresh classification" in prompt
+    assert "Prohibited key combos for this run: ctrl+shift+s." in prompt
+    assert "native control instead of pixels" in prompt
+
+
+def test_tool_schemas_include_completion_and_desktop_awareness_guidance(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.prohibited_key_combos = {"ctrl+shift+s"}
+    schemas = {tool["name"]: tool for tool in agent._tool_schemas()}
+
+    assert "data.observed_result" in schemas["task_complete"]["description"]
+    assert "before task_complete" in schemas["see_screen"]["description"]
+    assert "text-heavy targets" in schemas["enhance"]["description"]
+    assert "verify copy or cut results" in schemas["clipboard_get"]["description"]
+    assert "pointer state matters" in schemas["get_cursor_position"]["description"]
+    assert "verify focus and active app" in schemas["get_active_window"]["description"]
+    assert "foreground focus" in schemas["execute_command"]["description"]
+    assert "Prohibited for this run: ctrl+shift+s." in schemas["press_key"]["description"]
+    assert "dialog classification" in schemas["get_active_window"]["description"]
+    assert "visible top-level windows" in schemas["list_windows"]["description"]
+    assert "#32770 or picker surface" in schemas["detect_dialog"]["description"]
+    assert "filename or path field" in schemas["dialog_set_filename"]["description"]
+    assert "native child controls" in schemas["list_ui_elements"]["description"]
+
+
+def test_tool_schemas_hide_optional_native_tools_when_mode_off(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.options.native_automation_mode = "off"
+
+    schemas = {tool["name"]: tool for tool in agent._tool_schemas()}
+
+    assert "get_active_window" in schemas
+    assert "list_windows" not in schemas
+    assert "detect_dialog" not in schemas
+    assert "list_ui_elements" not in schemas
+
+
+def test_list_windows_returns_structured_surface_metadata(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    monkeypatch.setattr(
+        agent,
+        "_list_windows_info",
+        lambda visible_only=True: [
+            {
+                "available": True,
+                "hwnd": 111,
+                "title": "Open",
+                "class_name": "#32770",
+                "executable_name": "notepad.exe",
+                "surface_kind": "file_dialog",
+                "dialog_kind": "file_open",
+            }
+        ],
+    )
+    monkeypatch.setattr(
+        agent,
+        "_get_active_window_info",
+        lambda: {
+            "available": True,
+            "hwnd": 111,
+            "title": "Open",
+            "class_name": "#32770",
+            "executable_name": "notepad.exe",
+        },
+    )
+
+    result = agent._tool_list_windows({})
+
+    assert result["ok"] is True
+    assert result["count"] == 1
+    assert result["surface_kind"] == "file_dialog"
+    assert result["dialog_kind"] == "file_open"
+    assert result["recommended_next_tools"][0] == "dialog_set_filename"
+
+
+def test_detect_dialog_returns_buttons_and_target_handle(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    monkeypatch.setattr(
+        agent,
+        "_find_dialog_info",
+        lambda title_contains="": {
+            "available": True,
+            "hwnd": 222,
+            "title": "Save as",
+            "class_name": "#32770",
+            "executable_name": "notepad.exe",
+        },
+    )
+    monkeypatch.setattr(
+        agent,
+        "_get_active_window_info",
+        lambda: {
+            "available": True,
+            "hwnd": 222,
+            "title": "Save as",
+            "class_name": "#32770",
+            "executable_name": "notepad.exe",
+        },
+    )
+    monkeypatch.setattr(
+        agent,
+        "_list_ui_elements_for_window",
+        lambda hwnd, include_hidden=False: [
+            {
+                "handle": 10,
+                "role": "button",
+                "text": "Save",
+                "target": {"type": "ui_element", "handle": 10, "window_handle": hwnd},
+            }
+        ],
+    )
+
+    result = agent._tool_detect_dialog({})
+
+    assert result["ok"] is True
+    assert result["dialog_kind"] == "file_save"
+    assert result["target"]["type"] == "dialog"
+    assert result["buttons"][0]["text"] == "Save"
+
+
+def test_notepad_save_pattern_enters_finish_likely_mode(tmp_path: Path, monkeypatch) -> None:
+    events: list[dict[str, object]] = []
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.event_callback = events.append
+    agent.objective = "Open Notepad, type a short to-do list, save it as todo-demo.txt in Documents"
+    agent.finish_likely_state["target_filename"] = agent._infer_target_filename(agent.objective)
+    agent.last_observed_window = {
+        "available": True,
+        "title": "Save as",
+        "class_name": "#32770",
+    }
+
+    agent.step = 24
+    window_result = agent._update_finish_likely_from_tool(
+        "get_active_window",
+        {},
+        {
+            "ok": True,
+            "window": {
+                "available": True,
+                "title": "todo-demo.txt - Notepad",
+                "class_name": "Notepad",
+            },
+        },
+    )
+
+    assert agent.finish_likely_state["active"] is False
+    assert [item["kind"] for item in window_result["completion_evidence"]] == [
+        "active_window_title_matches_target",
+        "save_dialog_closed_to_target_window",
+    ]
+
+    agent.last_visual_signature = "stable-post-save"
+    agent.step = 25
+    command_result = agent._update_finish_likely_from_tool(
+        "execute_command",
+        {"command": "powershell -NoProfile -Command \"Test-Path ... todo-demo.txt\""},
+        {
+            "ok": True,
+            "exit_code": 0,
+            "stdout": r"C:\Users\paulw\Documents\todo-demo.txt",
+        },
+    )
+
+    assert agent.finish_likely_state["active"] is True
+    assert agent.finish_likely_state["summary"]
+    assert command_result["finish_likely"]["target_filename"] == "todo-demo.txt"
+    assert any(event["event_type"] == "completion_evidence" for event in events)
+    assert any(event["event_type"] == "finish_likely" for event in events)
+
+
+def test_finish_likely_guard_blocks_reopening_menu_after_fresh_verification(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.objective = "Open Notepad, type a short to-do list, save it as todo-demo.txt in Documents"
+    agent.finish_likely_state.update(
+        {
+            "active": True,
+            "activated_at_step": 24,
+            "target_filename": "todo-demo.txt",
+            "summary": 'Save dialog closed and focus returned to "todo-demo.txt - Notepad". | Command verification confirms "todo-demo.txt" exists.',
+            "fresh_verification_done": False,
+            "verification_step": 0,
+            "post_completion_visual_signature": "",
+        }
+    )
+
+    agent.step = 25
+    verify_result = agent._dispatch_tool("see_screen", {})
+    assert verify_result["ok"] is True
+    assert verify_result["finish_likely_verification_done"] is True
+    assert agent.finish_likely_state["fresh_verification_done"] is True
+
+    blocked = agent._dispatch_tool("press_key", {"key": "alt+f"})
+    assert blocked["ok"] is False
+    assert blocked["blocked"] is True
+    assert blocked["blocked_reason"] == "finish_likely"
+    assert "appears satisfied" in blocked["error"]
+    assert "reopen menus" in blocked["hint"].lower()
+
+
+def test_dispatch_rejects_unknown_and_disabled_tools(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.disabled_tools = {"scroll"}
+
+    assert agent._dispatch_tool("unknown_tool", {}) == {"ok": False, "error": "Unknown tool: unknown_tool"}
+    assert agent._dispatch_tool("scroll", {}) == {"ok": False, "error": "Tool 'scroll' is disabled for this job."}
+
+
+def test_tool_schemas_filter_disabled_tools(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.disabled_tools = {"scroll", "clipboard_get"}
+
+    tool_names = {tool["name"] for tool in agent._tool_schemas()}
+
+    assert "scroll" not in tool_names
+    assert "clipboard_get" not in tool_names
+    assert "click" in tool_names
+    assert "task_complete" in tool_names
+
+
+def test_normalize_disabled_tools_rejects_invalid_and_required_names() -> None:
+    with pytest.raises(ValueError, match="Unknown disabled tool"):
+        agent_module.normalize_disabled_tools(["not_a_real_tool"])
+
+    with pytest.raises(ValueError, match="Cannot disable required tool"):
+        agent_module.normalize_disabled_tools(["task_complete"])
--- a/tests/test_cli.py
+++ b/tests/test_cli.py
@@ -20,6 +20,7 @@ def test_cli_emits_structured_return_and_data(monkeypatch: Any, capsys, tmp_path
        port=8787,
        runs_dir=tmp_path / "runs",
        db_path=tmp_path / "screenjob.db",
+        prohibited_key_combos=("ctrl+shift+s",),
    )
    config.runs_dir.mkdir(parents=True, exist_ok=True)

@@ -71,3 +72,11 @@ def test_cli_emits_structured_return_and_data(monkeypatch: Any, capsys, tmp_path
    assert payload["data"] == "file1.txt\nfile2.txt"
    assert captured_kwargs["options"].reasoning_effort == "medium"
    assert captured_kwargs["options"].screen_context_decay_steps == 4
+    assert captured_kwargs["options"].max_visual_context_images == 3
+    assert captured_kwargs["options"].native_automation_mode == "prefer"
+    assert captured_kwargs["options"].dialog_timeout_seconds == 12.0
+    assert captured_kwargs["options"].focus_timeout_seconds == 8.0
+    assert captured_kwargs["options"].ui_element_timeout_seconds == 8.0
+    assert captured_kwargs["options"].max_retries_per_surface == 3
+    assert captured_kwargs["options"].pretty_logs is False
+    assert captured_kwargs["options"].prohibited_key_combos == {"ctrl+shift+s"}
--- a/tests/test_desktop_overlay.py
+++ b/tests/test_desktop_overlay.py
@@ -0,0 +1,149 @@
+from __future__ import annotations
+
+import types
+from collections import deque
+from typing import Any
+
+from src.desktop_overlay import CompletionOverlayPayload, DesktopOverlayManager
+
+
+class _FakeWidget:
+    def __init__(self, root: "_FakeTk", *, width: int = 360, height: int = 160) -> None:
+        self._root = root
+        self._width = width
+        self._height = height
+        self._exists = True
+        self._after_ids: dict[str, tuple[int, Any]] = {}
+
+    def withdraw(self) -> None:
+        return None
+
+    def overrideredirect(self, *_args: Any, **_kwargs: Any) -> None:
+        return None
+
+    def attributes(self, *_args: Any, **_kwargs: Any) -> None:
+        return None
+
+    def configure(self, *_args: Any, **_kwargs: Any) -> None:
+        return None
+
+    def pack(self, *_args: Any, **_kwargs: Any) -> None:
+        return None
+
+    def place(self, *_args: Any, **_kwargs: Any) -> None:
+        return None
+
+    def update_idletasks(self) -> None:
+        return None
+
+    def winfo_width(self) -> int:
+        return self._width
+
+    def winfo_height(self) -> int:
+        return self._height
+
+    def winfo_exists(self) -> bool:
+        return self._exists
+
+    def geometry(self, *_args: Any, **_kwargs: Any) -> None:
+        return None
+
+    def deiconify(self) -> None:
+        return None
+
+    def destroy(self) -> None:
+        self._exists = False
+
+    def after(self, delay_ms: int, callback: Any) -> str:
+        after_id = self._root._schedule(delay_ms, callback)
+        self._after_ids[after_id] = (delay_ms, callback)
+        return after_id
+
+    def after_cancel(self, after_id: str) -> None:
+        self._after_ids.pop(after_id, None)
+        self._root._cancel(after_id)
+
+
+class _FakeButton(_FakeWidget):
+    def __init__(self, root: "_FakeTk", command: Any | None = None, **_kwargs: Any) -> None:
+        super().__init__(root)
+        self.command = command
+
+
+class _FakeTk(_FakeWidget):
+    def __init__(self) -> None:
+        super().__init__(self)
+        self._events: deque[tuple[str, int, Any]] = deque()
+        self._event_seq = 0
+        self.scheduled_delays: list[int] = []
+        self.cards: list[_FakeWidget] = []
+
+    def withdraw(self) -> None:
+        return None
+
+    def winfo_screenwidth(self) -> int:
+        return 1920
+
+    def _schedule(self, delay_ms: int, callback: Any) -> str:
+        after_id = f"after-{self._event_seq}"
+        self._event_seq += 1
+        self.scheduled_delays.append(delay_ms)
+        self._events.append((after_id, delay_ms, callback))
+        return after_id
+
+    def _cancel(self, after_id: str) -> None:
+        self._events = deque(event for event in self._events if event[0] != after_id)
+
+    def mainloop(self) -> None:
+        iterations = 0
+        while self._events and iterations < 20:
+            after_id, _delay_ms, callback = self._events.popleft()
+            iterations += 1
+            callback()
+            if any(not card.winfo_exists() for card in self.cards):
+                return
+
+
+class _FakeTkModule(types.SimpleNamespace):
+    def __init__(self, root: _FakeTk) -> None:
+        super().__init__()
+        self._root = root
+
+    def Tk(self) -> _FakeTk:
+        return self._root
+
+    def Toplevel(self, _root: _FakeTk) -> _FakeWidget:
+        card = _FakeWidget(self._root)
+        self._root.cards.append(card)
+        return card
+
+    def Frame(self, root: _FakeWidget, **_kwargs: Any) -> _FakeWidget:
+        return _FakeWidget(root._root)
+
+    def Label(self, root: _FakeWidget, **_kwargs: Any) -> _FakeWidget:
+        return _FakeWidget(root._root)
+
+    def Button(self, root: _FakeWidget, command: Any | None = None, **_kwargs: Any) -> _FakeButton:
+        return _FakeButton(root._root, command=command)
+
+
+def test_completion_overlay_auto_dismisses(monkeypatch: Any) -> None:
+    root = _FakeTk()
+    fake_tk = _FakeTkModule(root)
+    monkeypatch.setitem(__import__("sys").modules, "tkinter", fake_tk)
+
+    manager = DesktopOverlayManager(auto_dismiss_seconds=0.01)
+    manager._queue.put(
+        CompletionOverlayPayload(
+            job_id="job-123",
+            objective="Write a report",
+            return_message="Finished",
+            steps=5,
+            elapsed_seconds=12.4,
+        )
+    )
+
+    manager._ui_main()
+
+    assert any(delay == 10 for delay in root.scheduled_delays)
+    assert root.cards[0]._exists is False
--- a/tests/test_server_api.py
+++ b/tests/test_server_api.py
@@ -46,6 +46,13 @@ class FakeJobManager:
        click_pause: float = 0.10,
        reasoning_effort: str = "medium",
        screen_context_decay_steps: int = 4,
+        max_visual_context_images: int = 3,
+        native_automation_mode: str = "prefer",
+        dialog_timeout_seconds: float = 12.0,
+        focus_timeout_seconds: float = 8.0,
+        ui_element_timeout_seconds: float = 8.0,
+        max_retries_per_surface: int = 3,
+        pretty_logs: bool = False,
        disabled_tools: list[str] | None = None,
        safety_override: bool = False,
        no_failsafe: bool = False,
@@ -69,6 +76,13 @@ class FakeJobManager:
            "click_pause": click_pause,
            "reasoning_effort": reasoning_effort,
            "screen_context_decay_steps": screen_context_decay_steps,
+            "max_visual_context_images": max_visual_context_images,
+            "native_automation_mode": native_automation_mode,
+            "dialog_timeout_seconds": dialog_timeout_seconds,
+            "focus_timeout_seconds": focus_timeout_seconds,
+            "ui_element_timeout_seconds": ui_element_timeout_seconds,
+            "max_retries_per_surface": max_retries_per_surface,
+            "pretty_logs": pretty_logs,
            "no_failsafe": no_failsafe,
        }
        self._jobs[job_id] = {
@@ -293,6 +307,7 @@ def _build_app(tmp_path: Path, monkeypatch: Any, disable_ui: bool = False):
        port=8787,
        runs_dir=tmp_path / "runs",
        db_path=tmp_path / "screenjob_test.db",
+        prohibited_key_combos=("ctrl+shift+s",),
    )
    config.runs_dir.mkdir(parents=True, exist_ok=True)
    app = server_module.create_app(config)
@@ -326,6 +341,13 @@ def test_create_job_returns_only_job_id_and_defaults_model(tmp_path: Path, monke
    assert manager.last_submit_payload["disabled_tools"] == ["click"]
    assert manager.last_submit_payload["reasoning_effort"] == "medium"
    assert manager.last_submit_payload["screen_context_decay_steps"] == 4
+    assert manager.last_submit_payload["max_visual_context_images"] == 3
+    assert manager.last_submit_payload["native_automation_mode"] == "prefer"
+    assert manager.last_submit_payload["dialog_timeout_seconds"] == 12.0
+    assert manager.last_submit_payload["focus_timeout_seconds"] == 8.0
+    assert manager.last_submit_payload["ui_element_timeout_seconds"] == 8.0
+    assert manager.last_submit_payload["max_retries_per_surface"] == 3
+    assert manager.last_submit_payload["pretty_logs"] is False

    status_res = client.get(f"/api/jobs/{job_id}/status", headers=headers)
    assert status_res.status_code == 200
@@ -334,6 +356,36 @@ def test_create_job_returns_only_job_id_and_defaults_model(tmp_path: Path, monke
    assert "data" in status_res.json()["response"]


+def test_create_job_rejects_invalid_disabled_tool_names(tmp_path: Path, monkeypatch: Any) -> None:
+    app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
+    client = TestClient(app)
+    headers = {"Authorization": "Bearer test_token"}
+
+    response = client.post(
+        "/api/jobs",
+        headers=headers,
+        json={"job": "Open amazon.de", "disabled_tools": ["not_a_real_tool"], "safety_override": True},
+    )
+
+    assert response.status_code == 400
+    assert "Unknown disabled tool" in response.json()["detail"]
+
+
+def test_create_job_rejects_disabling_task_complete(tmp_path: Path, monkeypatch: Any) -> None:
+    app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
+    client = TestClient(app)
+    headers = {"Authorization": "Bearer test_token"}
+
+    response = client.post(
+        "/api/jobs",
+        headers=headers,
+        json={"job": "Open amazon.de", "disabled_tools": ["task_complete"], "safety_override": True},
+    )
+
+    assert response.status_code == 400
+    assert "Cannot disable required tool" in response.json()["detail"]
+
+
 def test_cancel_endpoint_and_events(tmp_path: Path, monkeypatch: Any) -> None:
    app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
    client = TestClient(app)
--- a/tests/test_task_manager.py
+++ b/tests/test_task_manager.py
@@ -0,0 +1,238 @@
+from __future__ import annotations
+
+import threading
+from pathlib import Path
+from typing import Any
+
+import src.task_manager as task_manager_module
+from src.config import AppConfig
+from src.models import AgentResult, RunArtifacts, UsageSummary
+from src.storage import HistoryDB
+from src.task_manager import JobManager
+
+
+class _OverlayRecorder:
+    def __init__(self) -> None:
+        self.calls: list[dict[str, Any]] = []
+
+    def show_completion(self, **kwargs: Any) -> None:
+        self.calls.append(kwargs)
+
+
+def _build_manager(tmp_path: Path, overlay_manager: _OverlayRecorder) -> tuple[JobManager, HistoryDB, AppConfig]:
+    config = AppConfig(
+        openai_api_key="test-key",
+        screenjob_token="test-token",
+        disable_ui=False,
+        default_model="gpt-5.4-mini",
+        safety_model="gpt-5.4-mini",
+        host="127.0.0.1",
+        port=8787,
+        runs_dir=tmp_path / "runs",
+        db_path=tmp_path / "screenjob.db",
+    )
+    db = HistoryDB(config.db_path)
+    manager = JobManager(config=config, db=db, overlay_manager=overlay_manager)
+    return manager, db, config
+
+
+def _artifacts(tmp_path: Path) -> RunArtifacts:
+    root = tmp_path / "run_artifacts"
+    return RunArtifacts(
+        run_id="test_run",
+        root_dir=root,
+        logs_dir=root / "logs",
+        shots_dir=root / "shots",
+        enhance_dir=root / "enhanced",
+        log_file=root / "logs" / "screenjob.log",
+    )
+
+
+def _create_job(db: HistoryDB, job_id: str, objective: str) -> None:
+    db.create_job(
+        job_id=job_id,
+        objective=objective,
+        model="gpt-5.4-mini",
+        created_at="2026-05-30T12:00:00+00:00",
+        safety_override=True,
+        disabled_tools=[],
+    )
+
+
+def test_completed_job_triggers_desktop_overlay(tmp_path: Path, monkeypatch) -> None:
+    overlay = _OverlayRecorder()
+    manager, db, _config = _build_manager(tmp_path, overlay)
+    job_id = "job_overlay_complete"
+    objective = "Save todo-demo.txt in Documents"
+    _create_job(db, job_id, objective)
+
+    result = AgentResult(
+        completed=True,
+        result="Saved todo-demo.txt",
+        return_message="Saved todo-demo.txt",
+        data={"observed_result": "todo-demo.txt - Notepad is visible"},
+        steps=11,
+        started_at=100.0,
+        ended_at=112.6,
+        usage=UsageSummary(),
+    )
+    monkeypatch.setattr(task_manager_module, "run_job", lambda **_kwargs: (result, _artifacts(tmp_path)))
+
+    manager._execute_job(
+        job_id=job_id,
+        objective=objective,
+        model="gpt-5.4-mini",
+        disabled_tools=[],
+        safety_override=True,
+        max_steps=60,
+        command_timeout=45,
+        type_interval=0.02,
+        click_pause=0.10,
+        reasoning_effort="medium",
+        screen_context_decay_steps=4,
+        max_visual_context_images=3,
+        native_automation_mode="prefer",
+        dialog_timeout_seconds=12.0,
+        focus_timeout_seconds=8.0,
+        ui_element_timeout_seconds=8.0,
+        max_retries_per_surface=3,
+        pretty_logs=False,
+        no_failsafe=False,
+        cancel_event=threading.Event(),
+    )
+
+    assert overlay.calls == [
+        {
+            "job_id": job_id,
+            "objective": objective,
+            "return_message": "Saved todo-demo.txt",
+            "steps": 11,
+            "elapsed_seconds": 12.599999999999994,
+        }
+    ]
+    assert db.get_job(job_id)["status"] == "completed"
+
+
+def test_non_completed_jobs_do_not_trigger_desktop_overlay(tmp_path: Path, monkeypatch) -> None:
+    overlay = _OverlayRecorder()
+    manager, db, _config = _build_manager(tmp_path, overlay)
+
+    failed_job_id = "job_overlay_failed"
+    _create_job(db, failed_job_id, "Fail intentionally")
+    failed_result = AgentResult(
+        completed=False,
+        result="Failure",
+        return_message="Failure",
+        data=None,
+        steps=7,
+        started_at=10.0,
+        ended_at=18.0,
+        usage=UsageSummary(),
+        error="Failure",
+    )
+    monkeypatch.setattr(task_manager_module, "run_job", lambda **_kwargs: (failed_result, _artifacts(tmp_path)))
+
+    manager._execute_job(
+        job_id=failed_job_id,
+        objective="Fail intentionally",
+        model="gpt-5.4-mini",
+        disabled_tools=[],
+        safety_override=True,
+        max_steps=60,
+        command_timeout=45,
+        type_interval=0.02,
+        click_pause=0.10,
+        reasoning_effort="medium",
+        screen_context_decay_steps=4,
+        max_visual_context_images=3,
+        native_automation_mode="prefer",
+        dialog_timeout_seconds=12.0,
+        focus_timeout_seconds=8.0,
+        ui_element_timeout_seconds=8.0,
+        max_retries_per_surface=3,
+        pretty_logs=False,
+        no_failsafe=False,
+        cancel_event=threading.Event(),
+    )
+
+    cancelled_job_id = "job_overlay_cancelled"
+    _create_job(db, cancelled_job_id, "Cancel intentionally")
+    cancelled_result = AgentResult(
+        completed=False,
+        result="Cancelled",
+        return_message="Cancelled",
+        data=None,
+        steps=4,
+        started_at=20.0,
+        ended_at=23.0,
+        usage=UsageSummary(),
+        error="Cancelled",
+        cancelled=True,
+    )
+    monkeypatch.setattr(task_manager_module, "run_job", lambda **_kwargs: (cancelled_result, _artifacts(tmp_path)))
+
+    manager._execute_job(
+        job_id=cancelled_job_id,
+        objective="Cancel intentionally",
+        model="gpt-5.4-mini",
+        disabled_tools=[],
+        safety_override=True,
+        max_steps=60,
+        command_timeout=45,
+        type_interval=0.02,
+        click_pause=0.10,
+        reasoning_effort="medium",
+        screen_context_decay_steps=4,
+        max_visual_context_images=3,
+        native_automation_mode="prefer",
+        dialog_timeout_seconds=12.0,
+        focus_timeout_seconds=8.0,
+        ui_element_timeout_seconds=8.0,
+        max_retries_per_surface=3,
+        pretty_logs=False,
+        no_failsafe=False,
+        cancel_event=threading.Event(),
+    )
+
+    assert overlay.calls == []
+
+
+def test_rejected_job_does_not_trigger_desktop_overlay(tmp_path: Path, monkeypatch) -> None:
+    overlay = _OverlayRecorder()
+    manager, db, _config = _build_manager(tmp_path, overlay)
+    job_id = "job_overlay_rejected"
+    _create_job(db, job_id, "Do something unsafe")
+
+    monkeypatch.setattr(task_manager_module, "create_openai_client", lambda *_args, **_kwargs: object())
+    monkeypatch.setattr(
+        task_manager_module,
+        "assess_task_safety",
+        lambda *_args, **_kwargs: (False, "Unsafe request", {"decision": "blocked"}),
+    )
+
+    manager._execute_job(
+        job_id=job_id,
+        objective="Do something unsafe",
+        model="gpt-5.4-mini",
+        disabled_tools=[],
+        safety_override=False,
+        max_steps=60,
+        command_timeout=45,
+        type_interval=0.02,
+        click_pause=0.10,
+        reasoning_effort="medium",
+        screen_context_decay_steps=4,
+        max_visual_context_images=3,
+        native_automation_mode="prefer",
+        dialog_timeout_seconds=12.0,
+        focus_timeout_seconds=8.0,
+        ui_element_timeout_seconds=8.0,
+        max_retries_per_surface=3,
+        pretty_logs=False,
+        no_failsafe=False,
+        cancel_event=threading.Event(),
+    )
+
+    assert overlay.calls == []
+    events = db.get_job_events(job_id)
+    assert events[-1]["event_type"] == "job_rejected"
--- a/tray_service_control.ps1
+++ b/tray_service_control.ps1
@@ -0,0 +1,53 @@
+[CmdletBinding()]
+param(
+    [ValidateSet("start", "stop", "restart")]
+    [string]$Action,
+    [string]$ServiceName = "ScreenJobBackend"
+)
+
+Set-StrictMode -Version Latest
+$ErrorActionPreference = "Stop"
+
+function Wait-ForStatus {
+    param(
+        [Parameter(Mandatory = $true)]$Service,
+        [Parameter(Mandatory = $true)][System.ServiceProcess.ServiceControllerStatus]$TargetStatus,
+        [int]$TimeoutSeconds = 20
+    )
+
+    $deadline = (Get-Date).AddSeconds($TimeoutSeconds)
+    while ((Get-Date) -lt $deadline) {
+        $Service.Refresh()
+        if ($Service.Status -eq $TargetStatus) {
+            return
+        }
+        Start-Sleep -Milliseconds 350
+    }
+
+    throw "Timed out waiting for service '$($Service.ServiceName)' to reach status '$TargetStatus'."
+}
+
+$service = Get-Service -Name $ServiceName -ErrorAction Stop
+
+switch ($Action) {
+    "start" {
+        if ($service.Status -ne [System.ServiceProcess.ServiceControllerStatus]::Running) {
+            Start-Service -Name $ServiceName -ErrorAction Stop
+            Wait-ForStatus -Service $service -TargetStatus ([System.ServiceProcess.ServiceControllerStatus]::Running)
+        }
+    }
+    "stop" {
+        if ($service.Status -ne [System.ServiceProcess.ServiceControllerStatus]::Stopped) {
+            Stop-Service -Name $ServiceName -Force -ErrorAction Stop
+            Wait-ForStatus -Service $service -TargetStatus ([System.ServiceProcess.ServiceControllerStatus]::Stopped)
+        }
+    }
+    "restart" {
+        if ($service.Status -eq [System.ServiceProcess.ServiceControllerStatus]::Running) {
+            Restart-Service -Name $ServiceName -Force -ErrorAction Stop
+        } else {
+            Start-Service -Name $ServiceName -ErrorAction Stop
+        }
+        Wait-ForStatus -Service $service -TargetStatus ([System.ServiceProcess.ServiceControllerStatus]::Running)
+    }
+}
--- a/uninstall_backend_service.ps1
+++ b/uninstall_backend_service.ps1
@@ -0,0 +1,45 @@
+[CmdletBinding(SupportsShouldProcess = $true)]
+param(
+    [switch]$AllUsers,
+    [string]$ServiceName = "ScreenJobBackend"
+)
+
+Set-StrictMode -Version Latest
+$ErrorActionPreference = "Stop"
+
+$scriptDir = Split-Path -Parent $PSCommandPath
+$shortcutName = "ScreenJob Backend.lnk"
+$startupFolder = if ($AllUsers) {
+    [Environment]::GetFolderPath("CommonStartup")
+} else {
+    [Environment]::GetFolderPath("Startup")
+}
+
+$shortcutPath = Join-Path $startupFolder $shortcutName
+
+$service = Get-Service -Name $ServiceName -ErrorAction SilentlyContinue
+if ($null -ne $service) {
+    if ($PSCmdlet.ShouldProcess($ServiceName, "Remove legacy Windows service")) {
+        if ($service.Status -ne "Stopped") {
+            Stop-Service -Name $ServiceName -Force -ErrorAction Stop
+        }
+
+        & sc.exe delete $ServiceName | Out-Null
+        if ($LASTEXITCODE -ne 0) {
+            throw "Failed to delete service '$ServiceName' (sc.exe exit code $LASTEXITCODE)."
+        }
+
+        Write-Host "Removed legacy Windows service: $ServiceName"
+    }
+}
+
+if (Test-Path -LiteralPath $shortcutPath) {
+    if ($PSCmdlet.ShouldProcess($shortcutPath, "Remove backend startup shortcut")) {
+        Remove-Item -LiteralPath $shortcutPath -Force
+        Write-Host "Removed backend startup shortcut: $shortcutPath"
+    }
+} else {
+    Write-Host "No backend startup shortcut found at: $shortcutPath"
+}
+
+Write-Host "Backend launcher uninstalled successfully." -ForegroundColor Green
Author	SHA1	Message	Date
Space-Banane	d514fe161c	docs: update context compaction prompt with observe-decide-act-verify loop Some checks failed CI / test (push) Failing after 8s Details	2026-05-31 20:52:49 +02:00
Space-Banane	4123765aba	Commit remaining workspace updates Some checks failed CI / test (push) Failing after 8s Details	2026-05-31 20:43:36 +02:00
Space-Banane	79c9e98842	Switch backend startup to interactive session	2026-05-31 20:43:36 +02:00
Luna	a521142b89	docs: add patience rule for rerunning jobs All checks were successful CI / test (push) Successful in 8s Details	2026-05-31 18:35:35 +00:00
Space-Banane	880bfb1c70	Fix tray health detection and harden backend service startup All checks were successful CI / test (push) Successful in 7s Details	2026-05-28 13:44:31 +02:00
Space-Banane	114ddd80d6	Add Windows service host and system tray controller All checks were successful CI / test (push) Successful in 7s Details	2026-05-28 13:30:27 +02:00