Compare commits

...

8 Commits

Author SHA1 Message Date
Space-Banane
880bfb1c70 Fix tray health detection and harden backend service startup
All checks were successful
CI / test (push) Successful in 7s
2026-05-28 13:44:31 +02:00
Space-Banane
114ddd80d6 Add Windows service host and system tray controller
All checks were successful
CI / test (push) Successful in 7s
2026-05-28 13:30:27 +02:00
314311d8fc Merge pull request 'Add lightweight analytics dashboard' (#1) from feat/lightweight-dash into master
All checks were successful
CI / test (push) Successful in 7s
Reviewed-on: #1
2026-05-27 22:50:08 +02:00
Space-Banane
8126b57404 Add lightweight analytics dashboard
All checks were successful
CI / test (push) Successful in 7s
CI / test (pull_request) Successful in 7s
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-27 22:34:26 +02:00
Space-Banane
cceed18cf1 feat: (literally) "enhance" functionality with new parameters and improved image processing
All checks were successful
CI / test (push) Successful in 7s
2026-05-27 22:14:32 +02:00
Space-Banane
880468ef02 Mark completed P1 TODO items as done 2026-05-27 22:05:57 +02:00
Space-Banane
b05a7be668 Compact screenshot context every 4 steps by default 2026-05-27 22:04:15 +02:00
Space-Banane
0c019474af Default model reasoning effort to medium 2026-05-27 22:02:20 +02:00
27 changed files with 1947 additions and 29 deletions

5
.gitignore vendored
View File

@@ -20,3 +20,8 @@ screenjob.db
# IDE # IDE
.vscode/ .vscode/
.idea/ .idea/
# Service host build/publish artifacts
service_host/**/bin/
service_host/**/obj/
service_host/publish/

View File

@@ -109,6 +109,77 @@ Or use the PowerShell launcher:
.\start_backend.ps1 .\start_backend.ps1
``` ```
### Windows Service
Run these from an elevated PowerShell session (Run as Administrator):
Requires .NET SDK 10+ (installer publishes a native service host executable).
Install and start at boot:
```powershell
.\install_backend_service.ps1 -ForceReinstall -StartAfterInstall -DelayedAutoStart
```
Check status:
```powershell
Get-Service -Name ScreenJobBackend
```
Stop/start manually:
```powershell
Stop-Service -Name ScreenJobBackend
Start-Service -Name ScreenJobBackend
```
Uninstall:
```powershell
.\uninstall_backend_service.ps1
```
Service logs are written to:
```text
screenjob_runs/service/backend-service.stdout.log
screenjob_runs/service/backend-service.stderr.log
```
### System Tray Icon (Windows)
Start tray icon now:
```powershell
powershell -NoProfile -ExecutionPolicy Bypass -STA -File .\screenjob_tray.ps1
```
Install startup shortcut (current user):
```powershell
.\install_tray_startup_shortcut.ps1
```
Install startup shortcut for all users:
```powershell
.\install_tray_startup_shortcut.ps1 -AllUsers
```
Remove startup shortcut:
```powershell
.\install_tray_startup_shortcut.ps1 -Remove
```
Tray menu actions:
- Refresh service status
- Start/Stop/Restart service (prompts for admin/UAC)
- Open dashboard URL from `.env` `SCREENJOB_HOST` / `SCREENJOB_PORT`
- Open service logs folder
- Exit tray icon process
Auth for all API routes: Auth for all API routes:
- `Authorization: Bearer <SCREENJOB_TOKEN>` - `Authorization: Bearer <SCREENJOB_TOKEN>`
@@ -156,13 +227,21 @@ Each job payload includes:
- Read-only dashboard (no run controls) - Read-only dashboard (no run controls)
- Requires token input - Requires token input
- Live updates via `/ws` - Live updates via `/ws`
- Analytics dashboards for success rate by objective category and daily averages
- Set `DISABLE_UI=true` to disable UI - Set `DISABLE_UI=true` to disable UI
### Analytics API
- `GET /api/analytics`
- Returns objective-category success rates plus average steps/cost over time
## Agent Instructions (Practical) ## Agent Instructions (Practical)
- Prefer `execute_command` for deterministic actions (opening URLs, filesystem checks). - Prefer `execute_command` for deterministic actions (opening URLs, filesystem checks).
- Use `see_screen` before UI interaction. - Use `see_screen` before UI interaction.
- Use `enhance` when text is unclear. - Use `enhance` before clicking small/ambiguous targets; prefer `region="small"` for compact controls.
- Use `enhance` `mode="text"` for tiny labels/text, or `mode="ui"` for general UI.
- Optionally set `enhance` `scale` (2-6) for tighter zoom control.
- Use `press_key` for non-text keys (Enter, Tab, arrows, Escape). - Use `press_key` for non-text keys (Enter, Tab, arrows, Escape).
- For shortcuts, use one `press_key` call with combo syntax (example: `win+r`). - For shortcuts, use one `press_key` call with combo syntax (example: `win+r`).
- Use `click` offsets via `offset_up/down/left/right` and optional `sleep_after_seconds`. - Use `click` offsets via `offset_up/down/left/right` and optional `sleep_after_seconds`.

View File

@@ -37,6 +37,14 @@ Keyboard combo rule:
- For shortcuts, use one `press_key` call with combo syntax, for example: `win+r`, `ctrl+shift+esc`. - For shortcuts, use one `press_key` call with combo syntax, for example: `win+r`, `ctrl+shift+esc`.
- Do not split modifier combos into separate calls. - Do not split modifier combos into separate calls.
Enhance-first click rule:
- Before clicking small buttons/icons, dense UI, or ambiguous targets, call `enhance` first.
- Preferred preset for tiny controls: `enhance(coordinate, region="small", mode="ui")`.
- For tiny labels/text: use `mode="text"` to improve readability.
- Optional zoom control: set `scale` from `2` to `6` (defaults are tuned by region).
- After checking the enhanced image, click using the same target coordinate (or a small directional offset if needed).
Verification rule: Verification rule:
- Before `task_complete`, verify actual on-screen content matches the expected outcome. - Before `task_complete`, verify actual on-screen content matches the expected outcome.

125
install_backend_service.ps1 Normal file
View File

@@ -0,0 +1,125 @@
[CmdletBinding(SupportsShouldProcess = $true)]
param(
[string]$ServiceName = "ScreenJobBackend",
[string]$DisplayName = "ScreenJob Backend",
[string]$Description = "Runs the ScreenJob backend (start_backend.ps1) as a Windows service.",
[ValidateSet("Automatic", "Manual", "Disabled")]
[string]$StartupType = "Automatic",
[switch]$DelayedAutoStart,
[switch]$ForceReinstall,
[switch]$StartAfterInstall
)
Set-StrictMode -Version Latest
$ErrorActionPreference = "Stop"
function Test-IsAdministrator {
$identity = [Security.Principal.WindowsIdentity]::GetCurrent()
$principal = New-Object Security.Principal.WindowsPrincipal($identity)
return $principal.IsInRole([Security.Principal.WindowsBuiltInRole]::Administrator)
}
if (-not (Test-IsAdministrator)) {
throw "Run this script from an elevated PowerShell session (Run as Administrator)."
}
$scriptDir = Split-Path -Parent $PSCommandPath
$backendScript = Join-Path $scriptDir "start_backend.ps1"
if (-not (Test-Path -LiteralPath $backendScript)) {
throw "Backend launcher script not found: $backendScript"
}
$projectFile = Join-Path $scriptDir "service_host\ScreenJob.WindowsServiceHost\ScreenJob.WindowsServiceHost.csproj"
if (-not (Test-Path -LiteralPath $projectFile)) {
throw "Windows service host project not found: $projectFile"
}
$dotnetCmd = Get-Command dotnet -ErrorAction SilentlyContinue
if ($null -eq $dotnetCmd) {
throw "dotnet SDK was not found in PATH. Install .NET SDK 10+ and retry."
}
$publishDir = Join-Path $scriptDir "service_host\publish"
$serviceExe = Join-Path $publishDir "ScreenJob.WindowsServiceHost.exe"
$logDir = Join-Path $scriptDir "screenjob_runs\service"
$existingService = Get-Service -Name $ServiceName -ErrorAction SilentlyContinue
if ($null -ne $existingService) {
if (-not $ForceReinstall) {
throw "Service '$ServiceName' already exists. Re-run with -ForceReinstall to replace it."
}
if ($PSCmdlet.ShouldProcess($ServiceName, "Remove existing service")) {
if ($existingService.Status -ne "Stopped") {
Stop-Service -Name $ServiceName -Force -ErrorAction Stop
}
& sc.exe delete $ServiceName | Out-Null
if ($LASTEXITCODE -ne 0) {
throw "Failed to delete existing service '$ServiceName' (sc.exe exit code $LASTEXITCODE)."
}
$deadline = (Get-Date).AddSeconds(15)
while ((Get-Date) -lt $deadline) {
$stillThere = Get-Service -Name $ServiceName -ErrorAction SilentlyContinue
if ($null -eq $stillThere) {
break
}
Start-Sleep -Milliseconds 300
}
}
}
if ($PSCmdlet.ShouldProcess($projectFile, "Publish Windows service host")) {
if (Test-Path -LiteralPath $serviceExe) {
Remove-Item -LiteralPath $serviceExe -Force -ErrorAction SilentlyContinue
}
& $dotnetCmd.Source publish `
$projectFile `
-c Release `
-r win-x64 `
--self-contained false `
-p:PublishSingleFile=true `
-o $publishDir
if ($LASTEXITCODE -ne 0) {
throw "dotnet publish failed with exit code $LASTEXITCODE."
}
}
if (-not (Test-Path -LiteralPath $serviceExe)) {
throw "Published service executable not found: $serviceExe"
}
$binaryPath = "`"$serviceExe`" --backend-script `"$backendScript`" --working-dir `"$scriptDir`" --log-dir `"$logDir`""
if ($PSCmdlet.ShouldProcess($ServiceName, "Create service")) {
New-Service `
-Name $ServiceName `
-BinaryPathName $binaryPath `
-DisplayName $DisplayName `
-Description $Description `
-StartupType $StartupType
if ($StartupType -eq "Automatic" -and $DelayedAutoStart) {
& sc.exe config $ServiceName start= delayed-auto | Out-Null
if ($LASTEXITCODE -ne 0) {
throw "Failed to enable delayed auto-start for '$ServiceName' (sc.exe exit code $LASTEXITCODE)."
}
}
# Restart on first/second/subsequent failure after 5 seconds.
& sc.exe failure $ServiceName reset= 86400 actions= restart/5000/restart/5000/restart/5000 | Out-Null
if ($LASTEXITCODE -ne 0) {
throw "Failed to configure failure actions for '$ServiceName' (sc.exe exit code $LASTEXITCODE)."
}
if ($StartAfterInstall) {
Start-Service -Name $ServiceName -ErrorAction Stop
}
}
Write-Host "Service '$ServiceName' installed successfully." -ForegroundColor Green
Write-Host "Check status with: Get-Service -Name $ServiceName"
Write-Host "View logs in: $logDir"

View File

@@ -0,0 +1,47 @@
[CmdletBinding(SupportsShouldProcess = $true)]
param(
[switch]$Remove,
[switch]$AllUsers
)
Set-StrictMode -Version Latest
$ErrorActionPreference = "Stop"
$scriptDir = Split-Path -Parent $PSCommandPath
$vbsLauncher = Join-Path $scriptDir "start_screenjob_tray_hidden.vbs"
$shortcutName = "ScreenJob Tray.lnk"
if (-not (Test-Path -LiteralPath $vbsLauncher)) {
throw "Launcher file not found: $vbsLauncher"
}
$startupFolder = if ($AllUsers) {
[Environment]::GetFolderPath("CommonStartup")
} else {
[Environment]::GetFolderPath("Startup")
}
$shortcutPath = Join-Path $startupFolder $shortcutName
if ($Remove) {
if (Test-Path -LiteralPath $shortcutPath) {
if ($PSCmdlet.ShouldProcess($shortcutPath, "Remove startup shortcut")) {
Remove-Item -LiteralPath $shortcutPath -Force
Write-Host "Removed startup shortcut: $shortcutPath"
}
} else {
Write-Host "No startup shortcut found at: $shortcutPath"
}
return
}
if ($PSCmdlet.ShouldProcess($shortcutPath, "Create startup shortcut")) {
$shell = New-Object -ComObject WScript.Shell
$shortcut = $shell.CreateShortcut($shortcutPath)
$shortcut.TargetPath = "$env:SystemRoot\System32\wscript.exe"
$shortcut.Arguments = '"' + $vbsLauncher + '"'
$shortcut.WorkingDirectory = $scriptDir
$shortcut.Description = "Launch ScreenJob tray icon at sign-in."
$shortcut.Save()
Write-Host "Created startup shortcut: $shortcutPath"
}

307
screenjob_tray.ps1 Normal file
View File

@@ -0,0 +1,307 @@
param(
[string]$ServiceName = "ScreenJobBackend"
)
Set-StrictMode -Version Latest
$ErrorActionPreference = "Stop"
Add-Type -AssemblyName System.Windows.Forms
Add-Type -AssemblyName System.Drawing
$scriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
$controlScript = Join-Path $scriptDir "tray_service_control.ps1"
$logsDir = Join-Path $scriptDir "screenjob_runs\service"
$defaultHost = "127.0.0.1"
$defaultPort = "8787"
function Read-EnvConfig {
param([string]$EnvFilePath)
$result = @{}
if (-not (Test-Path -LiteralPath $EnvFilePath)) {
return $result
}
foreach ($line in Get-Content -Path $EnvFilePath) {
$trimmed = $line.Trim()
if ($trimmed.Length -eq 0 -or $trimmed.StartsWith("#")) {
continue
}
$parts = $trimmed.Split("=", 2)
if ($parts.Count -eq 2) {
$key = $parts[0].Trim()
$value = $parts[1].Trim()
if (($value.StartsWith('"') -and $value.EndsWith('"')) -or ($value.StartsWith("'") -and $value.EndsWith("'"))) {
$value = $value.Substring(1, $value.Length - 2)
}
$result[$key] = $value
}
}
return $result
}
function Get-ServiceStatusSafe {
param([string]$Name)
try {
$svc = Get-Service -Name $Name -ErrorAction Stop
return $svc.Status.ToString()
} catch {
return "NotInstalled"
}
}
function Invoke-ServiceActionElevated {
param(
[Parameter(Mandatory = $true)][string]$Action,
[Parameter(Mandatory = $true)][string]$Name
)
if (-not (Test-Path -LiteralPath $controlScript)) {
[System.Windows.Forms.MessageBox]::Show(
"Missing control script: $controlScript",
"ScreenJob Tray",
[System.Windows.Forms.MessageBoxButtons]::OK,
[System.Windows.Forms.MessageBoxIcon]::Error
) | Out-Null
return
}
$argList = @(
"-NoProfile",
"-ExecutionPolicy", "Bypass",
"-File", "`"$controlScript`"",
"-Action", $Action,
"-ServiceName", $Name
)
try {
Start-Process -FilePath "powershell.exe" -ArgumentList $argList -Verb RunAs -WindowStyle Hidden | Out-Null
} catch {
# User canceled UAC prompt or launch failed.
}
}
function Get-DashboardUrl {
$envFile = Join-Path $scriptDir ".env"
$envVars = Read-EnvConfig -EnvFilePath $envFile
$dashboardHost = $defaultHost
$dashboardPort = $defaultPort
if ($envVars.ContainsKey("SCREENJOB_HOST") -and -not [string]::IsNullOrWhiteSpace($envVars["SCREENJOB_HOST"])) {
$dashboardHost = $envVars["SCREENJOB_HOST"]
}
if ($envVars.ContainsKey("SCREENJOB_PORT") -and -not [string]::IsNullOrWhiteSpace($envVars["SCREENJOB_PORT"])) {
$dashboardPort = $envVars["SCREENJOB_PORT"]
}
$connectHost = Resolve-ConnectHost -ConfiguredHost $dashboardHost
return "http://{0}:{1}/" -f $connectHost, $dashboardPort
}
function Resolve-ConnectHost {
param([string]$ConfiguredHost)
if ([string]::IsNullOrWhiteSpace($ConfiguredHost)) {
return "127.0.0.1"
}
switch ($ConfiguredHost.Trim().ToLowerInvariant()) {
"0.0.0.0" { return "127.0.0.1" }
"::" { return "127.0.0.1" }
"*" { return "127.0.0.1" }
default { return $ConfiguredHost }
}
}
function Get-HealthCheckHosts {
param([string]$ConfiguredHost)
if ([string]::IsNullOrWhiteSpace($ConfiguredHost)) {
return @("127.0.0.1", "localhost")
}
$normalized = $ConfiguredHost.Trim().ToLowerInvariant()
switch ($normalized) {
"0.0.0.0" { return @("127.0.0.1", "localhost", "::1") }
"::" { return @("127.0.0.1", "localhost", "::1") }
"*" { return @("127.0.0.1", "localhost", "::1") }
default { return @($ConfiguredHost) }
}
}
function Test-TcpEndpoint {
param(
[Parameter(Mandatory = $true)][string]$HostName,
[Parameter(Mandatory = $true)][int]$Port,
[int]$TimeoutMs = 1200
)
$client = New-Object System.Net.Sockets.TcpClient
try {
$async = $client.BeginConnect($HostName, $Port, $null, $null)
$connected = $async.AsyncWaitHandle.WaitOne($TimeoutMs, $false)
if (-not $connected) {
return $false
}
$client.EndConnect($async) | Out-Null
return $true
} catch {
return $false
} finally {
$client.Dispose()
}
}
function Get-BackendReachability {
$envFile = Join-Path $scriptDir ".env"
$envVars = Read-EnvConfig -EnvFilePath $envFile
$configuredHost = $defaultHost
$configuredPort = $defaultPort
if ($envVars.ContainsKey("SCREENJOB_HOST") -and -not [string]::IsNullOrWhiteSpace($envVars["SCREENJOB_HOST"])) {
$configuredHost = $envVars["SCREENJOB_HOST"]
}
if ($envVars.ContainsKey("SCREENJOB_PORT") -and -not [string]::IsNullOrWhiteSpace($envVars["SCREENJOB_PORT"])) {
$configuredPort = $envVars["SCREENJOB_PORT"]
}
$portNumber = 8787
[void][int]::TryParse([string]$configuredPort, [ref]$portNumber)
$hostsToTry = Get-HealthCheckHosts -ConfiguredHost $configuredHost
foreach ($candidateHost in $hostsToTry) {
if (Test-TcpEndpoint -HostName $candidateHost -Port $portNumber) {
return $true
}
}
return $false
}
function Update-TrayState {
param(
[System.Windows.Forms.NotifyIcon]$NotifyIcon,
[System.Windows.Forms.ToolStripMenuItem]$StatusItem,
[string]$Name
)
$status = Get-ServiceStatusSafe -Name $Name
$isBackendReachable = Get-BackendReachability
$displayStatus = $status
if ($status -eq "Running" -and -not $isBackendReachable) {
$displayStatus = "Running (Backend Down)"
} elseif ($status -eq "Stopped" -and $isBackendReachable) {
$displayStatus = "Stopped (Backend Up)"
} elseif ($status -eq "NotInstalled" -and $isBackendReachable) {
$displayStatus = "NotInstalled (Backend Up)"
}
$StatusItem.Text = "Status: $displayStatus"
switch ($displayStatus) {
"Running" {
$NotifyIcon.Icon = [System.Drawing.SystemIcons]::Information
}
"Stopped" {
$NotifyIcon.Icon = [System.Drawing.SystemIcons]::Warning
}
default {
$NotifyIcon.Icon = [System.Drawing.SystemIcons]::Error
}
}
$tooltip = "ScreenJob Backend: $displayStatus"
if ($tooltip.Length -gt 63) {
$tooltip = $tooltip.Substring(0, 63)
}
$NotifyIcon.Text = $tooltip
}
$appContext = New-Object System.Windows.Forms.ApplicationContext
$notifyIcon = New-Object System.Windows.Forms.NotifyIcon
$notifyIcon.Visible = $false
$menu = New-Object System.Windows.Forms.ContextMenuStrip
$statusItem = New-Object System.Windows.Forms.ToolStripMenuItem "Status: Unknown"
$statusItem.Enabled = $false
$refreshItem = New-Object System.Windows.Forms.ToolStripMenuItem "Refresh Status"
$refreshItem.Add_Click({
Update-TrayState -NotifyIcon $notifyIcon -StatusItem $statusItem -Name $ServiceName
})
$startItem = New-Object System.Windows.Forms.ToolStripMenuItem "Start Service (Admin)"
$startItem.Add_Click({
Invoke-ServiceActionElevated -Action "start" -Name $ServiceName
})
$stopItem = New-Object System.Windows.Forms.ToolStripMenuItem "Stop Service (Admin)"
$stopItem.Add_Click({
Invoke-ServiceActionElevated -Action "stop" -Name $ServiceName
})
$restartItem = New-Object System.Windows.Forms.ToolStripMenuItem "Restart Service (Admin)"
$restartItem.Add_Click({
Invoke-ServiceActionElevated -Action "restart" -Name $ServiceName
})
$dashboardItem = New-Object System.Windows.Forms.ToolStripMenuItem "Open Dashboard"
$dashboardItem.Add_Click({
$url = Get-DashboardUrl
Start-Process $url | Out-Null
})
$logsItem = New-Object System.Windows.Forms.ToolStripMenuItem "Open Service Logs"
$logsItem.Add_Click({
if (-not (Test-Path -LiteralPath $logsDir)) {
New-Item -ItemType Directory -Path $logsDir -Force | Out-Null
}
Start-Process explorer.exe $logsDir | Out-Null
})
$openFolderItem = New-Object System.Windows.Forms.ToolStripMenuItem "Open Project Folder"
$openFolderItem.Add_Click({
Start-Process explorer.exe $scriptDir | Out-Null
})
$exitItem = New-Object System.Windows.Forms.ToolStripMenuItem "Exit Tray"
$exitItem.Add_Click({
$refreshTimer.Stop()
$notifyIcon.Visible = $false
$notifyIcon.Dispose()
$menu.Dispose()
$appContext.ExitThread()
})
[void]$menu.Items.Add($statusItem)
[void]$menu.Items.Add($refreshItem)
[void]$menu.Items.Add((New-Object System.Windows.Forms.ToolStripSeparator))
[void]$menu.Items.Add($startItem)
[void]$menu.Items.Add($stopItem)
[void]$menu.Items.Add($restartItem)
[void]$menu.Items.Add((New-Object System.Windows.Forms.ToolStripSeparator))
[void]$menu.Items.Add($dashboardItem)
[void]$menu.Items.Add($logsItem)
[void]$menu.Items.Add($openFolderItem)
[void]$menu.Items.Add((New-Object System.Windows.Forms.ToolStripSeparator))
[void]$menu.Items.Add($exitItem)
$notifyIcon.ContextMenuStrip = $menu
$notifyIcon.Visible = $true
$notifyIcon.Add_DoubleClick({
$url = Get-DashboardUrl
Start-Process $url | Out-Null
})
$refreshTimer = New-Object System.Windows.Forms.Timer
$refreshTimer.Interval = 5000
$refreshTimer.Add_Tick({
Update-TrayState -NotifyIcon $notifyIcon -StatusItem $statusItem -Name $ServiceName
})
Update-TrayState -NotifyIcon $notifyIcon -StatusItem $statusItem -Name $ServiceName
$refreshTimer.Start()
[System.Windows.Forms.Application]::Run($appContext)

View File

@@ -0,0 +1,138 @@
using System.Diagnostics;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
namespace ScreenJob.WindowsServiceHost;
internal sealed class BackendProcessService : BackgroundService
{
private readonly ILogger<BackendProcessService> _logger;
private readonly ServiceOptions _options;
private readonly object _logLock = new();
private Process? _backendProcess;
private string _stdoutLogPath = string.Empty;
private string _stderrLogPath = string.Empty;
public BackendProcessService(ILogger<BackendProcessService> logger, ServiceOptions options)
{
_logger = logger;
_options = options;
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
Directory.CreateDirectory(_options.LogDirectory);
_stdoutLogPath = Path.Combine(_options.LogDirectory, "backend-service.stdout.log");
_stderrLogPath = Path.Combine(_options.LogDirectory, "backend-service.stderr.log");
LogStdOut("Service host starting backend process.");
LogStdOut($"Script: {_options.BackendScriptPath}");
LogStdOut($"Working directory: {_options.WorkingDirectory}");
var powershellPath = Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.Windows),
"System32",
"WindowsPowerShell",
"v1.0",
"powershell.exe");
var startInfo = new ProcessStartInfo
{
FileName = powershellPath,
Arguments = $"-NoProfile -ExecutionPolicy Bypass -File \"{_options.BackendScriptPath}\"",
WorkingDirectory = _options.WorkingDirectory,
RedirectStandardOutput = true,
RedirectStandardError = true,
UseShellExecute = false,
CreateNoWindow = true
};
_backendProcess = new Process { StartInfo = startInfo };
if (!_backendProcess.Start())
{
throw new InvalidOperationException("Failed to start backend process.");
}
LogStdOut($"Backend process started with PID {_backendProcess.Id}.");
_logger.LogInformation("Backend process started with PID {Pid}.", _backendProcess.Id);
var stdoutPump = PumpStreamAsync(_backendProcess.StandardOutput, LogStdOut, stoppingToken);
var stderrPump = PumpStreamAsync(_backendProcess.StandardError, LogStdErr, stoppingToken);
try
{
await _backendProcess.WaitForExitAsync(stoppingToken);
var exitCode = _backendProcess.ExitCode;
LogStdErr($"Backend process exited unexpectedly with code {exitCode}.");
_logger.LogError("Backend process exited unexpectedly with code {ExitCode}.", exitCode);
Environment.ExitCode = exitCode == 0 ? 1 : exitCode;
throw new InvalidOperationException(
$"Backend process ended unexpectedly. Service host exit code: {Environment.ExitCode}.");
}
catch (OperationCanceledException)
{
LogStdOut("Service stop requested.");
}
finally
{
await Task.WhenAll(stdoutPump, stderrPump);
}
}
public override async Task StopAsync(CancellationToken cancellationToken)
{
if (_backendProcess is { HasExited: false })
{
try
{
LogStdOut("Stopping backend process.");
_backendProcess.Kill(entireProcessTree: true);
}
catch (Exception ex)
{
LogStdErr($"Failed to stop backend process cleanly: {ex.Message}");
_logger.LogError(ex, "Failed to stop backend process cleanly.");
}
}
await base.StopAsync(cancellationToken);
}
private async Task PumpStreamAsync(
StreamReader reader,
Action<string> sink,
CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
var line = await reader.ReadLineAsync();
if (line is null)
{
break;
}
sink(line);
}
}
private void LogStdOut(string message)
{
WriteLog(_stdoutLogPath, message);
}
private void LogStdErr(string message)
{
WriteLog(_stderrLogPath, message);
}
private void WriteLog(string path, string message)
{
var stamp = DateTimeOffset.Now.ToString("yyyy-MM-dd HH:mm:ss");
var line = $"[{stamp}] {message}{Environment.NewLine}";
lock (_logLock)
{
File.AppendAllText(path, line);
}
}
}

View File

@@ -0,0 +1,18 @@
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using ScreenJob.WindowsServiceHost;
var options = ServiceOptions.Parse(args);
Host.CreateDefaultBuilder(args)
.UseWindowsService(serviceOptions =>
{
serviceOptions.ServiceName = "ScreenJobBackend";
})
.ConfigureServices(services =>
{
services.AddSingleton(options);
services.AddHostedService<BackendProcessService>();
})
.Build()
.Run();

View File

@@ -0,0 +1,12 @@
<Project Sdk="Microsoft.NET.Sdk.Worker">
<PropertyGroup>
<TargetFramework>net10.0-windows</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<OutputType>Exe</OutputType>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Hosting.WindowsServices" Version="10.0.0" />
</ItemGroup>
</Project>

View File

@@ -0,0 +1,77 @@
namespace ScreenJob.WindowsServiceHost;
internal sealed record ServiceOptions(
string BackendScriptPath,
string WorkingDirectory,
string LogDirectory)
{
public static ServiceOptions Parse(string[] args)
{
var map = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
for (var i = 0; i < args.Length; i++)
{
var raw = args[i];
if (!raw.StartsWith("--", StringComparison.Ordinal))
{
continue;
}
var key = raw[2..];
if (string.IsNullOrWhiteSpace(key))
{
continue;
}
if (i + 1 < args.Length && !args[i + 1].StartsWith("--", StringComparison.Ordinal))
{
map[key] = args[++i];
}
else
{
map[key] = "true";
}
}
if (!map.TryGetValue("backend-script", out var backendScript) || string.IsNullOrWhiteSpace(backendScript))
{
throw new ArgumentException("Missing required argument: --backend-script <absolute-path-to-start_backend.ps1>.");
}
if (!Path.IsPathRooted(backendScript))
{
throw new ArgumentException("The --backend-script value must be an absolute path.");
}
if (!File.Exists(backendScript))
{
throw new FileNotFoundException("Backend script not found.", backendScript);
}
if (!map.TryGetValue("working-dir", out var workingDir) || string.IsNullOrWhiteSpace(workingDir))
{
workingDir = Path.GetDirectoryName(backendScript)
?? throw new ArgumentException("Could not resolve working directory from backend script path.");
}
if (!Path.IsPathRooted(workingDir))
{
throw new ArgumentException("The --working-dir value must be an absolute path.");
}
if (!map.TryGetValue("log-dir", out var logDir) || string.IsNullOrWhiteSpace(logDir))
{
logDir = Path.Combine(workingDir, "screenjob_runs", "service");
}
if (!Path.IsPathRooted(logDir))
{
throw new ArgumentException("The --log-dir value must be an absolute path.");
}
return new ServiceOptions(
Path.GetFullPath(backendScript),
Path.GetFullPath(workingDir),
Path.GetFullPath(logDir));
}
}

View File

@@ -9,7 +9,7 @@ import traceback
from typing import Any, Callable from typing import Any, Callable
from openai import OpenAI from openai import OpenAI
from PIL import Image, ImageEnhance, ImageFilter, ImageOps from PIL import Image, ImageDraw, ImageEnhance, ImageFilter, ImageOps
from .models import AgentResult, RunArtifacts, RuntimeOptions, UsageSummary from .models import AgentResult, RunArtifacts, RuntimeOptions, UsageSummary
from .pricing import estimate_cost_usd from .pricing import estimate_cost_usd
@@ -34,7 +34,8 @@ Rules:
- launching apps or running terminal checks - launching apps or running terminal checks
3) For UI tasks, inspect with see_screen before clicking/typing. 3) For UI tasks, inspect with see_screen before clicking/typing.
4) Coordinates are absolute screen pixels (x, y) from top-left. 4) Coordinates are absolute screen pixels (x, y) from top-left.
5) Use enhance(coordinate) when text/UI is unclear. 5) Use enhance before risky clicks: small buttons/icons, dense UI, or when target confidence is below high.
5a) For tiny controls use enhance(coordinate, region="small", mode="ui"). For tiny text use mode="text".
6) For keyboard-heavy interactions, prefer press_key for special keys. 6) For keyboard-heavy interactions, prefer press_key for special keys.
6a) For key combinations, call press_key once with combo syntax (example: "win+r", "ctrl+shift+esc"). Do not split modifier combos across separate calls. 6a) For key combinations, call press_key once with combo syntax (example: "win+r", "ctrl+shift+esc"). Do not split modifier combos across separate calls.
7) You may call multiple tools in one step. If needed, do click then sleep. 7) You may call multiple tools in one step. If needed, do click then sleep.
@@ -76,11 +77,14 @@ class ScreenJobAgent:
self.final_data: Any | None = None self.final_data: Any | None = None
self.previous_response_id: str | None = None self.previous_response_id: str | None = None
self.usage = UsageSummary() self.usage = UsageSummary()
self.objective = ""
self.last_screen_data_url: str | None = None self.last_screen_data_url: str | None = None
self.last_screen_meta: dict[str, Any] | None = None self.last_screen_meta: dict[str, Any] | None = None
self.click_history: list[tuple[int, int, float]] = [] self.click_history: list[tuple[int, int, float]] = []
self.disabled_tools = {tool.strip() for tool in (options.disable_tools or set()) if tool.strip()} self.disabled_tools = {tool.strip() for tool in (options.disable_tools or set()) if tool.strip()}
self.recent_tool_summaries: list[str] = []
self.last_context_compact_step = 0
def _emit(self, event_type: str, payload: dict[str, Any]) -> None: def _emit(self, event_type: str, payload: dict[str, Any]) -> None:
if self.event_callback is None: if self.event_callback is None:
@@ -192,7 +196,10 @@ class ScreenJobAgent:
{ {
"type": "function", "type": "function",
"name": "enhance", "name": "enhance",
"description": "Create enhanced zoom around a coordinate for readability.", "description": (
"Create enhanced zoom around a coordinate for readability and precise targeting. "
"Prefer this before clicking tiny or ambiguous UI targets."
),
"parameters": { "parameters": {
"type": "object", "type": "object",
"properties": { "properties": {
@@ -204,7 +211,19 @@ class ScreenJobAgent:
}, },
"required": ["x", "y"], "required": ["x", "y"],
"additionalProperties": False, "additionalProperties": False,
} },
"region": {
"type": "string",
"enum": ["small", "medium", "large"],
},
"mode": {
"type": "string",
"enum": ["ui", "text"],
},
"scale": {
"type": ["integer", "string"],
"description": "Zoom factor from 2 to 6. Defaults by region.",
},
}, },
"required": ["coordinate"], "required": ["coordinate"],
"additionalProperties": False, "additionalProperties": False,
@@ -352,6 +371,23 @@ class ScreenJobAgent:
sec = max_seconds sec = max_seconds
return sec return sec
def _parse_int(self, value: Any, default: int = 0) -> int:
if value is None:
return default
if isinstance(value, bool):
return int(value)
if isinstance(value, int):
return value
if isinstance(value, float):
return int(round(value))
text = str(value).strip()
if not text:
return default
try:
return int(float(text))
except Exception: # noqa: BLE001
return default
def _tool_see_screen(self, _: dict[str, Any]) -> dict[str, Any]: def _tool_see_screen(self, _: dict[str, Any]) -> dict[str, Any]:
image, meta = self._capture_screen(with_grid=True) image, meta = self._capture_screen(with_grid=True)
out_path = self.artifacts.shots_dir / f"screen_step_{self.step:03d}.png" out_path = self.artifacts.shots_dir / f"screen_step_{self.step:03d}.png"
@@ -369,34 +405,106 @@ class ScreenJobAgent:
def _tool_enhance(self, args: dict[str, Any]) -> dict[str, Any]: def _tool_enhance(self, args: dict[str, Any]) -> dict[str, Any]:
coord = args.get("coordinate") or {} coord = args.get("coordinate") or {}
x = int(coord.get("x", 0)) requested_x = self._parse_int(coord.get("x", 0), default=0)
y = int(coord.get("y", 0)) requested_y = self._parse_int(coord.get("y", 0), default=0)
region = str(args.get("region", "small") or "small").strip().lower()
mode = str(args.get("mode", "ui") or "ui").strip().lower()
if region not in {"small", "medium", "large"}:
region = "small"
if mode not in {"ui", "text"}:
mode = "ui"
region_half_by_preset = {
"small": 96,
"medium": 160,
"large": 240,
}
default_scale_by_region = {
"small": 4,
"medium": 3,
"large": 2,
}
raw_scale = self._parse_int(args.get("scale"), default=0)
scale = raw_scale if raw_scale > 0 else default_scale_by_region[region]
scale = clamp(scale, 2, 6)
base, base_meta = self._capture_screen(with_grid=False) base, base_meta = self._capture_screen(with_grid=False)
width, height = base.size width, height = base.size
region_half = 180 source_x = clamp(requested_x, 0, max(0, width - 1))
left = clamp(x - region_half, 0, width - 1) source_y = clamp(requested_y, 0, max(0, height - 1))
top = clamp(y - region_half, 0, height - 1) region_half = region_half_by_preset[region]
right = clamp(x + region_half, left + 1, width) left = clamp(source_x - region_half, 0, width - 1)
bottom = clamp(y + region_half, top + 1, height) top = clamp(source_y - region_half, 0, height - 1)
right = clamp(source_x + region_half, left + 1, width)
bottom = clamp(source_y + region_half, top + 1, height)
crop = base.crop((left, top, right, bottom)) crop = base.crop((left, top, right, bottom))
upscaled = crop.resize((crop.width * 2, crop.height * 2), Image.Resampling.BICUBIC) out_w = max(2, crop.width * scale)
enhanced = ImageOps.autocontrast(upscaled) out_h = max(2, crop.height * scale)
enhanced = ImageEnhance.Sharpness(enhanced).enhance(2.0) upscaled = crop.resize((out_w, out_h), Image.Resampling.LANCZOS)
enhanced = ImageEnhance.Contrast(enhanced).enhance(1.25)
enhanced = enhanced.filter(ImageFilter.UnsharpMask(radius=1.8, percent=180, threshold=2))
out_path = self.artifacts.enhance_dir / f"enhance_step_{self.step:03d}_{x}_{y}.png" if mode == "text":
text_view = ImageOps.grayscale(upscaled)
text_view = ImageOps.autocontrast(text_view, cutoff=1)
text_view = ImageOps.equalize(text_view)
text_view = ImageEnhance.Contrast(text_view).enhance(1.35)
text_view = ImageEnhance.Sharpness(text_view).enhance(2.1)
processed = text_view.filter(ImageFilter.UnsharpMask(radius=1.2, percent=160, threshold=1)).convert("RGB")
else:
ui_view = ImageOps.autocontrast(upscaled, cutoff=1)
ui_view = ImageEnhance.Contrast(ui_view).enhance(1.2)
ui_view = ImageEnhance.Sharpness(ui_view).enhance(1.8)
processed = ui_view.filter(ImageFilter.UnsharpMask(radius=1.4, percent=150, threshold=2)).convert("RGB")
edges = upscaled.convert("L").filter(ImageFilter.FIND_EDGES)
edges = ImageOps.autocontrast(edges, cutoff=4)
edge_overlay = ImageOps.colorize(edges, black=(0, 0, 0), white=(60, 220, 255))
enhanced = Image.blend(processed, edge_overlay, alpha=0.18)
cx = clamp((source_x - left) * scale, 0, max(0, enhanced.width - 1))
cy = clamp((source_y - top) * scale, 0, max(0, enhanced.height - 1))
draw = ImageDraw.Draw(enhanced)
draw.rectangle([0, 0, enhanced.width - 1, enhanced.height - 1], outline=(255, 80, 80), width=2)
ring_radius = max(10, int(6 * scale / 2))
arm_len = max(14, int(9 * scale / 2))
gap = max(4, int(2 * scale / 2))
line_width = max(2, int(scale / 2))
draw.ellipse(
[cx - ring_radius, cy - ring_radius, cx + ring_radius, cy + ring_radius],
outline=(255, 80, 80),
width=line_width,
)
draw.line([(max(0, cx - arm_len), cy), (max(0, cx - gap), cy)], fill=(255, 80, 80), width=line_width)
draw.line(
[(min(enhanced.width - 1, cx + gap), cy), (min(enhanced.width - 1, cx + arm_len), cy)],
fill=(255, 80, 80),
width=line_width,
)
draw.line([(cx, max(0, cy - arm_len)), (cx, max(0, cy - gap))], fill=(255, 80, 80), width=line_width)
draw.line(
[(cx, min(enhanced.height - 1, cy + gap)), (cx, min(enhanced.height - 1, cy + arm_len))],
fill=(255, 80, 80),
width=line_width,
)
out_path = self.artifacts.enhance_dir / (
f"enhance_step_{self.step:03d}_{source_x}_{source_y}_{region}_{mode}_x{scale}.png"
)
self._save_image(enhanced, out_path) self._save_image(enhanced, out_path)
data_url = image_to_data_url(enhanced, "PNG") data_url = image_to_data_url(enhanced, "PNG")
meta = { meta = {
"captured_at": utc_now_iso(), "captured_at": utc_now_iso(),
"source_coord": {"x": x, "y": y}, "requested_coord": {"x": requested_x, "y": requested_y},
"source_coord": {"x": source_x, "y": source_y},
"source_box": {"left": left, "top": top, "right": right, "bottom": bottom}, "source_box": {"left": left, "top": top, "right": right, "bottom": bottom},
"scale": 2, "region": region,
"mode": mode,
"scale": scale,
"path": str(out_path.resolve()), "path": str(out_path.resolve()),
"size": {"width": enhanced.width, "height": enhanced.height},
"target_pixel": {"x": cx, "y": cy},
"screen_size": {"width": width, "height": height}, "screen_size": {"width": width, "height": height},
"base_capture_meta": base_meta, "base_capture_meta": base_meta,
} }
@@ -628,6 +736,9 @@ class ScreenJobAgent:
return {"_raw": raw} return {"_raw": raw}
def _call_model(self, input_items: list[dict[str, Any]]) -> Any: def _call_model(self, input_items: list[dict[str, Any]]) -> Any:
effort = str(self.options.reasoning_effort or "medium").strip().lower()
if effort not in {"low", "medium", "high"}:
effort = "medium"
return self.client.responses.create( return self.client.responses.create(
model=self.options.model, model=self.options.model,
instructions=SYSTEM_PROMPT, instructions=SYSTEM_PROMPT,
@@ -636,9 +747,85 @@ class ScreenJobAgent:
previous_response_id=self.previous_response_id, previous_response_id=self.previous_response_id,
parallel_tool_calls=True, parallel_tool_calls=True,
max_tool_calls=8, max_tool_calls=8,
reasoning={"effort": effort},
) )
def _record_tool_summary(self, tool_name: str, result: dict[str, Any]) -> None:
ok = bool(result.get("ok"))
status = "ok" if ok else "fail"
summary = f"step={self.step} tool={tool_name} status={status}"
if tool_name == "click":
clicked = result.get("clicked") if isinstance(result.get("clicked"), dict) else {}
x = clicked.get("x")
y = clicked.get("y")
if isinstance(x, int) and isinstance(y, int):
summary = f"{summary} at=({x},{y})"
elif tool_name == "type":
typed_length = int(result.get("typed_length", 0) or 0)
summary = f"{summary} typed_length={typed_length}"
elif tool_name == "press_key":
key = str(result.get("key") or "").strip()
if key:
summary = f"{summary} key={key}"
elif tool_name == "execute_command":
exit_code = result.get("exit_code")
if exit_code is not None:
summary = f"{summary} exit_code={exit_code}"
elif tool_name in {"see_screen", "enhance"}:
meta = result.get("meta") if isinstance(result.get("meta"), dict) else {}
path = str(meta.get("path") or result.get("path") or "").strip()
if path:
summary = f"{summary} image={path}"
if not ok:
error_text = str(result.get("error") or "").strip()
if error_text:
summary = f"{summary} error={error_text[:140]}"
self.recent_tool_summaries.append(summary)
self.recent_tool_summaries = self.recent_tool_summaries[-20:]
def _should_compact_context(self) -> bool:
interval = max(0, int(self.options.screen_context_decay_steps or 0))
if interval <= 0:
return False
if self.previous_response_id is None:
return False
return (self.step - self.last_context_compact_step) >= interval
def _build_compacted_pending_input(self) -> list[dict[str, Any]]:
recent = self.recent_tool_summaries[-8:]
lines = "\n".join(f"- {line}" for line in recent) if recent else "- No recent tool activity."
content = (
"Context compaction activated to decay stale screenshots and reduce token usage.\n"
f"JOB: {self.objective}\n"
f"Current step: {self.step}\n"
"Recent tool activity:\n"
f"{lines}\n"
"Continue execution from the latest screen state. "
"Use tools only, and finish with task_complete when done."
)
compacted_input: list[dict[str, Any]] = [
{
"role": "user",
"content": [
{
"type": "input_text",
"text": content,
}
],
}
]
if self.last_screen_data_url and self.last_screen_meta:
compacted_input.append(
self._build_visual_message(
"Current screen after context compaction",
self.last_screen_data_url,
self.last_screen_meta,
)
)
return compacted_input
def run(self, job: str) -> AgentResult: def run(self, job: str) -> AgentResult:
self.objective = job
started_at = time.time() started_at = time.time()
self.logger.info("Starting run_id=%s model=%s", self.artifacts.run_id, self.options.model) self.logger.info("Starting run_id=%s model=%s", self.artifacts.run_id, self.options.model)
self.logger.info("Job: %s", job) self.logger.info("Job: %s", job)
@@ -648,6 +835,8 @@ class ScreenJobAgent:
{ {
"run_id": self.artifacts.run_id, "run_id": self.artifacts.run_id,
"model": self.options.model, "model": self.options.model,
"reasoning_effort": self.options.reasoning_effort,
"screen_context_decay_steps": self.options.screen_context_decay_steps,
"objective": job, "objective": job,
"disabled_tools": sorted(self.disabled_tools), "disabled_tools": sorted(self.disabled_tools),
}, },
@@ -664,6 +853,8 @@ class ScreenJobAgent:
f"JOB: {job}\n" f"JOB: {job}\n"
"You are in an action loop. Prefer execute_command for deterministic actions. " "You are in an action loop. Prefer execute_command for deterministic actions. "
"For modifier shortcuts, use a single press_key combo (example: win+r). " "For modifier shortcuts, use a single press_key combo (example: win+r). "
"Before clicking tiny buttons/icons or dense UI areas, call enhance first "
"(use region='small'; use mode='text' for tiny text labels). "
"You can return multiple tool calls in one step (example: click then sleep). " "You can return multiple tool calls in one step (example: click then sleep). "
"When done call task_complete(return=..., data=...). " "When done call task_complete(return=..., data=...). "
"Before task_complete, verify the screen content is what was expected " "Before task_complete, verify the screen content is what was expected "
@@ -692,6 +883,19 @@ class ScreenJobAgent:
self.step += 1 self.step += 1
self.logger.info("---- Agent step %d/%d ----", self.step, self.options.max_steps) self.logger.info("---- Agent step %d/%d ----", self.step, self.options.max_steps)
self._emit("step_started", {"step": self.step, "max_steps": self.options.max_steps}) self._emit("step_started", {"step": self.step, "max_steps": self.options.max_steps})
if self._should_compact_context():
self.previous_response_id = None
pending_input = self._build_compacted_pending_input()
self.last_context_compact_step = self.step
self.logger.info("Compacted model context at step %d.", self.step)
self._emit(
"context_compacted",
{
"step": self.step,
"decay_steps": self.options.screen_context_decay_steps,
"recent_tool_summaries": self.recent_tool_summaries[-8:],
},
)
try: try:
response = self._call_model(pending_input) response = self._call_model(pending_input)
self._register_usage(response) self._register_usage(response)
@@ -720,6 +924,8 @@ class ScreenJobAgent:
"text": ( "text": (
"No function call was returned. Continue by using tools. " "No function call was returned. Continue by using tools. "
"Use one press_key call for key combos like win+r. " "Use one press_key call for key combos like win+r. "
"Prefer enhance before clicking small/unclear targets "
"(region='small', mode='ui' or 'text'). "
"You may call multiple tools in one step. " "You may call multiple tools in one step. "
"Before task_complete, verify expected screen content with see_screen/enhance " "Before task_complete, verify expected screen content with see_screen/enhance "
"and include observed_result in data. " "and include observed_result in data. "
@@ -763,6 +969,7 @@ class ScreenJobAgent:
name, name,
json.dumps(result, ensure_ascii=False)[:2500], json.dumps(result, ensure_ascii=False)[:2500],
) )
self._record_tool_summary(name, result)
self._emit("tool_result", {"step": self.step, "tool": name, "result": result}) self._emit("tool_result", {"step": self.step, "tool": name, "result": result})
next_input.append( next_input.append(
{ {

View File

@@ -28,6 +28,18 @@ def build_parser() -> argparse.ArgumentParser:
parser.add_argument("--command-timeout", type=int, default=45, help="Timeout in seconds for execute_command.") parser.add_argument("--command-timeout", type=int, default=45, help="Timeout in seconds for execute_command.")
parser.add_argument("--type-interval", type=float, default=0.02, help="Seconds between typed characters.") parser.add_argument("--type-interval", type=float, default=0.02, help="Seconds between typed characters.")
parser.add_argument("--click-pause", type=float, default=0.10, help="Mouse move duration before click.") parser.add_argument("--click-pause", type=float, default=0.10, help="Mouse move duration before click.")
parser.add_argument(
"--reasoning-effort",
choices=["low", "medium", "high"],
default="medium",
help="Reasoning effort passed to the model.",
)
parser.add_argument(
"--screen-context-decay-steps",
type=int,
default=4,
help="Compact model context every N steps to decay old screenshots (0 disables).",
)
parser.add_argument("--disable-tool", action="append", default=[], help="Disable a tool by name.") parser.add_argument("--disable-tool", action="append", default=[], help="Disable a tool by name.")
parser.add_argument("--skip-safety-check", action="store_true", help="Bypass pre-flight safety check.") parser.add_argument("--skip-safety-check", action="store_true", help="Bypass pre-flight safety check.")
parser.add_argument("--no-failsafe", action="store_true", help="Disable PyAutoGUI fail-safe.") parser.add_argument("--no-failsafe", action="store_true", help="Disable PyAutoGUI fail-safe.")
@@ -78,6 +90,8 @@ def main(argv: list[str] | None = None) -> int:
command_timeout=args.command_timeout, command_timeout=args.command_timeout,
type_interval=args.type_interval, type_interval=args.type_interval,
click_pause=args.click_pause, click_pause=args.click_pause,
reasoning_effort=args.reasoning_effort,
screen_context_decay_steps=max(0, int(args.screen_context_decay_steps)),
disable_tools=set(disabled_tools), disable_tools=set(disabled_tools),
) )
try: try:

View File

@@ -58,4 +58,6 @@ class RuntimeOptions:
command_timeout: int = 45 command_timeout: int = 45
type_interval: float = 0.02 type_interval: float = 0.02
click_pause: float = 0.10 click_pause: float = 0.10
reasoning_effort: str = "medium"
screen_context_decay_steps: int = 4
disable_tools: set[str] | None = None disable_tools: set[str] | None = None

View File

@@ -16,6 +16,7 @@ from .config import AppConfig, load_app_config
from .storage import HistoryDB from .storage import HistoryDB
from .task_manager import JobManager from .task_manager import JobManager
from .ui import monitoring_js_path, monitoring_page_html from .ui import monitoring_js_path, monitoring_page_html
from .utils import utc_now_iso
class CreateJobRequest(BaseModel): class CreateJobRequest(BaseModel):
@@ -25,6 +26,8 @@ class CreateJobRequest(BaseModel):
command_timeout: int = Field(45, ge=1, le=600) command_timeout: int = Field(45, ge=1, le=600)
type_interval: float = Field(0.02, ge=0.0, le=1.0) type_interval: float = Field(0.02, ge=0.0, le=1.0)
click_pause: float = Field(0.10, ge=0.0, le=2.0) click_pause: float = Field(0.10, ge=0.0, le=2.0)
reasoning_effort: str = Field("medium", pattern="^(low|medium|high)$")
screen_context_decay_steps: int = Field(4, ge=0, le=50)
disabled_tools: list[str] = Field(default_factory=list) disabled_tools: list[str] = Field(default_factory=list)
safety_override: bool = False safety_override: bool = False
no_failsafe: bool = False no_failsafe: bool = False
@@ -301,6 +304,8 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
command_timeout=payload.command_timeout, command_timeout=payload.command_timeout,
type_interval=payload.type_interval, type_interval=payload.type_interval,
click_pause=payload.click_pause, click_pause=payload.click_pause,
reasoning_effort=payload.reasoning_effort,
screen_context_decay_steps=payload.screen_context_decay_steps,
disabled_tools=payload.disabled_tools, disabled_tools=payload.disabled_tools,
safety_override=payload.safety_override, safety_override=payload.safety_override,
no_failsafe=payload.no_failsafe, no_failsafe=payload.no_failsafe,
@@ -382,6 +387,12 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
def stats(_: None = Depends(require_token)) -> dict[str, Any]: def stats(_: None = Depends(require_token)) -> dict[str, Any]:
return manager.stats() return manager.stats()
@app.get("/api/analytics")
def analytics(_: None = Depends(require_token)) -> dict[str, Any]:
payload = manager.analytics()
payload["generated_at"] = utc_now_iso()
return payload
if not app_config.disable_ui: if not app_config.disable_ui:
@app.get("/", response_class=HTMLResponse) @app.get("/", response_class=HTMLResponse)
def ui_root() -> str: def ui_root() -> str:

View File

@@ -7,6 +7,39 @@ from pathlib import Path
from typing import Any from typing import Any
_TERMINAL_STATUSES = {"completed", "failed", "cancelled"}
_CATEGORY_RULES: tuple[tuple[str, tuple[str, ...]], ...] = (
(
"Browser / web",
("browser", "website", "webpage", "chrome", "url", "amazon", "google", "login", "shopping", "checkout", "orders"),
),
(
"Files / terminal",
("file", "folder", "directory", "terminal", "shell", "command", "cli", "script", "git", "repo", "install", "pip", "npm", "powershell", "bash"),
),
(
"Writing / docs",
("write", "summary", "summarize", "document", "docs", "report", "email", "message", "readme", "markdown", "note", "proposal"),
),
(
"Data / analysis",
("data", "analysis", "analyze", "csv", "spreadsheet", "sheet", "table", "chart", "dashboard", "metric", "metrics", "sql"),
),
(
"Development / ops",
("code", "bug", "fix", "test", "debug", "api", "backend", "frontend", "database", "deploy", "docker", "service", "build"),
),
)
def _objective_category(objective: str) -> str:
text = objective.lower()
for category, keywords in _CATEGORY_RULES:
if any(keyword in text for keyword in keywords):
return category
return "Other"
class HistoryDB: class HistoryDB:
def __init__(self, db_path: Path) -> None: def __init__(self, db_path: Path) -> None:
self.db_path = db_path self.db_path = db_path
@@ -184,6 +217,131 @@ class HistoryDB:
).fetchone() ).fetchone()
return dict(totals) if totals else {} return dict(totals) if totals else {}
def analytics(self) -> dict[str, Any]:
with self._connect() as conn:
rows = conn.execute(
"""
SELECT job_id, objective, status, steps, estimated_cost_usd, created_at
FROM jobs
ORDER BY created_at ASC, job_id ASC
"""
).fetchall()
total_jobs = 0
finished_jobs = 0
completed_jobs = 0
failed_jobs = 0
cancelled_jobs = 0
steps_sum = 0
steps_count = 0
cost_sum = 0.0
cost_count = 0
by_category: dict[str, dict[str, Any]] = {}
by_day: dict[str, dict[str, Any]] = {}
def _bucket(target: dict[str, dict[str, Any]], key: str) -> dict[str, Any]:
bucket = target.setdefault(
key,
{
"label": key,
"total_jobs": 0,
"finished_jobs": 0,
"completed_jobs": 0,
"failed_jobs": 0,
"cancelled_jobs": 0,
"steps_sum": 0,
"steps_count": 0,
"cost_sum": 0.0,
"cost_count": 0,
},
)
return bucket
for row in rows:
total_jobs += 1
status = str(row["status"] or "")
finished = status in _TERMINAL_STATUSES
completed = status == "completed"
objective = str(row["objective"] or "")
category = _objective_category(objective)
created_at = str(row["created_at"] or "")
day = created_at[:10] if len(created_at) >= 10 else created_at or "unknown"
category_bucket = _bucket(by_category, category)
day_bucket = _bucket(by_day, day)
for bucket in (category_bucket, day_bucket):
bucket["total_jobs"] += 1
if not finished:
continue
finished_jobs += 1
if completed:
completed_jobs += 1
elif status == "failed":
failed_jobs += 1
elif status == "cancelled":
cancelled_jobs += 1
steps = row["steps"]
if steps is not None:
step_value = int(steps)
steps_sum += step_value
steps_count += 1
for bucket in (category_bucket, day_bucket):
bucket["steps_sum"] += step_value
bucket["steps_count"] += 1
estimated_cost = row["estimated_cost_usd"]
if estimated_cost is not None:
cost_value = float(estimated_cost)
cost_sum += cost_value
cost_count += 1
for bucket in (category_bucket, day_bucket):
bucket["cost_sum"] += cost_value
bucket["cost_count"] += 1
for bucket in (category_bucket, day_bucket):
bucket["finished_jobs"] += 1
if completed:
bucket["completed_jobs"] += 1
elif status == "failed":
bucket["failed_jobs"] += 1
elif status == "cancelled":
bucket["cancelled_jobs"] += 1
def _finalize(bucket: dict[str, Any]) -> dict[str, Any]:
finished = bucket["finished_jobs"]
return {
"label": bucket["label"],
"total_jobs": bucket["total_jobs"],
"finished_jobs": finished,
"completed_jobs": bucket["completed_jobs"],
"failed_jobs": bucket["failed_jobs"],
"cancelled_jobs": bucket["cancelled_jobs"],
"success_rate": round((bucket["completed_jobs"] / finished) * 100, 2) if finished else 0.0,
"avg_steps": round(bucket["steps_sum"] / bucket["steps_count"], 2) if bucket["steps_count"] else None,
"avg_cost_usd": round(bucket["cost_sum"] / bucket["cost_count"], 6) if bucket["cost_count"] else None,
}
category_rows = [_finalize(bucket) for bucket in by_category.values()]
category_rows.sort(key=lambda item: (-item["success_rate"], item["label"]))
day_rows = [_finalize(bucket) for bucket in by_day.values()]
day_rows.sort(key=lambda item: item["label"])
return {
"total_jobs": total_jobs,
"finished_jobs": finished_jobs,
"completed_jobs": completed_jobs,
"failed_jobs": failed_jobs,
"cancelled_jobs": cancelled_jobs,
"success_rate": round((completed_jobs / finished_jobs) * 100, 2) if finished_jobs else 0.0,
"avg_steps": round(steps_sum / steps_count, 2) if steps_count else None,
"avg_cost_usd": round(cost_sum / cost_count, 6) if cost_count else None,
"by_category": category_rows,
"timeline": day_rows,
}
def _row_to_job(self, row: sqlite3.Row) -> dict[str, Any]: def _row_to_job(self, row: sqlite3.Row) -> dict[str, Any]:
disabled_tools: list[str] = [] disabled_tools: list[str] = []
try: try:

View File

@@ -48,6 +48,8 @@ class JobManager:
command_timeout: int = 45, command_timeout: int = 45,
type_interval: float = 0.02, type_interval: float = 0.02,
click_pause: float = 0.10, click_pause: float = 0.10,
reasoning_effort: str = "medium",
screen_context_decay_steps: int = 4,
disabled_tools: list[str] | None = None, disabled_tools: list[str] | None = None,
safety_override: bool = False, safety_override: bool = False,
no_failsafe: bool = False, no_failsafe: bool = False,
@@ -93,6 +95,8 @@ class JobManager:
"command_timeout": command_timeout, "command_timeout": command_timeout,
"type_interval": type_interval, "type_interval": type_interval,
"click_pause": click_pause, "click_pause": click_pause,
"reasoning_effort": reasoning_effort,
"screen_context_decay_steps": screen_context_decay_steps,
"no_failsafe": no_failsafe, "no_failsafe": no_failsafe,
"cancel_event": cancel_event, "cancel_event": cancel_event,
}, },
@@ -121,6 +125,8 @@ class JobManager:
command_timeout: int, command_timeout: int,
type_interval: float, type_interval: float,
click_pause: float, click_pause: float,
reasoning_effort: str,
screen_context_decay_steps: int,
no_failsafe: bool, no_failsafe: bool,
cancel_event: threading.Event, cancel_event: threading.Event,
) -> None: ) -> None:
@@ -218,6 +224,8 @@ class JobManager:
command_timeout=command_timeout, command_timeout=command_timeout,
type_interval=type_interval, type_interval=type_interval,
click_pause=click_pause, click_pause=click_pause,
reasoning_effort=reasoning_effort,
screen_context_decay_steps=max(0, int(screen_context_decay_steps)),
disable_tools=set(disabled_tools), disable_tools=set(disabled_tools),
) )
try: try:
@@ -343,6 +351,9 @@ class JobManager:
stats["live_running_threads"] = sum(1 for job in self._running.values() if job.thread.is_alive()) stats["live_running_threads"] = sum(1 for job in self._running.values() if job.thread.is_alive())
return stats return stats
def analytics(self) -> dict[str, Any]:
return self.db.analytics()
def _normalize_job_payload(self, job: dict[str, Any]) -> dict[str, Any]: def _normalize_job_payload(self, job: dict[str, Any]) -> dict[str, Any]:
response = job.get("response") response = job.get("response")
if not isinstance(response, dict): if not isinstance(response, dict):

View File

@@ -21,6 +21,30 @@
<section class="grid grid-cols-2 md:grid-cols-6 gap-3" id="stats"></section> <section class="grid grid-cols-2 md:grid-cols-6 gap-3" id="stats"></section>
<section class="space-y-3">
<div class="flex items-center justify-between gap-3">
<h2 class="font-semibold">Analytics</h2>
<div id="analyticsMeta" class="text-[11px] text-slate-400"></div>
</div>
<div id="analyticsSummary" class="grid grid-cols-2 md:grid-cols-4 gap-3"></div>
<div class="grid grid-cols-1 xl:grid-cols-2 gap-4">
<div class="bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
<div class="flex items-center justify-between gap-3">
<h3 class="font-semibold text-sm">Success by Objective Category</h3>
<div id="analyticsCategorySummary" class="text-[11px] text-slate-400"></div>
</div>
<div id="analyticsCategories" class="space-y-3"></div>
</div>
<div class="bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
<div class="flex items-center justify-between gap-3">
<h3 class="font-semibold text-sm">Avg Steps / Cost Over Time</h3>
<div id="analyticsTrendSummary" class="text-[11px] text-slate-400"></div>
</div>
<div id="analyticsTrends" class="space-y-4"></div>
</div>
</div>
</section>
<section class="grid grid-cols-1 lg:grid-cols-5 gap-4"> <section class="grid grid-cols-1 lg:grid-cols-5 gap-4">
<div class="lg:col-span-2 bg-slate-900/70 border border-slate-800 rounded-xl p-4"> <div class="lg:col-span-2 bg-slate-900/70 border border-slate-800 rounded-xl p-4">
<div class="flex items-center justify-between mb-3"> <div class="flex items-center justify-between mb-3">

View File

@@ -17,6 +17,12 @@ const replayPrevBtn = document.getElementById("replayPrevBtn");
const replayNextBtn = document.getElementById("replayNextBtn"); const replayNextBtn = document.getElementById("replayNextBtn");
const replaySpeedEl = document.getElementById("replaySpeed"); const replaySpeedEl = document.getElementById("replaySpeed");
const replaySeekEl = document.getElementById("replaySeek"); const replaySeekEl = document.getElementById("replaySeek");
const analyticsMetaEl = document.getElementById("analyticsMeta");
const analyticsSummaryEl = document.getElementById("analyticsSummary");
const analyticsCategorySummaryEl = document.getElementById("analyticsCategorySummary");
const analyticsCategoriesEl = document.getElementById("analyticsCategories");
const analyticsTrendSummaryEl = document.getElementById("analyticsTrendSummary");
const analyticsTrendsEl = document.getElementById("analyticsTrends");
const state = { const state = {
token: localStorage.getItem("screenjob_token") || "", token: localStorage.getItem("screenjob_token") || "",
@@ -35,6 +41,7 @@ const state = {
} }
}; };
const manuallyClosedSockets = new WeakSet(); const manuallyClosedSockets = new WeakSet();
const analyticsRefreshEvents = new Set(["job_finished", "job_failed", "job_rejected"]);
tokenInput.value = state.token; tokenInput.value = state.token;
function authHeaders() { function authHeaders() {
@@ -66,6 +73,197 @@ function renderStats(stats) {
`).join(""); `).join("");
} }
function escapeHtml(value) {
return String(value ?? "").replace(/[&<>"']/g, (ch) => ({
"&": "&amp;",
"<": "&lt;",
">": "&gt;",
'"': "&quot;",
"'": "&#39;"
})[ch]);
}
function formatNumber(value, digits = 2) {
const num = Number(value);
return Number.isFinite(num) ? num.toFixed(digits) : "—";
}
function formatCurrency(value, digits = 6) {
const num = Number(value);
return Number.isFinite(num) ? `$${num.toFixed(digits)}` : "—";
}
function formatPercent(value) {
const num = Number(value);
return Number.isFinite(num) ? `${num.toFixed(1)}%` : "—";
}
function formatDateLabel(value) {
const dt = new Date(value);
if (Number.isNaN(dt.getTime())) return String(value || "—");
return dt.toLocaleDateString(undefined, { month: "short", day: "numeric" });
}
function renderMetricCard(label, value) {
return `
<div class="bg-slate-950 border border-slate-800 rounded-xl p-3">
<div class="text-[11px] uppercase tracking-wide text-slate-400">${escapeHtml(label)}</div>
<div class="text-xl font-semibold mt-1">${escapeHtml(value)}</div>
</div>
`;
}
function renderLineChart(title, points, options = {}) {
const color = options.color || "#22d3ee";
const valueLabel = options.valueLabel || "";
const sourcePoints = Array.isArray(points)
? points.filter((point) => Number.isFinite(Number(point.value)))
: [];
if (!sourcePoints.length) {
return `
<div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3">
<div class="flex items-center justify-between gap-3">
<div>
<div class="text-xs text-slate-400">${escapeHtml(title)}</div>
<div class="text-sm text-slate-200 font-semibold">No data yet</div>
</div>
</div>
</div>
`;
}
const width = 640;
const height = 220;
const margin = { top: 20, right: 18, bottom: 34, left: 44 };
const values = sourcePoints.map((point) => Number(point.value));
const minValue = Math.min(...values);
const maxValue = Math.max(...values);
const span = maxValue - minValue || 1;
const chartWidth = width - margin.left - margin.right;
const chartHeight = height - margin.top - margin.bottom;
const xStep = sourcePoints.length > 1 ? chartWidth / (sourcePoints.length - 1) : 0;
const coords = sourcePoints.map((point, index) => ({
x: margin.left + (index * xStep),
y: margin.top + ((maxValue - Number(point.value)) / span) * chartHeight,
}));
const linePath = coords.map((point, index) => `${index === 0 ? "M" : "L"} ${point.x} ${point.y}`).join(" ");
const baseline = height - margin.bottom;
const midIndex = Math.floor(sourcePoints.length / 2);
const xLabels = [
{ index: 0, label: sourcePoints[0].label },
{ index: midIndex, label: sourcePoints[midIndex].label },
{ index: sourcePoints.length - 1, label: sourcePoints[sourcePoints.length - 1].label },
].filter((item, index, array) => item.label && array.findIndex((candidate) => candidate.index === item.index) === index);
const minLabel = options.formatValue ? options.formatValue(minValue) : formatNumber(minValue, 2);
const maxLabel = options.formatValue ? options.formatValue(maxValue) : formatNumber(maxValue, 2);
const latest = sourcePoints[sourcePoints.length - 1];
const latestValue = options.formatValue ? options.formatValue(latest.value) : formatNumber(latest.value, 2);
return `
<div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3 space-y-2">
<div class="flex items-center justify-between gap-3">
<div>
<div class="text-xs text-slate-400">${escapeHtml(title)}</div>
<div class="text-sm text-slate-200 font-semibold">${escapeHtml(latestValue)}${valueLabel ? ` <span class="text-slate-500 font-normal">${escapeHtml(valueLabel)}</span>` : ""}</div>
</div>
<div class="text-[11px] text-slate-400 text-right">
<div>${escapeHtml(sourcePoints.length)} points</div>
<div>${escapeHtml(minLabel)} - ${escapeHtml(maxLabel)}</div>
</div>
</div>
<svg viewBox="0 0 ${width} ${height}" class="w-full h-56">
${Array.from({ length: 4 }, (_, idx) => {
const y = margin.top + (chartHeight / 3) * idx;
return `<line x1="${margin.left}" y1="${y}" x2="${width - margin.right}" y2="${y}" stroke="rgba(51, 65, 85, 0.7)" stroke-width="1" />`;
}).join("")}
<line x1="${margin.left}" y1="${baseline}" x2="${width - margin.right}" y2="${baseline}" stroke="rgba(71, 85, 105, 0.8)" stroke-width="1.5" />
<path d="${linePath}" fill="none" stroke="${color}" stroke-width="3" stroke-linecap="round" stroke-linejoin="round" />
${coords.map((point) => `
<circle cx="${point.x}" cy="${point.y}" r="4.5" fill="${color}" />
`).join("")}
<text x="${margin.left - 8}" y="${margin.top + 4}" text-anchor="end" class="fill-slate-400 text-[10px]">${escapeHtml(maxLabel)}</text>
<text x="${margin.left - 8}" y="${baseline}" text-anchor="end" class="fill-slate-400 text-[10px]">${escapeHtml(minLabel)}</text>
${xLabels.map((item) => `
<text x="${coords[item.index].x}" y="${height - 10}" text-anchor="middle" class="fill-slate-500 text-[10px]">${escapeHtml(formatDateLabel(item.label))}</text>
`).join("")}
</svg>
</div>
`;
}
function renderAnalytics(payload) {
const analytics = payload || {};
const categories = Array.isArray(analytics.by_category) ? analytics.by_category : [];
const timeline = Array.isArray(analytics.timeline) ? analytics.timeline : [];
const finishedCategories = categories.filter((row) => Number(row.finished_jobs || 0) > 0);
if (analyticsMetaEl) {
analyticsMetaEl.textContent = analytics.generated_at
? `Updated ${new Date(analytics.generated_at).toLocaleString()}`
: "Historical snapshot";
}
analyticsSummaryEl.innerHTML = [
renderMetricCard("Finished Jobs", analytics.finished_jobs || 0),
renderMetricCard("Success Rate", formatPercent(analytics.success_rate)),
renderMetricCard("Avg Steps", formatNumber(analytics.avg_steps, 1)),
renderMetricCard("Avg Cost", formatCurrency(analytics.avg_cost_usd)),
].join("");
analyticsCategorySummaryEl.textContent = finishedCategories.length
? `${finishedCategories.length} categories`
: "No finished jobs yet";
if (finishedCategories.length) {
analyticsCategoriesEl.innerHTML = finishedCategories.map((row) => {
const successRate = Number(row.success_rate || 0);
const completed = Number(row.completed_jobs || 0);
const finished = Number(row.finished_jobs || 0);
const total = Number(row.total_jobs || 0);
const avgSteps = row.avg_steps == null ? "—" : formatNumber(row.avg_steps, 1);
const avgCost = row.avg_cost_usd == null ? "—" : formatCurrency(row.avg_cost_usd);
return `
<div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3 space-y-2">
<div class="flex items-start justify-between gap-3">
<div>
<div class="font-medium">${escapeHtml(row.label || "Other")}</div>
<div class="text-[11px] text-slate-400">${finished} finished · ${completed} completed · ${total} total</div>
</div>
<div class="text-right">
<div class="text-base font-semibold">${formatPercent(successRate)}</div>
<div class="text-[11px] text-slate-500">success rate</div>
</div>
</div>
<div class="h-2 rounded bg-slate-800 overflow-hidden">
<div class="h-full rounded bg-cyan-400" style="width: ${Math.max(0, Math.min(successRate, 100))}%"></div>
</div>
<div class="grid grid-cols-2 gap-2 text-[11px] text-slate-300">
<div>Avg steps: ${escapeHtml(avgSteps)}</div>
<div>Avg cost: ${escapeHtml(avgCost)}</div>
</div>
</div>
`;
}).join("");
} else {
analyticsCategoriesEl.innerHTML = `
<div class="rounded-lg border border-dashed border-slate-800 bg-slate-950/70 p-4 text-sm text-slate-400">
No finished jobs yet.
</div>
`;
}
analyticsTrendSummaryEl.textContent = timeline.length ? `${timeline.length} days` : "No daily data yet";
analyticsTrendsEl.innerHTML = [
renderLineChart("Average steps per day", timeline.map((row) => ({ label: row.label, value: row.avg_steps })), { color: "#38bdf8" }),
renderLineChart("Average cost per day", timeline.map((row) => ({ label: row.label, value: row.avg_cost_usd })), {
color: "#34d399",
valueLabel: "USD",
formatValue: (value) => formatCurrency(value),
}),
].join("");
}
function renderJobs() { function renderJobs() {
jobListEl.innerHTML = state.jobs.map((job) => { jobListEl.innerHTML = state.jobs.map((job) => {
const active = job.job_id === state.selectedJobId; const active = job.job_id === state.selectedJobId;
@@ -310,6 +508,11 @@ async function refreshStats() {
renderStats(payload); renderStats(payload);
} }
async function refreshAnalytics() {
const payload = await api("/api/analytics");
renderAnalytics(payload);
}
async function refreshJobDetail() { async function refreshJobDetail() {
if (!state.selectedJobId) return; if (!state.selectedJobId) return;
const [job, events, replay] = await Promise.all([ const [job, events, replay] = await Promise.all([
@@ -345,6 +548,9 @@ function connectWs() {
} }
await refreshJobs(); await refreshJobs();
await refreshStats(); await refreshStats();
if (analyticsRefreshEvents.has(payload.event_type)) {
await refreshAnalytics();
}
} catch (err) { } catch (err) {
console.error(err); console.error(err);
} }
@@ -362,6 +568,7 @@ function connectWs() {
async function fullRefresh() { async function fullRefresh() {
await refreshJobs(); await refreshJobs();
await refreshStats(); await refreshStats();
await refreshAnalytics();
await refreshJobDetail(); await refreshJobDetail();
} }

View File

@@ -15,10 +15,76 @@ function Test-EnvVarLine {
return [bool](Select-String -Path $FilePath -Pattern ("^\s*" + [regex]::Escape($Name) + "=") -Quiet) return [bool](Select-String -Path $FilePath -Pattern ("^\s*" + [regex]::Escape($Name) + "=") -Quiet)
} }
if (-not (Get-Command python -ErrorAction SilentlyContinue)) { function Resolve-PythonExecutable {
throw "Python was not found in PATH. Install Python 3.11+ and retry." $venvPython = Join-Path $scriptDir ".venv\Scripts\python.exe"
if (Test-Path -LiteralPath $venvPython) {
return $venvPython
}
$pythonCmd = Get-Command python -ErrorAction SilentlyContinue
if ($null -ne $pythonCmd -and (Test-Path -LiteralPath $pythonCmd.Source)) {
return $pythonCmd.Source
}
$candidatePyLaunchers = @()
$pyFromPath = Get-Command py -ErrorAction SilentlyContinue
if ($null -ne $pyFromPath -and (Test-Path -LiteralPath $pyFromPath.Source)) {
$candidatePyLaunchers += $pyFromPath.Source
}
$candidatePyLaunchers += "C:\Windows\py.exe"
if ($scriptDir -match "^[A-Za-z]:\\Users\\[^\\]+") {
$repoUserHome = $Matches[0]
$candidatePyLaunchers += (Join-Path $repoUserHome "AppData\Local\Programs\Python\Launcher\py.exe")
}
foreach ($pyLauncher in ($candidatePyLaunchers | Select-Object -Unique)) {
if (-not (Test-Path -LiteralPath $pyLauncher)) {
continue
}
try {
$resolved = (& $pyLauncher -3 -c "import sys; print(sys.executable)" 2>$null | Select-Object -Last 1).Trim()
if ($resolved -and (Test-Path -LiteralPath $resolved)) {
return $resolved
}
} catch {
continue
}
}
$candidatePythonPaths = @()
if ($scriptDir -match "^[A-Za-z]:\\Users\\[^\\]+") {
$repoUserHome = $Matches[0]
$pythonBase = Join-Path $repoUserHome "AppData\Local\Programs\Python"
if (Test-Path -LiteralPath $pythonBase) {
$candidatePythonPaths += (Get-ChildItem -LiteralPath $pythonBase -Directory -ErrorAction SilentlyContinue |
Sort-Object Name -Descending |
ForEach-Object { Join-Path $_.FullName "python.exe" })
}
}
$candidatePythonPaths += @(
"C:\Python314\python.exe",
"C:\Python313\python.exe",
"C:\Python312\python.exe",
"C:\Python311\python.exe",
"C:\Program Files\Python314\python.exe",
"C:\Program Files\Python313\python.exe",
"C:\Program Files\Python312\python.exe",
"C:\Program Files\Python311\python.exe"
)
foreach ($candidate in ($candidatePythonPaths | Select-Object -Unique)) {
if (Test-Path -LiteralPath $candidate) {
return $candidate
}
}
throw "Python was not found. Install Python 3.11+ system-wide, or create .venv in the repo root."
} }
$pythonExe = Resolve-PythonExecutable
$envFile = Join-Path $scriptDir ".env" $envFile = Join-Path $scriptDir ".env"
if (-not (Test-Path -LiteralPath $envFile)) { if (-not (Test-Path -LiteralPath $envFile)) {
Write-Warning ".env was not found at $envFile. Server startup may fail if required vars are missing." Write-Warning ".env was not found at $envFile. Server startup may fail if required vars are missing."
@@ -31,5 +97,5 @@ if (-not (Test-Path -LiteralPath $envFile)) {
} }
} }
Write-Host "Starting ScreenJob backend on configured host/port..." -ForegroundColor Cyan Write-Host "Starting ScreenJob backend with Python: $pythonExe" -ForegroundColor Cyan
python main.py server & $pythonExe main.py server

View File

@@ -0,0 +1,11 @@
Option Explicit
Dim shell, fso, scriptDir, psScript, command
Set shell = CreateObject("WScript.Shell")
Set fso = CreateObject("Scripting.FileSystemObject")
scriptDir = fso.GetParentFolderName(WScript.ScriptFullName)
psScript = """" & fso.BuildPath(scriptDir, "screenjob_tray.ps1") & """"
command = "powershell.exe -NoProfile -ExecutionPolicy Bypass -WindowStyle Hidden -STA -File " & psScript
shell.Run command, 0, False

View File

@@ -91,6 +91,41 @@ def test_click_supports_directional_offsets(tmp_path: Path, monkeypatch) -> None
assert click_result["clicked"] == {"x": 110, "y": 102} assert click_result["clicked"] == {"x": 110, "y": 102}
def test_enhance_defaults_to_small_ui_preset(tmp_path: Path, monkeypatch) -> None:
agent = _build_agent(tmp_path, monkeypatch)
result = agent._tool_enhance({"coordinate": {"x": 100, "y": 120}})
assert result["ok"] is True
meta = result["meta"]
assert meta["region"] == "small"
assert meta["mode"] == "ui"
assert meta["scale"] == 4
assert Path(meta["path"]).exists()
assert meta["target_pixel"]["x"] >= 0
assert meta["target_pixel"]["y"] >= 0
def test_enhance_supports_text_mode_and_scale_clamp(tmp_path: Path, monkeypatch) -> None:
agent = _build_agent(tmp_path, monkeypatch)
result = agent._tool_enhance(
{
"coordinate": {"x": -99, "y": 9999},
"region": "medium",
"mode": "text",
"scale": 99,
}
)
assert result["ok"] is True
meta = result["meta"]
assert meta["region"] == "medium"
assert meta["mode"] == "text"
assert meta["scale"] == 6
assert meta["requested_coord"] == {"x": -99, "y": 9999}
assert meta["source_coord"] == {"x": 0, "y": 719}
assert Path(meta["path"]).exists()
def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None: def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None:
agent = _build_agent(tmp_path, monkeypatch) agent = _build_agent(tmp_path, monkeypatch)
result = agent._tool_press_key({"key": "meta+r"}) result = agent._tool_press_key({"key": "meta+r"})
@@ -98,3 +133,21 @@ def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None:
assert result["key"] == "win+r" assert result["key"] == "win+r"
assert result["message"] == "Key combo executed." assert result["message"] == "Key combo executed."
assert agent_module.pyautogui.last_hotkey == ("win", "r") assert agent_module.pyautogui.last_hotkey == ("win", "r")
def test_context_compaction_trigger_and_payload(tmp_path: Path, monkeypatch) -> None:
agent = _build_agent(tmp_path, monkeypatch)
agent.objective = "Open settings app"
agent.previous_response_id = "resp_123"
agent.step = 4
agent.last_context_compact_step = 0
agent.options.screen_context_decay_steps = 4
agent.recent_tool_summaries = ["step=1 tool=see_screen status=ok"]
agent.last_screen_data_url = "data:image/png;base64,abc"
agent.last_screen_meta = {"width": 1280, "height": 720, "path": "C:/tmp/frame.png"}
assert agent._should_compact_context() is True
compacted = agent._build_compacted_pending_input()
assert len(compacted) == 2
assert "Context compaction activated" in compacted[0]["content"][0]["text"]
assert "Open settings app" in compacted[0]["content"][0]["text"]

View File

@@ -29,7 +29,10 @@ def test_cli_emits_structured_return_and_data(monkeypatch: Any, capsys, tmp_path
def fake_assess_task_safety(*_args, **_kwargs): def fake_assess_task_safety(*_args, **_kwargs):
return True, "safe", {"safe": True} return True, "safe", {"safe": True}
captured_kwargs: dict[str, Any] = {}
def fake_run_job(*_args, **_kwargs): def fake_run_job(*_args, **_kwargs):
captured_kwargs.update(_kwargs)
result = AgentResult( result = AgentResult(
completed=True, completed=True,
result="Done", result="Done",
@@ -66,3 +69,5 @@ def test_cli_emits_structured_return_and_data(monkeypatch: Any, capsys, tmp_path
assert payload["response"]["data"] == "file1.txt\nfile2.txt" assert payload["response"]["data"] == "file1.txt\nfile2.txt"
assert payload["return"] == "Task completed successfully" assert payload["return"] == "Task completed successfully"
assert payload["data"] == "file1.txt\nfile2.txt" assert payload["data"] == "file1.txt\nfile2.txt"
assert captured_kwargs["options"].reasoning_effort == "medium"
assert captured_kwargs["options"].screen_context_decay_steps == 4

View File

@@ -9,6 +9,24 @@ import src.server as server_module
from src.config import AppConfig from src.config import AppConfig
_TERMINAL_STATUSES = {"completed", "failed", "cancelled"}
def _objective_category(objective: str) -> str:
text = objective.lower()
if any(keyword in text for keyword in ("browser", "website", "amazon", "google", "login", "shopping", "checkout", "orders")):
return "Browser / web"
if any(keyword in text for keyword in ("file", "folder", "directory", "terminal", "shell", "command", "cli", "script", "git", "repo", "install", "pip", "npm")):
return "Files / terminal"
if any(keyword in text for keyword in ("write", "summary", "document", "docs", "report", "email", "message", "readme", "markdown")):
return "Writing / docs"
if any(keyword in text for keyword in ("data", "analysis", "csv", "spreadsheet", "sheet", "table", "chart", "dashboard", "metric", "sql")):
return "Data / analysis"
if any(keyword in text for keyword in ("code", "bug", "fix", "test", "debug", "api", "backend", "frontend", "database", "deploy", "docker", "service", "build")):
return "Development / ops"
return "Other"
class FakeJobManager: class FakeJobManager:
def __init__(self, *, config: AppConfig, db: Any, broadcast: Any = None) -> None: def __init__(self, *, config: AppConfig, db: Any, broadcast: Any = None) -> None:
self.config = config self.config = config
@@ -26,6 +44,8 @@ class FakeJobManager:
command_timeout: int = 45, command_timeout: int = 45,
type_interval: float = 0.02, type_interval: float = 0.02,
click_pause: float = 0.10, click_pause: float = 0.10,
reasoning_effort: str = "medium",
screen_context_decay_steps: int = 4,
disabled_tools: list[str] | None = None, disabled_tools: list[str] | None = None,
safety_override: bool = False, safety_override: bool = False,
no_failsafe: bool = False, no_failsafe: bool = False,
@@ -37,6 +57,7 @@ class FakeJobManager:
artifacts_dir.mkdir(parents=True, exist_ok=True) artifacts_dir.mkdir(parents=True, exist_ok=True)
screenshot_path = artifacts_dir / "screen_step_001.png" screenshot_path = artifacts_dir / "screen_step_001.png"
screenshot_path.write_bytes(b"not-a-real-png") screenshot_path.write_bytes(b"not-a-real-png")
created_at = f"2026-05-27T00:00:{self._counter:02d}Z"
self.last_submit_payload = { self.last_submit_payload = {
"objective": objective, "objective": objective,
"model": selected_model, "model": selected_model,
@@ -46,6 +67,8 @@ class FakeJobManager:
"command_timeout": command_timeout, "command_timeout": command_timeout,
"type_interval": type_interval, "type_interval": type_interval,
"click_pause": click_pause, "click_pause": click_pause,
"reasoning_effort": reasoning_effort,
"screen_context_decay_steps": screen_context_decay_steps,
"no_failsafe": no_failsafe, "no_failsafe": no_failsafe,
} }
self._jobs[job_id] = { self._jobs[job_id] = {
@@ -53,6 +76,10 @@ class FakeJobManager:
"objective": objective, "objective": objective,
"model": selected_model, "model": selected_model,
"status": "running", "status": "running",
"created_at": created_at,
"started_at": created_at,
"ended_at": None,
"steps": 1,
"result": "Running", "result": "Running",
"response": {"return": "Running", "data": None}, "response": {"return": "Running", "data": None},
"return": "Running", "return": "Running",
@@ -145,6 +172,114 @@ class FakeJobManager:
"live_running_threads": 0, "live_running_threads": 0,
} }
def analytics(self) -> dict[str, Any]:
by_category: dict[str, dict[str, Any]] = {}
by_day: dict[str, dict[str, Any]] = {}
def bucket(target: dict[str, dict[str, Any]], key: str) -> dict[str, Any]:
return target.setdefault(
key,
{
"label": key,
"total_jobs": 0,
"finished_jobs": 0,
"completed_jobs": 0,
"failed_jobs": 0,
"cancelled_jobs": 0,
"steps_sum": 0,
"steps_count": 0,
"cost_sum": 0.0,
"cost_count": 0,
},
)
total_jobs = 0
finished_jobs = 0
completed_jobs = 0
failed_jobs = 0
cancelled_jobs = 0
steps_sum = 0
steps_count = 0
cost_sum = 0.0
cost_count = 0
for job in self._jobs.values():
total_jobs += 1
status = str(job.get("status") or "")
finished = status in _TERMINAL_STATUSES
category = _objective_category(str(job.get("objective") or ""))
day = str(job.get("created_at") or "")[:10] or "unknown"
category_bucket = bucket(by_category, category)
day_bucket = bucket(by_day, day)
for item in (category_bucket, day_bucket):
item["total_jobs"] += 1
if not finished:
continue
finished_jobs += 1
if status == "completed":
completed_jobs += 1
elif status == "failed":
failed_jobs += 1
elif status == "cancelled":
cancelled_jobs += 1
steps_raw = job.get("steps")
if steps_raw is not None:
steps = int(steps_raw)
steps_sum += steps
steps_count += 1
for item in (category_bucket, day_bucket):
item["steps_sum"] += steps
item["steps_count"] += 1
estimated_cost_raw = (job.get("usage") or {}).get("estimated_cost_usd")
if estimated_cost_raw is not None:
estimated_cost = float(estimated_cost_raw)
cost_sum += estimated_cost
cost_count += 1
for item in (category_bucket, day_bucket):
item["cost_sum"] += estimated_cost
item["cost_count"] += 1
for item in (category_bucket, day_bucket):
item["finished_jobs"] += 1
if status == "completed":
item["completed_jobs"] += 1
elif status == "failed":
item["failed_jobs"] += 1
elif status == "cancelled":
item["cancelled_jobs"] += 1
def finalize(item: dict[str, Any]) -> dict[str, Any]:
finished = item["finished_jobs"]
return {
"label": item["label"],
"total_jobs": item["total_jobs"],
"finished_jobs": finished,
"completed_jobs": item["completed_jobs"],
"failed_jobs": item["failed_jobs"],
"cancelled_jobs": item["cancelled_jobs"],
"success_rate": round((item["completed_jobs"] / finished) * 100, 2) if finished else 0.0,
"avg_steps": round(item["steps_sum"] / item["steps_count"], 2) if item["steps_count"] else None,
"avg_cost_usd": round(item["cost_sum"] / item["cost_count"], 6) if item["cost_count"] else None,
}
return {
"total_jobs": total_jobs,
"finished_jobs": finished_jobs,
"completed_jobs": completed_jobs,
"failed_jobs": failed_jobs,
"cancelled_jobs": cancelled_jobs,
"success_rate": round((completed_jobs / finished_jobs) * 100, 2) if finished_jobs else 0.0,
"avg_steps": round(steps_sum / steps_count, 2) if steps_count else None,
"avg_cost_usd": round(cost_sum / cost_count, 6) if cost_count else None,
"by_category": sorted((finalize(item) for item in by_category.values()), key=lambda item: (-item["success_rate"], item["label"])),
"timeline": sorted((finalize(item) for item in by_day.values()), key=lambda item: item["label"]),
}
def _build_app(tmp_path: Path, monkeypatch: Any, disable_ui: bool = False): def _build_app(tmp_path: Path, monkeypatch: Any, disable_ui: bool = False):
monkeypatch.setattr(server_module, "JobManager", FakeJobManager) monkeypatch.setattr(server_module, "JobManager", FakeJobManager)
@@ -189,6 +324,8 @@ def test_create_job_returns_only_job_id_and_defaults_model(tmp_path: Path, monke
manager = app.state.manager manager = app.state.manager
assert manager.last_submit_payload["model"] == "gpt-5.4-mini" assert manager.last_submit_payload["model"] == "gpt-5.4-mini"
assert manager.last_submit_payload["disabled_tools"] == ["click"] assert manager.last_submit_payload["disabled_tools"] == ["click"]
assert manager.last_submit_payload["reasoning_effort"] == "medium"
assert manager.last_submit_payload["screen_context_decay_steps"] == 4
status_res = client.get(f"/api/jobs/{job_id}/status", headers=headers) status_res = client.get(f"/api/jobs/{job_id}/status", headers=headers)
assert status_res.status_code == 200 assert status_res.status_code == 200
@@ -270,12 +407,67 @@ def test_replay_endpoint_skips_visual_paths_outside_artifacts(tmp_path: Path, mo
assert payload["total_frames"] == 1 assert payload["total_frames"] == 1
def test_analytics_endpoint_groups_by_category_and_time(tmp_path: Path, monkeypatch: Any) -> None:
app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
manager = app.state.manager
client = TestClient(app)
headers = {"Authorization": "Bearer test_token"}
browser_completed = client.post("/api/jobs", headers=headers, json={"job": "Open amazon.de and checkout"}).json()["job_id"]
browser_failed = client.post("/api/jobs", headers=headers, json={"job": "Open website and login"}).json()["job_id"]
terminal_completed = client.post("/api/jobs", headers=headers, json={"job": "Run a shell command to inspect files"}).json()["job_id"]
manager._jobs[browser_completed].update(
status="completed",
ended_at="2026-05-27T00:10:00Z",
steps=4,
created_at="2026-05-27T00:00:01Z",
usage={**manager._jobs[browser_completed]["usage"], "estimated_cost_usd": 0.12},
)
manager._jobs[browser_failed].update(
status="failed",
ended_at="2026-05-28T00:10:00Z",
steps=6,
created_at="2026-05-28T00:00:01Z",
usage={**manager._jobs[browser_failed]["usage"], "estimated_cost_usd": 0.24},
)
manager._jobs[terminal_completed].update(
status="completed",
ended_at="2026-05-28T00:15:00Z",
steps=10,
created_at="2026-05-28T00:00:02Z",
usage={**manager._jobs[terminal_completed]["usage"], "estimated_cost_usd": 0.05},
)
analytics = client.get("/api/analytics", headers=headers)
assert analytics.status_code == 200
payload = analytics.json()
assert payload["total_jobs"] == 3
assert payload["finished_jobs"] == 3
assert payload["completed_jobs"] == 2
assert payload["failed_jobs"] == 1
assert payload["success_rate"] == 66.67
assert payload["avg_steps"] == 6.67
assert payload["avg_cost_usd"] == 0.136667
browser = next(row for row in payload["by_category"] if row["label"] == "Browser / web")
terminal = next(row for row in payload["by_category"] if row["label"] == "Files / terminal")
assert browser["finished_jobs"] == 2
assert browser["success_rate"] == 50.0
assert browser["avg_steps"] == 5.0
assert terminal["success_rate"] == 100.0
assert [row["label"] for row in payload["timeline"]] == ["2026-05-27", "2026-05-28"]
def test_ui_toggle(tmp_path: Path, monkeypatch: Any) -> None: def test_ui_toggle(tmp_path: Path, monkeypatch: Any) -> None:
app_enabled, _ = _build_app(tmp_path / "enabled", monkeypatch, disable_ui=False) app_enabled, _ = _build_app(tmp_path / "enabled", monkeypatch, disable_ui=False)
client_enabled = TestClient(app_enabled) client_enabled = TestClient(app_enabled)
root_enabled = client_enabled.get("/") root_enabled = client_enabled.get("/")
assert root_enabled.status_code == 200 assert root_enabled.status_code == 200
assert "ScreenJob Monitor" in root_enabled.text assert "ScreenJob Monitor" in root_enabled.text
assert "Success by Objective Category" in root_enabled.text
js_enabled = client_enabled.get("/ui/monitoring.js") js_enabled = client_enabled.get("/ui/monitoring.js")
assert js_enabled.status_code == 200 assert js_enabled.status_code == 200
assert "const tokenInput" in js_enabled.text assert "const tokenInput" in js_enabled.text

View File

@@ -72,3 +72,55 @@ def test_storage_response_fallback_uses_result_when_json_missing(tmp_path: Path)
assert job is not None assert job is not None
assert job["response"]["return"] == "Legacy result string" assert job["response"]["return"] == "Legacy result string"
assert job["response"]["data"] is None assert job["response"]["data"] is None
def test_history_db_analytics_groups_by_category_and_day(tmp_path: Path) -> None:
db = HistoryDB(tmp_path / "screenjob_test_analytics.db")
db.create_job(
job_id="job_browser_ok",
objective="Open amazon.de and checkout",
model="gpt-5.4-mini",
created_at="2026-05-27T00:00:01Z",
safety_override=False,
disabled_tools=[],
)
db.update_job("job_browser_ok", status="completed", steps=4, estimated_cost_usd=0.12)
db.create_job(
job_id="job_browser_fail",
objective="Open website and login",
model="gpt-5.4-mini",
created_at="2026-05-28T00:00:01Z",
safety_override=False,
disabled_tools=[],
)
db.update_job("job_browser_fail", status="failed", steps=6, estimated_cost_usd=0.24)
db.create_job(
job_id="job_terminal_ok",
objective="Run a shell command to inspect files",
model="gpt-5.4-mini",
created_at="2026-05-28T00:00:02Z",
safety_override=False,
disabled_tools=[],
)
db.update_job("job_terminal_ok", status="completed", steps=10, estimated_cost_usd=0.05)
analytics = db.analytics()
assert analytics["total_jobs"] == 3
assert analytics["finished_jobs"] == 3
assert analytics["completed_jobs"] == 2
assert analytics["failed_jobs"] == 1
assert analytics["success_rate"] == 66.67
assert analytics["avg_steps"] == 6.67
assert analytics["avg_cost_usd"] == 0.136667
browser = next(row for row in analytics["by_category"] if row["label"] == "Browser / web")
terminal = next(row for row in analytics["by_category"] if row["label"] == "Files / terminal")
assert browser["finished_jobs"] == 2
assert browser["success_rate"] == 50.0
assert browser["avg_steps"] == 5.0
assert terminal["success_rate"] == 100.0
assert [row["label"] for row in analytics["timeline"]] == ["2026-05-27", "2026-05-28"]

10
todo.md
View File

@@ -4,12 +4,12 @@
- [Bug] Enforce single active desktop-control run (or a strict queue) so concurrent jobs cannot fight over the same mouse/keyboard/screen session. - [Bug] Enforce single active desktop-control run (or a strict queue) so concurrent jobs cannot fight over the same mouse/keyboard/screen session.
- [Bug] Fix run artifact collisions in `setup_artifacts()` (`run_id` is second-granularity, so two jobs in the same second can share/overwrite the same directory). - [Bug] Fix run artifact collisions in `setup_artifacts()` (`run_id` is second-granularity, so two jobs in the same second can share/overwrite the same directory).
- [Bug] Remove global logger handler clobbering in `setup_logger()` (`logging.getLogger("screenjob").handlers.clear()` breaks concurrent runs and can redirect logs to the wrong file). - [Bug] Remove global logger handler clobbering in `setup_logger()` (`logging.getLogger("screenjob").handlers.clear()` breaks concurrent runs and can redirect logs to the wrong file).
- [Bug] More consistent clicks and more uses of enhance images. - [x] More consistent clicks and more uses of enhance images.
## P1 ## P1
- [Idea] Move ui.py into a seperate html file and js file. - [x] Move ui.py into a seperate html file and js file.
- [Idea] Think harder using effort "medium" by default. - [x] Think harder using effort "medium" by default.
- [Idea] Decay old screenshots after 3 to 5 steps to save (1) tokens and (2) brain fuck in the agents. - [x] Decay old screenshots after 3 to 5 steps to save (1) tokens and (2) brain fuck in the agents.
- [Bug] Validate `disabled_tools` against an allowlist and disallow disabling critical completion flow (`task_complete`) to avoid guaranteed step-limit failures. - [Bug] Validate `disabled_tools` against an allowlist and disallow disabling critical completion flow (`task_complete`) to avoid guaranteed step-limit failures.
- [Bug] Improve `execute_command` cancellation/timeout handling to terminate full process trees, not only the parent shell process. - [Bug] Improve `execute_command` cancellation/timeout handling to terminate full process trees, not only the parent shell process.
@@ -20,4 +20,4 @@
## P3 ## P3
- [x] Add Replay Mode; Ability to replay a session by reconstructing the screen from screenshots and overlaying tool calls and click and type events. - [x] Add Replay Mode; Ability to replay a session by reconstructing the screen from screenshots and overlaying tool calls and click and type events.
- [Idea] Add lightweight analytics dashboards (success rate by objective category, avg steps/cost over time). - [x] Add lightweight analytics dashboards (success rate by objective category, avg steps/cost over time).

53
tray_service_control.ps1 Normal file
View File

@@ -0,0 +1,53 @@
[CmdletBinding()]
param(
[ValidateSet("start", "stop", "restart")]
[string]$Action,
[string]$ServiceName = "ScreenJobBackend"
)
Set-StrictMode -Version Latest
$ErrorActionPreference = "Stop"
function Wait-ForStatus {
param(
[Parameter(Mandatory = $true)]$Service,
[Parameter(Mandatory = $true)][System.ServiceProcess.ServiceControllerStatus]$TargetStatus,
[int]$TimeoutSeconds = 20
)
$deadline = (Get-Date).AddSeconds($TimeoutSeconds)
while ((Get-Date) -lt $deadline) {
$Service.Refresh()
if ($Service.Status -eq $TargetStatus) {
return
}
Start-Sleep -Milliseconds 350
}
throw "Timed out waiting for service '$($Service.ServiceName)' to reach status '$TargetStatus'."
}
$service = Get-Service -Name $ServiceName -ErrorAction Stop
switch ($Action) {
"start" {
if ($service.Status -ne [System.ServiceProcess.ServiceControllerStatus]::Running) {
Start-Service -Name $ServiceName -ErrorAction Stop
Wait-ForStatus -Service $service -TargetStatus ([System.ServiceProcess.ServiceControllerStatus]::Running)
}
}
"stop" {
if ($service.Status -ne [System.ServiceProcess.ServiceControllerStatus]::Stopped) {
Stop-Service -Name $ServiceName -Force -ErrorAction Stop
Wait-ForStatus -Service $service -TargetStatus ([System.ServiceProcess.ServiceControllerStatus]::Stopped)
}
}
"restart" {
if ($service.Status -eq [System.ServiceProcess.ServiceControllerStatus]::Running) {
Restart-Service -Name $ServiceName -Force -ErrorAction Stop
} else {
Start-Service -Name $ServiceName -ErrorAction Stop
}
Wait-ForStatus -Service $service -TargetStatus ([System.ServiceProcess.ServiceControllerStatus]::Running)
}
}

View File

@@ -0,0 +1,36 @@
[CmdletBinding(SupportsShouldProcess = $true)]
param(
[string]$ServiceName = "ScreenJobBackend"
)
Set-StrictMode -Version Latest
$ErrorActionPreference = "Stop"
function Test-IsAdministrator {
$identity = [Security.Principal.WindowsIdentity]::GetCurrent()
$principal = New-Object Security.Principal.WindowsPrincipal($identity)
return $principal.IsInRole([Security.Principal.WindowsBuiltInRole]::Administrator)
}
if (-not (Test-IsAdministrator)) {
throw "Run this script from an elevated PowerShell session (Run as Administrator)."
}
$service = Get-Service -Name $ServiceName -ErrorAction SilentlyContinue
if ($null -eq $service) {
Write-Host "Service '$ServiceName' is not installed."
exit 0
}
if ($PSCmdlet.ShouldProcess($ServiceName, "Uninstall service")) {
if ($service.Status -ne "Stopped") {
Stop-Service -Name $ServiceName -Force -ErrorAction Stop
}
& sc.exe delete $ServiceName | Out-Null
if ($LASTEXITCODE -ne 0) {
throw "Failed to delete service '$ServiceName' (sc.exe exit code $LASTEXITCODE)."
}
}
Write-Host "Service '$ServiceName' uninstalled successfully." -ForegroundColor Green