Compare commits

..

8 Commits

Author SHA1 Message Date
Space-Banane
8126b57404 Add lightweight analytics dashboard
All checks were successful
CI / test (push) Successful in 7s
CI / test (pull_request) Successful in 7s
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-27 22:34:26 +02:00
Space-Banane
cceed18cf1 feat: (literally) "enhance" functionality with new parameters and improved image processing
All checks were successful
CI / test (push) Successful in 7s
2026-05-27 22:14:32 +02:00
Space-Banane
880468ef02 Mark completed P1 TODO items as done 2026-05-27 22:05:57 +02:00
Space-Banane
b05a7be668 Compact screenshot context every 4 steps by default 2026-05-27 22:04:15 +02:00
Space-Banane
0c019474af Default model reasoning effort to medium 2026-05-27 22:02:20 +02:00
Space-Banane
a8ef8ee552 Split monitor UI into separate HTML and JS assets
All checks were successful
CI / test (push) Successful in 7s
2026-05-27 22:01:06 +02:00
Space-Banane
111a1e84af feat: implement replay functionality with UI controls and backend support 2026-05-27 21:57:37 +02:00
Space-Banane
620fcc4aa6 removed slop 2026-05-27 21:53:32 +02:00
16 changed files with 1782 additions and 329 deletions

View File

@@ -156,13 +156,21 @@ Each job payload includes:
- Read-only dashboard (no run controls)
- Requires token input
- Live updates via `/ws`
- Analytics dashboards for success rate by objective category and daily averages
- Set `DISABLE_UI=true` to disable UI
### Analytics API
- `GET /api/analytics`
- Returns objective-category success rates plus average steps/cost over time
## Agent Instructions (Practical)
- Prefer `execute_command` for deterministic actions (opening URLs, filesystem checks).
- Use `see_screen` before UI interaction.
- Use `enhance` when text is unclear.
- Use `enhance` before clicking small/ambiguous targets; prefer `region="small"` for compact controls.
- Use `enhance` `mode="text"` for tiny labels/text, or `mode="ui"` for general UI.
- Optionally set `enhance` `scale` (2-6) for tighter zoom control.
- Use `press_key` for non-text keys (Enter, Tab, arrows, Escape).
- For shortcuts, use one `press_key` call with combo syntax (example: `win+r`).
- Use `click` offsets via `offset_up/down/left/right` and optional `sleep_after_seconds`.

View File

@@ -37,6 +37,14 @@ Keyboard combo rule:
- For shortcuts, use one `press_key` call with combo syntax, for example: `win+r`, `ctrl+shift+esc`.
- Do not split modifier combos into separate calls.
Enhance-first click rule:
- Before clicking small buttons/icons, dense UI, or ambiguous targets, call `enhance` first.
- Preferred preset for tiny controls: `enhance(coordinate, region="small", mode="ui")`.
- For tiny labels/text: use `mode="text"` to improve readability.
- Optional zoom control: set `scale` from `2` to `6` (defaults are tuned by region).
- After checking the enhanced image, click using the same target coordinate (or a small directional offset if needed).
Verification rule:
- Before `task_complete`, verify actual on-screen content matches the expected outcome.

View File

@@ -9,7 +9,7 @@ import traceback
from typing import Any, Callable
from openai import OpenAI
from PIL import Image, ImageEnhance, ImageFilter, ImageOps
from PIL import Image, ImageDraw, ImageEnhance, ImageFilter, ImageOps
from .models import AgentResult, RunArtifacts, RuntimeOptions, UsageSummary
from .pricing import estimate_cost_usd
@@ -34,7 +34,8 @@ Rules:
- launching apps or running terminal checks
3) For UI tasks, inspect with see_screen before clicking/typing.
4) Coordinates are absolute screen pixels (x, y) from top-left.
5) Use enhance(coordinate) when text/UI is unclear.
5) Use enhance before risky clicks: small buttons/icons, dense UI, or when target confidence is below high.
5a) For tiny controls use enhance(coordinate, region="small", mode="ui"). For tiny text use mode="text".
6) For keyboard-heavy interactions, prefer press_key for special keys.
6a) For key combinations, call press_key once with combo syntax (example: "win+r", "ctrl+shift+esc"). Do not split modifier combos across separate calls.
7) You may call multiple tools in one step. If needed, do click then sleep.
@@ -76,11 +77,14 @@ class ScreenJobAgent:
self.final_data: Any | None = None
self.previous_response_id: str | None = None
self.usage = UsageSummary()
self.objective = ""
self.last_screen_data_url: str | None = None
self.last_screen_meta: dict[str, Any] | None = None
self.click_history: list[tuple[int, int, float]] = []
self.disabled_tools = {tool.strip() for tool in (options.disable_tools or set()) if tool.strip()}
self.recent_tool_summaries: list[str] = []
self.last_context_compact_step = 0
def _emit(self, event_type: str, payload: dict[str, Any]) -> None:
if self.event_callback is None:
@@ -192,7 +196,10 @@ class ScreenJobAgent:
{
"type": "function",
"name": "enhance",
"description": "Create enhanced zoom around a coordinate for readability.",
"description": (
"Create enhanced zoom around a coordinate for readability and precise targeting. "
"Prefer this before clicking tiny or ambiguous UI targets."
),
"parameters": {
"type": "object",
"properties": {
@@ -204,7 +211,19 @@ class ScreenJobAgent:
},
"required": ["x", "y"],
"additionalProperties": False,
}
},
"region": {
"type": "string",
"enum": ["small", "medium", "large"],
},
"mode": {
"type": "string",
"enum": ["ui", "text"],
},
"scale": {
"type": ["integer", "string"],
"description": "Zoom factor from 2 to 6. Defaults by region.",
},
},
"required": ["coordinate"],
"additionalProperties": False,
@@ -352,6 +371,23 @@ class ScreenJobAgent:
sec = max_seconds
return sec
def _parse_int(self, value: Any, default: int = 0) -> int:
if value is None:
return default
if isinstance(value, bool):
return int(value)
if isinstance(value, int):
return value
if isinstance(value, float):
return int(round(value))
text = str(value).strip()
if not text:
return default
try:
return int(float(text))
except Exception: # noqa: BLE001
return default
def _tool_see_screen(self, _: dict[str, Any]) -> dict[str, Any]:
image, meta = self._capture_screen(with_grid=True)
out_path = self.artifacts.shots_dir / f"screen_step_{self.step:03d}.png"
@@ -369,34 +405,106 @@ class ScreenJobAgent:
def _tool_enhance(self, args: dict[str, Any]) -> dict[str, Any]:
coord = args.get("coordinate") or {}
x = int(coord.get("x", 0))
y = int(coord.get("y", 0))
requested_x = self._parse_int(coord.get("x", 0), default=0)
requested_y = self._parse_int(coord.get("y", 0), default=0)
region = str(args.get("region", "small") or "small").strip().lower()
mode = str(args.get("mode", "ui") or "ui").strip().lower()
if region not in {"small", "medium", "large"}:
region = "small"
if mode not in {"ui", "text"}:
mode = "ui"
region_half_by_preset = {
"small": 96,
"medium": 160,
"large": 240,
}
default_scale_by_region = {
"small": 4,
"medium": 3,
"large": 2,
}
raw_scale = self._parse_int(args.get("scale"), default=0)
scale = raw_scale if raw_scale > 0 else default_scale_by_region[region]
scale = clamp(scale, 2, 6)
base, base_meta = self._capture_screen(with_grid=False)
width, height = base.size
region_half = 180
left = clamp(x - region_half, 0, width - 1)
top = clamp(y - region_half, 0, height - 1)
right = clamp(x + region_half, left + 1, width)
bottom = clamp(y + region_half, top + 1, height)
source_x = clamp(requested_x, 0, max(0, width - 1))
source_y = clamp(requested_y, 0, max(0, height - 1))
region_half = region_half_by_preset[region]
left = clamp(source_x - region_half, 0, width - 1)
top = clamp(source_y - region_half, 0, height - 1)
right = clamp(source_x + region_half, left + 1, width)
bottom = clamp(source_y + region_half, top + 1, height)
crop = base.crop((left, top, right, bottom))
upscaled = crop.resize((crop.width * 2, crop.height * 2), Image.Resampling.BICUBIC)
enhanced = ImageOps.autocontrast(upscaled)
enhanced = ImageEnhance.Sharpness(enhanced).enhance(2.0)
enhanced = ImageEnhance.Contrast(enhanced).enhance(1.25)
enhanced = enhanced.filter(ImageFilter.UnsharpMask(radius=1.8, percent=180, threshold=2))
out_w = max(2, crop.width * scale)
out_h = max(2, crop.height * scale)
upscaled = crop.resize((out_w, out_h), Image.Resampling.LANCZOS)
out_path = self.artifacts.enhance_dir / f"enhance_step_{self.step:03d}_{x}_{y}.png"
if mode == "text":
text_view = ImageOps.grayscale(upscaled)
text_view = ImageOps.autocontrast(text_view, cutoff=1)
text_view = ImageOps.equalize(text_view)
text_view = ImageEnhance.Contrast(text_view).enhance(1.35)
text_view = ImageEnhance.Sharpness(text_view).enhance(2.1)
processed = text_view.filter(ImageFilter.UnsharpMask(radius=1.2, percent=160, threshold=1)).convert("RGB")
else:
ui_view = ImageOps.autocontrast(upscaled, cutoff=1)
ui_view = ImageEnhance.Contrast(ui_view).enhance(1.2)
ui_view = ImageEnhance.Sharpness(ui_view).enhance(1.8)
processed = ui_view.filter(ImageFilter.UnsharpMask(radius=1.4, percent=150, threshold=2)).convert("RGB")
edges = upscaled.convert("L").filter(ImageFilter.FIND_EDGES)
edges = ImageOps.autocontrast(edges, cutoff=4)
edge_overlay = ImageOps.colorize(edges, black=(0, 0, 0), white=(60, 220, 255))
enhanced = Image.blend(processed, edge_overlay, alpha=0.18)
cx = clamp((source_x - left) * scale, 0, max(0, enhanced.width - 1))
cy = clamp((source_y - top) * scale, 0, max(0, enhanced.height - 1))
draw = ImageDraw.Draw(enhanced)
draw.rectangle([0, 0, enhanced.width - 1, enhanced.height - 1], outline=(255, 80, 80), width=2)
ring_radius = max(10, int(6 * scale / 2))
arm_len = max(14, int(9 * scale / 2))
gap = max(4, int(2 * scale / 2))
line_width = max(2, int(scale / 2))
draw.ellipse(
[cx - ring_radius, cy - ring_radius, cx + ring_radius, cy + ring_radius],
outline=(255, 80, 80),
width=line_width,
)
draw.line([(max(0, cx - arm_len), cy), (max(0, cx - gap), cy)], fill=(255, 80, 80), width=line_width)
draw.line(
[(min(enhanced.width - 1, cx + gap), cy), (min(enhanced.width - 1, cx + arm_len), cy)],
fill=(255, 80, 80),
width=line_width,
)
draw.line([(cx, max(0, cy - arm_len)), (cx, max(0, cy - gap))], fill=(255, 80, 80), width=line_width)
draw.line(
[(cx, min(enhanced.height - 1, cy + gap)), (cx, min(enhanced.height - 1, cy + arm_len))],
fill=(255, 80, 80),
width=line_width,
)
out_path = self.artifacts.enhance_dir / (
f"enhance_step_{self.step:03d}_{source_x}_{source_y}_{region}_{mode}_x{scale}.png"
)
self._save_image(enhanced, out_path)
data_url = image_to_data_url(enhanced, "PNG")
meta = {
"captured_at": utc_now_iso(),
"source_coord": {"x": x, "y": y},
"requested_coord": {"x": requested_x, "y": requested_y},
"source_coord": {"x": source_x, "y": source_y},
"source_box": {"left": left, "top": top, "right": right, "bottom": bottom},
"scale": 2,
"region": region,
"mode": mode,
"scale": scale,
"path": str(out_path.resolve()),
"size": {"width": enhanced.width, "height": enhanced.height},
"target_pixel": {"x": cx, "y": cy},
"screen_size": {"width": width, "height": height},
"base_capture_meta": base_meta,
}
@@ -628,6 +736,9 @@ class ScreenJobAgent:
return {"_raw": raw}
def _call_model(self, input_items: list[dict[str, Any]]) -> Any:
effort = str(self.options.reasoning_effort or "medium").strip().lower()
if effort not in {"low", "medium", "high"}:
effort = "medium"
return self.client.responses.create(
model=self.options.model,
instructions=SYSTEM_PROMPT,
@@ -636,9 +747,85 @@ class ScreenJobAgent:
previous_response_id=self.previous_response_id,
parallel_tool_calls=True,
max_tool_calls=8,
reasoning={"effort": effort},
)
def _record_tool_summary(self, tool_name: str, result: dict[str, Any]) -> None:
ok = bool(result.get("ok"))
status = "ok" if ok else "fail"
summary = f"step={self.step} tool={tool_name} status={status}"
if tool_name == "click":
clicked = result.get("clicked") if isinstance(result.get("clicked"), dict) else {}
x = clicked.get("x")
y = clicked.get("y")
if isinstance(x, int) and isinstance(y, int):
summary = f"{summary} at=({x},{y})"
elif tool_name == "type":
typed_length = int(result.get("typed_length", 0) or 0)
summary = f"{summary} typed_length={typed_length}"
elif tool_name == "press_key":
key = str(result.get("key") or "").strip()
if key:
summary = f"{summary} key={key}"
elif tool_name == "execute_command":
exit_code = result.get("exit_code")
if exit_code is not None:
summary = f"{summary} exit_code={exit_code}"
elif tool_name in {"see_screen", "enhance"}:
meta = result.get("meta") if isinstance(result.get("meta"), dict) else {}
path = str(meta.get("path") or result.get("path") or "").strip()
if path:
summary = f"{summary} image={path}"
if not ok:
error_text = str(result.get("error") or "").strip()
if error_text:
summary = f"{summary} error={error_text[:140]}"
self.recent_tool_summaries.append(summary)
self.recent_tool_summaries = self.recent_tool_summaries[-20:]
def _should_compact_context(self) -> bool:
interval = max(0, int(self.options.screen_context_decay_steps or 0))
if interval <= 0:
return False
if self.previous_response_id is None:
return False
return (self.step - self.last_context_compact_step) >= interval
def _build_compacted_pending_input(self) -> list[dict[str, Any]]:
recent = self.recent_tool_summaries[-8:]
lines = "\n".join(f"- {line}" for line in recent) if recent else "- No recent tool activity."
content = (
"Context compaction activated to decay stale screenshots and reduce token usage.\n"
f"JOB: {self.objective}\n"
f"Current step: {self.step}\n"
"Recent tool activity:\n"
f"{lines}\n"
"Continue execution from the latest screen state. "
"Use tools only, and finish with task_complete when done."
)
compacted_input: list[dict[str, Any]] = [
{
"role": "user",
"content": [
{
"type": "input_text",
"text": content,
}
],
}
]
if self.last_screen_data_url and self.last_screen_meta:
compacted_input.append(
self._build_visual_message(
"Current screen after context compaction",
self.last_screen_data_url,
self.last_screen_meta,
)
)
return compacted_input
def run(self, job: str) -> AgentResult:
self.objective = job
started_at = time.time()
self.logger.info("Starting run_id=%s model=%s", self.artifacts.run_id, self.options.model)
self.logger.info("Job: %s", job)
@@ -648,6 +835,8 @@ class ScreenJobAgent:
{
"run_id": self.artifacts.run_id,
"model": self.options.model,
"reasoning_effort": self.options.reasoning_effort,
"screen_context_decay_steps": self.options.screen_context_decay_steps,
"objective": job,
"disabled_tools": sorted(self.disabled_tools),
},
@@ -664,6 +853,8 @@ class ScreenJobAgent:
f"JOB: {job}\n"
"You are in an action loop. Prefer execute_command for deterministic actions. "
"For modifier shortcuts, use a single press_key combo (example: win+r). "
"Before clicking tiny buttons/icons or dense UI areas, call enhance first "
"(use region='small'; use mode='text' for tiny text labels). "
"You can return multiple tool calls in one step (example: click then sleep). "
"When done call task_complete(return=..., data=...). "
"Before task_complete, verify the screen content is what was expected "
@@ -692,6 +883,19 @@ class ScreenJobAgent:
self.step += 1
self.logger.info("---- Agent step %d/%d ----", self.step, self.options.max_steps)
self._emit("step_started", {"step": self.step, "max_steps": self.options.max_steps})
if self._should_compact_context():
self.previous_response_id = None
pending_input = self._build_compacted_pending_input()
self.last_context_compact_step = self.step
self.logger.info("Compacted model context at step %d.", self.step)
self._emit(
"context_compacted",
{
"step": self.step,
"decay_steps": self.options.screen_context_decay_steps,
"recent_tool_summaries": self.recent_tool_summaries[-8:],
},
)
try:
response = self._call_model(pending_input)
self._register_usage(response)
@@ -720,6 +924,8 @@ class ScreenJobAgent:
"text": (
"No function call was returned. Continue by using tools. "
"Use one press_key call for key combos like win+r. "
"Prefer enhance before clicking small/unclear targets "
"(region='small', mode='ui' or 'text'). "
"You may call multiple tools in one step. "
"Before task_complete, verify expected screen content with see_screen/enhance "
"and include observed_result in data. "
@@ -763,6 +969,7 @@ class ScreenJobAgent:
name,
json.dumps(result, ensure_ascii=False)[:2500],
)
self._record_tool_summary(name, result)
self._emit("tool_result", {"step": self.step, "tool": name, "result": result})
next_input.append(
{

View File

@@ -28,6 +28,18 @@ def build_parser() -> argparse.ArgumentParser:
parser.add_argument("--command-timeout", type=int, default=45, help="Timeout in seconds for execute_command.")
parser.add_argument("--type-interval", type=float, default=0.02, help="Seconds between typed characters.")
parser.add_argument("--click-pause", type=float, default=0.10, help="Mouse move duration before click.")
parser.add_argument(
"--reasoning-effort",
choices=["low", "medium", "high"],
default="medium",
help="Reasoning effort passed to the model.",
)
parser.add_argument(
"--screen-context-decay-steps",
type=int,
default=4,
help="Compact model context every N steps to decay old screenshots (0 disables).",
)
parser.add_argument("--disable-tool", action="append", default=[], help="Disable a tool by name.")
parser.add_argument("--skip-safety-check", action="store_true", help="Bypass pre-flight safety check.")
parser.add_argument("--no-failsafe", action="store_true", help="Disable PyAutoGUI fail-safe.")
@@ -78,6 +90,8 @@ def main(argv: list[str] | None = None) -> int:
command_timeout=args.command_timeout,
type_interval=args.type_interval,
click_pause=args.click_pause,
reasoning_effort=args.reasoning_effort,
screen_context_decay_steps=max(0, int(args.screen_context_decay_steps)),
disable_tools=set(disabled_tools),
)
try:

View File

@@ -58,4 +58,6 @@ class RuntimeOptions:
command_timeout: int = 45
type_interval: float = 0.02
click_pause: float = 0.10
reasoning_effort: str = "medium"
screen_context_decay_steps: int = 4
disable_tools: set[str] | None = None

View File

@@ -15,7 +15,8 @@ from pydantic import BaseModel, Field
from .config import AppConfig, load_app_config
from .storage import HistoryDB
from .task_manager import JobManager
from .ui import monitoring_page_html
from .ui import monitoring_js_path, monitoring_page_html
from .utils import utc_now_iso
class CreateJobRequest(BaseModel):
@@ -25,11 +26,188 @@ class CreateJobRequest(BaseModel):
command_timeout: int = Field(45, ge=1, le=600)
type_interval: float = Field(0.02, ge=0.0, le=1.0)
click_pause: float = Field(0.10, ge=0.0, le=2.0)
reasoning_effort: str = Field("medium", pattern="^(low|medium|high)$")
screen_context_decay_steps: int = Field(4, ge=0, le=50)
disabled_tools: list[str] = Field(default_factory=list)
safety_override: bool = False
no_failsafe: bool = False
def _safe_int(value: Any) -> int | None:
try:
return int(value)
except Exception: # noqa: BLE001
return None
def _safe_text(value: Any, limit: int = 180) -> str:
text = str(value or "").strip()
if len(text) <= limit:
return text
return f"{text[:limit]}..."
def _resolve_artifact_path(artifacts_dir: Path | None, path_raw: Any) -> Path | None:
if artifacts_dir is None:
return None
text = str(path_raw or "").strip()
if not text:
return None
candidate = Path(text).resolve()
try:
candidate.relative_to(artifacts_dir)
except ValueError:
return None
return candidate
def _extract_replay_action(
event: dict[str, Any],
pending_tool_args: dict[tuple[int, str], list[dict[str, Any]]],
) -> dict[str, Any] | None:
event_type = str(event.get("event_type") or "")
payload = event.get("payload") if isinstance(event.get("payload"), dict) else {}
step = int(event.get("step") or 0)
ts = str(event.get("ts") or "")
event_id = int(event.get("id") or 0)
if event_type == "tool_called":
tool = str(payload.get("tool") or "").strip()
args = payload.get("args") if isinstance(payload.get("args"), dict) else {}
if tool:
pending_tool_args.setdefault((step, tool), []).append(args)
action: dict[str, Any] = {
"ts": ts,
"step": step,
"event_id": event_id,
"kind": "tool_called",
"tool": tool,
"label": f"Call: {tool}" if tool else "Tool call",
}
if tool == "click":
coord = args.get("coordinate") if isinstance(args, dict) else None
if isinstance(coord, dict):
x = _safe_int(coord.get("x"))
y = _safe_int(coord.get("y"))
if x is not None and y is not None:
action["requested_click"] = {"x": x, "y": y}
action["label"] = f"Call: click ({x}, {y})"
elif tool == "type":
text = _safe_text((args or {}).get("text"), 120)
if text:
action["text_preview"] = text
action["label"] = f"Call: type \"{text}\""
return action
if event_type == "tool_result":
tool = str(payload.get("tool") or "").strip()
result = payload.get("result") if isinstance(payload.get("result"), dict) else {}
matching_args: dict[str, Any] = {}
key = (step, tool)
queued = pending_tool_args.get(key) or []
if queued:
matching_args = queued.pop(0)
if not queued:
pending_tool_args.pop(key, None)
action = {
"ts": ts,
"step": step,
"event_id": event_id,
"kind": "tool_result",
"tool": tool,
"ok": bool(result.get("ok")),
"label": f"Result: {tool}",
}
if tool == "click":
clicked = result.get("clicked") if isinstance(result.get("clicked"), dict) else {}
x = _safe_int(clicked.get("x"))
y = _safe_int(clicked.get("y"))
if x is not None and y is not None:
action["click"] = {"x": x, "y": y}
action["label"] = f"Clicked ({x}, {y})" if bool(result.get("ok")) else f"Click failed ({x}, {y})"
elif tool == "type":
text = _safe_text((matching_args or {}).get("text"), 120)
typed_length = _safe_int(result.get("typed_length"))
if typed_length is not None:
action["typed_length"] = typed_length
if text:
action["text_preview"] = text
action["label"] = f"Typed \"{text}\""
elif tool == "press_key":
key_name = _safe_text(result.get("key"), 80)
if key_name:
action["label"] = f"Pressed {key_name}"
elif tool == "execute_command":
command = _safe_text((matching_args or {}).get("command"), 140)
if command:
action["command_preview"] = command
action["label"] = f"Command: {command}"
return action
return None
def _build_replay_payload(job_id: str, job: dict[str, Any], events: list[dict[str, Any]]) -> dict[str, Any]:
artifacts_dir_raw = str(job.get("artifacts_dir") or "").strip()
artifacts_dir = Path(artifacts_dir_raw).resolve() if artifacts_dir_raw else None
pending_tool_args: dict[tuple[int, str], list[dict[str, Any]]] = {}
buffered_actions: list[dict[str, Any]] = []
frames: list[dict[str, Any]] = []
for event in events:
action = _extract_replay_action(event, pending_tool_args)
if action is not None:
buffered_actions.append(action)
if str(event.get("event_type") or "") != "visual_update":
continue
payload = event.get("payload") if isinstance(event.get("payload"), dict) else {}
image_meta = payload.get("image_meta") if isinstance(payload.get("image_meta"), dict) else {}
resolved = _resolve_artifact_path(artifacts_dir, image_meta.get("path"))
if resolved is None or not resolved.exists() or not resolved.is_file():
continue
width = _safe_int(image_meta.get("width"))
height = _safe_int(image_meta.get("height"))
if width is None or height is None:
size = image_meta.get("screen_size") if isinstance(image_meta.get("screen_size"), dict) else {}
width = _safe_int(size.get("width"))
height = _safe_int(size.get("height"))
is_fullscreen = (
str(payload.get("kind") or "") == "see_screen"
and bool(image_meta.get("grid"))
and isinstance(width, int)
and isinstance(height, int)
and width > 0
and height > 0
)
frames.append(
{
"frame_index": len(frames),
"event_id": int(event.get("id") or 0),
"ts": str(event.get("ts") or ""),
"step": int(event.get("step") or 0),
"kind": str(payload.get("kind") or "visual_update"),
"image_path": str(resolved),
"image_meta": image_meta,
"screen_size": {"width": width, "height": height} if width and height else None,
"is_fullscreen": is_fullscreen,
"overlays": buffered_actions,
}
)
buffered_actions = []
return {
"job_id": job_id,
"total_events": len(events),
"total_frames": len(frames),
"frames": frames,
"trailing_events": buffered_actions,
}
class _WebSocketHub:
def __init__(self) -> None:
self._connections: set[WebSocket] = set()
@@ -126,6 +304,8 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
command_timeout=payload.command_timeout,
type_interval=payload.type_interval,
click_pause=payload.click_pause,
reasoning_effort=payload.reasoning_effort,
screen_context_decay_steps=payload.screen_context_decay_steps,
disabled_tools=payload.disabled_tools,
safety_override=payload.safety_override,
no_failsafe=payload.no_failsafe,
@@ -161,6 +341,18 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
raise HTTPException(status_code=404, detail="Job not found")
return {"events": manager.get_events(job_id, limit=limit)}
@app.get("/api/jobs/{job_id}/replay")
def get_job_replay(
job_id: str,
limit: int = Query(default=5000, ge=1, le=5000),
_: None = Depends(require_token),
) -> dict[str, Any]:
job = manager.get_job(job_id)
if job is None:
raise HTTPException(status_code=404, detail="Job not found")
events = manager.get_events(job_id, limit=limit)
return _build_replay_payload(job_id, job, events)
@app.post("/api/jobs/{job_id}/cancel")
def cancel_job(job_id: str, _: None = Depends(require_token)) -> dict[str, Any]:
job = manager.get_job(job_id)
@@ -195,11 +387,21 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
def stats(_: None = Depends(require_token)) -> dict[str, Any]:
return manager.stats()
@app.get("/api/analytics")
def analytics(_: None = Depends(require_token)) -> dict[str, Any]:
payload = manager.analytics()
payload["generated_at"] = utc_now_iso()
return payload
if not app_config.disable_ui:
@app.get("/", response_class=HTMLResponse)
def ui_root() -> str:
return monitoring_page_html(device_hostname=device_hostname)
@app.get("/ui/monitoring.js")
def ui_monitoring_js() -> FileResponse:
return FileResponse(str(monitoring_js_path()), media_type="application/javascript")
@app.websocket("/ws")
async def ws_endpoint(websocket: WebSocket, token: str = Query(default="")) -> None:
if not token or not secrets.compare_digest(token, app_config.screenjob_token):

View File

@@ -7,6 +7,39 @@ from pathlib import Path
from typing import Any
_TERMINAL_STATUSES = {"completed", "failed", "cancelled"}
_CATEGORY_RULES: tuple[tuple[str, tuple[str, ...]], ...] = (
(
"Browser / web",
("browser", "website", "webpage", "chrome", "url", "amazon", "google", "login", "shopping", "checkout", "orders"),
),
(
"Files / terminal",
("file", "folder", "directory", "terminal", "shell", "command", "cli", "script", "git", "repo", "install", "pip", "npm", "powershell", "bash"),
),
(
"Writing / docs",
("write", "summary", "summarize", "document", "docs", "report", "email", "message", "readme", "markdown", "note", "proposal"),
),
(
"Data / analysis",
("data", "analysis", "analyze", "csv", "spreadsheet", "sheet", "table", "chart", "dashboard", "metric", "metrics", "sql"),
),
(
"Development / ops",
("code", "bug", "fix", "test", "debug", "api", "backend", "frontend", "database", "deploy", "docker", "service", "build"),
),
)
def _objective_category(objective: str) -> str:
text = objective.lower()
for category, keywords in _CATEGORY_RULES:
if any(keyword in text for keyword in keywords):
return category
return "Other"
class HistoryDB:
def __init__(self, db_path: Path) -> None:
self.db_path = db_path
@@ -184,6 +217,131 @@ class HistoryDB:
).fetchone()
return dict(totals) if totals else {}
def analytics(self) -> dict[str, Any]:
with self._connect() as conn:
rows = conn.execute(
"""
SELECT job_id, objective, status, steps, estimated_cost_usd, created_at
FROM jobs
ORDER BY created_at ASC, job_id ASC
"""
).fetchall()
total_jobs = 0
finished_jobs = 0
completed_jobs = 0
failed_jobs = 0
cancelled_jobs = 0
steps_sum = 0
steps_count = 0
cost_sum = 0.0
cost_count = 0
by_category: dict[str, dict[str, Any]] = {}
by_day: dict[str, dict[str, Any]] = {}
def _bucket(target: dict[str, dict[str, Any]], key: str) -> dict[str, Any]:
bucket = target.setdefault(
key,
{
"label": key,
"total_jobs": 0,
"finished_jobs": 0,
"completed_jobs": 0,
"failed_jobs": 0,
"cancelled_jobs": 0,
"steps_sum": 0,
"steps_count": 0,
"cost_sum": 0.0,
"cost_count": 0,
},
)
return bucket
for row in rows:
total_jobs += 1
status = str(row["status"] or "")
finished = status in _TERMINAL_STATUSES
completed = status == "completed"
objective = str(row["objective"] or "")
category = _objective_category(objective)
created_at = str(row["created_at"] or "")
day = created_at[:10] if len(created_at) >= 10 else created_at or "unknown"
category_bucket = _bucket(by_category, category)
day_bucket = _bucket(by_day, day)
for bucket in (category_bucket, day_bucket):
bucket["total_jobs"] += 1
if not finished:
continue
finished_jobs += 1
if completed:
completed_jobs += 1
elif status == "failed":
failed_jobs += 1
elif status == "cancelled":
cancelled_jobs += 1
steps = row["steps"]
if steps is not None:
step_value = int(steps)
steps_sum += step_value
steps_count += 1
for bucket in (category_bucket, day_bucket):
bucket["steps_sum"] += step_value
bucket["steps_count"] += 1
estimated_cost = row["estimated_cost_usd"]
if estimated_cost is not None:
cost_value = float(estimated_cost)
cost_sum += cost_value
cost_count += 1
for bucket in (category_bucket, day_bucket):
bucket["cost_sum"] += cost_value
bucket["cost_count"] += 1
for bucket in (category_bucket, day_bucket):
bucket["finished_jobs"] += 1
if completed:
bucket["completed_jobs"] += 1
elif status == "failed":
bucket["failed_jobs"] += 1
elif status == "cancelled":
bucket["cancelled_jobs"] += 1
def _finalize(bucket: dict[str, Any]) -> dict[str, Any]:
finished = bucket["finished_jobs"]
return {
"label": bucket["label"],
"total_jobs": bucket["total_jobs"],
"finished_jobs": finished,
"completed_jobs": bucket["completed_jobs"],
"failed_jobs": bucket["failed_jobs"],
"cancelled_jobs": bucket["cancelled_jobs"],
"success_rate": round((bucket["completed_jobs"] / finished) * 100, 2) if finished else 0.0,
"avg_steps": round(bucket["steps_sum"] / bucket["steps_count"], 2) if bucket["steps_count"] else None,
"avg_cost_usd": round(bucket["cost_sum"] / bucket["cost_count"], 6) if bucket["cost_count"] else None,
}
category_rows = [_finalize(bucket) for bucket in by_category.values()]
category_rows.sort(key=lambda item: (-item["success_rate"], item["label"]))
day_rows = [_finalize(bucket) for bucket in by_day.values()]
day_rows.sort(key=lambda item: item["label"])
return {
"total_jobs": total_jobs,
"finished_jobs": finished_jobs,
"completed_jobs": completed_jobs,
"failed_jobs": failed_jobs,
"cancelled_jobs": cancelled_jobs,
"success_rate": round((completed_jobs / finished_jobs) * 100, 2) if finished_jobs else 0.0,
"avg_steps": round(steps_sum / steps_count, 2) if steps_count else None,
"avg_cost_usd": round(cost_sum / cost_count, 6) if cost_count else None,
"by_category": category_rows,
"timeline": day_rows,
}
def _row_to_job(self, row: sqlite3.Row) -> dict[str, Any]:
disabled_tools: list[str] = []
try:

View File

@@ -48,6 +48,8 @@ class JobManager:
command_timeout: int = 45,
type_interval: float = 0.02,
click_pause: float = 0.10,
reasoning_effort: str = "medium",
screen_context_decay_steps: int = 4,
disabled_tools: list[str] | None = None,
safety_override: bool = False,
no_failsafe: bool = False,
@@ -93,6 +95,8 @@ class JobManager:
"command_timeout": command_timeout,
"type_interval": type_interval,
"click_pause": click_pause,
"reasoning_effort": reasoning_effort,
"screen_context_decay_steps": screen_context_decay_steps,
"no_failsafe": no_failsafe,
"cancel_event": cancel_event,
},
@@ -121,6 +125,8 @@ class JobManager:
command_timeout: int,
type_interval: float,
click_pause: float,
reasoning_effort: str,
screen_context_decay_steps: int,
no_failsafe: bool,
cancel_event: threading.Event,
) -> None:
@@ -218,6 +224,8 @@ class JobManager:
command_timeout=command_timeout,
type_interval=type_interval,
click_pause=click_pause,
reasoning_effort=reasoning_effort,
screen_context_decay_steps=max(0, int(screen_context_decay_steps)),
disable_tools=set(disabled_tools),
)
try:
@@ -343,6 +351,9 @@ class JobManager:
stats["live_running_threads"] = sum(1 for job in self._running.values() if job.thread.is_alive())
return stats
def analytics(self) -> dict[str, Any]:
return self.db.analytics()
def _normalize_job_payload(self, job: dict[str, Any]) -> dict[str, Any]:
response = job.get("response")
if not isinstance(response, dict):

310
src/ui.py
View File

@@ -1,307 +1,19 @@
from __future__ import annotations
from html import escape
from pathlib import Path
_UI_DIR = Path(__file__).resolve().parent / "ui_assets"
_HTML_TEMPLATE_PATH = _UI_DIR / "monitoring.html"
_JS_PATH = _UI_DIR / "monitoring.js"
def monitoring_page_html(device_hostname: str = "") -> str:
host_suffix = f" ({escape(device_hostname)})" if device_hostname else ""
return """<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>ScreenJob Monitor</title>
<script src="https://cdn.tailwindcss.com"></script>
</head>
<body class="bg-slate-950 text-slate-100 min-h-screen">
<div class="max-w-7xl mx-auto p-4 md:p-8 space-y-6">
<header class="flex flex-col gap-3 md:flex-row md:items-center md:justify-between">
<div>
<h1 class="text-2xl md:text-3xl font-bold tracking-tight">ScreenJob Monitor<span class="text-slate-400 text-base md:text-lg font-medium">__MONITOR_HOST__</span></h1>
<p class="text-slate-400 text-sm">Read-only monitoring for active and historical tasks.</p>
</div>
<div class="flex flex-col md:flex-row gap-2 md:items-center">
<input id="tokenInput" type="password" placeholder="SCREENJOB_TOKEN" class="bg-slate-900 border border-slate-700 rounded px-3 py-2 text-sm w-72" />
<button id="saveTokenBtn" class="bg-cyan-500 hover:bg-cyan-400 text-slate-950 font-semibold px-4 py-2 rounded">Connect</button>
</div>
</header>
html = _HTML_TEMPLATE_PATH.read_text(encoding="utf-8")
return html.replace("__MONITOR_HOST__", host_suffix)
<section class="grid grid-cols-2 md:grid-cols-6 gap-3" id="stats"></section>
<section class="grid grid-cols-1 lg:grid-cols-5 gap-4">
<div class="lg:col-span-2 bg-slate-900/70 border border-slate-800 rounded-xl p-4">
<div class="flex items-center justify-between mb-3">
<h2 class="font-semibold">Jobs</h2>
<button id="refreshBtn" class="text-xs bg-slate-800 px-2 py-1 rounded">Refresh</button>
</div>
<div id="jobList" class="space-y-2 max-h-[62vh] overflow-auto"></div>
</div>
<div class="lg:col-span-3 bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
<h2 class="font-semibold">Job Detail</h2>
<pre id="jobDetail" class="bg-slate-950 border border-slate-800 rounded p-3 text-xs overflow-auto max-h-[24vh]"></pre>
<h3 class="font-semibold text-sm">Latest Visual</h3>
<div class="bg-slate-950 border border-slate-800 rounded p-2">
<img id="latestVisual" alt="Latest visual update" class="max-h-[24vh] w-full object-contain rounded" />
</div>
<div class="flex items-center justify-between">
<h3 class="font-semibold text-sm">Live Events</h3>
<label for="eventsViewToggle" class="flex items-center gap-2 text-xs text-slate-300 cursor-pointer select-none">
<span>Raw</span>
<input id="eventsViewToggle" type="checkbox" class="accent-cyan-400 h-4 w-4" />
<span>Beautiful</span>
</label>
</div>
<div id="events" class="bg-slate-950 border border-slate-800 rounded p-3 text-xs overflow-auto max-h-[36vh] space-y-1"></div>
</div>
</section>
</div>
<script>
const tokenInput = document.getElementById("tokenInput");
const saveTokenBtn = document.getElementById("saveTokenBtn");
const refreshBtn = document.getElementById("refreshBtn");
const jobListEl = document.getElementById("jobList");
const jobDetailEl = document.getElementById("jobDetail");
const eventsEl = document.getElementById("events");
const statsEl = document.getElementById("stats");
const latestVisualEl = document.getElementById("latestVisual");
const eventsViewToggle = document.getElementById("eventsViewToggle");
const state = {
token: localStorage.getItem("screenjob_token") || "",
jobs: [],
selectedJobId: null,
ws: null,
wsReconnectTimer: null,
eventsViewMode: localStorage.getItem("screenjob_events_view_mode") === "beautiful" ? "beautiful" : "raw"
};
const manuallyClosedSockets = new WeakSet();
tokenInput.value = state.token;
function authHeaders() {
return { "Authorization": "Bearer " + state.token };
}
async function api(path, opts = {}) {
if (!state.token) throw new Error("Token required");
const headers = Object.assign({}, authHeaders(), opts.headers || {});
const response = await fetch(path, Object.assign({}, opts, { headers }));
if (!response.ok) throw new Error(await response.text());
return response.json();
}
function renderStats(stats) {
const cards = [
["Total Jobs", stats.total_jobs || 0],
["Running", stats.running_jobs || 0],
["Completed", stats.completed_jobs || 0],
["Failed", stats.failed_jobs || 0],
["Cancelled", stats.cancelled_jobs || 0],
["Total Cost (USD)", Number(stats.total_estimated_cost || 0).toFixed(4)]
];
statsEl.innerHTML = cards.map(([name, val]) => `
<div class="bg-slate-900/70 border border-slate-800 rounded-xl p-3">
<div class="text-slate-400 text-xs">${name}</div>
<div class="text-lg font-semibold">${val}</div>
</div>
`).join("");
}
function renderJobs() {
jobListEl.innerHTML = state.jobs.map((job) => {
const active = job.job_id === state.selectedJobId;
return `
<button data-job-id="${job.job_id}" class="w-full text-left p-3 rounded border ${active ? "border-cyan-400 bg-slate-800" : "border-slate-800 bg-slate-950"} hover:bg-slate-800">
<div class="flex items-center justify-between">
<span class="font-medium">${job.job_id}</span>
<span class="text-xs px-2 py-0.5 rounded bg-slate-700">${job.status}</span>
</div>
<div class="text-xs text-slate-400 mt-1">${job.model}</div>
<div class="text-xs text-slate-300 mt-1 line-clamp-2">${job.objective}</div>
<div class="text-xs text-slate-500 mt-1">$${Number((job.usage && job.usage.estimated_cost_usd) || 0).toFixed(6)}</div>
</button>
`;
}).join("");
for (const btn of jobListEl.querySelectorAll("button[data-job-id]")) {
btn.addEventListener("click", () => {
state.selectedJobId = btn.getAttribute("data-job-id");
renderJobs();
refreshJobDetail();
});
}
}
function pushEventLine(obj) {
if (!obj || !obj.job_id || !obj.event_type) return;
const line = document.createElement("div");
const ts = obj.ts || "-";
const step = (obj.step ?? "-");
if (state.eventsViewMode === "raw") {
line.className = "border-b border-slate-800 pb-1";
line.textContent = `[${ts}] ${obj.job_id} step=${step} ${obj.event_type} ${JSON.stringify(obj.payload || {})}`;
} else {
const typeColors = {
info: "bg-sky-900/50 text-sky-200 border border-sky-800",
warning: "bg-amber-900/40 text-amber-200 border border-amber-800",
error: "bg-rose-900/40 text-rose-200 border border-rose-800",
visual_update: "bg-emerald-900/40 text-emerald-200 border border-emerald-800",
tool_call: "bg-violet-900/40 text-violet-200 border border-violet-800",
tool_result: "bg-indigo-900/40 text-indigo-200 border border-indigo-800"
};
const dt = new Date(ts);
const tsText = Number.isNaN(dt.getTime()) ? ts : dt.toLocaleString();
const payload = obj.payload || {};
line.className = "rounded-lg border border-slate-800 bg-slate-900/80 p-2 space-y-2";
const header = document.createElement("div");
header.className = "flex flex-wrap items-center gap-2";
const typePill = document.createElement("span");
typePill.className = `px-2 py-0.5 rounded text-[10px] font-semibold ${typeColors[obj.event_type] || "bg-slate-800 text-slate-200 border border-slate-700"}`;
typePill.textContent = obj.event_type;
const stepPill = document.createElement("span");
stepPill.className = "px-2 py-0.5 rounded text-[10px] bg-slate-800 text-slate-300 border border-slate-700";
stepPill.textContent = `step ${step}`;
const tsSpan = document.createElement("span");
tsSpan.className = "text-[10px] text-slate-400";
tsSpan.textContent = tsText;
header.appendChild(typePill);
header.appendChild(stepPill);
header.appendChild(tsSpan);
const jobLine = document.createElement("div");
jobLine.className = "text-[11px] text-slate-300 font-medium";
jobLine.textContent = obj.job_id;
const body = document.createElement("pre");
body.className = "bg-slate-950 border border-slate-800 rounded p-2 text-[11px] text-slate-200 overflow-auto";
body.textContent = JSON.stringify(payload, null, 2);
line.appendChild(header);
line.appendChild(jobLine);
line.appendChild(body);
}
eventsEl.prepend(line);
while (eventsEl.childNodes.length > 400) {
eventsEl.removeChild(eventsEl.lastChild);
}
}
function scheduleWsReconnect() {
if (state.wsReconnectTimer || !state.token) return;
state.wsReconnectTimer = setTimeout(() => {
state.wsReconnectTimer = null;
connectWs();
}, 1200);
}
function updateLatestVisualFromEvent(ev) {
if (!ev || ev.event_type !== "visual_update") return;
if (!state.selectedJobId || ev.job_id !== state.selectedJobId) return;
const imagePath = ev.payload && ev.payload.image_meta && ev.payload.image_meta.path;
if (!imagePath) return;
const q = encodeURIComponent(imagePath);
latestVisualEl.src = `/api/jobs/${state.selectedJobId}/artifact?path=${q}&token=${encodeURIComponent(state.token)}`;
}
async function refreshJobs() {
const payload = await api("/api/jobs?limit=100");
state.jobs = payload.jobs || [];
if (!state.selectedJobId && state.jobs.length > 0) state.selectedJobId = state.jobs[0].job_id;
renderJobs();
}
async function refreshStats() {
const payload = await api("/api/stats");
renderStats(payload);
}
async function refreshJobDetail() {
if (!state.selectedJobId) return;
const [job, events] = await Promise.all([
api(`/api/jobs/${state.selectedJobId}`),
api(`/api/jobs/${state.selectedJobId}/events?limit=120`)
]);
jobDetailEl.textContent = JSON.stringify(job, null, 2);
eventsEl.innerHTML = "";
const list = (events.events || []).slice().reverse();
for (const ev of list) pushEventLine(ev);
const visual = list.find((ev) => ev.event_type === "visual_update");
if (visual) updateLatestVisualFromEvent(visual);
}
function connectWs() {
if (!state.token) return;
if (state.ws && (state.ws.readyState === WebSocket.OPEN || state.ws.readyState === WebSocket.CONNECTING)) {
return;
}
const scheme = location.protocol === "https:" ? "wss" : "ws";
const ws = new WebSocket(`${scheme}://${location.host}/ws?token=${encodeURIComponent(state.token)}`);
state.ws = ws;
ws.onmessage = async (event) => {
try {
const payload = JSON.parse(event.data);
if (!payload || payload.event_type === "connected") return;
pushEventLine(payload);
updateLatestVisualFromEvent(payload);
if (!state.selectedJobId || payload.job_id === state.selectedJobId) {
await refreshJobDetail();
}
await refreshJobs();
await refreshStats();
} catch (err) {
console.error(err);
}
};
ws.onclose = () => {
if (state.ws === ws) state.ws = null;
if (manuallyClosedSockets.has(ws)) {
manuallyClosedSockets.delete(ws);
return;
}
scheduleWsReconnect();
};
}
async function fullRefresh() {
await refreshJobs();
await refreshStats();
await refreshJobDetail();
}
async function connect() {
state.token = tokenInput.value.trim();
localStorage.setItem("screenjob_token", state.token);
if (state.ws) {
manuallyClosedSockets.add(state.ws);
try { state.ws.close(); } catch (_) {}
state.ws = null;
}
if (state.wsReconnectTimer) {
clearTimeout(state.wsReconnectTimer);
state.wsReconnectTimer = null;
}
await fullRefresh();
connectWs();
}
function syncEventsViewToggle() {
eventsViewToggle.checked = state.eventsViewMode === "beautiful";
}
saveTokenBtn.addEventListener("click", () => connect().catch((err) => alert(err.message)));
refreshBtn.addEventListener("click", () => fullRefresh().catch((err) => alert(err.message)));
eventsViewToggle.addEventListener("change", () => {
state.eventsViewMode = eventsViewToggle.checked ? "beautiful" : "raw";
localStorage.setItem("screenjob_events_view_mode", state.eventsViewMode);
refreshJobDetail().catch((err) => alert(err.message));
});
syncEventsViewToggle();
if (state.token) connect().catch(() => {});
</script>
</body>
</html>
""".replace("__MONITOR_HOST__", host_suffix)
def monitoring_js_path() -> Path:
return _JS_PATH

View File

@@ -0,0 +1,106 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>ScreenJob Monitor</title>
<script src="https://cdn.tailwindcss.com"></script>
</head>
<body class="bg-slate-950 text-slate-100 min-h-screen">
<div class="max-w-7xl mx-auto p-4 md:p-8 space-y-6">
<header class="flex flex-col gap-3 md:flex-row md:items-center md:justify-between">
<div>
<h1 class="text-2xl md:text-3xl font-bold tracking-tight">ScreenJob Monitor<span class="text-slate-400 text-base md:text-lg font-medium">__MONITOR_HOST__</span></h1>
<p class="text-slate-400 text-sm">Read-only monitoring for active and historical tasks.</p>
</div>
<div class="flex flex-col md:flex-row gap-2 md:items-center">
<input id="tokenInput" type="password" placeholder="SCREENJOB_TOKEN" class="bg-slate-900 border border-slate-700 rounded px-3 py-2 text-sm w-72" />
<button id="saveTokenBtn" class="bg-cyan-500 hover:bg-cyan-400 text-slate-950 font-semibold px-4 py-2 rounded">Connect</button>
</div>
</header>
<section class="grid grid-cols-2 md:grid-cols-6 gap-3" id="stats"></section>
<section class="space-y-3">
<div class="flex items-center justify-between gap-3">
<h2 class="font-semibold">Analytics</h2>
<div id="analyticsMeta" class="text-[11px] text-slate-400"></div>
</div>
<div id="analyticsSummary" class="grid grid-cols-2 md:grid-cols-4 gap-3"></div>
<div class="grid grid-cols-1 xl:grid-cols-2 gap-4">
<div class="bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
<div class="flex items-center justify-between gap-3">
<h3 class="font-semibold text-sm">Success by Objective Category</h3>
<div id="analyticsCategorySummary" class="text-[11px] text-slate-400"></div>
</div>
<div id="analyticsCategories" class="space-y-3"></div>
</div>
<div class="bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
<div class="flex items-center justify-between gap-3">
<h3 class="font-semibold text-sm">Avg Steps / Cost Over Time</h3>
<div id="analyticsTrendSummary" class="text-[11px] text-slate-400"></div>
</div>
<div id="analyticsTrends" class="space-y-4"></div>
</div>
</div>
</section>
<section class="grid grid-cols-1 lg:grid-cols-5 gap-4">
<div class="lg:col-span-2 bg-slate-900/70 border border-slate-800 rounded-xl p-4">
<div class="flex items-center justify-between mb-3">
<h2 class="font-semibold">Jobs</h2>
<button id="refreshBtn" class="text-xs bg-slate-800 px-2 py-1 rounded">Refresh</button>
</div>
<div id="jobList" class="space-y-2 max-h-[62vh] overflow-auto"></div>
</div>
<div class="lg:col-span-3 bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
<h2 class="font-semibold">Job Detail</h2>
<pre id="jobDetail" class="bg-slate-950 border border-slate-800 rounded p-3 text-xs overflow-auto max-h-[24vh]"></pre>
<h3 class="font-semibold text-sm">Latest Visual</h3>
<div class="bg-slate-950 border border-slate-800 rounded p-2">
<img id="latestVisual" alt="Latest visual update" class="max-h-[24vh] w-full object-contain rounded" />
</div>
<div class="flex items-center justify-between">
<h3 class="font-semibold text-sm">Replay</h3>
<div id="replayStatus" class="text-[11px] text-slate-400">No replay loaded.</div>
</div>
<div class="flex flex-wrap items-center gap-2">
<button id="replayPlayBtn" class="text-xs bg-slate-800 px-2 py-1 rounded">Play</button>
<button id="replayPrevBtn" class="text-xs bg-slate-800 px-2 py-1 rounded">Prev</button>
<button id="replayNextBtn" class="text-xs bg-slate-800 px-2 py-1 rounded">Next</button>
<label class="text-xs text-slate-300 flex items-center gap-1">
Speed
<select id="replaySpeed" class="bg-slate-900 border border-slate-700 rounded px-1 py-0.5">
<option value="0.5">0.5x</option>
<option value="1" selected>1.0x</option>
<option value="1.5">1.5x</option>
<option value="2">2.0x</option>
</select>
</label>
</div>
<input id="replaySeek" type="range" min="0" max="0" value="0" class="w-full accent-cyan-400" />
<div class="bg-slate-950 border border-slate-800 rounded p-2">
<div class="relative w-full min-h-[180px] bg-black/40 rounded">
<img id="replayVisual" alt="Replay frame" class="max-h-[30vh] w-full object-contain rounded" />
<svg id="replayOverlay" class="absolute inset-0 w-full h-full pointer-events-none" preserveAspectRatio="xMidYMid meet"></svg>
</div>
<div id="replayFrameMeta" class="text-[11px] text-slate-400 mt-2"></div>
<div id="replayFrameEvents" class="mt-2 space-y-1"></div>
</div>
<div class="flex items-center justify-between">
<h3 class="font-semibold text-sm">Live Events</h3>
<label for="eventsViewToggle" class="flex items-center gap-2 text-xs text-slate-300 cursor-pointer select-none">
<span>Raw</span>
<input id="eventsViewToggle" type="checkbox" class="accent-cyan-400 h-4 w-4" />
<span>Beautiful</span>
</label>
</div>
<div id="events" class="bg-slate-950 border border-slate-800 rounded p-3 text-xs overflow-auto max-h-[36vh] space-y-1"></div>
</div>
</section>
</div>
<script src="/ui/monitoring.js"></script>
</body>
</html>

625
src/ui_assets/monitoring.js Normal file
View File

@@ -0,0 +1,625 @@
const tokenInput = document.getElementById("tokenInput");
const saveTokenBtn = document.getElementById("saveTokenBtn");
const refreshBtn = document.getElementById("refreshBtn");
const jobListEl = document.getElementById("jobList");
const jobDetailEl = document.getElementById("jobDetail");
const eventsEl = document.getElementById("events");
const statsEl = document.getElementById("stats");
const latestVisualEl = document.getElementById("latestVisual");
const eventsViewToggle = document.getElementById("eventsViewToggle");
const replayVisualEl = document.getElementById("replayVisual");
const replayOverlayEl = document.getElementById("replayOverlay");
const replayFrameMetaEl = document.getElementById("replayFrameMeta");
const replayFrameEventsEl = document.getElementById("replayFrameEvents");
const replayStatusEl = document.getElementById("replayStatus");
const replayPlayBtn = document.getElementById("replayPlayBtn");
const replayPrevBtn = document.getElementById("replayPrevBtn");
const replayNextBtn = document.getElementById("replayNextBtn");
const replaySpeedEl = document.getElementById("replaySpeed");
const replaySeekEl = document.getElementById("replaySeek");
const analyticsMetaEl = document.getElementById("analyticsMeta");
const analyticsSummaryEl = document.getElementById("analyticsSummary");
const analyticsCategorySummaryEl = document.getElementById("analyticsCategorySummary");
const analyticsCategoriesEl = document.getElementById("analyticsCategories");
const analyticsTrendSummaryEl = document.getElementById("analyticsTrendSummary");
const analyticsTrendsEl = document.getElementById("analyticsTrends");
const state = {
token: localStorage.getItem("screenjob_token") || "",
jobs: [],
selectedJobId: null,
ws: null,
wsReconnectTimer: null,
eventsViewMode: localStorage.getItem("screenjob_events_view_mode") === "beautiful" ? "beautiful" : "raw",
replay: {
frames: [],
trailingEvents: [],
frameIndex: 0,
isPlaying: false,
speed: 1,
timer: null
}
};
const manuallyClosedSockets = new WeakSet();
const analyticsRefreshEvents = new Set(["job_finished", "job_failed", "job_rejected"]);
tokenInput.value = state.token;
function authHeaders() {
return { "Authorization": "Bearer " + state.token };
}
async function api(path, opts = {}) {
if (!state.token) throw new Error("Token required");
const headers = Object.assign({}, authHeaders(), opts.headers || {});
const response = await fetch(path, Object.assign({}, opts, { headers }));
if (!response.ok) throw new Error(await response.text());
return response.json();
}
function renderStats(stats) {
const cards = [
["Total Jobs", stats.total_jobs || 0],
["Running", stats.running_jobs || 0],
["Completed", stats.completed_jobs || 0],
["Failed", stats.failed_jobs || 0],
["Cancelled", stats.cancelled_jobs || 0],
["Total Cost (USD)", Number(stats.total_estimated_cost || 0).toFixed(4)]
];
statsEl.innerHTML = cards.map(([name, val]) => `
<div class="bg-slate-900/70 border border-slate-800 rounded-xl p-3">
<div class="text-slate-400 text-xs">${name}</div>
<div class="text-lg font-semibold">${val}</div>
</div>
`).join("");
}
function escapeHtml(value) {
return String(value ?? "").replace(/[&<>"']/g, (ch) => ({
"&": "&amp;",
"<": "&lt;",
">": "&gt;",
'"': "&quot;",
"'": "&#39;"
})[ch]);
}
function formatNumber(value, digits = 2) {
const num = Number(value);
return Number.isFinite(num) ? num.toFixed(digits) : "—";
}
function formatCurrency(value, digits = 6) {
const num = Number(value);
return Number.isFinite(num) ? `$${num.toFixed(digits)}` : "—";
}
function formatPercent(value) {
const num = Number(value);
return Number.isFinite(num) ? `${num.toFixed(1)}%` : "—";
}
function formatDateLabel(value) {
const dt = new Date(value);
if (Number.isNaN(dt.getTime())) return String(value || "—");
return dt.toLocaleDateString(undefined, { month: "short", day: "numeric" });
}
function renderMetricCard(label, value) {
return `
<div class="bg-slate-950 border border-slate-800 rounded-xl p-3">
<div class="text-[11px] uppercase tracking-wide text-slate-400">${escapeHtml(label)}</div>
<div class="text-xl font-semibold mt-1">${escapeHtml(value)}</div>
</div>
`;
}
function renderLineChart(title, points, options = {}) {
const color = options.color || "#22d3ee";
const valueLabel = options.valueLabel || "";
const sourcePoints = Array.isArray(points)
? points.filter((point) => Number.isFinite(Number(point.value)))
: [];
if (!sourcePoints.length) {
return `
<div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3">
<div class="flex items-center justify-between gap-3">
<div>
<div class="text-xs text-slate-400">${escapeHtml(title)}</div>
<div class="text-sm text-slate-200 font-semibold">No data yet</div>
</div>
</div>
</div>
`;
}
const width = 640;
const height = 220;
const margin = { top: 20, right: 18, bottom: 34, left: 44 };
const values = sourcePoints.map((point) => Number(point.value));
const minValue = Math.min(...values);
const maxValue = Math.max(...values);
const span = maxValue - minValue || 1;
const chartWidth = width - margin.left - margin.right;
const chartHeight = height - margin.top - margin.bottom;
const xStep = sourcePoints.length > 1 ? chartWidth / (sourcePoints.length - 1) : 0;
const coords = sourcePoints.map((point, index) => ({
x: margin.left + (index * xStep),
y: margin.top + ((maxValue - Number(point.value)) / span) * chartHeight,
}));
const linePath = coords.map((point, index) => `${index === 0 ? "M" : "L"} ${point.x} ${point.y}`).join(" ");
const baseline = height - margin.bottom;
const midIndex = Math.floor(sourcePoints.length / 2);
const xLabels = [
{ index: 0, label: sourcePoints[0].label },
{ index: midIndex, label: sourcePoints[midIndex].label },
{ index: sourcePoints.length - 1, label: sourcePoints[sourcePoints.length - 1].label },
].filter((item, index, array) => item.label && array.findIndex((candidate) => candidate.index === item.index) === index);
const minLabel = options.formatValue ? options.formatValue(minValue) : formatNumber(minValue, 2);
const maxLabel = options.formatValue ? options.formatValue(maxValue) : formatNumber(maxValue, 2);
const latest = sourcePoints[sourcePoints.length - 1];
const latestValue = options.formatValue ? options.formatValue(latest.value) : formatNumber(latest.value, 2);
return `
<div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3 space-y-2">
<div class="flex items-center justify-between gap-3">
<div>
<div class="text-xs text-slate-400">${escapeHtml(title)}</div>
<div class="text-sm text-slate-200 font-semibold">${escapeHtml(latestValue)}${valueLabel ? ` <span class="text-slate-500 font-normal">${escapeHtml(valueLabel)}</span>` : ""}</div>
</div>
<div class="text-[11px] text-slate-400 text-right">
<div>${escapeHtml(sourcePoints.length)} points</div>
<div>${escapeHtml(minLabel)} - ${escapeHtml(maxLabel)}</div>
</div>
</div>
<svg viewBox="0 0 ${width} ${height}" class="w-full h-56">
${Array.from({ length: 4 }, (_, idx) => {
const y = margin.top + (chartHeight / 3) * idx;
return `<line x1="${margin.left}" y1="${y}" x2="${width - margin.right}" y2="${y}" stroke="rgba(51, 65, 85, 0.7)" stroke-width="1" />`;
}).join("")}
<line x1="${margin.left}" y1="${baseline}" x2="${width - margin.right}" y2="${baseline}" stroke="rgba(71, 85, 105, 0.8)" stroke-width="1.5" />
<path d="${linePath}" fill="none" stroke="${color}" stroke-width="3" stroke-linecap="round" stroke-linejoin="round" />
${coords.map((point) => `
<circle cx="${point.x}" cy="${point.y}" r="4.5" fill="${color}" />
`).join("")}
<text x="${margin.left - 8}" y="${margin.top + 4}" text-anchor="end" class="fill-slate-400 text-[10px]">${escapeHtml(maxLabel)}</text>
<text x="${margin.left - 8}" y="${baseline}" text-anchor="end" class="fill-slate-400 text-[10px]">${escapeHtml(minLabel)}</text>
${xLabels.map((item) => `
<text x="${coords[item.index].x}" y="${height - 10}" text-anchor="middle" class="fill-slate-500 text-[10px]">${escapeHtml(formatDateLabel(item.label))}</text>
`).join("")}
</svg>
</div>
`;
}
function renderAnalytics(payload) {
const analytics = payload || {};
const categories = Array.isArray(analytics.by_category) ? analytics.by_category : [];
const timeline = Array.isArray(analytics.timeline) ? analytics.timeline : [];
const finishedCategories = categories.filter((row) => Number(row.finished_jobs || 0) > 0);
if (analyticsMetaEl) {
analyticsMetaEl.textContent = analytics.generated_at
? `Updated ${new Date(analytics.generated_at).toLocaleString()}`
: "Historical snapshot";
}
analyticsSummaryEl.innerHTML = [
renderMetricCard("Finished Jobs", analytics.finished_jobs || 0),
renderMetricCard("Success Rate", formatPercent(analytics.success_rate)),
renderMetricCard("Avg Steps", formatNumber(analytics.avg_steps, 1)),
renderMetricCard("Avg Cost", formatCurrency(analytics.avg_cost_usd)),
].join("");
analyticsCategorySummaryEl.textContent = finishedCategories.length
? `${finishedCategories.length} categories`
: "No finished jobs yet";
if (finishedCategories.length) {
analyticsCategoriesEl.innerHTML = finishedCategories.map((row) => {
const successRate = Number(row.success_rate || 0);
const completed = Number(row.completed_jobs || 0);
const finished = Number(row.finished_jobs || 0);
const total = Number(row.total_jobs || 0);
const avgSteps = row.avg_steps == null ? "—" : formatNumber(row.avg_steps, 1);
const avgCost = row.avg_cost_usd == null ? "—" : formatCurrency(row.avg_cost_usd);
return `
<div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3 space-y-2">
<div class="flex items-start justify-between gap-3">
<div>
<div class="font-medium">${escapeHtml(row.label || "Other")}</div>
<div class="text-[11px] text-slate-400">${finished} finished · ${completed} completed · ${total} total</div>
</div>
<div class="text-right">
<div class="text-base font-semibold">${formatPercent(successRate)}</div>
<div class="text-[11px] text-slate-500">success rate</div>
</div>
</div>
<div class="h-2 rounded bg-slate-800 overflow-hidden">
<div class="h-full rounded bg-cyan-400" style="width: ${Math.max(0, Math.min(successRate, 100))}%"></div>
</div>
<div class="grid grid-cols-2 gap-2 text-[11px] text-slate-300">
<div>Avg steps: ${escapeHtml(avgSteps)}</div>
<div>Avg cost: ${escapeHtml(avgCost)}</div>
</div>
</div>
`;
}).join("");
} else {
analyticsCategoriesEl.innerHTML = `
<div class="rounded-lg border border-dashed border-slate-800 bg-slate-950/70 p-4 text-sm text-slate-400">
No finished jobs yet.
</div>
`;
}
analyticsTrendSummaryEl.textContent = timeline.length ? `${timeline.length} days` : "No daily data yet";
analyticsTrendsEl.innerHTML = [
renderLineChart("Average steps per day", timeline.map((row) => ({ label: row.label, value: row.avg_steps })), { color: "#38bdf8" }),
renderLineChart("Average cost per day", timeline.map((row) => ({ label: row.label, value: row.avg_cost_usd })), {
color: "#34d399",
valueLabel: "USD",
formatValue: (value) => formatCurrency(value),
}),
].join("");
}
function renderJobs() {
jobListEl.innerHTML = state.jobs.map((job) => {
const active = job.job_id === state.selectedJobId;
return `
<button data-job-id="${job.job_id}" class="w-full text-left p-3 rounded border ${active ? "border-cyan-400 bg-slate-800" : "border-slate-800 bg-slate-950"} hover:bg-slate-800">
<div class="flex items-center justify-between">
<span class="font-medium">${job.job_id}</span>
<span class="text-xs px-2 py-0.5 rounded bg-slate-700">${job.status}</span>
</div>
<div class="text-xs text-slate-400 mt-1">${job.model}</div>
<div class="text-xs text-slate-300 mt-1 line-clamp-2">${job.objective}</div>
<div class="text-xs text-slate-500 mt-1">$${Number((job.usage && job.usage.estimated_cost_usd) || 0).toFixed(6)}</div>
</button>
`;
}).join("");
for (const btn of jobListEl.querySelectorAll("button[data-job-id]")) {
btn.addEventListener("click", () => {
state.selectedJobId = btn.getAttribute("data-job-id");
renderJobs();
refreshJobDetail();
});
}
}
function pushEventLine(obj) {
if (!obj || !obj.job_id || !obj.event_type) return;
const line = document.createElement("div");
const ts = obj.ts || "-";
const step = (obj.step ?? "-");
if (state.eventsViewMode === "raw") {
line.className = "border-b border-slate-800 pb-1";
line.textContent = `[${ts}] ${obj.job_id} step=${step} ${obj.event_type} ${JSON.stringify(obj.payload || {})}`;
} else {
const typeColors = {
info: "bg-sky-900/50 text-sky-200 border border-sky-800",
warning: "bg-amber-900/40 text-amber-200 border border-amber-800",
error: "bg-rose-900/40 text-rose-200 border border-rose-800",
visual_update: "bg-emerald-900/40 text-emerald-200 border border-emerald-800",
tool_call: "bg-violet-900/40 text-violet-200 border border-violet-800",
tool_result: "bg-indigo-900/40 text-indigo-200 border border-indigo-800"
};
const dt = new Date(ts);
const tsText = Number.isNaN(dt.getTime()) ? ts : dt.toLocaleString();
const payload = obj.payload || {};
line.className = "rounded-lg border border-slate-800 bg-slate-900/80 p-2 space-y-2";
const header = document.createElement("div");
header.className = "flex flex-wrap items-center gap-2";
const typePill = document.createElement("span");
typePill.className = `px-2 py-0.5 rounded text-[10px] font-semibold ${typeColors[obj.event_type] || "bg-slate-800 text-slate-200 border border-slate-700"}`;
typePill.textContent = obj.event_type;
const stepPill = document.createElement("span");
stepPill.className = "px-2 py-0.5 rounded text-[10px] bg-slate-800 text-slate-300 border border-slate-700";
stepPill.textContent = `step ${step}`;
const tsSpan = document.createElement("span");
tsSpan.className = "text-[10px] text-slate-400";
tsSpan.textContent = tsText;
header.appendChild(typePill);
header.appendChild(stepPill);
header.appendChild(tsSpan);
const jobLine = document.createElement("div");
jobLine.className = "text-[11px] text-slate-300 font-medium";
jobLine.textContent = obj.job_id;
const body = document.createElement("pre");
body.className = "bg-slate-950 border border-slate-800 rounded p-2 text-[11px] text-slate-200 overflow-auto";
body.textContent = JSON.stringify(payload, null, 2);
line.appendChild(header);
line.appendChild(jobLine);
line.appendChild(body);
}
eventsEl.prepend(line);
while (eventsEl.childNodes.length > 400) {
eventsEl.removeChild(eventsEl.lastChild);
}
}
function clearReplayTimer() {
if (state.replay.timer) {
clearTimeout(state.replay.timer);
state.replay.timer = null;
}
}
function stopReplay() {
state.replay.isPlaying = false;
clearReplayTimer();
replayPlayBtn.textContent = "Play";
}
function replayImageSrc(path) {
const q = encodeURIComponent(path || "");
return `/api/jobs/${state.selectedJobId}/artifact?path=${q}&token=${encodeURIComponent(state.token)}`;
}
function renderReplayOverlay(frame) {
replayOverlayEl.innerHTML = "";
const size = frame && frame.screen_size;
if (!frame || !frame.is_fullscreen || !size || !size.width || !size.height) {
replayOverlayEl.removeAttribute("viewBox");
return;
}
replayOverlayEl.setAttribute("viewBox", `0 0 ${size.width} ${size.height}`);
const overlayEvents = Array.isArray(frame.overlays) ? frame.overlays : [];
const points = overlayEvents.filter((ev) => ev && ev.kind === "tool_result" && ev.tool === "click" && ev.click);
for (const ev of points) {
const x = Number(ev.click.x);
const y = Number(ev.click.y);
if (!Number.isFinite(x) || !Number.isFinite(y)) continue;
const halo = document.createElementNS("http://www.w3.org/2000/svg", "circle");
halo.setAttribute("cx", String(x));
halo.setAttribute("cy", String(y));
halo.setAttribute("r", "14");
halo.setAttribute("fill", "rgba(14, 165, 233, 0.22)");
halo.setAttribute("stroke", "#38bdf8");
halo.setAttribute("stroke-width", "2");
const dot = document.createElementNS("http://www.w3.org/2000/svg", "circle");
dot.setAttribute("cx", String(x));
dot.setAttribute("cy", String(y));
dot.setAttribute("r", "4");
dot.setAttribute("fill", "#38bdf8");
replayOverlayEl.appendChild(halo);
replayOverlayEl.appendChild(dot);
}
}
function renderReplayFrameEvents(frame) {
replayFrameEventsEl.innerHTML = "";
if (!frame) return;
const events = Array.isArray(frame.overlays) ? frame.overlays : [];
const shown = events.slice(-8);
for (const ev of shown) {
const row = document.createElement("div");
row.className = "text-[11px] rounded border border-slate-800 bg-slate-900/80 px-2 py-1";
row.textContent = ev.label || `${ev.kind || "event"} ${ev.tool || ""}`.trim();
replayFrameEventsEl.appendChild(row);
}
if (!shown.length) {
const empty = document.createElement("div");
empty.className = "text-[11px] text-slate-500";
empty.textContent = "No overlay events for this frame.";
replayFrameEventsEl.appendChild(empty);
}
}
function setReplayFrame(index) {
const frames = state.replay.frames;
if (!frames.length) {
replayVisualEl.removeAttribute("src");
replayOverlayEl.innerHTML = "";
replayFrameMetaEl.textContent = "No replay frames.";
replaySeekEl.value = "0";
replaySeekEl.max = "0";
replayStatusEl.textContent = "No replay loaded.";
return;
}
const bounded = Math.max(0, Math.min(index, frames.length - 1));
state.replay.frameIndex = bounded;
const frame = frames[bounded];
replayVisualEl.src = replayImageSrc(frame.image_path);
replayFrameMetaEl.textContent = `Frame ${bounded + 1}/${frames.length} | step ${frame.step} | ${frame.kind} | ${frame.ts}`;
replaySeekEl.max = String(Math.max(0, frames.length - 1));
replaySeekEl.value = String(bounded);
replayStatusEl.textContent = state.replay.isPlaying ? "Playing replay." : "Replay ready.";
renderReplayOverlay(frame);
renderReplayFrameEvents(frame);
}
function advanceReplay() {
const frames = state.replay.frames;
if (!state.replay.isPlaying || !frames.length) return;
if (state.replay.frameIndex >= frames.length - 1) {
stopReplay();
setReplayFrame(frames.length - 1);
replayStatusEl.textContent = "Replay finished.";
return;
}
setReplayFrame(state.replay.frameIndex + 1);
clearReplayTimer();
const delayMs = Math.max(120, Math.round(700 / (state.replay.speed || 1)));
state.replay.timer = setTimeout(advanceReplay, delayMs);
}
function toggleReplayPlay() {
if (!state.replay.frames.length) return;
if (state.replay.isPlaying) {
stopReplay();
setReplayFrame(state.replay.frameIndex);
return;
}
state.replay.isPlaying = true;
replayPlayBtn.textContent = "Pause";
replayStatusEl.textContent = "Playing replay.";
advanceReplay();
}
function resetReplay(payload) {
stopReplay();
const replayPayload = payload || {};
state.replay.frames = Array.isArray(replayPayload.frames) ? replayPayload.frames : [];
state.replay.trailingEvents = Array.isArray(replayPayload.trailing_events) ? replayPayload.trailing_events : [];
state.replay.frameIndex = 0;
setReplayFrame(0);
}
function scheduleWsReconnect() {
if (state.wsReconnectTimer || !state.token) return;
state.wsReconnectTimer = setTimeout(() => {
state.wsReconnectTimer = null;
connectWs();
}, 1200);
}
function updateLatestVisualFromEvent(ev) {
if (!ev || ev.event_type !== "visual_update") return;
if (!state.selectedJobId || ev.job_id !== state.selectedJobId) return;
const imagePath = ev.payload && ev.payload.image_meta && ev.payload.image_meta.path;
if (!imagePath) return;
const q = encodeURIComponent(imagePath);
latestVisualEl.src = `/api/jobs/${state.selectedJobId}/artifact?path=${q}&token=${encodeURIComponent(state.token)}`;
}
async function refreshJobs() {
const payload = await api("/api/jobs?limit=100");
state.jobs = payload.jobs || [];
if (!state.selectedJobId && state.jobs.length > 0) state.selectedJobId = state.jobs[0].job_id;
renderJobs();
}
async function refreshStats() {
const payload = await api("/api/stats");
renderStats(payload);
}
async function refreshAnalytics() {
const payload = await api("/api/analytics");
renderAnalytics(payload);
}
async function refreshJobDetail() {
if (!state.selectedJobId) return;
const [job, events, replay] = await Promise.all([
api(`/api/jobs/${state.selectedJobId}`),
api(`/api/jobs/${state.selectedJobId}/events?limit=120`),
api(`/api/jobs/${state.selectedJobId}/replay?limit=5000`)
]);
jobDetailEl.textContent = JSON.stringify(job, null, 2);
eventsEl.innerHTML = "";
const list = (events.events || []).slice().reverse();
for (const ev of list) pushEventLine(ev);
const visual = list.find((ev) => ev.event_type === "visual_update");
if (visual) updateLatestVisualFromEvent(visual);
resetReplay(replay);
}
function connectWs() {
if (!state.token) return;
if (state.ws && (state.ws.readyState === WebSocket.OPEN || state.ws.readyState === WebSocket.CONNECTING)) {
return;
}
const scheme = location.protocol === "https:" ? "wss" : "ws";
const ws = new WebSocket(`${scheme}://${location.host}/ws?token=${encodeURIComponent(state.token)}`);
state.ws = ws;
ws.onmessage = async (event) => {
try {
const payload = JSON.parse(event.data);
if (!payload || payload.event_type === "connected") return;
pushEventLine(payload);
updateLatestVisualFromEvent(payload);
if (!state.selectedJobId || payload.job_id === state.selectedJobId) {
await refreshJobDetail();
}
await refreshJobs();
await refreshStats();
if (analyticsRefreshEvents.has(payload.event_type)) {
await refreshAnalytics();
}
} catch (err) {
console.error(err);
}
};
ws.onclose = () => {
if (state.ws === ws) state.ws = null;
if (manuallyClosedSockets.has(ws)) {
manuallyClosedSockets.delete(ws);
return;
}
scheduleWsReconnect();
};
}
async function fullRefresh() {
await refreshJobs();
await refreshStats();
await refreshAnalytics();
await refreshJobDetail();
}
async function connect() {
state.token = tokenInput.value.trim();
localStorage.setItem("screenjob_token", state.token);
if (state.ws) {
manuallyClosedSockets.add(state.ws);
try { state.ws.close(); } catch (_) {}
state.ws = null;
}
if (state.wsReconnectTimer) {
clearTimeout(state.wsReconnectTimer);
state.wsReconnectTimer = null;
}
await fullRefresh();
connectWs();
}
function syncEventsViewToggle() {
eventsViewToggle.checked = state.eventsViewMode === "beautiful";
}
saveTokenBtn.addEventListener("click", () => connect().catch((err) => alert(err.message)));
refreshBtn.addEventListener("click", () => fullRefresh().catch((err) => alert(err.message)));
eventsViewToggle.addEventListener("change", () => {
state.eventsViewMode = eventsViewToggle.checked ? "beautiful" : "raw";
localStorage.setItem("screenjob_events_view_mode", state.eventsViewMode);
refreshJobDetail().catch((err) => alert(err.message));
});
replayPlayBtn.addEventListener("click", () => toggleReplayPlay());
replayPrevBtn.addEventListener("click", () => {
stopReplay();
setReplayFrame(state.replay.frameIndex - 1);
});
replayNextBtn.addEventListener("click", () => {
stopReplay();
setReplayFrame(state.replay.frameIndex + 1);
});
replaySpeedEl.addEventListener("change", () => {
const speed = Number(replaySpeedEl.value);
state.replay.speed = Number.isFinite(speed) && speed > 0 ? speed : 1;
if (state.replay.isPlaying) {
clearReplayTimer();
advanceReplay();
}
});
replaySeekEl.addEventListener("input", () => {
stopReplay();
setReplayFrame(Number(replaySeekEl.value || 0));
});
syncEventsViewToggle();
resetReplay(null);
if (state.token) connect().catch(() => {});

View File

@@ -91,6 +91,41 @@ def test_click_supports_directional_offsets(tmp_path: Path, monkeypatch) -> None
assert click_result["clicked"] == {"x": 110, "y": 102}
def test_enhance_defaults_to_small_ui_preset(tmp_path: Path, monkeypatch) -> None:
agent = _build_agent(tmp_path, monkeypatch)
result = agent._tool_enhance({"coordinate": {"x": 100, "y": 120}})
assert result["ok"] is True
meta = result["meta"]
assert meta["region"] == "small"
assert meta["mode"] == "ui"
assert meta["scale"] == 4
assert Path(meta["path"]).exists()
assert meta["target_pixel"]["x"] >= 0
assert meta["target_pixel"]["y"] >= 0
def test_enhance_supports_text_mode_and_scale_clamp(tmp_path: Path, monkeypatch) -> None:
agent = _build_agent(tmp_path, monkeypatch)
result = agent._tool_enhance(
{
"coordinate": {"x": -99, "y": 9999},
"region": "medium",
"mode": "text",
"scale": 99,
}
)
assert result["ok"] is True
meta = result["meta"]
assert meta["region"] == "medium"
assert meta["mode"] == "text"
assert meta["scale"] == 6
assert meta["requested_coord"] == {"x": -99, "y": 9999}
assert meta["source_coord"] == {"x": 0, "y": 719}
assert Path(meta["path"]).exists()
def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None:
agent = _build_agent(tmp_path, monkeypatch)
result = agent._tool_press_key({"key": "meta+r"})
@@ -98,3 +133,21 @@ def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None:
assert result["key"] == "win+r"
assert result["message"] == "Key combo executed."
assert agent_module.pyautogui.last_hotkey == ("win", "r")
def test_context_compaction_trigger_and_payload(tmp_path: Path, monkeypatch) -> None:
agent = _build_agent(tmp_path, monkeypatch)
agent.objective = "Open settings app"
agent.previous_response_id = "resp_123"
agent.step = 4
agent.last_context_compact_step = 0
agent.options.screen_context_decay_steps = 4
agent.recent_tool_summaries = ["step=1 tool=see_screen status=ok"]
agent.last_screen_data_url = "data:image/png;base64,abc"
agent.last_screen_meta = {"width": 1280, "height": 720, "path": "C:/tmp/frame.png"}
assert agent._should_compact_context() is True
compacted = agent._build_compacted_pending_input()
assert len(compacted) == 2
assert "Context compaction activated" in compacted[0]["content"][0]["text"]
assert "Open settings app" in compacted[0]["content"][0]["text"]

View File

@@ -29,7 +29,10 @@ def test_cli_emits_structured_return_and_data(monkeypatch: Any, capsys, tmp_path
def fake_assess_task_safety(*_args, **_kwargs):
return True, "safe", {"safe": True}
captured_kwargs: dict[str, Any] = {}
def fake_run_job(*_args, **_kwargs):
captured_kwargs.update(_kwargs)
result = AgentResult(
completed=True,
result="Done",
@@ -66,3 +69,5 @@ def test_cli_emits_structured_return_and_data(monkeypatch: Any, capsys, tmp_path
assert payload["response"]["data"] == "file1.txt\nfile2.txt"
assert payload["return"] == "Task completed successfully"
assert payload["data"] == "file1.txt\nfile2.txt"
assert captured_kwargs["options"].reasoning_effort == "medium"
assert captured_kwargs["options"].screen_context_decay_steps == 4

View File

@@ -9,6 +9,24 @@ import src.server as server_module
from src.config import AppConfig
_TERMINAL_STATUSES = {"completed", "failed", "cancelled"}
def _objective_category(objective: str) -> str:
text = objective.lower()
if any(keyword in text for keyword in ("browser", "website", "amazon", "google", "login", "shopping", "checkout", "orders")):
return "Browser / web"
if any(keyword in text for keyword in ("file", "folder", "directory", "terminal", "shell", "command", "cli", "script", "git", "repo", "install", "pip", "npm")):
return "Files / terminal"
if any(keyword in text for keyword in ("write", "summary", "document", "docs", "report", "email", "message", "readme", "markdown")):
return "Writing / docs"
if any(keyword in text for keyword in ("data", "analysis", "csv", "spreadsheet", "sheet", "table", "chart", "dashboard", "metric", "sql")):
return "Data / analysis"
if any(keyword in text for keyword in ("code", "bug", "fix", "test", "debug", "api", "backend", "frontend", "database", "deploy", "docker", "service", "build")):
return "Development / ops"
return "Other"
class FakeJobManager:
def __init__(self, *, config: AppConfig, db: Any, broadcast: Any = None) -> None:
self.config = config
@@ -26,6 +44,8 @@ class FakeJobManager:
command_timeout: int = 45,
type_interval: float = 0.02,
click_pause: float = 0.10,
reasoning_effort: str = "medium",
screen_context_decay_steps: int = 4,
disabled_tools: list[str] | None = None,
safety_override: bool = False,
no_failsafe: bool = False,
@@ -33,6 +53,11 @@ class FakeJobManager:
self._counter += 1
job_id = f"job_fake_{self._counter:03d}"
selected_model = (model or self.config.default_model).strip()
artifacts_dir = (self.config.runs_dir / f"run_{job_id}").resolve()
artifacts_dir.mkdir(parents=True, exist_ok=True)
screenshot_path = artifacts_dir / "screen_step_001.png"
screenshot_path.write_bytes(b"not-a-real-png")
created_at = f"2026-05-27T00:00:{self._counter:02d}Z"
self.last_submit_payload = {
"objective": objective,
"model": selected_model,
@@ -42,6 +67,8 @@ class FakeJobManager:
"command_timeout": command_timeout,
"type_interval": type_interval,
"click_pause": click_pause,
"reasoning_effort": reasoning_effort,
"screen_context_decay_steps": screen_context_decay_steps,
"no_failsafe": no_failsafe,
}
self._jobs[job_id] = {
@@ -49,6 +76,10 @@ class FakeJobManager:
"objective": objective,
"model": selected_model,
"status": "running",
"created_at": created_at,
"started_at": created_at,
"ended_at": None,
"steps": 1,
"result": "Running",
"response": {"return": "Running", "data": None},
"return": "Running",
@@ -61,7 +92,7 @@ class FakeJobManager:
"total_tokens": 14,
"estimated_cost_usd": 0.0001,
},
"artifacts_dir": str(self.config.runs_dir.resolve()),
"artifacts_dir": str(artifacts_dir),
}
self._events[job_id] = [
{
@@ -70,7 +101,47 @@ class FakeJobManager:
"ts": "2026-05-27T00:00:00Z",
"step": 1,
"event_type": "tool_called",
"payload": {"tool": "execute_command"},
"payload": {"tool": "click", "args": {"coordinate": {"x": 320, "y": 180}}},
},
{
"id": 2,
"job_id": job_id,
"ts": "2026-05-27T00:00:01Z",
"step": 1,
"event_type": "tool_result",
"payload": {"tool": "click", "result": {"ok": True, "clicked": {"x": 322, "y": 182}}},
},
{
"id": 3,
"job_id": job_id,
"ts": "2026-05-27T00:00:02Z",
"step": 1,
"event_type": "tool_called",
"payload": {"tool": "type", "args": {"text": "hello world"}},
},
{
"id": 4,
"job_id": job_id,
"ts": "2026-05-27T00:00:03Z",
"step": 1,
"event_type": "tool_result",
"payload": {"tool": "type", "result": {"ok": True, "typed_length": 11}},
},
{
"id": 5,
"job_id": job_id,
"ts": "2026-05-27T00:00:04Z",
"step": 1,
"event_type": "visual_update",
"payload": {
"kind": "see_screen",
"image_meta": {
"path": str(screenshot_path),
"width": 1920,
"height": 1080,
"grid": True,
},
},
}
]
return job_id
@@ -101,6 +172,114 @@ class FakeJobManager:
"live_running_threads": 0,
}
def analytics(self) -> dict[str, Any]:
by_category: dict[str, dict[str, Any]] = {}
by_day: dict[str, dict[str, Any]] = {}
def bucket(target: dict[str, dict[str, Any]], key: str) -> dict[str, Any]:
return target.setdefault(
key,
{
"label": key,
"total_jobs": 0,
"finished_jobs": 0,
"completed_jobs": 0,
"failed_jobs": 0,
"cancelled_jobs": 0,
"steps_sum": 0,
"steps_count": 0,
"cost_sum": 0.0,
"cost_count": 0,
},
)
total_jobs = 0
finished_jobs = 0
completed_jobs = 0
failed_jobs = 0
cancelled_jobs = 0
steps_sum = 0
steps_count = 0
cost_sum = 0.0
cost_count = 0
for job in self._jobs.values():
total_jobs += 1
status = str(job.get("status") or "")
finished = status in _TERMINAL_STATUSES
category = _objective_category(str(job.get("objective") or ""))
day = str(job.get("created_at") or "")[:10] or "unknown"
category_bucket = bucket(by_category, category)
day_bucket = bucket(by_day, day)
for item in (category_bucket, day_bucket):
item["total_jobs"] += 1
if not finished:
continue
finished_jobs += 1
if status == "completed":
completed_jobs += 1
elif status == "failed":
failed_jobs += 1
elif status == "cancelled":
cancelled_jobs += 1
steps_raw = job.get("steps")
if steps_raw is not None:
steps = int(steps_raw)
steps_sum += steps
steps_count += 1
for item in (category_bucket, day_bucket):
item["steps_sum"] += steps
item["steps_count"] += 1
estimated_cost_raw = (job.get("usage") or {}).get("estimated_cost_usd")
if estimated_cost_raw is not None:
estimated_cost = float(estimated_cost_raw)
cost_sum += estimated_cost
cost_count += 1
for item in (category_bucket, day_bucket):
item["cost_sum"] += estimated_cost
item["cost_count"] += 1
for item in (category_bucket, day_bucket):
item["finished_jobs"] += 1
if status == "completed":
item["completed_jobs"] += 1
elif status == "failed":
item["failed_jobs"] += 1
elif status == "cancelled":
item["cancelled_jobs"] += 1
def finalize(item: dict[str, Any]) -> dict[str, Any]:
finished = item["finished_jobs"]
return {
"label": item["label"],
"total_jobs": item["total_jobs"],
"finished_jobs": finished,
"completed_jobs": item["completed_jobs"],
"failed_jobs": item["failed_jobs"],
"cancelled_jobs": item["cancelled_jobs"],
"success_rate": round((item["completed_jobs"] / finished) * 100, 2) if finished else 0.0,
"avg_steps": round(item["steps_sum"] / item["steps_count"], 2) if item["steps_count"] else None,
"avg_cost_usd": round(item["cost_sum"] / item["cost_count"], 6) if item["cost_count"] else None,
}
return {
"total_jobs": total_jobs,
"finished_jobs": finished_jobs,
"completed_jobs": completed_jobs,
"failed_jobs": failed_jobs,
"cancelled_jobs": cancelled_jobs,
"success_rate": round((completed_jobs / finished_jobs) * 100, 2) if finished_jobs else 0.0,
"avg_steps": round(steps_sum / steps_count, 2) if steps_count else None,
"avg_cost_usd": round(cost_sum / cost_count, 6) if cost_count else None,
"by_category": sorted((finalize(item) for item in by_category.values()), key=lambda item: (-item["success_rate"], item["label"])),
"timeline": sorted((finalize(item) for item in by_day.values()), key=lambda item: item["label"]),
}
def _build_app(tmp_path: Path, monkeypatch: Any, disable_ui: bool = False):
monkeypatch.setattr(server_module, "JobManager", FakeJobManager)
@@ -145,6 +324,8 @@ def test_create_job_returns_only_job_id_and_defaults_model(tmp_path: Path, monke
manager = app.state.manager
assert manager.last_submit_payload["model"] == "gpt-5.4-mini"
assert manager.last_submit_payload["disabled_tools"] == ["click"]
assert manager.last_submit_payload["reasoning_effort"] == "medium"
assert manager.last_submit_payload["screen_context_decay_steps"] == 4
status_res = client.get(f"/api/jobs/{job_id}/status", headers=headers)
assert status_res.status_code == 200
@@ -174,12 +355,122 @@ def test_cancel_endpoint_and_events(tmp_path: Path, monkeypatch: Any) -> None:
assert status_after["data"] is None
def test_replay_endpoint_builds_frames_and_overlays(tmp_path: Path, monkeypatch: Any) -> None:
app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
client = TestClient(app)
headers = {"Authorization": "Bearer test_token"}
create = client.post("/api/jobs", headers=headers, json={"job": "Replay test"})
job_id = create.json()["job_id"]
replay = client.get(f"/api/jobs/{job_id}/replay?limit=200", headers=headers)
assert replay.status_code == 200
payload = replay.json()
assert payload["job_id"] == job_id
assert payload["total_frames"] == 1
frame = payload["frames"][0]
assert frame["kind"] == "see_screen"
assert frame["is_fullscreen"] is True
labels = [item.get("label", "") for item in frame["overlays"]]
assert any("click" in text.lower() for text in labels)
assert any("typed" in text.lower() for text in labels)
def test_replay_endpoint_skips_visual_paths_outside_artifacts(tmp_path: Path, monkeypatch: Any) -> None:
app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
manager = app.state.manager
client = TestClient(app)
headers = {"Authorization": "Bearer test_token"}
create = client.post("/api/jobs", headers=headers, json={"job": "Replay path check"})
job_id = create.json()["job_id"]
manager._events[job_id].append(
{
"id": 999,
"job_id": job_id,
"ts": "2026-05-27T00:01:00Z",
"step": 2,
"event_type": "visual_update",
"payload": {
"kind": "see_screen",
"image_meta": {
"path": str((tmp_path / "outside.png").resolve()),
"width": 100,
"height": 100,
"grid": True,
},
},
}
)
replay = client.get(f"/api/jobs/{job_id}/replay?limit=500", headers=headers)
assert replay.status_code == 200
payload = replay.json()
assert payload["total_frames"] == 1
def test_analytics_endpoint_groups_by_category_and_time(tmp_path: Path, monkeypatch: Any) -> None:
app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
manager = app.state.manager
client = TestClient(app)
headers = {"Authorization": "Bearer test_token"}
browser_completed = client.post("/api/jobs", headers=headers, json={"job": "Open amazon.de and checkout"}).json()["job_id"]
browser_failed = client.post("/api/jobs", headers=headers, json={"job": "Open website and login"}).json()["job_id"]
terminal_completed = client.post("/api/jobs", headers=headers, json={"job": "Run a shell command to inspect files"}).json()["job_id"]
manager._jobs[browser_completed].update(
status="completed",
ended_at="2026-05-27T00:10:00Z",
steps=4,
created_at="2026-05-27T00:00:01Z",
usage={**manager._jobs[browser_completed]["usage"], "estimated_cost_usd": 0.12},
)
manager._jobs[browser_failed].update(
status="failed",
ended_at="2026-05-28T00:10:00Z",
steps=6,
created_at="2026-05-28T00:00:01Z",
usage={**manager._jobs[browser_failed]["usage"], "estimated_cost_usd": 0.24},
)
manager._jobs[terminal_completed].update(
status="completed",
ended_at="2026-05-28T00:15:00Z",
steps=10,
created_at="2026-05-28T00:00:02Z",
usage={**manager._jobs[terminal_completed]["usage"], "estimated_cost_usd": 0.05},
)
analytics = client.get("/api/analytics", headers=headers)
assert analytics.status_code == 200
payload = analytics.json()
assert payload["total_jobs"] == 3
assert payload["finished_jobs"] == 3
assert payload["completed_jobs"] == 2
assert payload["failed_jobs"] == 1
assert payload["success_rate"] == 66.67
assert payload["avg_steps"] == 6.67
assert payload["avg_cost_usd"] == 0.136667
browser = next(row for row in payload["by_category"] if row["label"] == "Browser / web")
terminal = next(row for row in payload["by_category"] if row["label"] == "Files / terminal")
assert browser["finished_jobs"] == 2
assert browser["success_rate"] == 50.0
assert browser["avg_steps"] == 5.0
assert terminal["success_rate"] == 100.0
assert [row["label"] for row in payload["timeline"]] == ["2026-05-27", "2026-05-28"]
def test_ui_toggle(tmp_path: Path, monkeypatch: Any) -> None:
app_enabled, _ = _build_app(tmp_path / "enabled", monkeypatch, disable_ui=False)
client_enabled = TestClient(app_enabled)
root_enabled = client_enabled.get("/")
assert root_enabled.status_code == 200
assert "ScreenJob Monitor" in root_enabled.text
assert "Success by Objective Category" in root_enabled.text
js_enabled = client_enabled.get("/ui/monitoring.js")
assert js_enabled.status_code == 200
assert "const tokenInput" in js_enabled.text
app_disabled, _ = _build_app(tmp_path / "disabled", monkeypatch, disable_ui=True)
client_disabled = TestClient(app_disabled)

View File

@@ -72,3 +72,55 @@ def test_storage_response_fallback_uses_result_when_json_missing(tmp_path: Path)
assert job is not None
assert job["response"]["return"] == "Legacy result string"
assert job["response"]["data"] is None
def test_history_db_analytics_groups_by_category_and_day(tmp_path: Path) -> None:
db = HistoryDB(tmp_path / "screenjob_test_analytics.db")
db.create_job(
job_id="job_browser_ok",
objective="Open amazon.de and checkout",
model="gpt-5.4-mini",
created_at="2026-05-27T00:00:01Z",
safety_override=False,
disabled_tools=[],
)
db.update_job("job_browser_ok", status="completed", steps=4, estimated_cost_usd=0.12)
db.create_job(
job_id="job_browser_fail",
objective="Open website and login",
model="gpt-5.4-mini",
created_at="2026-05-28T00:00:01Z",
safety_override=False,
disabled_tools=[],
)
db.update_job("job_browser_fail", status="failed", steps=6, estimated_cost_usd=0.24)
db.create_job(
job_id="job_terminal_ok",
objective="Run a shell command to inspect files",
model="gpt-5.4-mini",
created_at="2026-05-28T00:00:02Z",
safety_override=False,
disabled_tools=[],
)
db.update_job("job_terminal_ok", status="completed", steps=10, estimated_cost_usd=0.05)
analytics = db.analytics()
assert analytics["total_jobs"] == 3
assert analytics["finished_jobs"] == 3
assert analytics["completed_jobs"] == 2
assert analytics["failed_jobs"] == 1
assert analytics["success_rate"] == 66.67
assert analytics["avg_steps"] == 6.67
assert analytics["avg_cost_usd"] == 0.136667
browser = next(row for row in analytics["by_category"] if row["label"] == "Browser / web")
terminal = next(row for row in analytics["by_category"] if row["label"] == "Files / terminal")
assert browser["finished_jobs"] == 2
assert browser["success_rate"] == 50.0
assert browser["avg_steps"] == 5.0
assert terminal["success_rate"] == 100.0
assert [row["label"] for row in analytics["timeline"]] == ["2026-05-27", "2026-05-28"]

13
todo.md
View File

@@ -4,21 +4,20 @@
- [Bug] Enforce single active desktop-control run (or a strict queue) so concurrent jobs cannot fight over the same mouse/keyboard/screen session.
- [Bug] Fix run artifact collisions in `setup_artifacts()` (`run_id` is second-granularity, so two jobs in the same second can share/overwrite the same directory).
- [Bug] Remove global logger handler clobbering in `setup_logger()` (`logging.getLogger("screenjob").handlers.clear()` breaks concurrent runs and can redirect logs to the wrong file).
- [Bug] More consistent clicks and more uses of enhance images.
- [x] More consistent clicks and more uses of enhance images.
## P1
- [x] Move ui.py into a seperate html file and js file.
- [x] Think harder using effort "medium" by default.
- [x] Decay old screenshots after 3 to 5 steps to save (1) tokens and (2) brain fuck in the agents.
- [Bug] Validate `disabled_tools` against an allowlist and disallow disabling critical completion flow (`task_complete`) to avoid guaranteed step-limit failures.
- [Bug] Improve `execute_command` cancellation/timeout handling to terminate full process trees, not only the parent shell process.
- [Bug] Reduce API/UI token leakage risk by moving away from query-string token usage for websocket/artifact access where possible.
- [Idea] Add per-token rate limiting and request size limits (objective length + payload bounds) for API hardening.
## P2
- [Bug] Fix UI event style mapping mismatch (`tool_called` events are emitted, but UI color map expects `tool_call`).
- [Idea] Reduce monitoring UI backend load by throttling websocket-triggered refreshes and avoiding full job/event re-fetch on every event.
- [Idea] Add cursor-based pagination for jobs/events instead of large fixed limits.
- [Idea] Support offline/self-hosted UI assets (bundle Tailwind instead of CDN dependency).
- [Idea] Add retention controls/pruning for old runs, screenshots, and DB rows.
## P3
- [Idea] Add Replay Mode; Ability to replay a session by reconstructing the screen from screenshots and overlaying tool calls and click and type events.
- [Idea] Add lightweight analytics dashboards (success rate by objective category, avg steps/cost over time).
- [x] Add Replay Mode; Ability to replay a session by reconstructing the screen from screenshots and overlaying tool calls and click and type events.
- [x] Add lightweight analytics dashboards (success rate by objective category, avg steps/cost over time).