Compare commits
8 Commits
160b4e6292
...
feat/light
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
8126b57404 | ||
|
|
cceed18cf1 | ||
|
|
880468ef02 | ||
|
|
b05a7be668 | ||
|
|
0c019474af | ||
|
|
a8ef8ee552 | ||
|
|
111a1e84af | ||
|
|
620fcc4aa6 |
10
README.md
10
README.md
@@ -156,13 +156,21 @@ Each job payload includes:
|
|||||||
- Read-only dashboard (no run controls)
|
- Read-only dashboard (no run controls)
|
||||||
- Requires token input
|
- Requires token input
|
||||||
- Live updates via `/ws`
|
- Live updates via `/ws`
|
||||||
|
- Analytics dashboards for success rate by objective category and daily averages
|
||||||
- Set `DISABLE_UI=true` to disable UI
|
- Set `DISABLE_UI=true` to disable UI
|
||||||
|
|
||||||
|
### Analytics API
|
||||||
|
|
||||||
|
- `GET /api/analytics`
|
||||||
|
- Returns objective-category success rates plus average steps/cost over time
|
||||||
|
|
||||||
## Agent Instructions (Practical)
|
## Agent Instructions (Practical)
|
||||||
|
|
||||||
- Prefer `execute_command` for deterministic actions (opening URLs, filesystem checks).
|
- Prefer `execute_command` for deterministic actions (opening URLs, filesystem checks).
|
||||||
- Use `see_screen` before UI interaction.
|
- Use `see_screen` before UI interaction.
|
||||||
- Use `enhance` when text is unclear.
|
- Use `enhance` before clicking small/ambiguous targets; prefer `region="small"` for compact controls.
|
||||||
|
- Use `enhance` `mode="text"` for tiny labels/text, or `mode="ui"` for general UI.
|
||||||
|
- Optionally set `enhance` `scale` (2-6) for tighter zoom control.
|
||||||
- Use `press_key` for non-text keys (Enter, Tab, arrows, Escape).
|
- Use `press_key` for non-text keys (Enter, Tab, arrows, Escape).
|
||||||
- For shortcuts, use one `press_key` call with combo syntax (example: `win+r`).
|
- For shortcuts, use one `press_key` call with combo syntax (example: `win+r`).
|
||||||
- Use `click` offsets via `offset_up/down/left/right` and optional `sleep_after_seconds`.
|
- Use `click` offsets via `offset_up/down/left/right` and optional `sleep_after_seconds`.
|
||||||
|
|||||||
8
SKILL.md
8
SKILL.md
@@ -37,6 +37,14 @@ Keyboard combo rule:
|
|||||||
- For shortcuts, use one `press_key` call with combo syntax, for example: `win+r`, `ctrl+shift+esc`.
|
- For shortcuts, use one `press_key` call with combo syntax, for example: `win+r`, `ctrl+shift+esc`.
|
||||||
- Do not split modifier combos into separate calls.
|
- Do not split modifier combos into separate calls.
|
||||||
|
|
||||||
|
Enhance-first click rule:
|
||||||
|
|
||||||
|
- Before clicking small buttons/icons, dense UI, or ambiguous targets, call `enhance` first.
|
||||||
|
- Preferred preset for tiny controls: `enhance(coordinate, region="small", mode="ui")`.
|
||||||
|
- For tiny labels/text: use `mode="text"` to improve readability.
|
||||||
|
- Optional zoom control: set `scale` from `2` to `6` (defaults are tuned by region).
|
||||||
|
- After checking the enhanced image, click using the same target coordinate (or a small directional offset if needed).
|
||||||
|
|
||||||
Verification rule:
|
Verification rule:
|
||||||
|
|
||||||
- Before `task_complete`, verify actual on-screen content matches the expected outcome.
|
- Before `task_complete`, verify actual on-screen content matches the expected outcome.
|
||||||
|
|||||||
245
src/agent.py
245
src/agent.py
@@ -9,7 +9,7 @@ import traceback
|
|||||||
from typing import Any, Callable
|
from typing import Any, Callable
|
||||||
|
|
||||||
from openai import OpenAI
|
from openai import OpenAI
|
||||||
from PIL import Image, ImageEnhance, ImageFilter, ImageOps
|
from PIL import Image, ImageDraw, ImageEnhance, ImageFilter, ImageOps
|
||||||
|
|
||||||
from .models import AgentResult, RunArtifacts, RuntimeOptions, UsageSummary
|
from .models import AgentResult, RunArtifacts, RuntimeOptions, UsageSummary
|
||||||
from .pricing import estimate_cost_usd
|
from .pricing import estimate_cost_usd
|
||||||
@@ -34,7 +34,8 @@ Rules:
|
|||||||
- launching apps or running terminal checks
|
- launching apps or running terminal checks
|
||||||
3) For UI tasks, inspect with see_screen before clicking/typing.
|
3) For UI tasks, inspect with see_screen before clicking/typing.
|
||||||
4) Coordinates are absolute screen pixels (x, y) from top-left.
|
4) Coordinates are absolute screen pixels (x, y) from top-left.
|
||||||
5) Use enhance(coordinate) when text/UI is unclear.
|
5) Use enhance before risky clicks: small buttons/icons, dense UI, or when target confidence is below high.
|
||||||
|
5a) For tiny controls use enhance(coordinate, region="small", mode="ui"). For tiny text use mode="text".
|
||||||
6) For keyboard-heavy interactions, prefer press_key for special keys.
|
6) For keyboard-heavy interactions, prefer press_key for special keys.
|
||||||
6a) For key combinations, call press_key once with combo syntax (example: "win+r", "ctrl+shift+esc"). Do not split modifier combos across separate calls.
|
6a) For key combinations, call press_key once with combo syntax (example: "win+r", "ctrl+shift+esc"). Do not split modifier combos across separate calls.
|
||||||
7) You may call multiple tools in one step. If needed, do click then sleep.
|
7) You may call multiple tools in one step. If needed, do click then sleep.
|
||||||
@@ -76,11 +77,14 @@ class ScreenJobAgent:
|
|||||||
self.final_data: Any | None = None
|
self.final_data: Any | None = None
|
||||||
self.previous_response_id: str | None = None
|
self.previous_response_id: str | None = None
|
||||||
self.usage = UsageSummary()
|
self.usage = UsageSummary()
|
||||||
|
self.objective = ""
|
||||||
|
|
||||||
self.last_screen_data_url: str | None = None
|
self.last_screen_data_url: str | None = None
|
||||||
self.last_screen_meta: dict[str, Any] | None = None
|
self.last_screen_meta: dict[str, Any] | None = None
|
||||||
self.click_history: list[tuple[int, int, float]] = []
|
self.click_history: list[tuple[int, int, float]] = []
|
||||||
self.disabled_tools = {tool.strip() for tool in (options.disable_tools or set()) if tool.strip()}
|
self.disabled_tools = {tool.strip() for tool in (options.disable_tools or set()) if tool.strip()}
|
||||||
|
self.recent_tool_summaries: list[str] = []
|
||||||
|
self.last_context_compact_step = 0
|
||||||
|
|
||||||
def _emit(self, event_type: str, payload: dict[str, Any]) -> None:
|
def _emit(self, event_type: str, payload: dict[str, Any]) -> None:
|
||||||
if self.event_callback is None:
|
if self.event_callback is None:
|
||||||
@@ -192,7 +196,10 @@ class ScreenJobAgent:
|
|||||||
{
|
{
|
||||||
"type": "function",
|
"type": "function",
|
||||||
"name": "enhance",
|
"name": "enhance",
|
||||||
"description": "Create enhanced zoom around a coordinate for readability.",
|
"description": (
|
||||||
|
"Create enhanced zoom around a coordinate for readability and precise targeting. "
|
||||||
|
"Prefer this before clicking tiny or ambiguous UI targets."
|
||||||
|
),
|
||||||
"parameters": {
|
"parameters": {
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
@@ -204,7 +211,19 @@ class ScreenJobAgent:
|
|||||||
},
|
},
|
||||||
"required": ["x", "y"],
|
"required": ["x", "y"],
|
||||||
"additionalProperties": False,
|
"additionalProperties": False,
|
||||||
}
|
},
|
||||||
|
"region": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["small", "medium", "large"],
|
||||||
|
},
|
||||||
|
"mode": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["ui", "text"],
|
||||||
|
},
|
||||||
|
"scale": {
|
||||||
|
"type": ["integer", "string"],
|
||||||
|
"description": "Zoom factor from 2 to 6. Defaults by region.",
|
||||||
|
},
|
||||||
},
|
},
|
||||||
"required": ["coordinate"],
|
"required": ["coordinate"],
|
||||||
"additionalProperties": False,
|
"additionalProperties": False,
|
||||||
@@ -352,6 +371,23 @@ class ScreenJobAgent:
|
|||||||
sec = max_seconds
|
sec = max_seconds
|
||||||
return sec
|
return sec
|
||||||
|
|
||||||
|
def _parse_int(self, value: Any, default: int = 0) -> int:
|
||||||
|
if value is None:
|
||||||
|
return default
|
||||||
|
if isinstance(value, bool):
|
||||||
|
return int(value)
|
||||||
|
if isinstance(value, int):
|
||||||
|
return value
|
||||||
|
if isinstance(value, float):
|
||||||
|
return int(round(value))
|
||||||
|
text = str(value).strip()
|
||||||
|
if not text:
|
||||||
|
return default
|
||||||
|
try:
|
||||||
|
return int(float(text))
|
||||||
|
except Exception: # noqa: BLE001
|
||||||
|
return default
|
||||||
|
|
||||||
def _tool_see_screen(self, _: dict[str, Any]) -> dict[str, Any]:
|
def _tool_see_screen(self, _: dict[str, Any]) -> dict[str, Any]:
|
||||||
image, meta = self._capture_screen(with_grid=True)
|
image, meta = self._capture_screen(with_grid=True)
|
||||||
out_path = self.artifacts.shots_dir / f"screen_step_{self.step:03d}.png"
|
out_path = self.artifacts.shots_dir / f"screen_step_{self.step:03d}.png"
|
||||||
@@ -369,34 +405,106 @@ class ScreenJobAgent:
|
|||||||
|
|
||||||
def _tool_enhance(self, args: dict[str, Any]) -> dict[str, Any]:
|
def _tool_enhance(self, args: dict[str, Any]) -> dict[str, Any]:
|
||||||
coord = args.get("coordinate") or {}
|
coord = args.get("coordinate") or {}
|
||||||
x = int(coord.get("x", 0))
|
requested_x = self._parse_int(coord.get("x", 0), default=0)
|
||||||
y = int(coord.get("y", 0))
|
requested_y = self._parse_int(coord.get("y", 0), default=0)
|
||||||
|
region = str(args.get("region", "small") or "small").strip().lower()
|
||||||
|
mode = str(args.get("mode", "ui") or "ui").strip().lower()
|
||||||
|
if region not in {"small", "medium", "large"}:
|
||||||
|
region = "small"
|
||||||
|
if mode not in {"ui", "text"}:
|
||||||
|
mode = "ui"
|
||||||
|
|
||||||
|
region_half_by_preset = {
|
||||||
|
"small": 96,
|
||||||
|
"medium": 160,
|
||||||
|
"large": 240,
|
||||||
|
}
|
||||||
|
default_scale_by_region = {
|
||||||
|
"small": 4,
|
||||||
|
"medium": 3,
|
||||||
|
"large": 2,
|
||||||
|
}
|
||||||
|
raw_scale = self._parse_int(args.get("scale"), default=0)
|
||||||
|
scale = raw_scale if raw_scale > 0 else default_scale_by_region[region]
|
||||||
|
scale = clamp(scale, 2, 6)
|
||||||
|
|
||||||
base, base_meta = self._capture_screen(with_grid=False)
|
base, base_meta = self._capture_screen(with_grid=False)
|
||||||
width, height = base.size
|
width, height = base.size
|
||||||
|
|
||||||
region_half = 180
|
source_x = clamp(requested_x, 0, max(0, width - 1))
|
||||||
left = clamp(x - region_half, 0, width - 1)
|
source_y = clamp(requested_y, 0, max(0, height - 1))
|
||||||
top = clamp(y - region_half, 0, height - 1)
|
region_half = region_half_by_preset[region]
|
||||||
right = clamp(x + region_half, left + 1, width)
|
left = clamp(source_x - region_half, 0, width - 1)
|
||||||
bottom = clamp(y + region_half, top + 1, height)
|
top = clamp(source_y - region_half, 0, height - 1)
|
||||||
|
right = clamp(source_x + region_half, left + 1, width)
|
||||||
|
bottom = clamp(source_y + region_half, top + 1, height)
|
||||||
|
|
||||||
crop = base.crop((left, top, right, bottom))
|
crop = base.crop((left, top, right, bottom))
|
||||||
upscaled = crop.resize((crop.width * 2, crop.height * 2), Image.Resampling.BICUBIC)
|
out_w = max(2, crop.width * scale)
|
||||||
enhanced = ImageOps.autocontrast(upscaled)
|
out_h = max(2, crop.height * scale)
|
||||||
enhanced = ImageEnhance.Sharpness(enhanced).enhance(2.0)
|
upscaled = crop.resize((out_w, out_h), Image.Resampling.LANCZOS)
|
||||||
enhanced = ImageEnhance.Contrast(enhanced).enhance(1.25)
|
|
||||||
enhanced = enhanced.filter(ImageFilter.UnsharpMask(radius=1.8, percent=180, threshold=2))
|
|
||||||
|
|
||||||
out_path = self.artifacts.enhance_dir / f"enhance_step_{self.step:03d}_{x}_{y}.png"
|
if mode == "text":
|
||||||
|
text_view = ImageOps.grayscale(upscaled)
|
||||||
|
text_view = ImageOps.autocontrast(text_view, cutoff=1)
|
||||||
|
text_view = ImageOps.equalize(text_view)
|
||||||
|
text_view = ImageEnhance.Contrast(text_view).enhance(1.35)
|
||||||
|
text_view = ImageEnhance.Sharpness(text_view).enhance(2.1)
|
||||||
|
processed = text_view.filter(ImageFilter.UnsharpMask(radius=1.2, percent=160, threshold=1)).convert("RGB")
|
||||||
|
else:
|
||||||
|
ui_view = ImageOps.autocontrast(upscaled, cutoff=1)
|
||||||
|
ui_view = ImageEnhance.Contrast(ui_view).enhance(1.2)
|
||||||
|
ui_view = ImageEnhance.Sharpness(ui_view).enhance(1.8)
|
||||||
|
processed = ui_view.filter(ImageFilter.UnsharpMask(radius=1.4, percent=150, threshold=2)).convert("RGB")
|
||||||
|
|
||||||
|
edges = upscaled.convert("L").filter(ImageFilter.FIND_EDGES)
|
||||||
|
edges = ImageOps.autocontrast(edges, cutoff=4)
|
||||||
|
edge_overlay = ImageOps.colorize(edges, black=(0, 0, 0), white=(60, 220, 255))
|
||||||
|
enhanced = Image.blend(processed, edge_overlay, alpha=0.18)
|
||||||
|
|
||||||
|
cx = clamp((source_x - left) * scale, 0, max(0, enhanced.width - 1))
|
||||||
|
cy = clamp((source_y - top) * scale, 0, max(0, enhanced.height - 1))
|
||||||
|
draw = ImageDraw.Draw(enhanced)
|
||||||
|
draw.rectangle([0, 0, enhanced.width - 1, enhanced.height - 1], outline=(255, 80, 80), width=2)
|
||||||
|
ring_radius = max(10, int(6 * scale / 2))
|
||||||
|
arm_len = max(14, int(9 * scale / 2))
|
||||||
|
gap = max(4, int(2 * scale / 2))
|
||||||
|
line_width = max(2, int(scale / 2))
|
||||||
|
draw.ellipse(
|
||||||
|
[cx - ring_radius, cy - ring_radius, cx + ring_radius, cy + ring_radius],
|
||||||
|
outline=(255, 80, 80),
|
||||||
|
width=line_width,
|
||||||
|
)
|
||||||
|
draw.line([(max(0, cx - arm_len), cy), (max(0, cx - gap), cy)], fill=(255, 80, 80), width=line_width)
|
||||||
|
draw.line(
|
||||||
|
[(min(enhanced.width - 1, cx + gap), cy), (min(enhanced.width - 1, cx + arm_len), cy)],
|
||||||
|
fill=(255, 80, 80),
|
||||||
|
width=line_width,
|
||||||
|
)
|
||||||
|
draw.line([(cx, max(0, cy - arm_len)), (cx, max(0, cy - gap))], fill=(255, 80, 80), width=line_width)
|
||||||
|
draw.line(
|
||||||
|
[(cx, min(enhanced.height - 1, cy + gap)), (cx, min(enhanced.height - 1, cy + arm_len))],
|
||||||
|
fill=(255, 80, 80),
|
||||||
|
width=line_width,
|
||||||
|
)
|
||||||
|
|
||||||
|
out_path = self.artifacts.enhance_dir / (
|
||||||
|
f"enhance_step_{self.step:03d}_{source_x}_{source_y}_{region}_{mode}_x{scale}.png"
|
||||||
|
)
|
||||||
self._save_image(enhanced, out_path)
|
self._save_image(enhanced, out_path)
|
||||||
data_url = image_to_data_url(enhanced, "PNG")
|
data_url = image_to_data_url(enhanced, "PNG")
|
||||||
|
|
||||||
meta = {
|
meta = {
|
||||||
"captured_at": utc_now_iso(),
|
"captured_at": utc_now_iso(),
|
||||||
"source_coord": {"x": x, "y": y},
|
"requested_coord": {"x": requested_x, "y": requested_y},
|
||||||
|
"source_coord": {"x": source_x, "y": source_y},
|
||||||
"source_box": {"left": left, "top": top, "right": right, "bottom": bottom},
|
"source_box": {"left": left, "top": top, "right": right, "bottom": bottom},
|
||||||
"scale": 2,
|
"region": region,
|
||||||
|
"mode": mode,
|
||||||
|
"scale": scale,
|
||||||
"path": str(out_path.resolve()),
|
"path": str(out_path.resolve()),
|
||||||
|
"size": {"width": enhanced.width, "height": enhanced.height},
|
||||||
|
"target_pixel": {"x": cx, "y": cy},
|
||||||
"screen_size": {"width": width, "height": height},
|
"screen_size": {"width": width, "height": height},
|
||||||
"base_capture_meta": base_meta,
|
"base_capture_meta": base_meta,
|
||||||
}
|
}
|
||||||
@@ -628,6 +736,9 @@ class ScreenJobAgent:
|
|||||||
return {"_raw": raw}
|
return {"_raw": raw}
|
||||||
|
|
||||||
def _call_model(self, input_items: list[dict[str, Any]]) -> Any:
|
def _call_model(self, input_items: list[dict[str, Any]]) -> Any:
|
||||||
|
effort = str(self.options.reasoning_effort or "medium").strip().lower()
|
||||||
|
if effort not in {"low", "medium", "high"}:
|
||||||
|
effort = "medium"
|
||||||
return self.client.responses.create(
|
return self.client.responses.create(
|
||||||
model=self.options.model,
|
model=self.options.model,
|
||||||
instructions=SYSTEM_PROMPT,
|
instructions=SYSTEM_PROMPT,
|
||||||
@@ -636,9 +747,85 @@ class ScreenJobAgent:
|
|||||||
previous_response_id=self.previous_response_id,
|
previous_response_id=self.previous_response_id,
|
||||||
parallel_tool_calls=True,
|
parallel_tool_calls=True,
|
||||||
max_tool_calls=8,
|
max_tool_calls=8,
|
||||||
|
reasoning={"effort": effort},
|
||||||
)
|
)
|
||||||
|
|
||||||
|
def _record_tool_summary(self, tool_name: str, result: dict[str, Any]) -> None:
|
||||||
|
ok = bool(result.get("ok"))
|
||||||
|
status = "ok" if ok else "fail"
|
||||||
|
summary = f"step={self.step} tool={tool_name} status={status}"
|
||||||
|
if tool_name == "click":
|
||||||
|
clicked = result.get("clicked") if isinstance(result.get("clicked"), dict) else {}
|
||||||
|
x = clicked.get("x")
|
||||||
|
y = clicked.get("y")
|
||||||
|
if isinstance(x, int) and isinstance(y, int):
|
||||||
|
summary = f"{summary} at=({x},{y})"
|
||||||
|
elif tool_name == "type":
|
||||||
|
typed_length = int(result.get("typed_length", 0) or 0)
|
||||||
|
summary = f"{summary} typed_length={typed_length}"
|
||||||
|
elif tool_name == "press_key":
|
||||||
|
key = str(result.get("key") or "").strip()
|
||||||
|
if key:
|
||||||
|
summary = f"{summary} key={key}"
|
||||||
|
elif tool_name == "execute_command":
|
||||||
|
exit_code = result.get("exit_code")
|
||||||
|
if exit_code is not None:
|
||||||
|
summary = f"{summary} exit_code={exit_code}"
|
||||||
|
elif tool_name in {"see_screen", "enhance"}:
|
||||||
|
meta = result.get("meta") if isinstance(result.get("meta"), dict) else {}
|
||||||
|
path = str(meta.get("path") or result.get("path") or "").strip()
|
||||||
|
if path:
|
||||||
|
summary = f"{summary} image={path}"
|
||||||
|
if not ok:
|
||||||
|
error_text = str(result.get("error") or "").strip()
|
||||||
|
if error_text:
|
||||||
|
summary = f"{summary} error={error_text[:140]}"
|
||||||
|
self.recent_tool_summaries.append(summary)
|
||||||
|
self.recent_tool_summaries = self.recent_tool_summaries[-20:]
|
||||||
|
|
||||||
|
def _should_compact_context(self) -> bool:
|
||||||
|
interval = max(0, int(self.options.screen_context_decay_steps or 0))
|
||||||
|
if interval <= 0:
|
||||||
|
return False
|
||||||
|
if self.previous_response_id is None:
|
||||||
|
return False
|
||||||
|
return (self.step - self.last_context_compact_step) >= interval
|
||||||
|
|
||||||
|
def _build_compacted_pending_input(self) -> list[dict[str, Any]]:
|
||||||
|
recent = self.recent_tool_summaries[-8:]
|
||||||
|
lines = "\n".join(f"- {line}" for line in recent) if recent else "- No recent tool activity."
|
||||||
|
content = (
|
||||||
|
"Context compaction activated to decay stale screenshots and reduce token usage.\n"
|
||||||
|
f"JOB: {self.objective}\n"
|
||||||
|
f"Current step: {self.step}\n"
|
||||||
|
"Recent tool activity:\n"
|
||||||
|
f"{lines}\n"
|
||||||
|
"Continue execution from the latest screen state. "
|
||||||
|
"Use tools only, and finish with task_complete when done."
|
||||||
|
)
|
||||||
|
compacted_input: list[dict[str, Any]] = [
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": [
|
||||||
|
{
|
||||||
|
"type": "input_text",
|
||||||
|
"text": content,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
}
|
||||||
|
]
|
||||||
|
if self.last_screen_data_url and self.last_screen_meta:
|
||||||
|
compacted_input.append(
|
||||||
|
self._build_visual_message(
|
||||||
|
"Current screen after context compaction",
|
||||||
|
self.last_screen_data_url,
|
||||||
|
self.last_screen_meta,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
return compacted_input
|
||||||
|
|
||||||
def run(self, job: str) -> AgentResult:
|
def run(self, job: str) -> AgentResult:
|
||||||
|
self.objective = job
|
||||||
started_at = time.time()
|
started_at = time.time()
|
||||||
self.logger.info("Starting run_id=%s model=%s", self.artifacts.run_id, self.options.model)
|
self.logger.info("Starting run_id=%s model=%s", self.artifacts.run_id, self.options.model)
|
||||||
self.logger.info("Job: %s", job)
|
self.logger.info("Job: %s", job)
|
||||||
@@ -648,6 +835,8 @@ class ScreenJobAgent:
|
|||||||
{
|
{
|
||||||
"run_id": self.artifacts.run_id,
|
"run_id": self.artifacts.run_id,
|
||||||
"model": self.options.model,
|
"model": self.options.model,
|
||||||
|
"reasoning_effort": self.options.reasoning_effort,
|
||||||
|
"screen_context_decay_steps": self.options.screen_context_decay_steps,
|
||||||
"objective": job,
|
"objective": job,
|
||||||
"disabled_tools": sorted(self.disabled_tools),
|
"disabled_tools": sorted(self.disabled_tools),
|
||||||
},
|
},
|
||||||
@@ -664,6 +853,8 @@ class ScreenJobAgent:
|
|||||||
f"JOB: {job}\n"
|
f"JOB: {job}\n"
|
||||||
"You are in an action loop. Prefer execute_command for deterministic actions. "
|
"You are in an action loop. Prefer execute_command for deterministic actions. "
|
||||||
"For modifier shortcuts, use a single press_key combo (example: win+r). "
|
"For modifier shortcuts, use a single press_key combo (example: win+r). "
|
||||||
|
"Before clicking tiny buttons/icons or dense UI areas, call enhance first "
|
||||||
|
"(use region='small'; use mode='text' for tiny text labels). "
|
||||||
"You can return multiple tool calls in one step (example: click then sleep). "
|
"You can return multiple tool calls in one step (example: click then sleep). "
|
||||||
"When done call task_complete(return=..., data=...). "
|
"When done call task_complete(return=..., data=...). "
|
||||||
"Before task_complete, verify the screen content is what was expected "
|
"Before task_complete, verify the screen content is what was expected "
|
||||||
@@ -692,6 +883,19 @@ class ScreenJobAgent:
|
|||||||
self.step += 1
|
self.step += 1
|
||||||
self.logger.info("---- Agent step %d/%d ----", self.step, self.options.max_steps)
|
self.logger.info("---- Agent step %d/%d ----", self.step, self.options.max_steps)
|
||||||
self._emit("step_started", {"step": self.step, "max_steps": self.options.max_steps})
|
self._emit("step_started", {"step": self.step, "max_steps": self.options.max_steps})
|
||||||
|
if self._should_compact_context():
|
||||||
|
self.previous_response_id = None
|
||||||
|
pending_input = self._build_compacted_pending_input()
|
||||||
|
self.last_context_compact_step = self.step
|
||||||
|
self.logger.info("Compacted model context at step %d.", self.step)
|
||||||
|
self._emit(
|
||||||
|
"context_compacted",
|
||||||
|
{
|
||||||
|
"step": self.step,
|
||||||
|
"decay_steps": self.options.screen_context_decay_steps,
|
||||||
|
"recent_tool_summaries": self.recent_tool_summaries[-8:],
|
||||||
|
},
|
||||||
|
)
|
||||||
try:
|
try:
|
||||||
response = self._call_model(pending_input)
|
response = self._call_model(pending_input)
|
||||||
self._register_usage(response)
|
self._register_usage(response)
|
||||||
@@ -720,6 +924,8 @@ class ScreenJobAgent:
|
|||||||
"text": (
|
"text": (
|
||||||
"No function call was returned. Continue by using tools. "
|
"No function call was returned. Continue by using tools. "
|
||||||
"Use one press_key call for key combos like win+r. "
|
"Use one press_key call for key combos like win+r. "
|
||||||
|
"Prefer enhance before clicking small/unclear targets "
|
||||||
|
"(region='small', mode='ui' or 'text'). "
|
||||||
"You may call multiple tools in one step. "
|
"You may call multiple tools in one step. "
|
||||||
"Before task_complete, verify expected screen content with see_screen/enhance "
|
"Before task_complete, verify expected screen content with see_screen/enhance "
|
||||||
"and include observed_result in data. "
|
"and include observed_result in data. "
|
||||||
@@ -763,6 +969,7 @@ class ScreenJobAgent:
|
|||||||
name,
|
name,
|
||||||
json.dumps(result, ensure_ascii=False)[:2500],
|
json.dumps(result, ensure_ascii=False)[:2500],
|
||||||
)
|
)
|
||||||
|
self._record_tool_summary(name, result)
|
||||||
self._emit("tool_result", {"step": self.step, "tool": name, "result": result})
|
self._emit("tool_result", {"step": self.step, "tool": name, "result": result})
|
||||||
next_input.append(
|
next_input.append(
|
||||||
{
|
{
|
||||||
|
|||||||
14
src/cli.py
14
src/cli.py
@@ -28,6 +28,18 @@ def build_parser() -> argparse.ArgumentParser:
|
|||||||
parser.add_argument("--command-timeout", type=int, default=45, help="Timeout in seconds for execute_command.")
|
parser.add_argument("--command-timeout", type=int, default=45, help="Timeout in seconds for execute_command.")
|
||||||
parser.add_argument("--type-interval", type=float, default=0.02, help="Seconds between typed characters.")
|
parser.add_argument("--type-interval", type=float, default=0.02, help="Seconds between typed characters.")
|
||||||
parser.add_argument("--click-pause", type=float, default=0.10, help="Mouse move duration before click.")
|
parser.add_argument("--click-pause", type=float, default=0.10, help="Mouse move duration before click.")
|
||||||
|
parser.add_argument(
|
||||||
|
"--reasoning-effort",
|
||||||
|
choices=["low", "medium", "high"],
|
||||||
|
default="medium",
|
||||||
|
help="Reasoning effort passed to the model.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--screen-context-decay-steps",
|
||||||
|
type=int,
|
||||||
|
default=4,
|
||||||
|
help="Compact model context every N steps to decay old screenshots (0 disables).",
|
||||||
|
)
|
||||||
parser.add_argument("--disable-tool", action="append", default=[], help="Disable a tool by name.")
|
parser.add_argument("--disable-tool", action="append", default=[], help="Disable a tool by name.")
|
||||||
parser.add_argument("--skip-safety-check", action="store_true", help="Bypass pre-flight safety check.")
|
parser.add_argument("--skip-safety-check", action="store_true", help="Bypass pre-flight safety check.")
|
||||||
parser.add_argument("--no-failsafe", action="store_true", help="Disable PyAutoGUI fail-safe.")
|
parser.add_argument("--no-failsafe", action="store_true", help="Disable PyAutoGUI fail-safe.")
|
||||||
@@ -78,6 +90,8 @@ def main(argv: list[str] | None = None) -> int:
|
|||||||
command_timeout=args.command_timeout,
|
command_timeout=args.command_timeout,
|
||||||
type_interval=args.type_interval,
|
type_interval=args.type_interval,
|
||||||
click_pause=args.click_pause,
|
click_pause=args.click_pause,
|
||||||
|
reasoning_effort=args.reasoning_effort,
|
||||||
|
screen_context_decay_steps=max(0, int(args.screen_context_decay_steps)),
|
||||||
disable_tools=set(disabled_tools),
|
disable_tools=set(disabled_tools),
|
||||||
)
|
)
|
||||||
try:
|
try:
|
||||||
|
|||||||
@@ -58,4 +58,6 @@ class RuntimeOptions:
|
|||||||
command_timeout: int = 45
|
command_timeout: int = 45
|
||||||
type_interval: float = 0.02
|
type_interval: float = 0.02
|
||||||
click_pause: float = 0.10
|
click_pause: float = 0.10
|
||||||
|
reasoning_effort: str = "medium"
|
||||||
|
screen_context_decay_steps: int = 4
|
||||||
disable_tools: set[str] | None = None
|
disable_tools: set[str] | None = None
|
||||||
|
|||||||
204
src/server.py
204
src/server.py
@@ -15,7 +15,8 @@ from pydantic import BaseModel, Field
|
|||||||
from .config import AppConfig, load_app_config
|
from .config import AppConfig, load_app_config
|
||||||
from .storage import HistoryDB
|
from .storage import HistoryDB
|
||||||
from .task_manager import JobManager
|
from .task_manager import JobManager
|
||||||
from .ui import monitoring_page_html
|
from .ui import monitoring_js_path, monitoring_page_html
|
||||||
|
from .utils import utc_now_iso
|
||||||
|
|
||||||
|
|
||||||
class CreateJobRequest(BaseModel):
|
class CreateJobRequest(BaseModel):
|
||||||
@@ -25,11 +26,188 @@ class CreateJobRequest(BaseModel):
|
|||||||
command_timeout: int = Field(45, ge=1, le=600)
|
command_timeout: int = Field(45, ge=1, le=600)
|
||||||
type_interval: float = Field(0.02, ge=0.0, le=1.0)
|
type_interval: float = Field(0.02, ge=0.0, le=1.0)
|
||||||
click_pause: float = Field(0.10, ge=0.0, le=2.0)
|
click_pause: float = Field(0.10, ge=0.0, le=2.0)
|
||||||
|
reasoning_effort: str = Field("medium", pattern="^(low|medium|high)$")
|
||||||
|
screen_context_decay_steps: int = Field(4, ge=0, le=50)
|
||||||
disabled_tools: list[str] = Field(default_factory=list)
|
disabled_tools: list[str] = Field(default_factory=list)
|
||||||
safety_override: bool = False
|
safety_override: bool = False
|
||||||
no_failsafe: bool = False
|
no_failsafe: bool = False
|
||||||
|
|
||||||
|
|
||||||
|
def _safe_int(value: Any) -> int | None:
|
||||||
|
try:
|
||||||
|
return int(value)
|
||||||
|
except Exception: # noqa: BLE001
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _safe_text(value: Any, limit: int = 180) -> str:
|
||||||
|
text = str(value or "").strip()
|
||||||
|
if len(text) <= limit:
|
||||||
|
return text
|
||||||
|
return f"{text[:limit]}..."
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_artifact_path(artifacts_dir: Path | None, path_raw: Any) -> Path | None:
|
||||||
|
if artifacts_dir is None:
|
||||||
|
return None
|
||||||
|
text = str(path_raw or "").strip()
|
||||||
|
if not text:
|
||||||
|
return None
|
||||||
|
candidate = Path(text).resolve()
|
||||||
|
try:
|
||||||
|
candidate.relative_to(artifacts_dir)
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
return candidate
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_replay_action(
|
||||||
|
event: dict[str, Any],
|
||||||
|
pending_tool_args: dict[tuple[int, str], list[dict[str, Any]]],
|
||||||
|
) -> dict[str, Any] | None:
|
||||||
|
event_type = str(event.get("event_type") or "")
|
||||||
|
payload = event.get("payload") if isinstance(event.get("payload"), dict) else {}
|
||||||
|
step = int(event.get("step") or 0)
|
||||||
|
ts = str(event.get("ts") or "")
|
||||||
|
event_id = int(event.get("id") or 0)
|
||||||
|
|
||||||
|
if event_type == "tool_called":
|
||||||
|
tool = str(payload.get("tool") or "").strip()
|
||||||
|
args = payload.get("args") if isinstance(payload.get("args"), dict) else {}
|
||||||
|
if tool:
|
||||||
|
pending_tool_args.setdefault((step, tool), []).append(args)
|
||||||
|
action: dict[str, Any] = {
|
||||||
|
"ts": ts,
|
||||||
|
"step": step,
|
||||||
|
"event_id": event_id,
|
||||||
|
"kind": "tool_called",
|
||||||
|
"tool": tool,
|
||||||
|
"label": f"Call: {tool}" if tool else "Tool call",
|
||||||
|
}
|
||||||
|
if tool == "click":
|
||||||
|
coord = args.get("coordinate") if isinstance(args, dict) else None
|
||||||
|
if isinstance(coord, dict):
|
||||||
|
x = _safe_int(coord.get("x"))
|
||||||
|
y = _safe_int(coord.get("y"))
|
||||||
|
if x is not None and y is not None:
|
||||||
|
action["requested_click"] = {"x": x, "y": y}
|
||||||
|
action["label"] = f"Call: click ({x}, {y})"
|
||||||
|
elif tool == "type":
|
||||||
|
text = _safe_text((args or {}).get("text"), 120)
|
||||||
|
if text:
|
||||||
|
action["text_preview"] = text
|
||||||
|
action["label"] = f"Call: type \"{text}\""
|
||||||
|
return action
|
||||||
|
|
||||||
|
if event_type == "tool_result":
|
||||||
|
tool = str(payload.get("tool") or "").strip()
|
||||||
|
result = payload.get("result") if isinstance(payload.get("result"), dict) else {}
|
||||||
|
matching_args: dict[str, Any] = {}
|
||||||
|
key = (step, tool)
|
||||||
|
queued = pending_tool_args.get(key) or []
|
||||||
|
if queued:
|
||||||
|
matching_args = queued.pop(0)
|
||||||
|
if not queued:
|
||||||
|
pending_tool_args.pop(key, None)
|
||||||
|
|
||||||
|
action = {
|
||||||
|
"ts": ts,
|
||||||
|
"step": step,
|
||||||
|
"event_id": event_id,
|
||||||
|
"kind": "tool_result",
|
||||||
|
"tool": tool,
|
||||||
|
"ok": bool(result.get("ok")),
|
||||||
|
"label": f"Result: {tool}",
|
||||||
|
}
|
||||||
|
if tool == "click":
|
||||||
|
clicked = result.get("clicked") if isinstance(result.get("clicked"), dict) else {}
|
||||||
|
x = _safe_int(clicked.get("x"))
|
||||||
|
y = _safe_int(clicked.get("y"))
|
||||||
|
if x is not None and y is not None:
|
||||||
|
action["click"] = {"x": x, "y": y}
|
||||||
|
action["label"] = f"Clicked ({x}, {y})" if bool(result.get("ok")) else f"Click failed ({x}, {y})"
|
||||||
|
elif tool == "type":
|
||||||
|
text = _safe_text((matching_args or {}).get("text"), 120)
|
||||||
|
typed_length = _safe_int(result.get("typed_length"))
|
||||||
|
if typed_length is not None:
|
||||||
|
action["typed_length"] = typed_length
|
||||||
|
if text:
|
||||||
|
action["text_preview"] = text
|
||||||
|
action["label"] = f"Typed \"{text}\""
|
||||||
|
elif tool == "press_key":
|
||||||
|
key_name = _safe_text(result.get("key"), 80)
|
||||||
|
if key_name:
|
||||||
|
action["label"] = f"Pressed {key_name}"
|
||||||
|
elif tool == "execute_command":
|
||||||
|
command = _safe_text((matching_args or {}).get("command"), 140)
|
||||||
|
if command:
|
||||||
|
action["command_preview"] = command
|
||||||
|
action["label"] = f"Command: {command}"
|
||||||
|
return action
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _build_replay_payload(job_id: str, job: dict[str, Any], events: list[dict[str, Any]]) -> dict[str, Any]:
|
||||||
|
artifacts_dir_raw = str(job.get("artifacts_dir") or "").strip()
|
||||||
|
artifacts_dir = Path(artifacts_dir_raw).resolve() if artifacts_dir_raw else None
|
||||||
|
pending_tool_args: dict[tuple[int, str], list[dict[str, Any]]] = {}
|
||||||
|
buffered_actions: list[dict[str, Any]] = []
|
||||||
|
frames: list[dict[str, Any]] = []
|
||||||
|
|
||||||
|
for event in events:
|
||||||
|
action = _extract_replay_action(event, pending_tool_args)
|
||||||
|
if action is not None:
|
||||||
|
buffered_actions.append(action)
|
||||||
|
|
||||||
|
if str(event.get("event_type") or "") != "visual_update":
|
||||||
|
continue
|
||||||
|
payload = event.get("payload") if isinstance(event.get("payload"), dict) else {}
|
||||||
|
image_meta = payload.get("image_meta") if isinstance(payload.get("image_meta"), dict) else {}
|
||||||
|
resolved = _resolve_artifact_path(artifacts_dir, image_meta.get("path"))
|
||||||
|
if resolved is None or not resolved.exists() or not resolved.is_file():
|
||||||
|
continue
|
||||||
|
|
||||||
|
width = _safe_int(image_meta.get("width"))
|
||||||
|
height = _safe_int(image_meta.get("height"))
|
||||||
|
if width is None or height is None:
|
||||||
|
size = image_meta.get("screen_size") if isinstance(image_meta.get("screen_size"), dict) else {}
|
||||||
|
width = _safe_int(size.get("width"))
|
||||||
|
height = _safe_int(size.get("height"))
|
||||||
|
is_fullscreen = (
|
||||||
|
str(payload.get("kind") or "") == "see_screen"
|
||||||
|
and bool(image_meta.get("grid"))
|
||||||
|
and isinstance(width, int)
|
||||||
|
and isinstance(height, int)
|
||||||
|
and width > 0
|
||||||
|
and height > 0
|
||||||
|
)
|
||||||
|
|
||||||
|
frames.append(
|
||||||
|
{
|
||||||
|
"frame_index": len(frames),
|
||||||
|
"event_id": int(event.get("id") or 0),
|
||||||
|
"ts": str(event.get("ts") or ""),
|
||||||
|
"step": int(event.get("step") or 0),
|
||||||
|
"kind": str(payload.get("kind") or "visual_update"),
|
||||||
|
"image_path": str(resolved),
|
||||||
|
"image_meta": image_meta,
|
||||||
|
"screen_size": {"width": width, "height": height} if width and height else None,
|
||||||
|
"is_fullscreen": is_fullscreen,
|
||||||
|
"overlays": buffered_actions,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
buffered_actions = []
|
||||||
|
|
||||||
|
return {
|
||||||
|
"job_id": job_id,
|
||||||
|
"total_events": len(events),
|
||||||
|
"total_frames": len(frames),
|
||||||
|
"frames": frames,
|
||||||
|
"trailing_events": buffered_actions,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
class _WebSocketHub:
|
class _WebSocketHub:
|
||||||
def __init__(self) -> None:
|
def __init__(self) -> None:
|
||||||
self._connections: set[WebSocket] = set()
|
self._connections: set[WebSocket] = set()
|
||||||
@@ -126,6 +304,8 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
|
|||||||
command_timeout=payload.command_timeout,
|
command_timeout=payload.command_timeout,
|
||||||
type_interval=payload.type_interval,
|
type_interval=payload.type_interval,
|
||||||
click_pause=payload.click_pause,
|
click_pause=payload.click_pause,
|
||||||
|
reasoning_effort=payload.reasoning_effort,
|
||||||
|
screen_context_decay_steps=payload.screen_context_decay_steps,
|
||||||
disabled_tools=payload.disabled_tools,
|
disabled_tools=payload.disabled_tools,
|
||||||
safety_override=payload.safety_override,
|
safety_override=payload.safety_override,
|
||||||
no_failsafe=payload.no_failsafe,
|
no_failsafe=payload.no_failsafe,
|
||||||
@@ -161,6 +341,18 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
|
|||||||
raise HTTPException(status_code=404, detail="Job not found")
|
raise HTTPException(status_code=404, detail="Job not found")
|
||||||
return {"events": manager.get_events(job_id, limit=limit)}
|
return {"events": manager.get_events(job_id, limit=limit)}
|
||||||
|
|
||||||
|
@app.get("/api/jobs/{job_id}/replay")
|
||||||
|
def get_job_replay(
|
||||||
|
job_id: str,
|
||||||
|
limit: int = Query(default=5000, ge=1, le=5000),
|
||||||
|
_: None = Depends(require_token),
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
job = manager.get_job(job_id)
|
||||||
|
if job is None:
|
||||||
|
raise HTTPException(status_code=404, detail="Job not found")
|
||||||
|
events = manager.get_events(job_id, limit=limit)
|
||||||
|
return _build_replay_payload(job_id, job, events)
|
||||||
|
|
||||||
@app.post("/api/jobs/{job_id}/cancel")
|
@app.post("/api/jobs/{job_id}/cancel")
|
||||||
def cancel_job(job_id: str, _: None = Depends(require_token)) -> dict[str, Any]:
|
def cancel_job(job_id: str, _: None = Depends(require_token)) -> dict[str, Any]:
|
||||||
job = manager.get_job(job_id)
|
job = manager.get_job(job_id)
|
||||||
@@ -195,11 +387,21 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
|
|||||||
def stats(_: None = Depends(require_token)) -> dict[str, Any]:
|
def stats(_: None = Depends(require_token)) -> dict[str, Any]:
|
||||||
return manager.stats()
|
return manager.stats()
|
||||||
|
|
||||||
|
@app.get("/api/analytics")
|
||||||
|
def analytics(_: None = Depends(require_token)) -> dict[str, Any]:
|
||||||
|
payload = manager.analytics()
|
||||||
|
payload["generated_at"] = utc_now_iso()
|
||||||
|
return payload
|
||||||
|
|
||||||
if not app_config.disable_ui:
|
if not app_config.disable_ui:
|
||||||
@app.get("/", response_class=HTMLResponse)
|
@app.get("/", response_class=HTMLResponse)
|
||||||
def ui_root() -> str:
|
def ui_root() -> str:
|
||||||
return monitoring_page_html(device_hostname=device_hostname)
|
return monitoring_page_html(device_hostname=device_hostname)
|
||||||
|
|
||||||
|
@app.get("/ui/monitoring.js")
|
||||||
|
def ui_monitoring_js() -> FileResponse:
|
||||||
|
return FileResponse(str(monitoring_js_path()), media_type="application/javascript")
|
||||||
|
|
||||||
@app.websocket("/ws")
|
@app.websocket("/ws")
|
||||||
async def ws_endpoint(websocket: WebSocket, token: str = Query(default="")) -> None:
|
async def ws_endpoint(websocket: WebSocket, token: str = Query(default="")) -> None:
|
||||||
if not token or not secrets.compare_digest(token, app_config.screenjob_token):
|
if not token or not secrets.compare_digest(token, app_config.screenjob_token):
|
||||||
|
|||||||
158
src/storage.py
158
src/storage.py
@@ -7,6 +7,39 @@ from pathlib import Path
|
|||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
|
|
||||||
|
_TERMINAL_STATUSES = {"completed", "failed", "cancelled"}
|
||||||
|
_CATEGORY_RULES: tuple[tuple[str, tuple[str, ...]], ...] = (
|
||||||
|
(
|
||||||
|
"Browser / web",
|
||||||
|
("browser", "website", "webpage", "chrome", "url", "amazon", "google", "login", "shopping", "checkout", "orders"),
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"Files / terminal",
|
||||||
|
("file", "folder", "directory", "terminal", "shell", "command", "cli", "script", "git", "repo", "install", "pip", "npm", "powershell", "bash"),
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"Writing / docs",
|
||||||
|
("write", "summary", "summarize", "document", "docs", "report", "email", "message", "readme", "markdown", "note", "proposal"),
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"Data / analysis",
|
||||||
|
("data", "analysis", "analyze", "csv", "spreadsheet", "sheet", "table", "chart", "dashboard", "metric", "metrics", "sql"),
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"Development / ops",
|
||||||
|
("code", "bug", "fix", "test", "debug", "api", "backend", "frontend", "database", "deploy", "docker", "service", "build"),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _objective_category(objective: str) -> str:
|
||||||
|
text = objective.lower()
|
||||||
|
for category, keywords in _CATEGORY_RULES:
|
||||||
|
if any(keyword in text for keyword in keywords):
|
||||||
|
return category
|
||||||
|
return "Other"
|
||||||
|
|
||||||
|
|
||||||
class HistoryDB:
|
class HistoryDB:
|
||||||
def __init__(self, db_path: Path) -> None:
|
def __init__(self, db_path: Path) -> None:
|
||||||
self.db_path = db_path
|
self.db_path = db_path
|
||||||
@@ -184,6 +217,131 @@ class HistoryDB:
|
|||||||
).fetchone()
|
).fetchone()
|
||||||
return dict(totals) if totals else {}
|
return dict(totals) if totals else {}
|
||||||
|
|
||||||
|
def analytics(self) -> dict[str, Any]:
|
||||||
|
with self._connect() as conn:
|
||||||
|
rows = conn.execute(
|
||||||
|
"""
|
||||||
|
SELECT job_id, objective, status, steps, estimated_cost_usd, created_at
|
||||||
|
FROM jobs
|
||||||
|
ORDER BY created_at ASC, job_id ASC
|
||||||
|
"""
|
||||||
|
).fetchall()
|
||||||
|
|
||||||
|
total_jobs = 0
|
||||||
|
finished_jobs = 0
|
||||||
|
completed_jobs = 0
|
||||||
|
failed_jobs = 0
|
||||||
|
cancelled_jobs = 0
|
||||||
|
steps_sum = 0
|
||||||
|
steps_count = 0
|
||||||
|
cost_sum = 0.0
|
||||||
|
cost_count = 0
|
||||||
|
by_category: dict[str, dict[str, Any]] = {}
|
||||||
|
by_day: dict[str, dict[str, Any]] = {}
|
||||||
|
|
||||||
|
def _bucket(target: dict[str, dict[str, Any]], key: str) -> dict[str, Any]:
|
||||||
|
bucket = target.setdefault(
|
||||||
|
key,
|
||||||
|
{
|
||||||
|
"label": key,
|
||||||
|
"total_jobs": 0,
|
||||||
|
"finished_jobs": 0,
|
||||||
|
"completed_jobs": 0,
|
||||||
|
"failed_jobs": 0,
|
||||||
|
"cancelled_jobs": 0,
|
||||||
|
"steps_sum": 0,
|
||||||
|
"steps_count": 0,
|
||||||
|
"cost_sum": 0.0,
|
||||||
|
"cost_count": 0,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
return bucket
|
||||||
|
|
||||||
|
for row in rows:
|
||||||
|
total_jobs += 1
|
||||||
|
status = str(row["status"] or "")
|
||||||
|
finished = status in _TERMINAL_STATUSES
|
||||||
|
completed = status == "completed"
|
||||||
|
objective = str(row["objective"] or "")
|
||||||
|
category = _objective_category(objective)
|
||||||
|
created_at = str(row["created_at"] or "")
|
||||||
|
day = created_at[:10] if len(created_at) >= 10 else created_at or "unknown"
|
||||||
|
|
||||||
|
category_bucket = _bucket(by_category, category)
|
||||||
|
day_bucket = _bucket(by_day, day)
|
||||||
|
for bucket in (category_bucket, day_bucket):
|
||||||
|
bucket["total_jobs"] += 1
|
||||||
|
|
||||||
|
if not finished:
|
||||||
|
continue
|
||||||
|
|
||||||
|
finished_jobs += 1
|
||||||
|
if completed:
|
||||||
|
completed_jobs += 1
|
||||||
|
elif status == "failed":
|
||||||
|
failed_jobs += 1
|
||||||
|
elif status == "cancelled":
|
||||||
|
cancelled_jobs += 1
|
||||||
|
|
||||||
|
steps = row["steps"]
|
||||||
|
if steps is not None:
|
||||||
|
step_value = int(steps)
|
||||||
|
steps_sum += step_value
|
||||||
|
steps_count += 1
|
||||||
|
for bucket in (category_bucket, day_bucket):
|
||||||
|
bucket["steps_sum"] += step_value
|
||||||
|
bucket["steps_count"] += 1
|
||||||
|
|
||||||
|
estimated_cost = row["estimated_cost_usd"]
|
||||||
|
if estimated_cost is not None:
|
||||||
|
cost_value = float(estimated_cost)
|
||||||
|
cost_sum += cost_value
|
||||||
|
cost_count += 1
|
||||||
|
for bucket in (category_bucket, day_bucket):
|
||||||
|
bucket["cost_sum"] += cost_value
|
||||||
|
bucket["cost_count"] += 1
|
||||||
|
|
||||||
|
for bucket in (category_bucket, day_bucket):
|
||||||
|
bucket["finished_jobs"] += 1
|
||||||
|
if completed:
|
||||||
|
bucket["completed_jobs"] += 1
|
||||||
|
elif status == "failed":
|
||||||
|
bucket["failed_jobs"] += 1
|
||||||
|
elif status == "cancelled":
|
||||||
|
bucket["cancelled_jobs"] += 1
|
||||||
|
|
||||||
|
def _finalize(bucket: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
finished = bucket["finished_jobs"]
|
||||||
|
return {
|
||||||
|
"label": bucket["label"],
|
||||||
|
"total_jobs": bucket["total_jobs"],
|
||||||
|
"finished_jobs": finished,
|
||||||
|
"completed_jobs": bucket["completed_jobs"],
|
||||||
|
"failed_jobs": bucket["failed_jobs"],
|
||||||
|
"cancelled_jobs": bucket["cancelled_jobs"],
|
||||||
|
"success_rate": round((bucket["completed_jobs"] / finished) * 100, 2) if finished else 0.0,
|
||||||
|
"avg_steps": round(bucket["steps_sum"] / bucket["steps_count"], 2) if bucket["steps_count"] else None,
|
||||||
|
"avg_cost_usd": round(bucket["cost_sum"] / bucket["cost_count"], 6) if bucket["cost_count"] else None,
|
||||||
|
}
|
||||||
|
|
||||||
|
category_rows = [_finalize(bucket) for bucket in by_category.values()]
|
||||||
|
category_rows.sort(key=lambda item: (-item["success_rate"], item["label"]))
|
||||||
|
day_rows = [_finalize(bucket) for bucket in by_day.values()]
|
||||||
|
day_rows.sort(key=lambda item: item["label"])
|
||||||
|
|
||||||
|
return {
|
||||||
|
"total_jobs": total_jobs,
|
||||||
|
"finished_jobs": finished_jobs,
|
||||||
|
"completed_jobs": completed_jobs,
|
||||||
|
"failed_jobs": failed_jobs,
|
||||||
|
"cancelled_jobs": cancelled_jobs,
|
||||||
|
"success_rate": round((completed_jobs / finished_jobs) * 100, 2) if finished_jobs else 0.0,
|
||||||
|
"avg_steps": round(steps_sum / steps_count, 2) if steps_count else None,
|
||||||
|
"avg_cost_usd": round(cost_sum / cost_count, 6) if cost_count else None,
|
||||||
|
"by_category": category_rows,
|
||||||
|
"timeline": day_rows,
|
||||||
|
}
|
||||||
|
|
||||||
def _row_to_job(self, row: sqlite3.Row) -> dict[str, Any]:
|
def _row_to_job(self, row: sqlite3.Row) -> dict[str, Any]:
|
||||||
disabled_tools: list[str] = []
|
disabled_tools: list[str] = []
|
||||||
try:
|
try:
|
||||||
|
|||||||
@@ -48,6 +48,8 @@ class JobManager:
|
|||||||
command_timeout: int = 45,
|
command_timeout: int = 45,
|
||||||
type_interval: float = 0.02,
|
type_interval: float = 0.02,
|
||||||
click_pause: float = 0.10,
|
click_pause: float = 0.10,
|
||||||
|
reasoning_effort: str = "medium",
|
||||||
|
screen_context_decay_steps: int = 4,
|
||||||
disabled_tools: list[str] | None = None,
|
disabled_tools: list[str] | None = None,
|
||||||
safety_override: bool = False,
|
safety_override: bool = False,
|
||||||
no_failsafe: bool = False,
|
no_failsafe: bool = False,
|
||||||
@@ -93,6 +95,8 @@ class JobManager:
|
|||||||
"command_timeout": command_timeout,
|
"command_timeout": command_timeout,
|
||||||
"type_interval": type_interval,
|
"type_interval": type_interval,
|
||||||
"click_pause": click_pause,
|
"click_pause": click_pause,
|
||||||
|
"reasoning_effort": reasoning_effort,
|
||||||
|
"screen_context_decay_steps": screen_context_decay_steps,
|
||||||
"no_failsafe": no_failsafe,
|
"no_failsafe": no_failsafe,
|
||||||
"cancel_event": cancel_event,
|
"cancel_event": cancel_event,
|
||||||
},
|
},
|
||||||
@@ -121,6 +125,8 @@ class JobManager:
|
|||||||
command_timeout: int,
|
command_timeout: int,
|
||||||
type_interval: float,
|
type_interval: float,
|
||||||
click_pause: float,
|
click_pause: float,
|
||||||
|
reasoning_effort: str,
|
||||||
|
screen_context_decay_steps: int,
|
||||||
no_failsafe: bool,
|
no_failsafe: bool,
|
||||||
cancel_event: threading.Event,
|
cancel_event: threading.Event,
|
||||||
) -> None:
|
) -> None:
|
||||||
@@ -218,6 +224,8 @@ class JobManager:
|
|||||||
command_timeout=command_timeout,
|
command_timeout=command_timeout,
|
||||||
type_interval=type_interval,
|
type_interval=type_interval,
|
||||||
click_pause=click_pause,
|
click_pause=click_pause,
|
||||||
|
reasoning_effort=reasoning_effort,
|
||||||
|
screen_context_decay_steps=max(0, int(screen_context_decay_steps)),
|
||||||
disable_tools=set(disabled_tools),
|
disable_tools=set(disabled_tools),
|
||||||
)
|
)
|
||||||
try:
|
try:
|
||||||
@@ -343,6 +351,9 @@ class JobManager:
|
|||||||
stats["live_running_threads"] = sum(1 for job in self._running.values() if job.thread.is_alive())
|
stats["live_running_threads"] = sum(1 for job in self._running.values() if job.thread.is_alive())
|
||||||
return stats
|
return stats
|
||||||
|
|
||||||
|
def analytics(self) -> dict[str, Any]:
|
||||||
|
return self.db.analytics()
|
||||||
|
|
||||||
def _normalize_job_payload(self, job: dict[str, Any]) -> dict[str, Any]:
|
def _normalize_job_payload(self, job: dict[str, Any]) -> dict[str, Any]:
|
||||||
response = job.get("response")
|
response = job.get("response")
|
||||||
if not isinstance(response, dict):
|
if not isinstance(response, dict):
|
||||||
|
|||||||
310
src/ui.py
310
src/ui.py
@@ -1,307 +1,19 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from html import escape
|
from html import escape
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
_UI_DIR = Path(__file__).resolve().parent / "ui_assets"
|
||||||
|
_HTML_TEMPLATE_PATH = _UI_DIR / "monitoring.html"
|
||||||
|
_JS_PATH = _UI_DIR / "monitoring.js"
|
||||||
|
|
||||||
|
|
||||||
def monitoring_page_html(device_hostname: str = "") -> str:
|
def monitoring_page_html(device_hostname: str = "") -> str:
|
||||||
host_suffix = f" ({escape(device_hostname)})" if device_hostname else ""
|
host_suffix = f" ({escape(device_hostname)})" if device_hostname else ""
|
||||||
return """<!doctype html>
|
html = _HTML_TEMPLATE_PATH.read_text(encoding="utf-8")
|
||||||
<html lang="en">
|
return html.replace("__MONITOR_HOST__", host_suffix)
|
||||||
<head>
|
|
||||||
<meta charset="utf-8" />
|
|
||||||
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
|
||||||
<title>ScreenJob Monitor</title>
|
|
||||||
<script src="https://cdn.tailwindcss.com"></script>
|
|
||||||
</head>
|
|
||||||
<body class="bg-slate-950 text-slate-100 min-h-screen">
|
|
||||||
<div class="max-w-7xl mx-auto p-4 md:p-8 space-y-6">
|
|
||||||
<header class="flex flex-col gap-3 md:flex-row md:items-center md:justify-between">
|
|
||||||
<div>
|
|
||||||
<h1 class="text-2xl md:text-3xl font-bold tracking-tight">ScreenJob Monitor<span class="text-slate-400 text-base md:text-lg font-medium">__MONITOR_HOST__</span></h1>
|
|
||||||
<p class="text-slate-400 text-sm">Read-only monitoring for active and historical tasks.</p>
|
|
||||||
</div>
|
|
||||||
<div class="flex flex-col md:flex-row gap-2 md:items-center">
|
|
||||||
<input id="tokenInput" type="password" placeholder="SCREENJOB_TOKEN" class="bg-slate-900 border border-slate-700 rounded px-3 py-2 text-sm w-72" />
|
|
||||||
<button id="saveTokenBtn" class="bg-cyan-500 hover:bg-cyan-400 text-slate-950 font-semibold px-4 py-2 rounded">Connect</button>
|
|
||||||
</div>
|
|
||||||
</header>
|
|
||||||
|
|
||||||
<section class="grid grid-cols-2 md:grid-cols-6 gap-3" id="stats"></section>
|
|
||||||
|
|
||||||
<section class="grid grid-cols-1 lg:grid-cols-5 gap-4">
|
def monitoring_js_path() -> Path:
|
||||||
<div class="lg:col-span-2 bg-slate-900/70 border border-slate-800 rounded-xl p-4">
|
return _JS_PATH
|
||||||
<div class="flex items-center justify-between mb-3">
|
|
||||||
<h2 class="font-semibold">Jobs</h2>
|
|
||||||
<button id="refreshBtn" class="text-xs bg-slate-800 px-2 py-1 rounded">Refresh</button>
|
|
||||||
</div>
|
|
||||||
<div id="jobList" class="space-y-2 max-h-[62vh] overflow-auto"></div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<div class="lg:col-span-3 bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
|
|
||||||
<h2 class="font-semibold">Job Detail</h2>
|
|
||||||
<pre id="jobDetail" class="bg-slate-950 border border-slate-800 rounded p-3 text-xs overflow-auto max-h-[24vh]"></pre>
|
|
||||||
<h3 class="font-semibold text-sm">Latest Visual</h3>
|
|
||||||
<div class="bg-slate-950 border border-slate-800 rounded p-2">
|
|
||||||
<img id="latestVisual" alt="Latest visual update" class="max-h-[24vh] w-full object-contain rounded" />
|
|
||||||
</div>
|
|
||||||
<div class="flex items-center justify-between">
|
|
||||||
<h3 class="font-semibold text-sm">Live Events</h3>
|
|
||||||
<label for="eventsViewToggle" class="flex items-center gap-2 text-xs text-slate-300 cursor-pointer select-none">
|
|
||||||
<span>Raw</span>
|
|
||||||
<input id="eventsViewToggle" type="checkbox" class="accent-cyan-400 h-4 w-4" />
|
|
||||||
<span>Beautiful</span>
|
|
||||||
</label>
|
|
||||||
</div>
|
|
||||||
<div id="events" class="bg-slate-950 border border-slate-800 rounded p-3 text-xs overflow-auto max-h-[36vh] space-y-1"></div>
|
|
||||||
</div>
|
|
||||||
</section>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<script>
|
|
||||||
const tokenInput = document.getElementById("tokenInput");
|
|
||||||
const saveTokenBtn = document.getElementById("saveTokenBtn");
|
|
||||||
const refreshBtn = document.getElementById("refreshBtn");
|
|
||||||
const jobListEl = document.getElementById("jobList");
|
|
||||||
const jobDetailEl = document.getElementById("jobDetail");
|
|
||||||
const eventsEl = document.getElementById("events");
|
|
||||||
const statsEl = document.getElementById("stats");
|
|
||||||
const latestVisualEl = document.getElementById("latestVisual");
|
|
||||||
const eventsViewToggle = document.getElementById("eventsViewToggle");
|
|
||||||
|
|
||||||
const state = {
|
|
||||||
token: localStorage.getItem("screenjob_token") || "",
|
|
||||||
jobs: [],
|
|
||||||
selectedJobId: null,
|
|
||||||
ws: null,
|
|
||||||
wsReconnectTimer: null,
|
|
||||||
eventsViewMode: localStorage.getItem("screenjob_events_view_mode") === "beautiful" ? "beautiful" : "raw"
|
|
||||||
};
|
|
||||||
const manuallyClosedSockets = new WeakSet();
|
|
||||||
tokenInput.value = state.token;
|
|
||||||
|
|
||||||
function authHeaders() {
|
|
||||||
return { "Authorization": "Bearer " + state.token };
|
|
||||||
}
|
|
||||||
|
|
||||||
async function api(path, opts = {}) {
|
|
||||||
if (!state.token) throw new Error("Token required");
|
|
||||||
const headers = Object.assign({}, authHeaders(), opts.headers || {});
|
|
||||||
const response = await fetch(path, Object.assign({}, opts, { headers }));
|
|
||||||
if (!response.ok) throw new Error(await response.text());
|
|
||||||
return response.json();
|
|
||||||
}
|
|
||||||
|
|
||||||
function renderStats(stats) {
|
|
||||||
const cards = [
|
|
||||||
["Total Jobs", stats.total_jobs || 0],
|
|
||||||
["Running", stats.running_jobs || 0],
|
|
||||||
["Completed", stats.completed_jobs || 0],
|
|
||||||
["Failed", stats.failed_jobs || 0],
|
|
||||||
["Cancelled", stats.cancelled_jobs || 0],
|
|
||||||
["Total Cost (USD)", Number(stats.total_estimated_cost || 0).toFixed(4)]
|
|
||||||
];
|
|
||||||
statsEl.innerHTML = cards.map(([name, val]) => `
|
|
||||||
<div class="bg-slate-900/70 border border-slate-800 rounded-xl p-3">
|
|
||||||
<div class="text-slate-400 text-xs">${name}</div>
|
|
||||||
<div class="text-lg font-semibold">${val}</div>
|
|
||||||
</div>
|
|
||||||
`).join("");
|
|
||||||
}
|
|
||||||
|
|
||||||
function renderJobs() {
|
|
||||||
jobListEl.innerHTML = state.jobs.map((job) => {
|
|
||||||
const active = job.job_id === state.selectedJobId;
|
|
||||||
return `
|
|
||||||
<button data-job-id="${job.job_id}" class="w-full text-left p-3 rounded border ${active ? "border-cyan-400 bg-slate-800" : "border-slate-800 bg-slate-950"} hover:bg-slate-800">
|
|
||||||
<div class="flex items-center justify-between">
|
|
||||||
<span class="font-medium">${job.job_id}</span>
|
|
||||||
<span class="text-xs px-2 py-0.5 rounded bg-slate-700">${job.status}</span>
|
|
||||||
</div>
|
|
||||||
<div class="text-xs text-slate-400 mt-1">${job.model}</div>
|
|
||||||
<div class="text-xs text-slate-300 mt-1 line-clamp-2">${job.objective}</div>
|
|
||||||
<div class="text-xs text-slate-500 mt-1">$${Number((job.usage && job.usage.estimated_cost_usd) || 0).toFixed(6)}</div>
|
|
||||||
</button>
|
|
||||||
`;
|
|
||||||
}).join("");
|
|
||||||
for (const btn of jobListEl.querySelectorAll("button[data-job-id]")) {
|
|
||||||
btn.addEventListener("click", () => {
|
|
||||||
state.selectedJobId = btn.getAttribute("data-job-id");
|
|
||||||
renderJobs();
|
|
||||||
refreshJobDetail();
|
|
||||||
});
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function pushEventLine(obj) {
|
|
||||||
if (!obj || !obj.job_id || !obj.event_type) return;
|
|
||||||
const line = document.createElement("div");
|
|
||||||
const ts = obj.ts || "-";
|
|
||||||
const step = (obj.step ?? "-");
|
|
||||||
if (state.eventsViewMode === "raw") {
|
|
||||||
line.className = "border-b border-slate-800 pb-1";
|
|
||||||
line.textContent = `[${ts}] ${obj.job_id} step=${step} ${obj.event_type} ${JSON.stringify(obj.payload || {})}`;
|
|
||||||
} else {
|
|
||||||
const typeColors = {
|
|
||||||
info: "bg-sky-900/50 text-sky-200 border border-sky-800",
|
|
||||||
warning: "bg-amber-900/40 text-amber-200 border border-amber-800",
|
|
||||||
error: "bg-rose-900/40 text-rose-200 border border-rose-800",
|
|
||||||
visual_update: "bg-emerald-900/40 text-emerald-200 border border-emerald-800",
|
|
||||||
tool_call: "bg-violet-900/40 text-violet-200 border border-violet-800",
|
|
||||||
tool_result: "bg-indigo-900/40 text-indigo-200 border border-indigo-800"
|
|
||||||
};
|
|
||||||
const dt = new Date(ts);
|
|
||||||
const tsText = Number.isNaN(dt.getTime()) ? ts : dt.toLocaleString();
|
|
||||||
const payload = obj.payload || {};
|
|
||||||
|
|
||||||
line.className = "rounded-lg border border-slate-800 bg-slate-900/80 p-2 space-y-2";
|
|
||||||
const header = document.createElement("div");
|
|
||||||
header.className = "flex flex-wrap items-center gap-2";
|
|
||||||
|
|
||||||
const typePill = document.createElement("span");
|
|
||||||
typePill.className = `px-2 py-0.5 rounded text-[10px] font-semibold ${typeColors[obj.event_type] || "bg-slate-800 text-slate-200 border border-slate-700"}`;
|
|
||||||
typePill.textContent = obj.event_type;
|
|
||||||
|
|
||||||
const stepPill = document.createElement("span");
|
|
||||||
stepPill.className = "px-2 py-0.5 rounded text-[10px] bg-slate-800 text-slate-300 border border-slate-700";
|
|
||||||
stepPill.textContent = `step ${step}`;
|
|
||||||
|
|
||||||
const tsSpan = document.createElement("span");
|
|
||||||
tsSpan.className = "text-[10px] text-slate-400";
|
|
||||||
tsSpan.textContent = tsText;
|
|
||||||
|
|
||||||
header.appendChild(typePill);
|
|
||||||
header.appendChild(stepPill);
|
|
||||||
header.appendChild(tsSpan);
|
|
||||||
|
|
||||||
const jobLine = document.createElement("div");
|
|
||||||
jobLine.className = "text-[11px] text-slate-300 font-medium";
|
|
||||||
jobLine.textContent = obj.job_id;
|
|
||||||
|
|
||||||
const body = document.createElement("pre");
|
|
||||||
body.className = "bg-slate-950 border border-slate-800 rounded p-2 text-[11px] text-slate-200 overflow-auto";
|
|
||||||
body.textContent = JSON.stringify(payload, null, 2);
|
|
||||||
|
|
||||||
line.appendChild(header);
|
|
||||||
line.appendChild(jobLine);
|
|
||||||
line.appendChild(body);
|
|
||||||
}
|
|
||||||
eventsEl.prepend(line);
|
|
||||||
while (eventsEl.childNodes.length > 400) {
|
|
||||||
eventsEl.removeChild(eventsEl.lastChild);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function scheduleWsReconnect() {
|
|
||||||
if (state.wsReconnectTimer || !state.token) return;
|
|
||||||
state.wsReconnectTimer = setTimeout(() => {
|
|
||||||
state.wsReconnectTimer = null;
|
|
||||||
connectWs();
|
|
||||||
}, 1200);
|
|
||||||
}
|
|
||||||
|
|
||||||
function updateLatestVisualFromEvent(ev) {
|
|
||||||
if (!ev || ev.event_type !== "visual_update") return;
|
|
||||||
if (!state.selectedJobId || ev.job_id !== state.selectedJobId) return;
|
|
||||||
const imagePath = ev.payload && ev.payload.image_meta && ev.payload.image_meta.path;
|
|
||||||
if (!imagePath) return;
|
|
||||||
const q = encodeURIComponent(imagePath);
|
|
||||||
latestVisualEl.src = `/api/jobs/${state.selectedJobId}/artifact?path=${q}&token=${encodeURIComponent(state.token)}`;
|
|
||||||
}
|
|
||||||
|
|
||||||
async function refreshJobs() {
|
|
||||||
const payload = await api("/api/jobs?limit=100");
|
|
||||||
state.jobs = payload.jobs || [];
|
|
||||||
if (!state.selectedJobId && state.jobs.length > 0) state.selectedJobId = state.jobs[0].job_id;
|
|
||||||
renderJobs();
|
|
||||||
}
|
|
||||||
|
|
||||||
async function refreshStats() {
|
|
||||||
const payload = await api("/api/stats");
|
|
||||||
renderStats(payload);
|
|
||||||
}
|
|
||||||
|
|
||||||
async function refreshJobDetail() {
|
|
||||||
if (!state.selectedJobId) return;
|
|
||||||
const [job, events] = await Promise.all([
|
|
||||||
api(`/api/jobs/${state.selectedJobId}`),
|
|
||||||
api(`/api/jobs/${state.selectedJobId}/events?limit=120`)
|
|
||||||
]);
|
|
||||||
jobDetailEl.textContent = JSON.stringify(job, null, 2);
|
|
||||||
eventsEl.innerHTML = "";
|
|
||||||
const list = (events.events || []).slice().reverse();
|
|
||||||
for (const ev of list) pushEventLine(ev);
|
|
||||||
const visual = list.find((ev) => ev.event_type === "visual_update");
|
|
||||||
if (visual) updateLatestVisualFromEvent(visual);
|
|
||||||
}
|
|
||||||
|
|
||||||
function connectWs() {
|
|
||||||
if (!state.token) return;
|
|
||||||
if (state.ws && (state.ws.readyState === WebSocket.OPEN || state.ws.readyState === WebSocket.CONNECTING)) {
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
const scheme = location.protocol === "https:" ? "wss" : "ws";
|
|
||||||
const ws = new WebSocket(`${scheme}://${location.host}/ws?token=${encodeURIComponent(state.token)}`);
|
|
||||||
state.ws = ws;
|
|
||||||
ws.onmessage = async (event) => {
|
|
||||||
try {
|
|
||||||
const payload = JSON.parse(event.data);
|
|
||||||
if (!payload || payload.event_type === "connected") return;
|
|
||||||
pushEventLine(payload);
|
|
||||||
updateLatestVisualFromEvent(payload);
|
|
||||||
if (!state.selectedJobId || payload.job_id === state.selectedJobId) {
|
|
||||||
await refreshJobDetail();
|
|
||||||
}
|
|
||||||
await refreshJobs();
|
|
||||||
await refreshStats();
|
|
||||||
} catch (err) {
|
|
||||||
console.error(err);
|
|
||||||
}
|
|
||||||
};
|
|
||||||
ws.onclose = () => {
|
|
||||||
if (state.ws === ws) state.ws = null;
|
|
||||||
if (manuallyClosedSockets.has(ws)) {
|
|
||||||
manuallyClosedSockets.delete(ws);
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
scheduleWsReconnect();
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
async function fullRefresh() {
|
|
||||||
await refreshJobs();
|
|
||||||
await refreshStats();
|
|
||||||
await refreshJobDetail();
|
|
||||||
}
|
|
||||||
|
|
||||||
async function connect() {
|
|
||||||
state.token = tokenInput.value.trim();
|
|
||||||
localStorage.setItem("screenjob_token", state.token);
|
|
||||||
if (state.ws) {
|
|
||||||
manuallyClosedSockets.add(state.ws);
|
|
||||||
try { state.ws.close(); } catch (_) {}
|
|
||||||
state.ws = null;
|
|
||||||
}
|
|
||||||
if (state.wsReconnectTimer) {
|
|
||||||
clearTimeout(state.wsReconnectTimer);
|
|
||||||
state.wsReconnectTimer = null;
|
|
||||||
}
|
|
||||||
await fullRefresh();
|
|
||||||
connectWs();
|
|
||||||
}
|
|
||||||
|
|
||||||
function syncEventsViewToggle() {
|
|
||||||
eventsViewToggle.checked = state.eventsViewMode === "beautiful";
|
|
||||||
}
|
|
||||||
|
|
||||||
saveTokenBtn.addEventListener("click", () => connect().catch((err) => alert(err.message)));
|
|
||||||
refreshBtn.addEventListener("click", () => fullRefresh().catch((err) => alert(err.message)));
|
|
||||||
eventsViewToggle.addEventListener("change", () => {
|
|
||||||
state.eventsViewMode = eventsViewToggle.checked ? "beautiful" : "raw";
|
|
||||||
localStorage.setItem("screenjob_events_view_mode", state.eventsViewMode);
|
|
||||||
refreshJobDetail().catch((err) => alert(err.message));
|
|
||||||
});
|
|
||||||
syncEventsViewToggle();
|
|
||||||
if (state.token) connect().catch(() => {});
|
|
||||||
</script>
|
|
||||||
</body>
|
|
||||||
</html>
|
|
||||||
""".replace("__MONITOR_HOST__", host_suffix)
|
|
||||||
|
|||||||
106
src/ui_assets/monitoring.html
Normal file
106
src/ui_assets/monitoring.html
Normal file
@@ -0,0 +1,106 @@
|
|||||||
|
<!doctype html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
||||||
|
<title>ScreenJob Monitor</title>
|
||||||
|
<script src="https://cdn.tailwindcss.com"></script>
|
||||||
|
</head>
|
||||||
|
<body class="bg-slate-950 text-slate-100 min-h-screen">
|
||||||
|
<div class="max-w-7xl mx-auto p-4 md:p-8 space-y-6">
|
||||||
|
<header class="flex flex-col gap-3 md:flex-row md:items-center md:justify-between">
|
||||||
|
<div>
|
||||||
|
<h1 class="text-2xl md:text-3xl font-bold tracking-tight">ScreenJob Monitor<span class="text-slate-400 text-base md:text-lg font-medium">__MONITOR_HOST__</span></h1>
|
||||||
|
<p class="text-slate-400 text-sm">Read-only monitoring for active and historical tasks.</p>
|
||||||
|
</div>
|
||||||
|
<div class="flex flex-col md:flex-row gap-2 md:items-center">
|
||||||
|
<input id="tokenInput" type="password" placeholder="SCREENJOB_TOKEN" class="bg-slate-900 border border-slate-700 rounded px-3 py-2 text-sm w-72" />
|
||||||
|
<button id="saveTokenBtn" class="bg-cyan-500 hover:bg-cyan-400 text-slate-950 font-semibold px-4 py-2 rounded">Connect</button>
|
||||||
|
</div>
|
||||||
|
</header>
|
||||||
|
|
||||||
|
<section class="grid grid-cols-2 md:grid-cols-6 gap-3" id="stats"></section>
|
||||||
|
|
||||||
|
<section class="space-y-3">
|
||||||
|
<div class="flex items-center justify-between gap-3">
|
||||||
|
<h2 class="font-semibold">Analytics</h2>
|
||||||
|
<div id="analyticsMeta" class="text-[11px] text-slate-400"></div>
|
||||||
|
</div>
|
||||||
|
<div id="analyticsSummary" class="grid grid-cols-2 md:grid-cols-4 gap-3"></div>
|
||||||
|
<div class="grid grid-cols-1 xl:grid-cols-2 gap-4">
|
||||||
|
<div class="bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
|
||||||
|
<div class="flex items-center justify-between gap-3">
|
||||||
|
<h3 class="font-semibold text-sm">Success by Objective Category</h3>
|
||||||
|
<div id="analyticsCategorySummary" class="text-[11px] text-slate-400"></div>
|
||||||
|
</div>
|
||||||
|
<div id="analyticsCategories" class="space-y-3"></div>
|
||||||
|
</div>
|
||||||
|
<div class="bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
|
||||||
|
<div class="flex items-center justify-between gap-3">
|
||||||
|
<h3 class="font-semibold text-sm">Avg Steps / Cost Over Time</h3>
|
||||||
|
<div id="analyticsTrendSummary" class="text-[11px] text-slate-400"></div>
|
||||||
|
</div>
|
||||||
|
<div id="analyticsTrends" class="space-y-4"></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<section class="grid grid-cols-1 lg:grid-cols-5 gap-4">
|
||||||
|
<div class="lg:col-span-2 bg-slate-900/70 border border-slate-800 rounded-xl p-4">
|
||||||
|
<div class="flex items-center justify-between mb-3">
|
||||||
|
<h2 class="font-semibold">Jobs</h2>
|
||||||
|
<button id="refreshBtn" class="text-xs bg-slate-800 px-2 py-1 rounded">Refresh</button>
|
||||||
|
</div>
|
||||||
|
<div id="jobList" class="space-y-2 max-h-[62vh] overflow-auto"></div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="lg:col-span-3 bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
|
||||||
|
<h2 class="font-semibold">Job Detail</h2>
|
||||||
|
<pre id="jobDetail" class="bg-slate-950 border border-slate-800 rounded p-3 text-xs overflow-auto max-h-[24vh]"></pre>
|
||||||
|
<h3 class="font-semibold text-sm">Latest Visual</h3>
|
||||||
|
<div class="bg-slate-950 border border-slate-800 rounded p-2">
|
||||||
|
<img id="latestVisual" alt="Latest visual update" class="max-h-[24vh] w-full object-contain rounded" />
|
||||||
|
</div>
|
||||||
|
<div class="flex items-center justify-between">
|
||||||
|
<h3 class="font-semibold text-sm">Replay</h3>
|
||||||
|
<div id="replayStatus" class="text-[11px] text-slate-400">No replay loaded.</div>
|
||||||
|
</div>
|
||||||
|
<div class="flex flex-wrap items-center gap-2">
|
||||||
|
<button id="replayPlayBtn" class="text-xs bg-slate-800 px-2 py-1 rounded">Play</button>
|
||||||
|
<button id="replayPrevBtn" class="text-xs bg-slate-800 px-2 py-1 rounded">Prev</button>
|
||||||
|
<button id="replayNextBtn" class="text-xs bg-slate-800 px-2 py-1 rounded">Next</button>
|
||||||
|
<label class="text-xs text-slate-300 flex items-center gap-1">
|
||||||
|
Speed
|
||||||
|
<select id="replaySpeed" class="bg-slate-900 border border-slate-700 rounded px-1 py-0.5">
|
||||||
|
<option value="0.5">0.5x</option>
|
||||||
|
<option value="1" selected>1.0x</option>
|
||||||
|
<option value="1.5">1.5x</option>
|
||||||
|
<option value="2">2.0x</option>
|
||||||
|
</select>
|
||||||
|
</label>
|
||||||
|
</div>
|
||||||
|
<input id="replaySeek" type="range" min="0" max="0" value="0" class="w-full accent-cyan-400" />
|
||||||
|
<div class="bg-slate-950 border border-slate-800 rounded p-2">
|
||||||
|
<div class="relative w-full min-h-[180px] bg-black/40 rounded">
|
||||||
|
<img id="replayVisual" alt="Replay frame" class="max-h-[30vh] w-full object-contain rounded" />
|
||||||
|
<svg id="replayOverlay" class="absolute inset-0 w-full h-full pointer-events-none" preserveAspectRatio="xMidYMid meet"></svg>
|
||||||
|
</div>
|
||||||
|
<div id="replayFrameMeta" class="text-[11px] text-slate-400 mt-2"></div>
|
||||||
|
<div id="replayFrameEvents" class="mt-2 space-y-1"></div>
|
||||||
|
</div>
|
||||||
|
<div class="flex items-center justify-between">
|
||||||
|
<h3 class="font-semibold text-sm">Live Events</h3>
|
||||||
|
<label for="eventsViewToggle" class="flex items-center gap-2 text-xs text-slate-300 cursor-pointer select-none">
|
||||||
|
<span>Raw</span>
|
||||||
|
<input id="eventsViewToggle" type="checkbox" class="accent-cyan-400 h-4 w-4" />
|
||||||
|
<span>Beautiful</span>
|
||||||
|
</label>
|
||||||
|
</div>
|
||||||
|
<div id="events" class="bg-slate-950 border border-slate-800 rounded p-3 text-xs overflow-auto max-h-[36vh] space-y-1"></div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<script src="/ui/monitoring.js"></script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
625
src/ui_assets/monitoring.js
Normal file
625
src/ui_assets/monitoring.js
Normal file
@@ -0,0 +1,625 @@
|
|||||||
|
const tokenInput = document.getElementById("tokenInput");
|
||||||
|
const saveTokenBtn = document.getElementById("saveTokenBtn");
|
||||||
|
const refreshBtn = document.getElementById("refreshBtn");
|
||||||
|
const jobListEl = document.getElementById("jobList");
|
||||||
|
const jobDetailEl = document.getElementById("jobDetail");
|
||||||
|
const eventsEl = document.getElementById("events");
|
||||||
|
const statsEl = document.getElementById("stats");
|
||||||
|
const latestVisualEl = document.getElementById("latestVisual");
|
||||||
|
const eventsViewToggle = document.getElementById("eventsViewToggle");
|
||||||
|
const replayVisualEl = document.getElementById("replayVisual");
|
||||||
|
const replayOverlayEl = document.getElementById("replayOverlay");
|
||||||
|
const replayFrameMetaEl = document.getElementById("replayFrameMeta");
|
||||||
|
const replayFrameEventsEl = document.getElementById("replayFrameEvents");
|
||||||
|
const replayStatusEl = document.getElementById("replayStatus");
|
||||||
|
const replayPlayBtn = document.getElementById("replayPlayBtn");
|
||||||
|
const replayPrevBtn = document.getElementById("replayPrevBtn");
|
||||||
|
const replayNextBtn = document.getElementById("replayNextBtn");
|
||||||
|
const replaySpeedEl = document.getElementById("replaySpeed");
|
||||||
|
const replaySeekEl = document.getElementById("replaySeek");
|
||||||
|
const analyticsMetaEl = document.getElementById("analyticsMeta");
|
||||||
|
const analyticsSummaryEl = document.getElementById("analyticsSummary");
|
||||||
|
const analyticsCategorySummaryEl = document.getElementById("analyticsCategorySummary");
|
||||||
|
const analyticsCategoriesEl = document.getElementById("analyticsCategories");
|
||||||
|
const analyticsTrendSummaryEl = document.getElementById("analyticsTrendSummary");
|
||||||
|
const analyticsTrendsEl = document.getElementById("analyticsTrends");
|
||||||
|
|
||||||
|
const state = {
|
||||||
|
token: localStorage.getItem("screenjob_token") || "",
|
||||||
|
jobs: [],
|
||||||
|
selectedJobId: null,
|
||||||
|
ws: null,
|
||||||
|
wsReconnectTimer: null,
|
||||||
|
eventsViewMode: localStorage.getItem("screenjob_events_view_mode") === "beautiful" ? "beautiful" : "raw",
|
||||||
|
replay: {
|
||||||
|
frames: [],
|
||||||
|
trailingEvents: [],
|
||||||
|
frameIndex: 0,
|
||||||
|
isPlaying: false,
|
||||||
|
speed: 1,
|
||||||
|
timer: null
|
||||||
|
}
|
||||||
|
};
|
||||||
|
const manuallyClosedSockets = new WeakSet();
|
||||||
|
const analyticsRefreshEvents = new Set(["job_finished", "job_failed", "job_rejected"]);
|
||||||
|
tokenInput.value = state.token;
|
||||||
|
|
||||||
|
function authHeaders() {
|
||||||
|
return { "Authorization": "Bearer " + state.token };
|
||||||
|
}
|
||||||
|
|
||||||
|
async function api(path, opts = {}) {
|
||||||
|
if (!state.token) throw new Error("Token required");
|
||||||
|
const headers = Object.assign({}, authHeaders(), opts.headers || {});
|
||||||
|
const response = await fetch(path, Object.assign({}, opts, { headers }));
|
||||||
|
if (!response.ok) throw new Error(await response.text());
|
||||||
|
return response.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderStats(stats) {
|
||||||
|
const cards = [
|
||||||
|
["Total Jobs", stats.total_jobs || 0],
|
||||||
|
["Running", stats.running_jobs || 0],
|
||||||
|
["Completed", stats.completed_jobs || 0],
|
||||||
|
["Failed", stats.failed_jobs || 0],
|
||||||
|
["Cancelled", stats.cancelled_jobs || 0],
|
||||||
|
["Total Cost (USD)", Number(stats.total_estimated_cost || 0).toFixed(4)]
|
||||||
|
];
|
||||||
|
statsEl.innerHTML = cards.map(([name, val]) => `
|
||||||
|
<div class="bg-slate-900/70 border border-slate-800 rounded-xl p-3">
|
||||||
|
<div class="text-slate-400 text-xs">${name}</div>
|
||||||
|
<div class="text-lg font-semibold">${val}</div>
|
||||||
|
</div>
|
||||||
|
`).join("");
|
||||||
|
}
|
||||||
|
|
||||||
|
function escapeHtml(value) {
|
||||||
|
return String(value ?? "").replace(/[&<>"']/g, (ch) => ({
|
||||||
|
"&": "&",
|
||||||
|
"<": "<",
|
||||||
|
">": ">",
|
||||||
|
'"': """,
|
||||||
|
"'": "'"
|
||||||
|
})[ch]);
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatNumber(value, digits = 2) {
|
||||||
|
const num = Number(value);
|
||||||
|
return Number.isFinite(num) ? num.toFixed(digits) : "—";
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatCurrency(value, digits = 6) {
|
||||||
|
const num = Number(value);
|
||||||
|
return Number.isFinite(num) ? `$${num.toFixed(digits)}` : "—";
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatPercent(value) {
|
||||||
|
const num = Number(value);
|
||||||
|
return Number.isFinite(num) ? `${num.toFixed(1)}%` : "—";
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatDateLabel(value) {
|
||||||
|
const dt = new Date(value);
|
||||||
|
if (Number.isNaN(dt.getTime())) return String(value || "—");
|
||||||
|
return dt.toLocaleDateString(undefined, { month: "short", day: "numeric" });
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderMetricCard(label, value) {
|
||||||
|
return `
|
||||||
|
<div class="bg-slate-950 border border-slate-800 rounded-xl p-3">
|
||||||
|
<div class="text-[11px] uppercase tracking-wide text-slate-400">${escapeHtml(label)}</div>
|
||||||
|
<div class="text-xl font-semibold mt-1">${escapeHtml(value)}</div>
|
||||||
|
</div>
|
||||||
|
`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderLineChart(title, points, options = {}) {
|
||||||
|
const color = options.color || "#22d3ee";
|
||||||
|
const valueLabel = options.valueLabel || "";
|
||||||
|
const sourcePoints = Array.isArray(points)
|
||||||
|
? points.filter((point) => Number.isFinite(Number(point.value)))
|
||||||
|
: [];
|
||||||
|
|
||||||
|
if (!sourcePoints.length) {
|
||||||
|
return `
|
||||||
|
<div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3">
|
||||||
|
<div class="flex items-center justify-between gap-3">
|
||||||
|
<div>
|
||||||
|
<div class="text-xs text-slate-400">${escapeHtml(title)}</div>
|
||||||
|
<div class="text-sm text-slate-200 font-semibold">No data yet</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
`;
|
||||||
|
}
|
||||||
|
|
||||||
|
const width = 640;
|
||||||
|
const height = 220;
|
||||||
|
const margin = { top: 20, right: 18, bottom: 34, left: 44 };
|
||||||
|
const values = sourcePoints.map((point) => Number(point.value));
|
||||||
|
const minValue = Math.min(...values);
|
||||||
|
const maxValue = Math.max(...values);
|
||||||
|
const span = maxValue - minValue || 1;
|
||||||
|
const chartWidth = width - margin.left - margin.right;
|
||||||
|
const chartHeight = height - margin.top - margin.bottom;
|
||||||
|
const xStep = sourcePoints.length > 1 ? chartWidth / (sourcePoints.length - 1) : 0;
|
||||||
|
const coords = sourcePoints.map((point, index) => ({
|
||||||
|
x: margin.left + (index * xStep),
|
||||||
|
y: margin.top + ((maxValue - Number(point.value)) / span) * chartHeight,
|
||||||
|
}));
|
||||||
|
const linePath = coords.map((point, index) => `${index === 0 ? "M" : "L"} ${point.x} ${point.y}`).join(" ");
|
||||||
|
const baseline = height - margin.bottom;
|
||||||
|
const midIndex = Math.floor(sourcePoints.length / 2);
|
||||||
|
const xLabels = [
|
||||||
|
{ index: 0, label: sourcePoints[0].label },
|
||||||
|
{ index: midIndex, label: sourcePoints[midIndex].label },
|
||||||
|
{ index: sourcePoints.length - 1, label: sourcePoints[sourcePoints.length - 1].label },
|
||||||
|
].filter((item, index, array) => item.label && array.findIndex((candidate) => candidate.index === item.index) === index);
|
||||||
|
const minLabel = options.formatValue ? options.formatValue(minValue) : formatNumber(minValue, 2);
|
||||||
|
const maxLabel = options.formatValue ? options.formatValue(maxValue) : formatNumber(maxValue, 2);
|
||||||
|
const latest = sourcePoints[sourcePoints.length - 1];
|
||||||
|
const latestValue = options.formatValue ? options.formatValue(latest.value) : formatNumber(latest.value, 2);
|
||||||
|
|
||||||
|
return `
|
||||||
|
<div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3 space-y-2">
|
||||||
|
<div class="flex items-center justify-between gap-3">
|
||||||
|
<div>
|
||||||
|
<div class="text-xs text-slate-400">${escapeHtml(title)}</div>
|
||||||
|
<div class="text-sm text-slate-200 font-semibold">${escapeHtml(latestValue)}${valueLabel ? ` <span class="text-slate-500 font-normal">${escapeHtml(valueLabel)}</span>` : ""}</div>
|
||||||
|
</div>
|
||||||
|
<div class="text-[11px] text-slate-400 text-right">
|
||||||
|
<div>${escapeHtml(sourcePoints.length)} points</div>
|
||||||
|
<div>${escapeHtml(minLabel)} - ${escapeHtml(maxLabel)}</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<svg viewBox="0 0 ${width} ${height}" class="w-full h-56">
|
||||||
|
${Array.from({ length: 4 }, (_, idx) => {
|
||||||
|
const y = margin.top + (chartHeight / 3) * idx;
|
||||||
|
return `<line x1="${margin.left}" y1="${y}" x2="${width - margin.right}" y2="${y}" stroke="rgba(51, 65, 85, 0.7)" stroke-width="1" />`;
|
||||||
|
}).join("")}
|
||||||
|
<line x1="${margin.left}" y1="${baseline}" x2="${width - margin.right}" y2="${baseline}" stroke="rgba(71, 85, 105, 0.8)" stroke-width="1.5" />
|
||||||
|
<path d="${linePath}" fill="none" stroke="${color}" stroke-width="3" stroke-linecap="round" stroke-linejoin="round" />
|
||||||
|
${coords.map((point) => `
|
||||||
|
<circle cx="${point.x}" cy="${point.y}" r="4.5" fill="${color}" />
|
||||||
|
`).join("")}
|
||||||
|
<text x="${margin.left - 8}" y="${margin.top + 4}" text-anchor="end" class="fill-slate-400 text-[10px]">${escapeHtml(maxLabel)}</text>
|
||||||
|
<text x="${margin.left - 8}" y="${baseline}" text-anchor="end" class="fill-slate-400 text-[10px]">${escapeHtml(minLabel)}</text>
|
||||||
|
${xLabels.map((item) => `
|
||||||
|
<text x="${coords[item.index].x}" y="${height - 10}" text-anchor="middle" class="fill-slate-500 text-[10px]">${escapeHtml(formatDateLabel(item.label))}</text>
|
||||||
|
`).join("")}
|
||||||
|
</svg>
|
||||||
|
</div>
|
||||||
|
`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderAnalytics(payload) {
|
||||||
|
const analytics = payload || {};
|
||||||
|
const categories = Array.isArray(analytics.by_category) ? analytics.by_category : [];
|
||||||
|
const timeline = Array.isArray(analytics.timeline) ? analytics.timeline : [];
|
||||||
|
const finishedCategories = categories.filter((row) => Number(row.finished_jobs || 0) > 0);
|
||||||
|
|
||||||
|
if (analyticsMetaEl) {
|
||||||
|
analyticsMetaEl.textContent = analytics.generated_at
|
||||||
|
? `Updated ${new Date(analytics.generated_at).toLocaleString()}`
|
||||||
|
: "Historical snapshot";
|
||||||
|
}
|
||||||
|
|
||||||
|
analyticsSummaryEl.innerHTML = [
|
||||||
|
renderMetricCard("Finished Jobs", analytics.finished_jobs || 0),
|
||||||
|
renderMetricCard("Success Rate", formatPercent(analytics.success_rate)),
|
||||||
|
renderMetricCard("Avg Steps", formatNumber(analytics.avg_steps, 1)),
|
||||||
|
renderMetricCard("Avg Cost", formatCurrency(analytics.avg_cost_usd)),
|
||||||
|
].join("");
|
||||||
|
|
||||||
|
analyticsCategorySummaryEl.textContent = finishedCategories.length
|
||||||
|
? `${finishedCategories.length} categories`
|
||||||
|
: "No finished jobs yet";
|
||||||
|
|
||||||
|
if (finishedCategories.length) {
|
||||||
|
analyticsCategoriesEl.innerHTML = finishedCategories.map((row) => {
|
||||||
|
const successRate = Number(row.success_rate || 0);
|
||||||
|
const completed = Number(row.completed_jobs || 0);
|
||||||
|
const finished = Number(row.finished_jobs || 0);
|
||||||
|
const total = Number(row.total_jobs || 0);
|
||||||
|
const avgSteps = row.avg_steps == null ? "—" : formatNumber(row.avg_steps, 1);
|
||||||
|
const avgCost = row.avg_cost_usd == null ? "—" : formatCurrency(row.avg_cost_usd);
|
||||||
|
return `
|
||||||
|
<div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3 space-y-2">
|
||||||
|
<div class="flex items-start justify-between gap-3">
|
||||||
|
<div>
|
||||||
|
<div class="font-medium">${escapeHtml(row.label || "Other")}</div>
|
||||||
|
<div class="text-[11px] text-slate-400">${finished} finished · ${completed} completed · ${total} total</div>
|
||||||
|
</div>
|
||||||
|
<div class="text-right">
|
||||||
|
<div class="text-base font-semibold">${formatPercent(successRate)}</div>
|
||||||
|
<div class="text-[11px] text-slate-500">success rate</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="h-2 rounded bg-slate-800 overflow-hidden">
|
||||||
|
<div class="h-full rounded bg-cyan-400" style="width: ${Math.max(0, Math.min(successRate, 100))}%"></div>
|
||||||
|
</div>
|
||||||
|
<div class="grid grid-cols-2 gap-2 text-[11px] text-slate-300">
|
||||||
|
<div>Avg steps: ${escapeHtml(avgSteps)}</div>
|
||||||
|
<div>Avg cost: ${escapeHtml(avgCost)}</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
`;
|
||||||
|
}).join("");
|
||||||
|
} else {
|
||||||
|
analyticsCategoriesEl.innerHTML = `
|
||||||
|
<div class="rounded-lg border border-dashed border-slate-800 bg-slate-950/70 p-4 text-sm text-slate-400">
|
||||||
|
No finished jobs yet.
|
||||||
|
</div>
|
||||||
|
`;
|
||||||
|
}
|
||||||
|
|
||||||
|
analyticsTrendSummaryEl.textContent = timeline.length ? `${timeline.length} days` : "No daily data yet";
|
||||||
|
analyticsTrendsEl.innerHTML = [
|
||||||
|
renderLineChart("Average steps per day", timeline.map((row) => ({ label: row.label, value: row.avg_steps })), { color: "#38bdf8" }),
|
||||||
|
renderLineChart("Average cost per day", timeline.map((row) => ({ label: row.label, value: row.avg_cost_usd })), {
|
||||||
|
color: "#34d399",
|
||||||
|
valueLabel: "USD",
|
||||||
|
formatValue: (value) => formatCurrency(value),
|
||||||
|
}),
|
||||||
|
].join("");
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderJobs() {
|
||||||
|
jobListEl.innerHTML = state.jobs.map((job) => {
|
||||||
|
const active = job.job_id === state.selectedJobId;
|
||||||
|
return `
|
||||||
|
<button data-job-id="${job.job_id}" class="w-full text-left p-3 rounded border ${active ? "border-cyan-400 bg-slate-800" : "border-slate-800 bg-slate-950"} hover:bg-slate-800">
|
||||||
|
<div class="flex items-center justify-between">
|
||||||
|
<span class="font-medium">${job.job_id}</span>
|
||||||
|
<span class="text-xs px-2 py-0.5 rounded bg-slate-700">${job.status}</span>
|
||||||
|
</div>
|
||||||
|
<div class="text-xs text-slate-400 mt-1">${job.model}</div>
|
||||||
|
<div class="text-xs text-slate-300 mt-1 line-clamp-2">${job.objective}</div>
|
||||||
|
<div class="text-xs text-slate-500 mt-1">$${Number((job.usage && job.usage.estimated_cost_usd) || 0).toFixed(6)}</div>
|
||||||
|
</button>
|
||||||
|
`;
|
||||||
|
}).join("");
|
||||||
|
for (const btn of jobListEl.querySelectorAll("button[data-job-id]")) {
|
||||||
|
btn.addEventListener("click", () => {
|
||||||
|
state.selectedJobId = btn.getAttribute("data-job-id");
|
||||||
|
renderJobs();
|
||||||
|
refreshJobDetail();
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function pushEventLine(obj) {
|
||||||
|
if (!obj || !obj.job_id || !obj.event_type) return;
|
||||||
|
const line = document.createElement("div");
|
||||||
|
const ts = obj.ts || "-";
|
||||||
|
const step = (obj.step ?? "-");
|
||||||
|
if (state.eventsViewMode === "raw") {
|
||||||
|
line.className = "border-b border-slate-800 pb-1";
|
||||||
|
line.textContent = `[${ts}] ${obj.job_id} step=${step} ${obj.event_type} ${JSON.stringify(obj.payload || {})}`;
|
||||||
|
} else {
|
||||||
|
const typeColors = {
|
||||||
|
info: "bg-sky-900/50 text-sky-200 border border-sky-800",
|
||||||
|
warning: "bg-amber-900/40 text-amber-200 border border-amber-800",
|
||||||
|
error: "bg-rose-900/40 text-rose-200 border border-rose-800",
|
||||||
|
visual_update: "bg-emerald-900/40 text-emerald-200 border border-emerald-800",
|
||||||
|
tool_call: "bg-violet-900/40 text-violet-200 border border-violet-800",
|
||||||
|
tool_result: "bg-indigo-900/40 text-indigo-200 border border-indigo-800"
|
||||||
|
};
|
||||||
|
const dt = new Date(ts);
|
||||||
|
const tsText = Number.isNaN(dt.getTime()) ? ts : dt.toLocaleString();
|
||||||
|
const payload = obj.payload || {};
|
||||||
|
|
||||||
|
line.className = "rounded-lg border border-slate-800 bg-slate-900/80 p-2 space-y-2";
|
||||||
|
const header = document.createElement("div");
|
||||||
|
header.className = "flex flex-wrap items-center gap-2";
|
||||||
|
|
||||||
|
const typePill = document.createElement("span");
|
||||||
|
typePill.className = `px-2 py-0.5 rounded text-[10px] font-semibold ${typeColors[obj.event_type] || "bg-slate-800 text-slate-200 border border-slate-700"}`;
|
||||||
|
typePill.textContent = obj.event_type;
|
||||||
|
|
||||||
|
const stepPill = document.createElement("span");
|
||||||
|
stepPill.className = "px-2 py-0.5 rounded text-[10px] bg-slate-800 text-slate-300 border border-slate-700";
|
||||||
|
stepPill.textContent = `step ${step}`;
|
||||||
|
|
||||||
|
const tsSpan = document.createElement("span");
|
||||||
|
tsSpan.className = "text-[10px] text-slate-400";
|
||||||
|
tsSpan.textContent = tsText;
|
||||||
|
|
||||||
|
header.appendChild(typePill);
|
||||||
|
header.appendChild(stepPill);
|
||||||
|
header.appendChild(tsSpan);
|
||||||
|
|
||||||
|
const jobLine = document.createElement("div");
|
||||||
|
jobLine.className = "text-[11px] text-slate-300 font-medium";
|
||||||
|
jobLine.textContent = obj.job_id;
|
||||||
|
|
||||||
|
const body = document.createElement("pre");
|
||||||
|
body.className = "bg-slate-950 border border-slate-800 rounded p-2 text-[11px] text-slate-200 overflow-auto";
|
||||||
|
body.textContent = JSON.stringify(payload, null, 2);
|
||||||
|
|
||||||
|
line.appendChild(header);
|
||||||
|
line.appendChild(jobLine);
|
||||||
|
line.appendChild(body);
|
||||||
|
}
|
||||||
|
eventsEl.prepend(line);
|
||||||
|
while (eventsEl.childNodes.length > 400) {
|
||||||
|
eventsEl.removeChild(eventsEl.lastChild);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function clearReplayTimer() {
|
||||||
|
if (state.replay.timer) {
|
||||||
|
clearTimeout(state.replay.timer);
|
||||||
|
state.replay.timer = null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function stopReplay() {
|
||||||
|
state.replay.isPlaying = false;
|
||||||
|
clearReplayTimer();
|
||||||
|
replayPlayBtn.textContent = "Play";
|
||||||
|
}
|
||||||
|
|
||||||
|
function replayImageSrc(path) {
|
||||||
|
const q = encodeURIComponent(path || "");
|
||||||
|
return `/api/jobs/${state.selectedJobId}/artifact?path=${q}&token=${encodeURIComponent(state.token)}`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderReplayOverlay(frame) {
|
||||||
|
replayOverlayEl.innerHTML = "";
|
||||||
|
const size = frame && frame.screen_size;
|
||||||
|
if (!frame || !frame.is_fullscreen || !size || !size.width || !size.height) {
|
||||||
|
replayOverlayEl.removeAttribute("viewBox");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
replayOverlayEl.setAttribute("viewBox", `0 0 ${size.width} ${size.height}`);
|
||||||
|
const overlayEvents = Array.isArray(frame.overlays) ? frame.overlays : [];
|
||||||
|
const points = overlayEvents.filter((ev) => ev && ev.kind === "tool_result" && ev.tool === "click" && ev.click);
|
||||||
|
for (const ev of points) {
|
||||||
|
const x = Number(ev.click.x);
|
||||||
|
const y = Number(ev.click.y);
|
||||||
|
if (!Number.isFinite(x) || !Number.isFinite(y)) continue;
|
||||||
|
|
||||||
|
const halo = document.createElementNS("http://www.w3.org/2000/svg", "circle");
|
||||||
|
halo.setAttribute("cx", String(x));
|
||||||
|
halo.setAttribute("cy", String(y));
|
||||||
|
halo.setAttribute("r", "14");
|
||||||
|
halo.setAttribute("fill", "rgba(14, 165, 233, 0.22)");
|
||||||
|
halo.setAttribute("stroke", "#38bdf8");
|
||||||
|
halo.setAttribute("stroke-width", "2");
|
||||||
|
|
||||||
|
const dot = document.createElementNS("http://www.w3.org/2000/svg", "circle");
|
||||||
|
dot.setAttribute("cx", String(x));
|
||||||
|
dot.setAttribute("cy", String(y));
|
||||||
|
dot.setAttribute("r", "4");
|
||||||
|
dot.setAttribute("fill", "#38bdf8");
|
||||||
|
|
||||||
|
replayOverlayEl.appendChild(halo);
|
||||||
|
replayOverlayEl.appendChild(dot);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderReplayFrameEvents(frame) {
|
||||||
|
replayFrameEventsEl.innerHTML = "";
|
||||||
|
if (!frame) return;
|
||||||
|
const events = Array.isArray(frame.overlays) ? frame.overlays : [];
|
||||||
|
const shown = events.slice(-8);
|
||||||
|
for (const ev of shown) {
|
||||||
|
const row = document.createElement("div");
|
||||||
|
row.className = "text-[11px] rounded border border-slate-800 bg-slate-900/80 px-2 py-1";
|
||||||
|
row.textContent = ev.label || `${ev.kind || "event"} ${ev.tool || ""}`.trim();
|
||||||
|
replayFrameEventsEl.appendChild(row);
|
||||||
|
}
|
||||||
|
if (!shown.length) {
|
||||||
|
const empty = document.createElement("div");
|
||||||
|
empty.className = "text-[11px] text-slate-500";
|
||||||
|
empty.textContent = "No overlay events for this frame.";
|
||||||
|
replayFrameEventsEl.appendChild(empty);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function setReplayFrame(index) {
|
||||||
|
const frames = state.replay.frames;
|
||||||
|
if (!frames.length) {
|
||||||
|
replayVisualEl.removeAttribute("src");
|
||||||
|
replayOverlayEl.innerHTML = "";
|
||||||
|
replayFrameMetaEl.textContent = "No replay frames.";
|
||||||
|
replaySeekEl.value = "0";
|
||||||
|
replaySeekEl.max = "0";
|
||||||
|
replayStatusEl.textContent = "No replay loaded.";
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
const bounded = Math.max(0, Math.min(index, frames.length - 1));
|
||||||
|
state.replay.frameIndex = bounded;
|
||||||
|
const frame = frames[bounded];
|
||||||
|
replayVisualEl.src = replayImageSrc(frame.image_path);
|
||||||
|
replayFrameMetaEl.textContent = `Frame ${bounded + 1}/${frames.length} | step ${frame.step} | ${frame.kind} | ${frame.ts}`;
|
||||||
|
replaySeekEl.max = String(Math.max(0, frames.length - 1));
|
||||||
|
replaySeekEl.value = String(bounded);
|
||||||
|
replayStatusEl.textContent = state.replay.isPlaying ? "Playing replay." : "Replay ready.";
|
||||||
|
renderReplayOverlay(frame);
|
||||||
|
renderReplayFrameEvents(frame);
|
||||||
|
}
|
||||||
|
|
||||||
|
function advanceReplay() {
|
||||||
|
const frames = state.replay.frames;
|
||||||
|
if (!state.replay.isPlaying || !frames.length) return;
|
||||||
|
if (state.replay.frameIndex >= frames.length - 1) {
|
||||||
|
stopReplay();
|
||||||
|
setReplayFrame(frames.length - 1);
|
||||||
|
replayStatusEl.textContent = "Replay finished.";
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
setReplayFrame(state.replay.frameIndex + 1);
|
||||||
|
clearReplayTimer();
|
||||||
|
const delayMs = Math.max(120, Math.round(700 / (state.replay.speed || 1)));
|
||||||
|
state.replay.timer = setTimeout(advanceReplay, delayMs);
|
||||||
|
}
|
||||||
|
|
||||||
|
function toggleReplayPlay() {
|
||||||
|
if (!state.replay.frames.length) return;
|
||||||
|
if (state.replay.isPlaying) {
|
||||||
|
stopReplay();
|
||||||
|
setReplayFrame(state.replay.frameIndex);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
state.replay.isPlaying = true;
|
||||||
|
replayPlayBtn.textContent = "Pause";
|
||||||
|
replayStatusEl.textContent = "Playing replay.";
|
||||||
|
advanceReplay();
|
||||||
|
}
|
||||||
|
|
||||||
|
function resetReplay(payload) {
|
||||||
|
stopReplay();
|
||||||
|
const replayPayload = payload || {};
|
||||||
|
state.replay.frames = Array.isArray(replayPayload.frames) ? replayPayload.frames : [];
|
||||||
|
state.replay.trailingEvents = Array.isArray(replayPayload.trailing_events) ? replayPayload.trailing_events : [];
|
||||||
|
state.replay.frameIndex = 0;
|
||||||
|
setReplayFrame(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
function scheduleWsReconnect() {
|
||||||
|
if (state.wsReconnectTimer || !state.token) return;
|
||||||
|
state.wsReconnectTimer = setTimeout(() => {
|
||||||
|
state.wsReconnectTimer = null;
|
||||||
|
connectWs();
|
||||||
|
}, 1200);
|
||||||
|
}
|
||||||
|
|
||||||
|
function updateLatestVisualFromEvent(ev) {
|
||||||
|
if (!ev || ev.event_type !== "visual_update") return;
|
||||||
|
if (!state.selectedJobId || ev.job_id !== state.selectedJobId) return;
|
||||||
|
const imagePath = ev.payload && ev.payload.image_meta && ev.payload.image_meta.path;
|
||||||
|
if (!imagePath) return;
|
||||||
|
const q = encodeURIComponent(imagePath);
|
||||||
|
latestVisualEl.src = `/api/jobs/${state.selectedJobId}/artifact?path=${q}&token=${encodeURIComponent(state.token)}`;
|
||||||
|
}
|
||||||
|
|
||||||
|
async function refreshJobs() {
|
||||||
|
const payload = await api("/api/jobs?limit=100");
|
||||||
|
state.jobs = payload.jobs || [];
|
||||||
|
if (!state.selectedJobId && state.jobs.length > 0) state.selectedJobId = state.jobs[0].job_id;
|
||||||
|
renderJobs();
|
||||||
|
}
|
||||||
|
|
||||||
|
async function refreshStats() {
|
||||||
|
const payload = await api("/api/stats");
|
||||||
|
renderStats(payload);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function refreshAnalytics() {
|
||||||
|
const payload = await api("/api/analytics");
|
||||||
|
renderAnalytics(payload);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function refreshJobDetail() {
|
||||||
|
if (!state.selectedJobId) return;
|
||||||
|
const [job, events, replay] = await Promise.all([
|
||||||
|
api(`/api/jobs/${state.selectedJobId}`),
|
||||||
|
api(`/api/jobs/${state.selectedJobId}/events?limit=120`),
|
||||||
|
api(`/api/jobs/${state.selectedJobId}/replay?limit=5000`)
|
||||||
|
]);
|
||||||
|
jobDetailEl.textContent = JSON.stringify(job, null, 2);
|
||||||
|
eventsEl.innerHTML = "";
|
||||||
|
const list = (events.events || []).slice().reverse();
|
||||||
|
for (const ev of list) pushEventLine(ev);
|
||||||
|
const visual = list.find((ev) => ev.event_type === "visual_update");
|
||||||
|
if (visual) updateLatestVisualFromEvent(visual);
|
||||||
|
resetReplay(replay);
|
||||||
|
}
|
||||||
|
|
||||||
|
function connectWs() {
|
||||||
|
if (!state.token) return;
|
||||||
|
if (state.ws && (state.ws.readyState === WebSocket.OPEN || state.ws.readyState === WebSocket.CONNECTING)) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
const scheme = location.protocol === "https:" ? "wss" : "ws";
|
||||||
|
const ws = new WebSocket(`${scheme}://${location.host}/ws?token=${encodeURIComponent(state.token)}`);
|
||||||
|
state.ws = ws;
|
||||||
|
ws.onmessage = async (event) => {
|
||||||
|
try {
|
||||||
|
const payload = JSON.parse(event.data);
|
||||||
|
if (!payload || payload.event_type === "connected") return;
|
||||||
|
pushEventLine(payload);
|
||||||
|
updateLatestVisualFromEvent(payload);
|
||||||
|
if (!state.selectedJobId || payload.job_id === state.selectedJobId) {
|
||||||
|
await refreshJobDetail();
|
||||||
|
}
|
||||||
|
await refreshJobs();
|
||||||
|
await refreshStats();
|
||||||
|
if (analyticsRefreshEvents.has(payload.event_type)) {
|
||||||
|
await refreshAnalytics();
|
||||||
|
}
|
||||||
|
} catch (err) {
|
||||||
|
console.error(err);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
ws.onclose = () => {
|
||||||
|
if (state.ws === ws) state.ws = null;
|
||||||
|
if (manuallyClosedSockets.has(ws)) {
|
||||||
|
manuallyClosedSockets.delete(ws);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
scheduleWsReconnect();
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
async function fullRefresh() {
|
||||||
|
await refreshJobs();
|
||||||
|
await refreshStats();
|
||||||
|
await refreshAnalytics();
|
||||||
|
await refreshJobDetail();
|
||||||
|
}
|
||||||
|
|
||||||
|
async function connect() {
|
||||||
|
state.token = tokenInput.value.trim();
|
||||||
|
localStorage.setItem("screenjob_token", state.token);
|
||||||
|
if (state.ws) {
|
||||||
|
manuallyClosedSockets.add(state.ws);
|
||||||
|
try { state.ws.close(); } catch (_) {}
|
||||||
|
state.ws = null;
|
||||||
|
}
|
||||||
|
if (state.wsReconnectTimer) {
|
||||||
|
clearTimeout(state.wsReconnectTimer);
|
||||||
|
state.wsReconnectTimer = null;
|
||||||
|
}
|
||||||
|
await fullRefresh();
|
||||||
|
connectWs();
|
||||||
|
}
|
||||||
|
|
||||||
|
function syncEventsViewToggle() {
|
||||||
|
eventsViewToggle.checked = state.eventsViewMode === "beautiful";
|
||||||
|
}
|
||||||
|
|
||||||
|
saveTokenBtn.addEventListener("click", () => connect().catch((err) => alert(err.message)));
|
||||||
|
refreshBtn.addEventListener("click", () => fullRefresh().catch((err) => alert(err.message)));
|
||||||
|
eventsViewToggle.addEventListener("change", () => {
|
||||||
|
state.eventsViewMode = eventsViewToggle.checked ? "beautiful" : "raw";
|
||||||
|
localStorage.setItem("screenjob_events_view_mode", state.eventsViewMode);
|
||||||
|
refreshJobDetail().catch((err) => alert(err.message));
|
||||||
|
});
|
||||||
|
replayPlayBtn.addEventListener("click", () => toggleReplayPlay());
|
||||||
|
replayPrevBtn.addEventListener("click", () => {
|
||||||
|
stopReplay();
|
||||||
|
setReplayFrame(state.replay.frameIndex - 1);
|
||||||
|
});
|
||||||
|
replayNextBtn.addEventListener("click", () => {
|
||||||
|
stopReplay();
|
||||||
|
setReplayFrame(state.replay.frameIndex + 1);
|
||||||
|
});
|
||||||
|
replaySpeedEl.addEventListener("change", () => {
|
||||||
|
const speed = Number(replaySpeedEl.value);
|
||||||
|
state.replay.speed = Number.isFinite(speed) && speed > 0 ? speed : 1;
|
||||||
|
if (state.replay.isPlaying) {
|
||||||
|
clearReplayTimer();
|
||||||
|
advanceReplay();
|
||||||
|
}
|
||||||
|
});
|
||||||
|
replaySeekEl.addEventListener("input", () => {
|
||||||
|
stopReplay();
|
||||||
|
setReplayFrame(Number(replaySeekEl.value || 0));
|
||||||
|
});
|
||||||
|
syncEventsViewToggle();
|
||||||
|
resetReplay(null);
|
||||||
|
if (state.token) connect().catch(() => {});
|
||||||
@@ -91,6 +91,41 @@ def test_click_supports_directional_offsets(tmp_path: Path, monkeypatch) -> None
|
|||||||
assert click_result["clicked"] == {"x": 110, "y": 102}
|
assert click_result["clicked"] == {"x": 110, "y": 102}
|
||||||
|
|
||||||
|
|
||||||
|
def test_enhance_defaults_to_small_ui_preset(tmp_path: Path, monkeypatch) -> None:
|
||||||
|
agent = _build_agent(tmp_path, monkeypatch)
|
||||||
|
result = agent._tool_enhance({"coordinate": {"x": 100, "y": 120}})
|
||||||
|
|
||||||
|
assert result["ok"] is True
|
||||||
|
meta = result["meta"]
|
||||||
|
assert meta["region"] == "small"
|
||||||
|
assert meta["mode"] == "ui"
|
||||||
|
assert meta["scale"] == 4
|
||||||
|
assert Path(meta["path"]).exists()
|
||||||
|
assert meta["target_pixel"]["x"] >= 0
|
||||||
|
assert meta["target_pixel"]["y"] >= 0
|
||||||
|
|
||||||
|
|
||||||
|
def test_enhance_supports_text_mode_and_scale_clamp(tmp_path: Path, monkeypatch) -> None:
|
||||||
|
agent = _build_agent(tmp_path, monkeypatch)
|
||||||
|
result = agent._tool_enhance(
|
||||||
|
{
|
||||||
|
"coordinate": {"x": -99, "y": 9999},
|
||||||
|
"region": "medium",
|
||||||
|
"mode": "text",
|
||||||
|
"scale": 99,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result["ok"] is True
|
||||||
|
meta = result["meta"]
|
||||||
|
assert meta["region"] == "medium"
|
||||||
|
assert meta["mode"] == "text"
|
||||||
|
assert meta["scale"] == 6
|
||||||
|
assert meta["requested_coord"] == {"x": -99, "y": 9999}
|
||||||
|
assert meta["source_coord"] == {"x": 0, "y": 719}
|
||||||
|
assert Path(meta["path"]).exists()
|
||||||
|
|
||||||
|
|
||||||
def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None:
|
def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None:
|
||||||
agent = _build_agent(tmp_path, monkeypatch)
|
agent = _build_agent(tmp_path, monkeypatch)
|
||||||
result = agent._tool_press_key({"key": "meta+r"})
|
result = agent._tool_press_key({"key": "meta+r"})
|
||||||
@@ -98,3 +133,21 @@ def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None:
|
|||||||
assert result["key"] == "win+r"
|
assert result["key"] == "win+r"
|
||||||
assert result["message"] == "Key combo executed."
|
assert result["message"] == "Key combo executed."
|
||||||
assert agent_module.pyautogui.last_hotkey == ("win", "r")
|
assert agent_module.pyautogui.last_hotkey == ("win", "r")
|
||||||
|
|
||||||
|
|
||||||
|
def test_context_compaction_trigger_and_payload(tmp_path: Path, monkeypatch) -> None:
|
||||||
|
agent = _build_agent(tmp_path, monkeypatch)
|
||||||
|
agent.objective = "Open settings app"
|
||||||
|
agent.previous_response_id = "resp_123"
|
||||||
|
agent.step = 4
|
||||||
|
agent.last_context_compact_step = 0
|
||||||
|
agent.options.screen_context_decay_steps = 4
|
||||||
|
agent.recent_tool_summaries = ["step=1 tool=see_screen status=ok"]
|
||||||
|
agent.last_screen_data_url = "data:image/png;base64,abc"
|
||||||
|
agent.last_screen_meta = {"width": 1280, "height": 720, "path": "C:/tmp/frame.png"}
|
||||||
|
|
||||||
|
assert agent._should_compact_context() is True
|
||||||
|
compacted = agent._build_compacted_pending_input()
|
||||||
|
assert len(compacted) == 2
|
||||||
|
assert "Context compaction activated" in compacted[0]["content"][0]["text"]
|
||||||
|
assert "Open settings app" in compacted[0]["content"][0]["text"]
|
||||||
|
|||||||
@@ -29,7 +29,10 @@ def test_cli_emits_structured_return_and_data(monkeypatch: Any, capsys, tmp_path
|
|||||||
def fake_assess_task_safety(*_args, **_kwargs):
|
def fake_assess_task_safety(*_args, **_kwargs):
|
||||||
return True, "safe", {"safe": True}
|
return True, "safe", {"safe": True}
|
||||||
|
|
||||||
|
captured_kwargs: dict[str, Any] = {}
|
||||||
|
|
||||||
def fake_run_job(*_args, **_kwargs):
|
def fake_run_job(*_args, **_kwargs):
|
||||||
|
captured_kwargs.update(_kwargs)
|
||||||
result = AgentResult(
|
result = AgentResult(
|
||||||
completed=True,
|
completed=True,
|
||||||
result="Done",
|
result="Done",
|
||||||
@@ -66,3 +69,5 @@ def test_cli_emits_structured_return_and_data(monkeypatch: Any, capsys, tmp_path
|
|||||||
assert payload["response"]["data"] == "file1.txt\nfile2.txt"
|
assert payload["response"]["data"] == "file1.txt\nfile2.txt"
|
||||||
assert payload["return"] == "Task completed successfully"
|
assert payload["return"] == "Task completed successfully"
|
||||||
assert payload["data"] == "file1.txt\nfile2.txt"
|
assert payload["data"] == "file1.txt\nfile2.txt"
|
||||||
|
assert captured_kwargs["options"].reasoning_effort == "medium"
|
||||||
|
assert captured_kwargs["options"].screen_context_decay_steps == 4
|
||||||
|
|||||||
@@ -9,6 +9,24 @@ import src.server as server_module
|
|||||||
from src.config import AppConfig
|
from src.config import AppConfig
|
||||||
|
|
||||||
|
|
||||||
|
_TERMINAL_STATUSES = {"completed", "failed", "cancelled"}
|
||||||
|
|
||||||
|
|
||||||
|
def _objective_category(objective: str) -> str:
|
||||||
|
text = objective.lower()
|
||||||
|
if any(keyword in text for keyword in ("browser", "website", "amazon", "google", "login", "shopping", "checkout", "orders")):
|
||||||
|
return "Browser / web"
|
||||||
|
if any(keyword in text for keyword in ("file", "folder", "directory", "terminal", "shell", "command", "cli", "script", "git", "repo", "install", "pip", "npm")):
|
||||||
|
return "Files / terminal"
|
||||||
|
if any(keyword in text for keyword in ("write", "summary", "document", "docs", "report", "email", "message", "readme", "markdown")):
|
||||||
|
return "Writing / docs"
|
||||||
|
if any(keyword in text for keyword in ("data", "analysis", "csv", "spreadsheet", "sheet", "table", "chart", "dashboard", "metric", "sql")):
|
||||||
|
return "Data / analysis"
|
||||||
|
if any(keyword in text for keyword in ("code", "bug", "fix", "test", "debug", "api", "backend", "frontend", "database", "deploy", "docker", "service", "build")):
|
||||||
|
return "Development / ops"
|
||||||
|
return "Other"
|
||||||
|
|
||||||
|
|
||||||
class FakeJobManager:
|
class FakeJobManager:
|
||||||
def __init__(self, *, config: AppConfig, db: Any, broadcast: Any = None) -> None:
|
def __init__(self, *, config: AppConfig, db: Any, broadcast: Any = None) -> None:
|
||||||
self.config = config
|
self.config = config
|
||||||
@@ -26,6 +44,8 @@ class FakeJobManager:
|
|||||||
command_timeout: int = 45,
|
command_timeout: int = 45,
|
||||||
type_interval: float = 0.02,
|
type_interval: float = 0.02,
|
||||||
click_pause: float = 0.10,
|
click_pause: float = 0.10,
|
||||||
|
reasoning_effort: str = "medium",
|
||||||
|
screen_context_decay_steps: int = 4,
|
||||||
disabled_tools: list[str] | None = None,
|
disabled_tools: list[str] | None = None,
|
||||||
safety_override: bool = False,
|
safety_override: bool = False,
|
||||||
no_failsafe: bool = False,
|
no_failsafe: bool = False,
|
||||||
@@ -33,6 +53,11 @@ class FakeJobManager:
|
|||||||
self._counter += 1
|
self._counter += 1
|
||||||
job_id = f"job_fake_{self._counter:03d}"
|
job_id = f"job_fake_{self._counter:03d}"
|
||||||
selected_model = (model or self.config.default_model).strip()
|
selected_model = (model or self.config.default_model).strip()
|
||||||
|
artifacts_dir = (self.config.runs_dir / f"run_{job_id}").resolve()
|
||||||
|
artifacts_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
screenshot_path = artifacts_dir / "screen_step_001.png"
|
||||||
|
screenshot_path.write_bytes(b"not-a-real-png")
|
||||||
|
created_at = f"2026-05-27T00:00:{self._counter:02d}Z"
|
||||||
self.last_submit_payload = {
|
self.last_submit_payload = {
|
||||||
"objective": objective,
|
"objective": objective,
|
||||||
"model": selected_model,
|
"model": selected_model,
|
||||||
@@ -42,6 +67,8 @@ class FakeJobManager:
|
|||||||
"command_timeout": command_timeout,
|
"command_timeout": command_timeout,
|
||||||
"type_interval": type_interval,
|
"type_interval": type_interval,
|
||||||
"click_pause": click_pause,
|
"click_pause": click_pause,
|
||||||
|
"reasoning_effort": reasoning_effort,
|
||||||
|
"screen_context_decay_steps": screen_context_decay_steps,
|
||||||
"no_failsafe": no_failsafe,
|
"no_failsafe": no_failsafe,
|
||||||
}
|
}
|
||||||
self._jobs[job_id] = {
|
self._jobs[job_id] = {
|
||||||
@@ -49,6 +76,10 @@ class FakeJobManager:
|
|||||||
"objective": objective,
|
"objective": objective,
|
||||||
"model": selected_model,
|
"model": selected_model,
|
||||||
"status": "running",
|
"status": "running",
|
||||||
|
"created_at": created_at,
|
||||||
|
"started_at": created_at,
|
||||||
|
"ended_at": None,
|
||||||
|
"steps": 1,
|
||||||
"result": "Running",
|
"result": "Running",
|
||||||
"response": {"return": "Running", "data": None},
|
"response": {"return": "Running", "data": None},
|
||||||
"return": "Running",
|
"return": "Running",
|
||||||
@@ -61,7 +92,7 @@ class FakeJobManager:
|
|||||||
"total_tokens": 14,
|
"total_tokens": 14,
|
||||||
"estimated_cost_usd": 0.0001,
|
"estimated_cost_usd": 0.0001,
|
||||||
},
|
},
|
||||||
"artifacts_dir": str(self.config.runs_dir.resolve()),
|
"artifacts_dir": str(artifacts_dir),
|
||||||
}
|
}
|
||||||
self._events[job_id] = [
|
self._events[job_id] = [
|
||||||
{
|
{
|
||||||
@@ -70,7 +101,47 @@ class FakeJobManager:
|
|||||||
"ts": "2026-05-27T00:00:00Z",
|
"ts": "2026-05-27T00:00:00Z",
|
||||||
"step": 1,
|
"step": 1,
|
||||||
"event_type": "tool_called",
|
"event_type": "tool_called",
|
||||||
"payload": {"tool": "execute_command"},
|
"payload": {"tool": "click", "args": {"coordinate": {"x": 320, "y": 180}}},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": 2,
|
||||||
|
"job_id": job_id,
|
||||||
|
"ts": "2026-05-27T00:00:01Z",
|
||||||
|
"step": 1,
|
||||||
|
"event_type": "tool_result",
|
||||||
|
"payload": {"tool": "click", "result": {"ok": True, "clicked": {"x": 322, "y": 182}}},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": 3,
|
||||||
|
"job_id": job_id,
|
||||||
|
"ts": "2026-05-27T00:00:02Z",
|
||||||
|
"step": 1,
|
||||||
|
"event_type": "tool_called",
|
||||||
|
"payload": {"tool": "type", "args": {"text": "hello world"}},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": 4,
|
||||||
|
"job_id": job_id,
|
||||||
|
"ts": "2026-05-27T00:00:03Z",
|
||||||
|
"step": 1,
|
||||||
|
"event_type": "tool_result",
|
||||||
|
"payload": {"tool": "type", "result": {"ok": True, "typed_length": 11}},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": 5,
|
||||||
|
"job_id": job_id,
|
||||||
|
"ts": "2026-05-27T00:00:04Z",
|
||||||
|
"step": 1,
|
||||||
|
"event_type": "visual_update",
|
||||||
|
"payload": {
|
||||||
|
"kind": "see_screen",
|
||||||
|
"image_meta": {
|
||||||
|
"path": str(screenshot_path),
|
||||||
|
"width": 1920,
|
||||||
|
"height": 1080,
|
||||||
|
"grid": True,
|
||||||
|
},
|
||||||
|
},
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
return job_id
|
return job_id
|
||||||
@@ -101,6 +172,114 @@ class FakeJobManager:
|
|||||||
"live_running_threads": 0,
|
"live_running_threads": 0,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
def analytics(self) -> dict[str, Any]:
|
||||||
|
by_category: dict[str, dict[str, Any]] = {}
|
||||||
|
by_day: dict[str, dict[str, Any]] = {}
|
||||||
|
|
||||||
|
def bucket(target: dict[str, dict[str, Any]], key: str) -> dict[str, Any]:
|
||||||
|
return target.setdefault(
|
||||||
|
key,
|
||||||
|
{
|
||||||
|
"label": key,
|
||||||
|
"total_jobs": 0,
|
||||||
|
"finished_jobs": 0,
|
||||||
|
"completed_jobs": 0,
|
||||||
|
"failed_jobs": 0,
|
||||||
|
"cancelled_jobs": 0,
|
||||||
|
"steps_sum": 0,
|
||||||
|
"steps_count": 0,
|
||||||
|
"cost_sum": 0.0,
|
||||||
|
"cost_count": 0,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
total_jobs = 0
|
||||||
|
finished_jobs = 0
|
||||||
|
completed_jobs = 0
|
||||||
|
failed_jobs = 0
|
||||||
|
cancelled_jobs = 0
|
||||||
|
steps_sum = 0
|
||||||
|
steps_count = 0
|
||||||
|
cost_sum = 0.0
|
||||||
|
cost_count = 0
|
||||||
|
|
||||||
|
for job in self._jobs.values():
|
||||||
|
total_jobs += 1
|
||||||
|
status = str(job.get("status") or "")
|
||||||
|
finished = status in _TERMINAL_STATUSES
|
||||||
|
category = _objective_category(str(job.get("objective") or ""))
|
||||||
|
day = str(job.get("created_at") or "")[:10] or "unknown"
|
||||||
|
|
||||||
|
category_bucket = bucket(by_category, category)
|
||||||
|
day_bucket = bucket(by_day, day)
|
||||||
|
for item in (category_bucket, day_bucket):
|
||||||
|
item["total_jobs"] += 1
|
||||||
|
|
||||||
|
if not finished:
|
||||||
|
continue
|
||||||
|
|
||||||
|
finished_jobs += 1
|
||||||
|
if status == "completed":
|
||||||
|
completed_jobs += 1
|
||||||
|
elif status == "failed":
|
||||||
|
failed_jobs += 1
|
||||||
|
elif status == "cancelled":
|
||||||
|
cancelled_jobs += 1
|
||||||
|
|
||||||
|
steps_raw = job.get("steps")
|
||||||
|
if steps_raw is not None:
|
||||||
|
steps = int(steps_raw)
|
||||||
|
steps_sum += steps
|
||||||
|
steps_count += 1
|
||||||
|
for item in (category_bucket, day_bucket):
|
||||||
|
item["steps_sum"] += steps
|
||||||
|
item["steps_count"] += 1
|
||||||
|
|
||||||
|
estimated_cost_raw = (job.get("usage") or {}).get("estimated_cost_usd")
|
||||||
|
if estimated_cost_raw is not None:
|
||||||
|
estimated_cost = float(estimated_cost_raw)
|
||||||
|
cost_sum += estimated_cost
|
||||||
|
cost_count += 1
|
||||||
|
for item in (category_bucket, day_bucket):
|
||||||
|
item["cost_sum"] += estimated_cost
|
||||||
|
item["cost_count"] += 1
|
||||||
|
|
||||||
|
for item in (category_bucket, day_bucket):
|
||||||
|
item["finished_jobs"] += 1
|
||||||
|
if status == "completed":
|
||||||
|
item["completed_jobs"] += 1
|
||||||
|
elif status == "failed":
|
||||||
|
item["failed_jobs"] += 1
|
||||||
|
elif status == "cancelled":
|
||||||
|
item["cancelled_jobs"] += 1
|
||||||
|
|
||||||
|
def finalize(item: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
finished = item["finished_jobs"]
|
||||||
|
return {
|
||||||
|
"label": item["label"],
|
||||||
|
"total_jobs": item["total_jobs"],
|
||||||
|
"finished_jobs": finished,
|
||||||
|
"completed_jobs": item["completed_jobs"],
|
||||||
|
"failed_jobs": item["failed_jobs"],
|
||||||
|
"cancelled_jobs": item["cancelled_jobs"],
|
||||||
|
"success_rate": round((item["completed_jobs"] / finished) * 100, 2) if finished else 0.0,
|
||||||
|
"avg_steps": round(item["steps_sum"] / item["steps_count"], 2) if item["steps_count"] else None,
|
||||||
|
"avg_cost_usd": round(item["cost_sum"] / item["cost_count"], 6) if item["cost_count"] else None,
|
||||||
|
}
|
||||||
|
|
||||||
|
return {
|
||||||
|
"total_jobs": total_jobs,
|
||||||
|
"finished_jobs": finished_jobs,
|
||||||
|
"completed_jobs": completed_jobs,
|
||||||
|
"failed_jobs": failed_jobs,
|
||||||
|
"cancelled_jobs": cancelled_jobs,
|
||||||
|
"success_rate": round((completed_jobs / finished_jobs) * 100, 2) if finished_jobs else 0.0,
|
||||||
|
"avg_steps": round(steps_sum / steps_count, 2) if steps_count else None,
|
||||||
|
"avg_cost_usd": round(cost_sum / cost_count, 6) if cost_count else None,
|
||||||
|
"by_category": sorted((finalize(item) for item in by_category.values()), key=lambda item: (-item["success_rate"], item["label"])),
|
||||||
|
"timeline": sorted((finalize(item) for item in by_day.values()), key=lambda item: item["label"]),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
def _build_app(tmp_path: Path, monkeypatch: Any, disable_ui: bool = False):
|
def _build_app(tmp_path: Path, monkeypatch: Any, disable_ui: bool = False):
|
||||||
monkeypatch.setattr(server_module, "JobManager", FakeJobManager)
|
monkeypatch.setattr(server_module, "JobManager", FakeJobManager)
|
||||||
@@ -145,6 +324,8 @@ def test_create_job_returns_only_job_id_and_defaults_model(tmp_path: Path, monke
|
|||||||
manager = app.state.manager
|
manager = app.state.manager
|
||||||
assert manager.last_submit_payload["model"] == "gpt-5.4-mini"
|
assert manager.last_submit_payload["model"] == "gpt-5.4-mini"
|
||||||
assert manager.last_submit_payload["disabled_tools"] == ["click"]
|
assert manager.last_submit_payload["disabled_tools"] == ["click"]
|
||||||
|
assert manager.last_submit_payload["reasoning_effort"] == "medium"
|
||||||
|
assert manager.last_submit_payload["screen_context_decay_steps"] == 4
|
||||||
|
|
||||||
status_res = client.get(f"/api/jobs/{job_id}/status", headers=headers)
|
status_res = client.get(f"/api/jobs/{job_id}/status", headers=headers)
|
||||||
assert status_res.status_code == 200
|
assert status_res.status_code == 200
|
||||||
@@ -174,12 +355,122 @@ def test_cancel_endpoint_and_events(tmp_path: Path, monkeypatch: Any) -> None:
|
|||||||
assert status_after["data"] is None
|
assert status_after["data"] is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_replay_endpoint_builds_frames_and_overlays(tmp_path: Path, monkeypatch: Any) -> None:
|
||||||
|
app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
|
||||||
|
client = TestClient(app)
|
||||||
|
headers = {"Authorization": "Bearer test_token"}
|
||||||
|
create = client.post("/api/jobs", headers=headers, json={"job": "Replay test"})
|
||||||
|
job_id = create.json()["job_id"]
|
||||||
|
|
||||||
|
replay = client.get(f"/api/jobs/{job_id}/replay?limit=200", headers=headers)
|
||||||
|
assert replay.status_code == 200
|
||||||
|
payload = replay.json()
|
||||||
|
assert payload["job_id"] == job_id
|
||||||
|
assert payload["total_frames"] == 1
|
||||||
|
frame = payload["frames"][0]
|
||||||
|
assert frame["kind"] == "see_screen"
|
||||||
|
assert frame["is_fullscreen"] is True
|
||||||
|
labels = [item.get("label", "") for item in frame["overlays"]]
|
||||||
|
assert any("click" in text.lower() for text in labels)
|
||||||
|
assert any("typed" in text.lower() for text in labels)
|
||||||
|
|
||||||
|
|
||||||
|
def test_replay_endpoint_skips_visual_paths_outside_artifacts(tmp_path: Path, monkeypatch: Any) -> None:
|
||||||
|
app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
|
||||||
|
manager = app.state.manager
|
||||||
|
client = TestClient(app)
|
||||||
|
headers = {"Authorization": "Bearer test_token"}
|
||||||
|
create = client.post("/api/jobs", headers=headers, json={"job": "Replay path check"})
|
||||||
|
job_id = create.json()["job_id"]
|
||||||
|
manager._events[job_id].append(
|
||||||
|
{
|
||||||
|
"id": 999,
|
||||||
|
"job_id": job_id,
|
||||||
|
"ts": "2026-05-27T00:01:00Z",
|
||||||
|
"step": 2,
|
||||||
|
"event_type": "visual_update",
|
||||||
|
"payload": {
|
||||||
|
"kind": "see_screen",
|
||||||
|
"image_meta": {
|
||||||
|
"path": str((tmp_path / "outside.png").resolve()),
|
||||||
|
"width": 100,
|
||||||
|
"height": 100,
|
||||||
|
"grid": True,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
replay = client.get(f"/api/jobs/{job_id}/replay?limit=500", headers=headers)
|
||||||
|
assert replay.status_code == 200
|
||||||
|
payload = replay.json()
|
||||||
|
assert payload["total_frames"] == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_analytics_endpoint_groups_by_category_and_time(tmp_path: Path, monkeypatch: Any) -> None:
|
||||||
|
app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
|
||||||
|
manager = app.state.manager
|
||||||
|
client = TestClient(app)
|
||||||
|
headers = {"Authorization": "Bearer test_token"}
|
||||||
|
|
||||||
|
browser_completed = client.post("/api/jobs", headers=headers, json={"job": "Open amazon.de and checkout"}).json()["job_id"]
|
||||||
|
browser_failed = client.post("/api/jobs", headers=headers, json={"job": "Open website and login"}).json()["job_id"]
|
||||||
|
terminal_completed = client.post("/api/jobs", headers=headers, json={"job": "Run a shell command to inspect files"}).json()["job_id"]
|
||||||
|
|
||||||
|
manager._jobs[browser_completed].update(
|
||||||
|
status="completed",
|
||||||
|
ended_at="2026-05-27T00:10:00Z",
|
||||||
|
steps=4,
|
||||||
|
created_at="2026-05-27T00:00:01Z",
|
||||||
|
usage={**manager._jobs[browser_completed]["usage"], "estimated_cost_usd": 0.12},
|
||||||
|
)
|
||||||
|
manager._jobs[browser_failed].update(
|
||||||
|
status="failed",
|
||||||
|
ended_at="2026-05-28T00:10:00Z",
|
||||||
|
steps=6,
|
||||||
|
created_at="2026-05-28T00:00:01Z",
|
||||||
|
usage={**manager._jobs[browser_failed]["usage"], "estimated_cost_usd": 0.24},
|
||||||
|
)
|
||||||
|
manager._jobs[terminal_completed].update(
|
||||||
|
status="completed",
|
||||||
|
ended_at="2026-05-28T00:15:00Z",
|
||||||
|
steps=10,
|
||||||
|
created_at="2026-05-28T00:00:02Z",
|
||||||
|
usage={**manager._jobs[terminal_completed]["usage"], "estimated_cost_usd": 0.05},
|
||||||
|
)
|
||||||
|
|
||||||
|
analytics = client.get("/api/analytics", headers=headers)
|
||||||
|
assert analytics.status_code == 200
|
||||||
|
payload = analytics.json()
|
||||||
|
|
||||||
|
assert payload["total_jobs"] == 3
|
||||||
|
assert payload["finished_jobs"] == 3
|
||||||
|
assert payload["completed_jobs"] == 2
|
||||||
|
assert payload["failed_jobs"] == 1
|
||||||
|
assert payload["success_rate"] == 66.67
|
||||||
|
assert payload["avg_steps"] == 6.67
|
||||||
|
assert payload["avg_cost_usd"] == 0.136667
|
||||||
|
|
||||||
|
browser = next(row for row in payload["by_category"] if row["label"] == "Browser / web")
|
||||||
|
terminal = next(row for row in payload["by_category"] if row["label"] == "Files / terminal")
|
||||||
|
assert browser["finished_jobs"] == 2
|
||||||
|
assert browser["success_rate"] == 50.0
|
||||||
|
assert browser["avg_steps"] == 5.0
|
||||||
|
assert terminal["success_rate"] == 100.0
|
||||||
|
|
||||||
|
assert [row["label"] for row in payload["timeline"]] == ["2026-05-27", "2026-05-28"]
|
||||||
|
|
||||||
|
|
||||||
def test_ui_toggle(tmp_path: Path, monkeypatch: Any) -> None:
|
def test_ui_toggle(tmp_path: Path, monkeypatch: Any) -> None:
|
||||||
app_enabled, _ = _build_app(tmp_path / "enabled", monkeypatch, disable_ui=False)
|
app_enabled, _ = _build_app(tmp_path / "enabled", monkeypatch, disable_ui=False)
|
||||||
client_enabled = TestClient(app_enabled)
|
client_enabled = TestClient(app_enabled)
|
||||||
root_enabled = client_enabled.get("/")
|
root_enabled = client_enabled.get("/")
|
||||||
assert root_enabled.status_code == 200
|
assert root_enabled.status_code == 200
|
||||||
assert "ScreenJob Monitor" in root_enabled.text
|
assert "ScreenJob Monitor" in root_enabled.text
|
||||||
|
assert "Success by Objective Category" in root_enabled.text
|
||||||
|
js_enabled = client_enabled.get("/ui/monitoring.js")
|
||||||
|
assert js_enabled.status_code == 200
|
||||||
|
assert "const tokenInput" in js_enabled.text
|
||||||
|
|
||||||
app_disabled, _ = _build_app(tmp_path / "disabled", monkeypatch, disable_ui=True)
|
app_disabled, _ = _build_app(tmp_path / "disabled", monkeypatch, disable_ui=True)
|
||||||
client_disabled = TestClient(app_disabled)
|
client_disabled = TestClient(app_disabled)
|
||||||
|
|||||||
@@ -72,3 +72,55 @@ def test_storage_response_fallback_uses_result_when_json_missing(tmp_path: Path)
|
|||||||
assert job is not None
|
assert job is not None
|
||||||
assert job["response"]["return"] == "Legacy result string"
|
assert job["response"]["return"] == "Legacy result string"
|
||||||
assert job["response"]["data"] is None
|
assert job["response"]["data"] is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_history_db_analytics_groups_by_category_and_day(tmp_path: Path) -> None:
|
||||||
|
db = HistoryDB(tmp_path / "screenjob_test_analytics.db")
|
||||||
|
|
||||||
|
db.create_job(
|
||||||
|
job_id="job_browser_ok",
|
||||||
|
objective="Open amazon.de and checkout",
|
||||||
|
model="gpt-5.4-mini",
|
||||||
|
created_at="2026-05-27T00:00:01Z",
|
||||||
|
safety_override=False,
|
||||||
|
disabled_tools=[],
|
||||||
|
)
|
||||||
|
db.update_job("job_browser_ok", status="completed", steps=4, estimated_cost_usd=0.12)
|
||||||
|
|
||||||
|
db.create_job(
|
||||||
|
job_id="job_browser_fail",
|
||||||
|
objective="Open website and login",
|
||||||
|
model="gpt-5.4-mini",
|
||||||
|
created_at="2026-05-28T00:00:01Z",
|
||||||
|
safety_override=False,
|
||||||
|
disabled_tools=[],
|
||||||
|
)
|
||||||
|
db.update_job("job_browser_fail", status="failed", steps=6, estimated_cost_usd=0.24)
|
||||||
|
|
||||||
|
db.create_job(
|
||||||
|
job_id="job_terminal_ok",
|
||||||
|
objective="Run a shell command to inspect files",
|
||||||
|
model="gpt-5.4-mini",
|
||||||
|
created_at="2026-05-28T00:00:02Z",
|
||||||
|
safety_override=False,
|
||||||
|
disabled_tools=[],
|
||||||
|
)
|
||||||
|
db.update_job("job_terminal_ok", status="completed", steps=10, estimated_cost_usd=0.05)
|
||||||
|
|
||||||
|
analytics = db.analytics()
|
||||||
|
assert analytics["total_jobs"] == 3
|
||||||
|
assert analytics["finished_jobs"] == 3
|
||||||
|
assert analytics["completed_jobs"] == 2
|
||||||
|
assert analytics["failed_jobs"] == 1
|
||||||
|
assert analytics["success_rate"] == 66.67
|
||||||
|
assert analytics["avg_steps"] == 6.67
|
||||||
|
assert analytics["avg_cost_usd"] == 0.136667
|
||||||
|
|
||||||
|
browser = next(row for row in analytics["by_category"] if row["label"] == "Browser / web")
|
||||||
|
terminal = next(row for row in analytics["by_category"] if row["label"] == "Files / terminal")
|
||||||
|
assert browser["finished_jobs"] == 2
|
||||||
|
assert browser["success_rate"] == 50.0
|
||||||
|
assert browser["avg_steps"] == 5.0
|
||||||
|
assert terminal["success_rate"] == 100.0
|
||||||
|
|
||||||
|
assert [row["label"] for row in analytics["timeline"]] == ["2026-05-27", "2026-05-28"]
|
||||||
|
|||||||
13
todo.md
13
todo.md
@@ -4,21 +4,20 @@
|
|||||||
- [Bug] Enforce single active desktop-control run (or a strict queue) so concurrent jobs cannot fight over the same mouse/keyboard/screen session.
|
- [Bug] Enforce single active desktop-control run (or a strict queue) so concurrent jobs cannot fight over the same mouse/keyboard/screen session.
|
||||||
- [Bug] Fix run artifact collisions in `setup_artifacts()` (`run_id` is second-granularity, so two jobs in the same second can share/overwrite the same directory).
|
- [Bug] Fix run artifact collisions in `setup_artifacts()` (`run_id` is second-granularity, so two jobs in the same second can share/overwrite the same directory).
|
||||||
- [Bug] Remove global logger handler clobbering in `setup_logger()` (`logging.getLogger("screenjob").handlers.clear()` breaks concurrent runs and can redirect logs to the wrong file).
|
- [Bug] Remove global logger handler clobbering in `setup_logger()` (`logging.getLogger("screenjob").handlers.clear()` breaks concurrent runs and can redirect logs to the wrong file).
|
||||||
- [Bug] More consistent clicks and more uses of enhance images.
|
- [x] More consistent clicks and more uses of enhance images.
|
||||||
|
|
||||||
## P1
|
## P1
|
||||||
|
- [x] Move ui.py into a seperate html file and js file.
|
||||||
|
- [x] Think harder using effort "medium" by default.
|
||||||
|
- [x] Decay old screenshots after 3 to 5 steps to save (1) tokens and (2) brain fuck in the agents.
|
||||||
- [Bug] Validate `disabled_tools` against an allowlist and disallow disabling critical completion flow (`task_complete`) to avoid guaranteed step-limit failures.
|
- [Bug] Validate `disabled_tools` against an allowlist and disallow disabling critical completion flow (`task_complete`) to avoid guaranteed step-limit failures.
|
||||||
- [Bug] Improve `execute_command` cancellation/timeout handling to terminate full process trees, not only the parent shell process.
|
- [Bug] Improve `execute_command` cancellation/timeout handling to terminate full process trees, not only the parent shell process.
|
||||||
- [Bug] Reduce API/UI token leakage risk by moving away from query-string token usage for websocket/artifact access where possible.
|
|
||||||
- [Idea] Add per-token rate limiting and request size limits (objective length + payload bounds) for API hardening.
|
|
||||||
|
|
||||||
## P2
|
## P2
|
||||||
- [Bug] Fix UI event style mapping mismatch (`tool_called` events are emitted, but UI color map expects `tool_call`).
|
- [Bug] Fix UI event style mapping mismatch (`tool_called` events are emitted, but UI color map expects `tool_call`).
|
||||||
- [Idea] Reduce monitoring UI backend load by throttling websocket-triggered refreshes and avoiding full job/event re-fetch on every event.
|
- [Idea] Reduce monitoring UI backend load by throttling websocket-triggered refreshes and avoiding full job/event re-fetch on every event.
|
||||||
- [Idea] Add cursor-based pagination for jobs/events instead of large fixed limits.
|
|
||||||
- [Idea] Support offline/self-hosted UI assets (bundle Tailwind instead of CDN dependency).
|
|
||||||
- [Idea] Add retention controls/pruning for old runs, screenshots, and DB rows.
|
- [Idea] Add retention controls/pruning for old runs, screenshots, and DB rows.
|
||||||
|
|
||||||
## P3
|
## P3
|
||||||
- [Idea] Add Replay Mode; Ability to replay a session by reconstructing the screen from screenshots and overlaying tool calls and click and type events.
|
- [x] Add Replay Mode; Ability to replay a session by reconstructing the screen from screenshots and overlaying tool calls and click and type events.
|
||||||
- [Idea] Add lightweight analytics dashboards (success rate by objective category, avg steps/cost over time).
|
- [x] Add lightweight analytics dashboards (success rate by objective category, avg steps/cost over time).
|
||||||
|
|||||||
Reference in New Issue
Block a user