Add lightweight analytics dashboard

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
feat: (literally) "enhance" functionality with new parameters and improved image processing
2026-05-27 22:34:26 +02:00 · 2026-05-27 22:14:32 +02:00 · 2026-05-27 22:05:57 +02:00 · 2026-05-27 22:04:15 +02:00 · 2026-05-27 22:02:20 +02:00
15 changed files with 977 additions and 25 deletions
--- a/README.md
+++ b/README.md
@@ -156,13 +156,21 @@ Each job payload includes:
 - Read-only dashboard (no run controls)
 - Requires token input
 - Live updates via `/ws`
+- Analytics dashboards for success rate by objective category and daily averages
 - Set `DISABLE_UI=true` to disable UI

+### Analytics API
+
+- `GET /api/analytics`
+- Returns objective-category success rates plus average steps/cost over time
+
 ## Agent Instructions (Practical)

 - Prefer `execute_command` for deterministic actions (opening URLs, filesystem checks).
 - Use `see_screen` before UI interaction.
- Use `enhance` when text is unclear.
+- Use `enhance` before clicking small/ambiguous targets; prefer `region="small"` for compact controls.
+- Use `enhance` `mode="text"` for tiny labels/text, or `mode="ui"` for general UI.
+- Optionally set `enhance` `scale` (2-6) for tighter zoom control.
 - Use `press_key` for non-text keys (Enter, Tab, arrows, Escape).
 - For shortcuts, use one `press_key` call with combo syntax (example: `win+r`).
 - Use `click` offsets via `offset_up/down/left/right` and optional `sleep_after_seconds`.
--- a/SKILL.md
+++ b/SKILL.md
@@ -37,6 +37,14 @@ Keyboard combo rule:
 - For shortcuts, use one `press_key` call with combo syntax, for example: `win+r`, `ctrl+shift+esc`.
 - Do not split modifier combos into separate calls.

+Enhance-first click rule:
+
+- Before clicking small buttons/icons, dense UI, or ambiguous targets, call `enhance` first.
+- Preferred preset for tiny controls: `enhance(coordinate, region="small", mode="ui")`.
+- For tiny labels/text: use `mode="text"` to improve readability.
+- Optional zoom control: set `scale` from `2` to `6` (defaults are tuned by region).
+- After checking the enhanced image, click using the same target coordinate (or a small directional offset if needed).
+
 Verification rule:

 - Before `task_complete`, verify actual on-screen content matches the expected outcome.
--- a/src/agent.py
+++ b/src/agent.py
@@ -9,7 +9,7 @@ import traceback
 from typing import Any, Callable

 from openai import OpenAI
-from PIL import Image, ImageEnhance, ImageFilter, ImageOps
+from PIL import Image, ImageDraw, ImageEnhance, ImageFilter, ImageOps

 from .models import AgentResult, RunArtifacts, RuntimeOptions, UsageSummary
 from .pricing import estimate_cost_usd
@@ -34,7 +34,8 @@ Rules:
   - launching apps or running terminal checks
 3) For UI tasks, inspect with see_screen before clicking/typing.
 4) Coordinates are absolute screen pixels (x, y) from top-left.
-5) Use enhance(coordinate) when text/UI is unclear.
+5) Use enhance before risky clicks: small buttons/icons, dense UI, or when target confidence is below high.
+5a) For tiny controls use enhance(coordinate, region="small", mode="ui"). For tiny text use mode="text".
 6) For keyboard-heavy interactions, prefer press_key for special keys.
 6a) For key combinations, call press_key once with combo syntax (example: "win+r", "ctrl+shift+esc"). Do not split modifier combos across separate calls.
 7) You may call multiple tools in one step. If needed, do click then sleep.
@@ -76,11 +77,14 @@ class ScreenJobAgent:
        self.final_data: Any | None = None
        self.previous_response_id: str | None = None
        self.usage = UsageSummary()
+        self.objective = ""

        self.last_screen_data_url: str | None = None
        self.last_screen_meta: dict[str, Any] | None = None
        self.click_history: list[tuple[int, int, float]] = []
        self.disabled_tools = {tool.strip() for tool in (options.disable_tools or set()) if tool.strip()}
+        self.recent_tool_summaries: list[str] = []
+        self.last_context_compact_step = 0

    def _emit(self, event_type: str, payload: dict[str, Any]) -> None:
        if self.event_callback is None:
@@ -192,7 +196,10 @@ class ScreenJobAgent:
            {
                "type": "function",
                "name": "enhance",
-                "description": "Create enhanced zoom around a coordinate for readability.",
+                "description": (
+                    "Create enhanced zoom around a coordinate for readability and precise targeting. "
+                    "Prefer this before clicking tiny or ambiguous UI targets."
+                ),
                "parameters": {
                    "type": "object",
                    "properties": {
@@ -204,7 +211,19 @@ class ScreenJobAgent:
                            },
                            "required": ["x", "y"],
                            "additionalProperties": False,
-                        }
+                        },
+                        "region": {
+                            "type": "string",
+                            "enum": ["small", "medium", "large"],
+                        },
+                        "mode": {
+                            "type": "string",
+                            "enum": ["ui", "text"],
+                        },
+                        "scale": {
+                            "type": ["integer", "string"],
+                            "description": "Zoom factor from 2 to 6. Defaults by region.",
+                        },
                    },
                    "required": ["coordinate"],
                    "additionalProperties": False,
@@ -352,6 +371,23 @@ class ScreenJobAgent:
            sec = max_seconds
        return sec

+    def _parse_int(self, value: Any, default: int = 0) -> int:
+        if value is None:
+            return default
+        if isinstance(value, bool):
+            return int(value)
+        if isinstance(value, int):
+            return value
+        if isinstance(value, float):
+            return int(round(value))
+        text = str(value).strip()
+        if not text:
+            return default
+        try:
+            return int(float(text))
+        except Exception:  # noqa: BLE001
+            return default
+
    def _tool_see_screen(self, _: dict[str, Any]) -> dict[str, Any]:
        image, meta = self._capture_screen(with_grid=True)
        out_path = self.artifacts.shots_dir / f"screen_step_{self.step:03d}.png"
@@ -369,34 +405,106 @@ class ScreenJobAgent:

    def _tool_enhance(self, args: dict[str, Any]) -> dict[str, Any]:
        coord = args.get("coordinate") or {}
-        x = int(coord.get("x", 0))
-        y = int(coord.get("y", 0))
+        requested_x = self._parse_int(coord.get("x", 0), default=0)
+        requested_y = self._parse_int(coord.get("y", 0), default=0)
+        region = str(args.get("region", "small") or "small").strip().lower()
+        mode = str(args.get("mode", "ui") or "ui").strip().lower()
+        if region not in {"small", "medium", "large"}:
+            region = "small"
+        if mode not in {"ui", "text"}:
+            mode = "ui"
+
+        region_half_by_preset = {
+            "small": 96,
+            "medium": 160,
+            "large": 240,
+        }
+        default_scale_by_region = {
+            "small": 4,
+            "medium": 3,
+            "large": 2,
+        }
+        raw_scale = self._parse_int(args.get("scale"), default=0)
+        scale = raw_scale if raw_scale > 0 else default_scale_by_region[region]
+        scale = clamp(scale, 2, 6)
+
        base, base_meta = self._capture_screen(with_grid=False)
        width, height = base.size

-        region_half = 180
-        left = clamp(x - region_half, 0, width - 1)
-        top = clamp(y - region_half, 0, height - 1)
-        right = clamp(x + region_half, left + 1, width)
-        bottom = clamp(y + region_half, top + 1, height)
+        source_x = clamp(requested_x, 0, max(0, width - 1))
+        source_y = clamp(requested_y, 0, max(0, height - 1))
+        region_half = region_half_by_preset[region]
+        left = clamp(source_x - region_half, 0, width - 1)
+        top = clamp(source_y - region_half, 0, height - 1)
+        right = clamp(source_x + region_half, left + 1, width)
+        bottom = clamp(source_y + region_half, top + 1, height)

        crop = base.crop((left, top, right, bottom))
-        upscaled = crop.resize((crop.width * 2, crop.height * 2), Image.Resampling.BICUBIC)
-        enhanced = ImageOps.autocontrast(upscaled)
-        enhanced = ImageEnhance.Sharpness(enhanced).enhance(2.0)
-        enhanced = ImageEnhance.Contrast(enhanced).enhance(1.25)
-        enhanced = enhanced.filter(ImageFilter.UnsharpMask(radius=1.8, percent=180, threshold=2))
+        out_w = max(2, crop.width * scale)
+        out_h = max(2, crop.height * scale)
+        upscaled = crop.resize((out_w, out_h), Image.Resampling.LANCZOS)

-        out_path = self.artifacts.enhance_dir / f"enhance_step_{self.step:03d}_{x}_{y}.png"
+        if mode == "text":
+            text_view = ImageOps.grayscale(upscaled)
+            text_view = ImageOps.autocontrast(text_view, cutoff=1)
+            text_view = ImageOps.equalize(text_view)
+            text_view = ImageEnhance.Contrast(text_view).enhance(1.35)
+            text_view = ImageEnhance.Sharpness(text_view).enhance(2.1)
+            processed = text_view.filter(ImageFilter.UnsharpMask(radius=1.2, percent=160, threshold=1)).convert("RGB")
+        else:
+            ui_view = ImageOps.autocontrast(upscaled, cutoff=1)
+            ui_view = ImageEnhance.Contrast(ui_view).enhance(1.2)
+            ui_view = ImageEnhance.Sharpness(ui_view).enhance(1.8)
+            processed = ui_view.filter(ImageFilter.UnsharpMask(radius=1.4, percent=150, threshold=2)).convert("RGB")
+
+        edges = upscaled.convert("L").filter(ImageFilter.FIND_EDGES)
+        edges = ImageOps.autocontrast(edges, cutoff=4)
+        edge_overlay = ImageOps.colorize(edges, black=(0, 0, 0), white=(60, 220, 255))
+        enhanced = Image.blend(processed, edge_overlay, alpha=0.18)
+
+        cx = clamp((source_x - left) * scale, 0, max(0, enhanced.width - 1))
+        cy = clamp((source_y - top) * scale, 0, max(0, enhanced.height - 1))
+        draw = ImageDraw.Draw(enhanced)
+        draw.rectangle([0, 0, enhanced.width - 1, enhanced.height - 1], outline=(255, 80, 80), width=2)
+        ring_radius = max(10, int(6 * scale / 2))
+        arm_len = max(14, int(9 * scale / 2))
+        gap = max(4, int(2 * scale / 2))
+        line_width = max(2, int(scale / 2))
+        draw.ellipse(
+            [cx - ring_radius, cy - ring_radius, cx + ring_radius, cy + ring_radius],
+            outline=(255, 80, 80),
+            width=line_width,
+        )
+        draw.line([(max(0, cx - arm_len), cy), (max(0, cx - gap), cy)], fill=(255, 80, 80), width=line_width)
+        draw.line(
+            [(min(enhanced.width - 1, cx + gap), cy), (min(enhanced.width - 1, cx + arm_len), cy)],
+            fill=(255, 80, 80),
+            width=line_width,
+        )
+        draw.line([(cx, max(0, cy - arm_len)), (cx, max(0, cy - gap))], fill=(255, 80, 80), width=line_width)
+        draw.line(
+            [(cx, min(enhanced.height - 1, cy + gap)), (cx, min(enhanced.height - 1, cy + arm_len))],
+            fill=(255, 80, 80),
+            width=line_width,
+        )
+
+        out_path = self.artifacts.enhance_dir / (
+            f"enhance_step_{self.step:03d}_{source_x}_{source_y}_{region}_{mode}_x{scale}.png"
+        )
        self._save_image(enhanced, out_path)
        data_url = image_to_data_url(enhanced, "PNG")

        meta = {
            "captured_at": utc_now_iso(),
-            "source_coord": {"x": x, "y": y},
+            "requested_coord": {"x": requested_x, "y": requested_y},
+            "source_coord": {"x": source_x, "y": source_y},
            "source_box": {"left": left, "top": top, "right": right, "bottom": bottom},
-            "scale": 2,
+            "region": region,
+            "mode": mode,
+            "scale": scale,
            "path": str(out_path.resolve()),
+            "size": {"width": enhanced.width, "height": enhanced.height},
+            "target_pixel": {"x": cx, "y": cy},
            "screen_size": {"width": width, "height": height},
            "base_capture_meta": base_meta,
        }
@@ -628,6 +736,9 @@ class ScreenJobAgent:
            return {"_raw": raw}

    def _call_model(self, input_items: list[dict[str, Any]]) -> Any:
+        effort = str(self.options.reasoning_effort or "medium").strip().lower()
+        if effort not in {"low", "medium", "high"}:
+            effort = "medium"
        return self.client.responses.create(
            model=self.options.model,
            instructions=SYSTEM_PROMPT,
@@ -636,9 +747,85 @@ class ScreenJobAgent:
            previous_response_id=self.previous_response_id,
            parallel_tool_calls=True,
            max_tool_calls=8,
+            reasoning={"effort": effort},
        )

+    def _record_tool_summary(self, tool_name: str, result: dict[str, Any]) -> None:
+        ok = bool(result.get("ok"))
+        status = "ok" if ok else "fail"
+        summary = f"step={self.step} tool={tool_name} status={status}"
+        if tool_name == "click":
+            clicked = result.get("clicked") if isinstance(result.get("clicked"), dict) else {}
+            x = clicked.get("x")
+            y = clicked.get("y")
+            if isinstance(x, int) and isinstance(y, int):
+                summary = f"{summary} at=({x},{y})"
+        elif tool_name == "type":
+            typed_length = int(result.get("typed_length", 0) or 0)
+            summary = f"{summary} typed_length={typed_length}"
+        elif tool_name == "press_key":
+            key = str(result.get("key") or "").strip()
+            if key:
+                summary = f"{summary} key={key}"
+        elif tool_name == "execute_command":
+            exit_code = result.get("exit_code")
+            if exit_code is not None:
+                summary = f"{summary} exit_code={exit_code}"
+        elif tool_name in {"see_screen", "enhance"}:
+            meta = result.get("meta") if isinstance(result.get("meta"), dict) else {}
+            path = str(meta.get("path") or result.get("path") or "").strip()
+            if path:
+                summary = f"{summary} image={path}"
+        if not ok:
+            error_text = str(result.get("error") or "").strip()
+            if error_text:
+                summary = f"{summary} error={error_text[:140]}"
+        self.recent_tool_summaries.append(summary)
+        self.recent_tool_summaries = self.recent_tool_summaries[-20:]
+
+    def _should_compact_context(self) -> bool:
+        interval = max(0, int(self.options.screen_context_decay_steps or 0))
+        if interval <= 0:
+            return False
+        if self.previous_response_id is None:
+            return False
+        return (self.step - self.last_context_compact_step) >= interval
+
+    def _build_compacted_pending_input(self) -> list[dict[str, Any]]:
+        recent = self.recent_tool_summaries[-8:]
+        lines = "\n".join(f"- {line}" for line in recent) if recent else "- No recent tool activity."
+        content = (
+            "Context compaction activated to decay stale screenshots and reduce token usage.\n"
+            f"JOB: {self.objective}\n"
+            f"Current step: {self.step}\n"
+            "Recent tool activity:\n"
+            f"{lines}\n"
+            "Continue execution from the latest screen state. "
+            "Use tools only, and finish with task_complete when done."
+        )
+        compacted_input: list[dict[str, Any]] = [
+            {
+                "role": "user",
+                "content": [
+                    {
+                        "type": "input_text",
+                        "text": content,
+                    }
+                ],
+            }
+        ]
+        if self.last_screen_data_url and self.last_screen_meta:
+            compacted_input.append(
+                self._build_visual_message(
+                    "Current screen after context compaction",
+                    self.last_screen_data_url,
+                    self.last_screen_meta,
+                )
+            )
+        return compacted_input
+
    def run(self, job: str) -> AgentResult:
+        self.objective = job
        started_at = time.time()
        self.logger.info("Starting run_id=%s model=%s", self.artifacts.run_id, self.options.model)
        self.logger.info("Job: %s", job)
@@ -648,6 +835,8 @@ class ScreenJobAgent:
            {
                "run_id": self.artifacts.run_id,
                "model": self.options.model,
+                "reasoning_effort": self.options.reasoning_effort,
+                "screen_context_decay_steps": self.options.screen_context_decay_steps,
                "objective": job,
                "disabled_tools": sorted(self.disabled_tools),
            },
@@ -664,6 +853,8 @@ class ScreenJobAgent:
                            f"JOB: {job}\n"
                            "You are in an action loop. Prefer execute_command for deterministic actions. "
                            "For modifier shortcuts, use a single press_key combo (example: win+r). "
+                            "Before clicking tiny buttons/icons or dense UI areas, call enhance first "
+                            "(use region='small'; use mode='text' for tiny text labels). "
                            "You can return multiple tool calls in one step (example: click then sleep). "
                            "When done call task_complete(return=..., data=...). "
                            "Before task_complete, verify the screen content is what was expected "
@@ -692,6 +883,19 @@ class ScreenJobAgent:
            self.step += 1
            self.logger.info("---- Agent step %d/%d ----", self.step, self.options.max_steps)
            self._emit("step_started", {"step": self.step, "max_steps": self.options.max_steps})
+            if self._should_compact_context():
+                self.previous_response_id = None
+                pending_input = self._build_compacted_pending_input()
+                self.last_context_compact_step = self.step
+                self.logger.info("Compacted model context at step %d.", self.step)
+                self._emit(
+                    "context_compacted",
+                    {
+                        "step": self.step,
+                        "decay_steps": self.options.screen_context_decay_steps,
+                        "recent_tool_summaries": self.recent_tool_summaries[-8:],
+                    },
+                )
            try:
                response = self._call_model(pending_input)
                self._register_usage(response)
@@ -720,6 +924,8 @@ class ScreenJobAgent:
                                "text": (
                                    "No function call was returned. Continue by using tools. "
                                    "Use one press_key call for key combos like win+r. "
+                                    "Prefer enhance before clicking small/unclear targets "
+                                    "(region='small', mode='ui' or 'text'). "
                                    "You may call multiple tools in one step. "
                                    "Before task_complete, verify expected screen content with see_screen/enhance "
                                    "and include observed_result in data. "
@@ -763,6 +969,7 @@ class ScreenJobAgent:
                    name,
                    json.dumps(result, ensure_ascii=False)[:2500],
                )
+                self._record_tool_summary(name, result)
                self._emit("tool_result", {"step": self.step, "tool": name, "result": result})
                next_input.append(
                    {
--- a/src/cli.py
+++ b/src/cli.py
@@ -28,6 +28,18 @@ def build_parser() -> argparse.ArgumentParser:
    parser.add_argument("--command-timeout", type=int, default=45, help="Timeout in seconds for execute_command.")
    parser.add_argument("--type-interval", type=float, default=0.02, help="Seconds between typed characters.")
    parser.add_argument("--click-pause", type=float, default=0.10, help="Mouse move duration before click.")
+    parser.add_argument(
+        "--reasoning-effort",
+        choices=["low", "medium", "high"],
+        default="medium",
+        help="Reasoning effort passed to the model.",
+    )
+    parser.add_argument(
+        "--screen-context-decay-steps",
+        type=int,
+        default=4,
+        help="Compact model context every N steps to decay old screenshots (0 disables).",
+    )
    parser.add_argument("--disable-tool", action="append", default=[], help="Disable a tool by name.")
    parser.add_argument("--skip-safety-check", action="store_true", help="Bypass pre-flight safety check.")
    parser.add_argument("--no-failsafe", action="store_true", help="Disable PyAutoGUI fail-safe.")
@@ -78,6 +90,8 @@ def main(argv: list[str] | None = None) -> int:
        command_timeout=args.command_timeout,
        type_interval=args.type_interval,
        click_pause=args.click_pause,
+        reasoning_effort=args.reasoning_effort,
+        screen_context_decay_steps=max(0, int(args.screen_context_decay_steps)),
        disable_tools=set(disabled_tools),
    )
    try:
--- a/src/models.py
+++ b/src/models.py
@@ -58,4 +58,6 @@ class RuntimeOptions:
    command_timeout: int = 45
    type_interval: float = 0.02
    click_pause: float = 0.10
+    reasoning_effort: str = "medium"
+    screen_context_decay_steps: int = 4
    disable_tools: set[str] | None = None
--- a/src/server.py
+++ b/src/server.py
@@ -16,6 +16,7 @@ from .config import AppConfig, load_app_config
 from .storage import HistoryDB
 from .task_manager import JobManager
 from .ui import monitoring_js_path, monitoring_page_html
+from .utils import utc_now_iso


 class CreateJobRequest(BaseModel):
@@ -25,6 +26,8 @@ class CreateJobRequest(BaseModel):
    command_timeout: int = Field(45, ge=1, le=600)
    type_interval: float = Field(0.02, ge=0.0, le=1.0)
    click_pause: float = Field(0.10, ge=0.0, le=2.0)
+    reasoning_effort: str = Field("medium", pattern="^(low|medium|high)$")
+    screen_context_decay_steps: int = Field(4, ge=0, le=50)
    disabled_tools: list[str] = Field(default_factory=list)
    safety_override: bool = False
    no_failsafe: bool = False
@@ -301,6 +304,8 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
            command_timeout=payload.command_timeout,
            type_interval=payload.type_interval,
            click_pause=payload.click_pause,
+            reasoning_effort=payload.reasoning_effort,
+            screen_context_decay_steps=payload.screen_context_decay_steps,
            disabled_tools=payload.disabled_tools,
            safety_override=payload.safety_override,
            no_failsafe=payload.no_failsafe,
@@ -382,6 +387,12 @@ def create_app(config: AppConfig | None = None) -> FastAPI:
    def stats(_: None = Depends(require_token)) -> dict[str, Any]:
        return manager.stats()

+    @app.get("/api/analytics")
+    def analytics(_: None = Depends(require_token)) -> dict[str, Any]:
+        payload = manager.analytics()
+        payload["generated_at"] = utc_now_iso()
+        return payload
+
    if not app_config.disable_ui:
        @app.get("/", response_class=HTMLResponse)
        def ui_root() -> str:
--- a/src/storage.py
+++ b/src/storage.py
@@ -7,6 +7,39 @@ from pathlib import Path
 from typing import Any


+_TERMINAL_STATUSES = {"completed", "failed", "cancelled"}
+_CATEGORY_RULES: tuple[tuple[str, tuple[str, ...]], ...] = (
+    (
+        "Browser / web",
+        ("browser", "website", "webpage", "chrome", "url", "amazon", "google", "login", "shopping", "checkout", "orders"),
+    ),
+    (
+        "Files / terminal",
+        ("file", "folder", "directory", "terminal", "shell", "command", "cli", "script", "git", "repo", "install", "pip", "npm", "powershell", "bash"),
+    ),
+    (
+        "Writing / docs",
+        ("write", "summary", "summarize", "document", "docs", "report", "email", "message", "readme", "markdown", "note", "proposal"),
+    ),
+    (
+        "Data / analysis",
+        ("data", "analysis", "analyze", "csv", "spreadsheet", "sheet", "table", "chart", "dashboard", "metric", "metrics", "sql"),
+    ),
+    (
+        "Development / ops",
+        ("code", "bug", "fix", "test", "debug", "api", "backend", "frontend", "database", "deploy", "docker", "service", "build"),
+    ),
+)
+
+
+def _objective_category(objective: str) -> str:
+    text = objective.lower()
+    for category, keywords in _CATEGORY_RULES:
+        if any(keyword in text for keyword in keywords):
+            return category
+    return "Other"
+
+
 class HistoryDB:
    def __init__(self, db_path: Path) -> None:
        self.db_path = db_path
@@ -184,6 +217,131 @@ class HistoryDB:
            ).fetchone()
        return dict(totals) if totals else {}

+    def analytics(self) -> dict[str, Any]:
+        with self._connect() as conn:
+            rows = conn.execute(
+                """
+                SELECT job_id, objective, status, steps, estimated_cost_usd, created_at
+                FROM jobs
+                ORDER BY created_at ASC, job_id ASC
+                """
+            ).fetchall()
+
+        total_jobs = 0
+        finished_jobs = 0
+        completed_jobs = 0
+        failed_jobs = 0
+        cancelled_jobs = 0
+        steps_sum = 0
+        steps_count = 0
+        cost_sum = 0.0
+        cost_count = 0
+        by_category: dict[str, dict[str, Any]] = {}
+        by_day: dict[str, dict[str, Any]] = {}
+
+        def _bucket(target: dict[str, dict[str, Any]], key: str) -> dict[str, Any]:
+            bucket = target.setdefault(
+                key,
+                {
+                    "label": key,
+                    "total_jobs": 0,
+                    "finished_jobs": 0,
+                    "completed_jobs": 0,
+                    "failed_jobs": 0,
+                    "cancelled_jobs": 0,
+                    "steps_sum": 0,
+                    "steps_count": 0,
+                    "cost_sum": 0.0,
+                    "cost_count": 0,
+                },
+            )
+            return bucket
+
+        for row in rows:
+            total_jobs += 1
+            status = str(row["status"] or "")
+            finished = status in _TERMINAL_STATUSES
+            completed = status == "completed"
+            objective = str(row["objective"] or "")
+            category = _objective_category(objective)
+            created_at = str(row["created_at"] or "")
+            day = created_at[:10] if len(created_at) >= 10 else created_at or "unknown"
+
+            category_bucket = _bucket(by_category, category)
+            day_bucket = _bucket(by_day, day)
+            for bucket in (category_bucket, day_bucket):
+                bucket["total_jobs"] += 1
+
+            if not finished:
+                continue
+
+            finished_jobs += 1
+            if completed:
+                completed_jobs += 1
+            elif status == "failed":
+                failed_jobs += 1
+            elif status == "cancelled":
+                cancelled_jobs += 1
+
+            steps = row["steps"]
+            if steps is not None:
+                step_value = int(steps)
+                steps_sum += step_value
+                steps_count += 1
+                for bucket in (category_bucket, day_bucket):
+                    bucket["steps_sum"] += step_value
+                    bucket["steps_count"] += 1
+
+            estimated_cost = row["estimated_cost_usd"]
+            if estimated_cost is not None:
+                cost_value = float(estimated_cost)
+                cost_sum += cost_value
+                cost_count += 1
+                for bucket in (category_bucket, day_bucket):
+                    bucket["cost_sum"] += cost_value
+                    bucket["cost_count"] += 1
+
+            for bucket in (category_bucket, day_bucket):
+                bucket["finished_jobs"] += 1
+                if completed:
+                    bucket["completed_jobs"] += 1
+                elif status == "failed":
+                    bucket["failed_jobs"] += 1
+                elif status == "cancelled":
+                    bucket["cancelled_jobs"] += 1
+
+        def _finalize(bucket: dict[str, Any]) -> dict[str, Any]:
+            finished = bucket["finished_jobs"]
+            return {
+                "label": bucket["label"],
+                "total_jobs": bucket["total_jobs"],
+                "finished_jobs": finished,
+                "completed_jobs": bucket["completed_jobs"],
+                "failed_jobs": bucket["failed_jobs"],
+                "cancelled_jobs": bucket["cancelled_jobs"],
+                "success_rate": round((bucket["completed_jobs"] / finished) * 100, 2) if finished else 0.0,
+                "avg_steps": round(bucket["steps_sum"] / bucket["steps_count"], 2) if bucket["steps_count"] else None,
+                "avg_cost_usd": round(bucket["cost_sum"] / bucket["cost_count"], 6) if bucket["cost_count"] else None,
+            }
+
+        category_rows = [_finalize(bucket) for bucket in by_category.values()]
+        category_rows.sort(key=lambda item: (-item["success_rate"], item["label"]))
+        day_rows = [_finalize(bucket) for bucket in by_day.values()]
+        day_rows.sort(key=lambda item: item["label"])
+
+        return {
+            "total_jobs": total_jobs,
+            "finished_jobs": finished_jobs,
+            "completed_jobs": completed_jobs,
+            "failed_jobs": failed_jobs,
+            "cancelled_jobs": cancelled_jobs,
+            "success_rate": round((completed_jobs / finished_jobs) * 100, 2) if finished_jobs else 0.0,
+            "avg_steps": round(steps_sum / steps_count, 2) if steps_count else None,
+            "avg_cost_usd": round(cost_sum / cost_count, 6) if cost_count else None,
+            "by_category": category_rows,
+            "timeline": day_rows,
+        }
+
    def _row_to_job(self, row: sqlite3.Row) -> dict[str, Any]:
        disabled_tools: list[str] = []
        try:
--- a/src/task_manager.py
+++ b/src/task_manager.py
@@ -48,6 +48,8 @@ class JobManager:
        command_timeout: int = 45,
        type_interval: float = 0.02,
        click_pause: float = 0.10,
+        reasoning_effort: str = "medium",
+        screen_context_decay_steps: int = 4,
        disabled_tools: list[str] | None = None,
        safety_override: bool = False,
        no_failsafe: bool = False,
@@ -93,6 +95,8 @@ class JobManager:
                "command_timeout": command_timeout,
                "type_interval": type_interval,
                "click_pause": click_pause,
+                "reasoning_effort": reasoning_effort,
+                "screen_context_decay_steps": screen_context_decay_steps,
                "no_failsafe": no_failsafe,
                "cancel_event": cancel_event,
            },
@@ -121,6 +125,8 @@ class JobManager:
        command_timeout: int,
        type_interval: float,
        click_pause: float,
+        reasoning_effort: str,
+        screen_context_decay_steps: int,
        no_failsafe: bool,
        cancel_event: threading.Event,
    ) -> None:
@@ -218,6 +224,8 @@ class JobManager:
            command_timeout=command_timeout,
            type_interval=type_interval,
            click_pause=click_pause,
+            reasoning_effort=reasoning_effort,
+            screen_context_decay_steps=max(0, int(screen_context_decay_steps)),
            disable_tools=set(disabled_tools),
        )
        try:
@@ -343,6 +351,9 @@ class JobManager:
            stats["live_running_threads"] = sum(1 for job in self._running.values() if job.thread.is_alive())
        return stats

+    def analytics(self) -> dict[str, Any]:
+        return self.db.analytics()
+
    def _normalize_job_payload(self, job: dict[str, Any]) -> dict[str, Any]:
        response = job.get("response")
        if not isinstance(response, dict):
--- a/src/ui_assets/monitoring.html
+++ b/src/ui_assets/monitoring.html
@@ -21,6 +21,30 @@

    <section class="grid grid-cols-2 md:grid-cols-6 gap-3" id="stats"></section>

+    <section class="space-y-3">
+      <div class="flex items-center justify-between gap-3">
+        <h2 class="font-semibold">Analytics</h2>
+        <div id="analyticsMeta" class="text-[11px] text-slate-400"></div>
+      </div>
+      <div id="analyticsSummary" class="grid grid-cols-2 md:grid-cols-4 gap-3"></div>
+      <div class="grid grid-cols-1 xl:grid-cols-2 gap-4">
+        <div class="bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
+          <div class="flex items-center justify-between gap-3">
+            <h3 class="font-semibold text-sm">Success by Objective Category</h3>
+            <div id="analyticsCategorySummary" class="text-[11px] text-slate-400"></div>
+          </div>
+          <div id="analyticsCategories" class="space-y-3"></div>
+        </div>
+        <div class="bg-slate-900/70 border border-slate-800 rounded-xl p-4 space-y-3">
+          <div class="flex items-center justify-between gap-3">
+            <h3 class="font-semibold text-sm">Avg Steps / Cost Over Time</h3>
+            <div id="analyticsTrendSummary" class="text-[11px] text-slate-400"></div>
+          </div>
+          <div id="analyticsTrends" class="space-y-4"></div>
+        </div>
+      </div>
+    </section>
+
    <section class="grid grid-cols-1 lg:grid-cols-5 gap-4">
      <div class="lg:col-span-2 bg-slate-900/70 border border-slate-800 rounded-xl p-4">
        <div class="flex items-center justify-between mb-3">
--- a/src/ui_assets/monitoring.js
+++ b/src/ui_assets/monitoring.js
@@ -17,6 +17,12 @@ const replayPrevBtn = document.getElementById("replayPrevBtn");
 const replayNextBtn = document.getElementById("replayNextBtn");
 const replaySpeedEl = document.getElementById("replaySpeed");
 const replaySeekEl = document.getElementById("replaySeek");
+const analyticsMetaEl = document.getElementById("analyticsMeta");
+const analyticsSummaryEl = document.getElementById("analyticsSummary");
+const analyticsCategorySummaryEl = document.getElementById("analyticsCategorySummary");
+const analyticsCategoriesEl = document.getElementById("analyticsCategories");
+const analyticsTrendSummaryEl = document.getElementById("analyticsTrendSummary");
+const analyticsTrendsEl = document.getElementById("analyticsTrends");

 const state = {
  token: localStorage.getItem("screenjob_token") || "",
@@ -35,6 +41,7 @@ const state = {
  }
 };
 const manuallyClosedSockets = new WeakSet();
+const analyticsRefreshEvents = new Set(["job_finished", "job_failed", "job_rejected"]);
 tokenInput.value = state.token;

 function authHeaders() {
@@ -66,6 +73,197 @@ function renderStats(stats) {
  `).join("");
 }

+function escapeHtml(value) {
+  return String(value ?? "").replace(/[&<>"']/g, (ch) => ({
+    "&": "&amp;",
+    "<": "&lt;",
+    ">": "&gt;",
+    '"': "&quot;",
+    "'": "&#39;"
+  })[ch]);
+}
+
+function formatNumber(value, digits = 2) {
+  const num = Number(value);
+  return Number.isFinite(num) ? num.toFixed(digits) : "—";
+}
+
+function formatCurrency(value, digits = 6) {
+  const num = Number(value);
+  return Number.isFinite(num) ? `$${num.toFixed(digits)}` : "—";
+}
+
+function formatPercent(value) {
+  const num = Number(value);
+  return Number.isFinite(num) ? `${num.toFixed(1)}%` : "—";
+}
+
+function formatDateLabel(value) {
+  const dt = new Date(value);
+  if (Number.isNaN(dt.getTime())) return String(value || "—");
+  return dt.toLocaleDateString(undefined, { month: "short", day: "numeric" });
+}
+
+function renderMetricCard(label, value) {
+  return `
+    <div class="bg-slate-950 border border-slate-800 rounded-xl p-3">
+      <div class="text-[11px] uppercase tracking-wide text-slate-400">${escapeHtml(label)}</div>
+      <div class="text-xl font-semibold mt-1">${escapeHtml(value)}</div>
+    </div>
+  `;
+}
+
+function renderLineChart(title, points, options = {}) {
+  const color = options.color || "#22d3ee";
+  const valueLabel = options.valueLabel || "";
+  const sourcePoints = Array.isArray(points)
+    ? points.filter((point) => Number.isFinite(Number(point.value)))
+    : [];
+
+  if (!sourcePoints.length) {
+    return `
+      <div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3">
+        <div class="flex items-center justify-between gap-3">
+          <div>
+            <div class="text-xs text-slate-400">${escapeHtml(title)}</div>
+            <div class="text-sm text-slate-200 font-semibold">No data yet</div>
+          </div>
+        </div>
+      </div>
+    `;
+  }
+
+  const width = 640;
+  const height = 220;
+  const margin = { top: 20, right: 18, bottom: 34, left: 44 };
+  const values = sourcePoints.map((point) => Number(point.value));
+  const minValue = Math.min(...values);
+  const maxValue = Math.max(...values);
+  const span = maxValue - minValue || 1;
+  const chartWidth = width - margin.left - margin.right;
+  const chartHeight = height - margin.top - margin.bottom;
+  const xStep = sourcePoints.length > 1 ? chartWidth / (sourcePoints.length - 1) : 0;
+  const coords = sourcePoints.map((point, index) => ({
+    x: margin.left + (index * xStep),
+    y: margin.top + ((maxValue - Number(point.value)) / span) * chartHeight,
+  }));
+  const linePath = coords.map((point, index) => `${index === 0 ? "M" : "L"} ${point.x} ${point.y}`).join(" ");
+  const baseline = height - margin.bottom;
+  const midIndex = Math.floor(sourcePoints.length / 2);
+  const xLabels = [
+    { index: 0, label: sourcePoints[0].label },
+    { index: midIndex, label: sourcePoints[midIndex].label },
+    { index: sourcePoints.length - 1, label: sourcePoints[sourcePoints.length - 1].label },
+  ].filter((item, index, array) => item.label && array.findIndex((candidate) => candidate.index === item.index) === index);
+  const minLabel = options.formatValue ? options.formatValue(minValue) : formatNumber(minValue, 2);
+  const maxLabel = options.formatValue ? options.formatValue(maxValue) : formatNumber(maxValue, 2);
+  const latest = sourcePoints[sourcePoints.length - 1];
+  const latestValue = options.formatValue ? options.formatValue(latest.value) : formatNumber(latest.value, 2);
+
+  return `
+    <div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3 space-y-2">
+      <div class="flex items-center justify-between gap-3">
+        <div>
+          <div class="text-xs text-slate-400">${escapeHtml(title)}</div>
+          <div class="text-sm text-slate-200 font-semibold">${escapeHtml(latestValue)}${valueLabel ? ` <span class="text-slate-500 font-normal">${escapeHtml(valueLabel)}</span>` : ""}</div>
+        </div>
+        <div class="text-[11px] text-slate-400 text-right">
+          <div>${escapeHtml(sourcePoints.length)} points</div>
+          <div>${escapeHtml(minLabel)} - ${escapeHtml(maxLabel)}</div>
+        </div>
+      </div>
+      <svg viewBox="0 0 ${width} ${height}" class="w-full h-56">
+        ${Array.from({ length: 4 }, (_, idx) => {
+          const y = margin.top + (chartHeight / 3) * idx;
+          return `<line x1="${margin.left}" y1="${y}" x2="${width - margin.right}" y2="${y}" stroke="rgba(51, 65, 85, 0.7)" stroke-width="1" />`;
+        }).join("")}
+        <line x1="${margin.left}" y1="${baseline}" x2="${width - margin.right}" y2="${baseline}" stroke="rgba(71, 85, 105, 0.8)" stroke-width="1.5" />
+        <path d="${linePath}" fill="none" stroke="${color}" stroke-width="3" stroke-linecap="round" stroke-linejoin="round" />
+        ${coords.map((point) => `
+          <circle cx="${point.x}" cy="${point.y}" r="4.5" fill="${color}" />
+        `).join("")}
+        <text x="${margin.left - 8}" y="${margin.top + 4}" text-anchor="end" class="fill-slate-400 text-[10px]">${escapeHtml(maxLabel)}</text>
+        <text x="${margin.left - 8}" y="${baseline}" text-anchor="end" class="fill-slate-400 text-[10px]">${escapeHtml(minLabel)}</text>
+        ${xLabels.map((item) => `
+          <text x="${coords[item.index].x}" y="${height - 10}" text-anchor="middle" class="fill-slate-500 text-[10px]">${escapeHtml(formatDateLabel(item.label))}</text>
+        `).join("")}
+      </svg>
+    </div>
+  `;
+}
+
+function renderAnalytics(payload) {
+  const analytics = payload || {};
+  const categories = Array.isArray(analytics.by_category) ? analytics.by_category : [];
+  const timeline = Array.isArray(analytics.timeline) ? analytics.timeline : [];
+  const finishedCategories = categories.filter((row) => Number(row.finished_jobs || 0) > 0);
+
+  if (analyticsMetaEl) {
+    analyticsMetaEl.textContent = analytics.generated_at
+      ? `Updated ${new Date(analytics.generated_at).toLocaleString()}`
+      : "Historical snapshot";
+  }
+
+  analyticsSummaryEl.innerHTML = [
+    renderMetricCard("Finished Jobs", analytics.finished_jobs || 0),
+    renderMetricCard("Success Rate", formatPercent(analytics.success_rate)),
+    renderMetricCard("Avg Steps", formatNumber(analytics.avg_steps, 1)),
+    renderMetricCard("Avg Cost", formatCurrency(analytics.avg_cost_usd)),
+  ].join("");
+
+  analyticsCategorySummaryEl.textContent = finishedCategories.length
+    ? `${finishedCategories.length} categories`
+    : "No finished jobs yet";
+
+  if (finishedCategories.length) {
+    analyticsCategoriesEl.innerHTML = finishedCategories.map((row) => {
+      const successRate = Number(row.success_rate || 0);
+      const completed = Number(row.completed_jobs || 0);
+      const finished = Number(row.finished_jobs || 0);
+      const total = Number(row.total_jobs || 0);
+      const avgSteps = row.avg_steps == null ? "—" : formatNumber(row.avg_steps, 1);
+      const avgCost = row.avg_cost_usd == null ? "—" : formatCurrency(row.avg_cost_usd);
+      return `
+        <div class="rounded-lg border border-slate-800 bg-slate-950/70 p-3 space-y-2">
+          <div class="flex items-start justify-between gap-3">
+            <div>
+              <div class="font-medium">${escapeHtml(row.label || "Other")}</div>
+              <div class="text-[11px] text-slate-400">${finished} finished · ${completed} completed · ${total} total</div>
+            </div>
+            <div class="text-right">
+              <div class="text-base font-semibold">${formatPercent(successRate)}</div>
+              <div class="text-[11px] text-slate-500">success rate</div>
+            </div>
+          </div>
+          <div class="h-2 rounded bg-slate-800 overflow-hidden">
+            <div class="h-full rounded bg-cyan-400" style="width: ${Math.max(0, Math.min(successRate, 100))}%"></div>
+          </div>
+          <div class="grid grid-cols-2 gap-2 text-[11px] text-slate-300">
+            <div>Avg steps: ${escapeHtml(avgSteps)}</div>
+            <div>Avg cost: ${escapeHtml(avgCost)}</div>
+          </div>
+        </div>
+      `;
+    }).join("");
+  } else {
+    analyticsCategoriesEl.innerHTML = `
+      <div class="rounded-lg border border-dashed border-slate-800 bg-slate-950/70 p-4 text-sm text-slate-400">
+        No finished jobs yet.
+      </div>
+    `;
+  }
+
+  analyticsTrendSummaryEl.textContent = timeline.length ? `${timeline.length} days` : "No daily data yet";
+  analyticsTrendsEl.innerHTML = [
+    renderLineChart("Average steps per day", timeline.map((row) => ({ label: row.label, value: row.avg_steps })), { color: "#38bdf8" }),
+    renderLineChart("Average cost per day", timeline.map((row) => ({ label: row.label, value: row.avg_cost_usd })), {
+      color: "#34d399",
+      valueLabel: "USD",
+      formatValue: (value) => formatCurrency(value),
+    }),
+  ].join("");
+}
+
 function renderJobs() {
  jobListEl.innerHTML = state.jobs.map((job) => {
    const active = job.job_id === state.selectedJobId;
@@ -310,6 +508,11 @@ async function refreshStats() {
  renderStats(payload);
 }

+async function refreshAnalytics() {
+  const payload = await api("/api/analytics");
+  renderAnalytics(payload);
+}
+
 async function refreshJobDetail() {
  if (!state.selectedJobId) return;
  const [job, events, replay] = await Promise.all([
@@ -345,6 +548,9 @@ function connectWs() {
      }
      await refreshJobs();
      await refreshStats();
+      if (analyticsRefreshEvents.has(payload.event_type)) {
+        await refreshAnalytics();
+      }
    } catch (err) {
      console.error(err);
    }
@@ -362,6 +568,7 @@ function connectWs() {
 async function fullRefresh() {
  await refreshJobs();
  await refreshStats();
+  await refreshAnalytics();
  await refreshJobDetail();
 }

--- a/tests/test_agent_tools.py
+++ b/tests/test_agent_tools.py
@@ -91,6 +91,41 @@ def test_click_supports_directional_offsets(tmp_path: Path, monkeypatch) -> None
    assert click_result["clicked"] == {"x": 110, "y": 102}


+def test_enhance_defaults_to_small_ui_preset(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    result = agent._tool_enhance({"coordinate": {"x": 100, "y": 120}})
+
+    assert result["ok"] is True
+    meta = result["meta"]
+    assert meta["region"] == "small"
+    assert meta["mode"] == "ui"
+    assert meta["scale"] == 4
+    assert Path(meta["path"]).exists()
+    assert meta["target_pixel"]["x"] >= 0
+    assert meta["target_pixel"]["y"] >= 0
+
+
+def test_enhance_supports_text_mode_and_scale_clamp(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    result = agent._tool_enhance(
+        {
+            "coordinate": {"x": -99, "y": 9999},
+            "region": "medium",
+            "mode": "text",
+            "scale": 99,
+        }
+    )
+
+    assert result["ok"] is True
+    meta = result["meta"]
+    assert meta["region"] == "medium"
+    assert meta["mode"] == "text"
+    assert meta["scale"] == 6
+    assert meta["requested_coord"] == {"x": -99, "y": 9999}
+    assert meta["source_coord"] == {"x": 0, "y": 719}
+    assert Path(meta["path"]).exists()
+
+
 def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None:
    agent = _build_agent(tmp_path, monkeypatch)
    result = agent._tool_press_key({"key": "meta+r"})
@@ -98,3 +133,21 @@ def test_press_key_supports_hotkey_combo(tmp_path: Path, monkeypatch) -> None:
    assert result["key"] == "win+r"
    assert result["message"] == "Key combo executed."
    assert agent_module.pyautogui.last_hotkey == ("win", "r")
+
+
+def test_context_compaction_trigger_and_payload(tmp_path: Path, monkeypatch) -> None:
+    agent = _build_agent(tmp_path, monkeypatch)
+    agent.objective = "Open settings app"
+    agent.previous_response_id = "resp_123"
+    agent.step = 4
+    agent.last_context_compact_step = 0
+    agent.options.screen_context_decay_steps = 4
+    agent.recent_tool_summaries = ["step=1 tool=see_screen status=ok"]
+    agent.last_screen_data_url = "data:image/png;base64,abc"
+    agent.last_screen_meta = {"width": 1280, "height": 720, "path": "C:/tmp/frame.png"}
+
+    assert agent._should_compact_context() is True
+    compacted = agent._build_compacted_pending_input()
+    assert len(compacted) == 2
+    assert "Context compaction activated" in compacted[0]["content"][0]["text"]
+    assert "Open settings app" in compacted[0]["content"][0]["text"]
--- a/tests/test_cli.py
+++ b/tests/test_cli.py
@@ -29,7 +29,10 @@ def test_cli_emits_structured_return_and_data(monkeypatch: Any, capsys, tmp_path
    def fake_assess_task_safety(*_args, **_kwargs):
        return True, "safe", {"safe": True}

+    captured_kwargs: dict[str, Any] = {}
+
    def fake_run_job(*_args, **_kwargs):
+        captured_kwargs.update(_kwargs)
        result = AgentResult(
            completed=True,
            result="Done",
@@ -66,3 +69,5 @@ def test_cli_emits_structured_return_and_data(monkeypatch: Any, capsys, tmp_path
    assert payload["response"]["data"] == "file1.txt\nfile2.txt"
    assert payload["return"] == "Task completed successfully"
    assert payload["data"] == "file1.txt\nfile2.txt"
+    assert captured_kwargs["options"].reasoning_effort == "medium"
+    assert captured_kwargs["options"].screen_context_decay_steps == 4
--- a/tests/test_server_api.py
+++ b/tests/test_server_api.py
@@ -9,6 +9,24 @@ import src.server as server_module
 from src.config import AppConfig


+_TERMINAL_STATUSES = {"completed", "failed", "cancelled"}
+
+
+def _objective_category(objective: str) -> str:
+    text = objective.lower()
+    if any(keyword in text for keyword in ("browser", "website", "amazon", "google", "login", "shopping", "checkout", "orders")):
+        return "Browser / web"
+    if any(keyword in text for keyword in ("file", "folder", "directory", "terminal", "shell", "command", "cli", "script", "git", "repo", "install", "pip", "npm")):
+        return "Files / terminal"
+    if any(keyword in text for keyword in ("write", "summary", "document", "docs", "report", "email", "message", "readme", "markdown")):
+        return "Writing / docs"
+    if any(keyword in text for keyword in ("data", "analysis", "csv", "spreadsheet", "sheet", "table", "chart", "dashboard", "metric", "sql")):
+        return "Data / analysis"
+    if any(keyword in text for keyword in ("code", "bug", "fix", "test", "debug", "api", "backend", "frontend", "database", "deploy", "docker", "service", "build")):
+        return "Development / ops"
+    return "Other"
+
+
 class FakeJobManager:
    def __init__(self, *, config: AppConfig, db: Any, broadcast: Any = None) -> None:
        self.config = config
@@ -26,6 +44,8 @@ class FakeJobManager:
        command_timeout: int = 45,
        type_interval: float = 0.02,
        click_pause: float = 0.10,
+        reasoning_effort: str = "medium",
+        screen_context_decay_steps: int = 4,
        disabled_tools: list[str] | None = None,
        safety_override: bool = False,
        no_failsafe: bool = False,
@@ -37,6 +57,7 @@ class FakeJobManager:
        artifacts_dir.mkdir(parents=True, exist_ok=True)
        screenshot_path = artifacts_dir / "screen_step_001.png"
        screenshot_path.write_bytes(b"not-a-real-png")
+        created_at = f"2026-05-27T00:00:{self._counter:02d}Z"
        self.last_submit_payload = {
            "objective": objective,
            "model": selected_model,
@@ -46,6 +67,8 @@ class FakeJobManager:
            "command_timeout": command_timeout,
            "type_interval": type_interval,
            "click_pause": click_pause,
+            "reasoning_effort": reasoning_effort,
+            "screen_context_decay_steps": screen_context_decay_steps,
            "no_failsafe": no_failsafe,
        }
        self._jobs[job_id] = {
@@ -53,6 +76,10 @@ class FakeJobManager:
            "objective": objective,
            "model": selected_model,
            "status": "running",
+            "created_at": created_at,
+            "started_at": created_at,
+            "ended_at": None,
+            "steps": 1,
            "result": "Running",
            "response": {"return": "Running", "data": None},
            "return": "Running",
@@ -145,6 +172,114 @@ class FakeJobManager:
            "live_running_threads": 0,
        }

+    def analytics(self) -> dict[str, Any]:
+        by_category: dict[str, dict[str, Any]] = {}
+        by_day: dict[str, dict[str, Any]] = {}
+
+        def bucket(target: dict[str, dict[str, Any]], key: str) -> dict[str, Any]:
+            return target.setdefault(
+                key,
+                {
+                    "label": key,
+                    "total_jobs": 0,
+                    "finished_jobs": 0,
+                    "completed_jobs": 0,
+                    "failed_jobs": 0,
+                    "cancelled_jobs": 0,
+                    "steps_sum": 0,
+                    "steps_count": 0,
+                    "cost_sum": 0.0,
+                    "cost_count": 0,
+                },
+            )
+
+        total_jobs = 0
+        finished_jobs = 0
+        completed_jobs = 0
+        failed_jobs = 0
+        cancelled_jobs = 0
+        steps_sum = 0
+        steps_count = 0
+        cost_sum = 0.0
+        cost_count = 0
+
+        for job in self._jobs.values():
+            total_jobs += 1
+            status = str(job.get("status") or "")
+            finished = status in _TERMINAL_STATUSES
+            category = _objective_category(str(job.get("objective") or ""))
+            day = str(job.get("created_at") or "")[:10] or "unknown"
+
+            category_bucket = bucket(by_category, category)
+            day_bucket = bucket(by_day, day)
+            for item in (category_bucket, day_bucket):
+                item["total_jobs"] += 1
+
+            if not finished:
+                continue
+
+            finished_jobs += 1
+            if status == "completed":
+                completed_jobs += 1
+            elif status == "failed":
+                failed_jobs += 1
+            elif status == "cancelled":
+                cancelled_jobs += 1
+
+            steps_raw = job.get("steps")
+            if steps_raw is not None:
+                steps = int(steps_raw)
+                steps_sum += steps
+                steps_count += 1
+                for item in (category_bucket, day_bucket):
+                    item["steps_sum"] += steps
+                    item["steps_count"] += 1
+
+            estimated_cost_raw = (job.get("usage") or {}).get("estimated_cost_usd")
+            if estimated_cost_raw is not None:
+                estimated_cost = float(estimated_cost_raw)
+                cost_sum += estimated_cost
+                cost_count += 1
+                for item in (category_bucket, day_bucket):
+                    item["cost_sum"] += estimated_cost
+                    item["cost_count"] += 1
+
+            for item in (category_bucket, day_bucket):
+                item["finished_jobs"] += 1
+                if status == "completed":
+                    item["completed_jobs"] += 1
+                elif status == "failed":
+                    item["failed_jobs"] += 1
+                elif status == "cancelled":
+                    item["cancelled_jobs"] += 1
+
+        def finalize(item: dict[str, Any]) -> dict[str, Any]:
+            finished = item["finished_jobs"]
+            return {
+                "label": item["label"],
+                "total_jobs": item["total_jobs"],
+                "finished_jobs": finished,
+                "completed_jobs": item["completed_jobs"],
+                "failed_jobs": item["failed_jobs"],
+                "cancelled_jobs": item["cancelled_jobs"],
+                "success_rate": round((item["completed_jobs"] / finished) * 100, 2) if finished else 0.0,
+                "avg_steps": round(item["steps_sum"] / item["steps_count"], 2) if item["steps_count"] else None,
+                "avg_cost_usd": round(item["cost_sum"] / item["cost_count"], 6) if item["cost_count"] else None,
+            }
+
+        return {
+            "total_jobs": total_jobs,
+            "finished_jobs": finished_jobs,
+            "completed_jobs": completed_jobs,
+            "failed_jobs": failed_jobs,
+            "cancelled_jobs": cancelled_jobs,
+            "success_rate": round((completed_jobs / finished_jobs) * 100, 2) if finished_jobs else 0.0,
+            "avg_steps": round(steps_sum / steps_count, 2) if steps_count else None,
+            "avg_cost_usd": round(cost_sum / cost_count, 6) if cost_count else None,
+            "by_category": sorted((finalize(item) for item in by_category.values()), key=lambda item: (-item["success_rate"], item["label"])),
+            "timeline": sorted((finalize(item) for item in by_day.values()), key=lambda item: item["label"]),
+        }
+

 def _build_app(tmp_path: Path, monkeypatch: Any, disable_ui: bool = False):
    monkeypatch.setattr(server_module, "JobManager", FakeJobManager)
@@ -189,6 +324,8 @@ def test_create_job_returns_only_job_id_and_defaults_model(tmp_path: Path, monke
    manager = app.state.manager
    assert manager.last_submit_payload["model"] == "gpt-5.4-mini"
    assert manager.last_submit_payload["disabled_tools"] == ["click"]
+    assert manager.last_submit_payload["reasoning_effort"] == "medium"
+    assert manager.last_submit_payload["screen_context_decay_steps"] == 4

    status_res = client.get(f"/api/jobs/{job_id}/status", headers=headers)
    assert status_res.status_code == 200
@@ -270,12 +407,67 @@ def test_replay_endpoint_skips_visual_paths_outside_artifacts(tmp_path: Path, mo
    assert payload["total_frames"] == 1


+def test_analytics_endpoint_groups_by_category_and_time(tmp_path: Path, monkeypatch: Any) -> None:
+    app, _ = _build_app(tmp_path, monkeypatch, disable_ui=False)
+    manager = app.state.manager
+    client = TestClient(app)
+    headers = {"Authorization": "Bearer test_token"}
+
+    browser_completed = client.post("/api/jobs", headers=headers, json={"job": "Open amazon.de and checkout"}).json()["job_id"]
+    browser_failed = client.post("/api/jobs", headers=headers, json={"job": "Open website and login"}).json()["job_id"]
+    terminal_completed = client.post("/api/jobs", headers=headers, json={"job": "Run a shell command to inspect files"}).json()["job_id"]
+
+    manager._jobs[browser_completed].update(
+        status="completed",
+        ended_at="2026-05-27T00:10:00Z",
+        steps=4,
+        created_at="2026-05-27T00:00:01Z",
+        usage={**manager._jobs[browser_completed]["usage"], "estimated_cost_usd": 0.12},
+    )
+    manager._jobs[browser_failed].update(
+        status="failed",
+        ended_at="2026-05-28T00:10:00Z",
+        steps=6,
+        created_at="2026-05-28T00:00:01Z",
+        usage={**manager._jobs[browser_failed]["usage"], "estimated_cost_usd": 0.24},
+    )
+    manager._jobs[terminal_completed].update(
+        status="completed",
+        ended_at="2026-05-28T00:15:00Z",
+        steps=10,
+        created_at="2026-05-28T00:00:02Z",
+        usage={**manager._jobs[terminal_completed]["usage"], "estimated_cost_usd": 0.05},
+    )
+
+    analytics = client.get("/api/analytics", headers=headers)
+    assert analytics.status_code == 200
+    payload = analytics.json()
+
+    assert payload["total_jobs"] == 3
+    assert payload["finished_jobs"] == 3
+    assert payload["completed_jobs"] == 2
+    assert payload["failed_jobs"] == 1
+    assert payload["success_rate"] == 66.67
+    assert payload["avg_steps"] == 6.67
+    assert payload["avg_cost_usd"] == 0.136667
+
+    browser = next(row for row in payload["by_category"] if row["label"] == "Browser / web")
+    terminal = next(row for row in payload["by_category"] if row["label"] == "Files / terminal")
+    assert browser["finished_jobs"] == 2
+    assert browser["success_rate"] == 50.0
+    assert browser["avg_steps"] == 5.0
+    assert terminal["success_rate"] == 100.0
+
+    assert [row["label"] for row in payload["timeline"]] == ["2026-05-27", "2026-05-28"]
+
+
 def test_ui_toggle(tmp_path: Path, monkeypatch: Any) -> None:
    app_enabled, _ = _build_app(tmp_path / "enabled", monkeypatch, disable_ui=False)
    client_enabled = TestClient(app_enabled)
    root_enabled = client_enabled.get("/")
    assert root_enabled.status_code == 200
    assert "ScreenJob Monitor" in root_enabled.text
+    assert "Success by Objective Category" in root_enabled.text
    js_enabled = client_enabled.get("/ui/monitoring.js")
    assert js_enabled.status_code == 200
    assert "const tokenInput" in js_enabled.text
--- a/tests/test_storage.py
+++ b/tests/test_storage.py
@@ -72,3 +72,55 @@ def test_storage_response_fallback_uses_result_when_json_missing(tmp_path: Path)
    assert job is not None
    assert job["response"]["return"] == "Legacy result string"
    assert job["response"]["data"] is None
+
+
+def test_history_db_analytics_groups_by_category_and_day(tmp_path: Path) -> None:
+    db = HistoryDB(tmp_path / "screenjob_test_analytics.db")
+
+    db.create_job(
+        job_id="job_browser_ok",
+        objective="Open amazon.de and checkout",
+        model="gpt-5.4-mini",
+        created_at="2026-05-27T00:00:01Z",
+        safety_override=False,
+        disabled_tools=[],
+    )
+    db.update_job("job_browser_ok", status="completed", steps=4, estimated_cost_usd=0.12)
+
+    db.create_job(
+        job_id="job_browser_fail",
+        objective="Open website and login",
+        model="gpt-5.4-mini",
+        created_at="2026-05-28T00:00:01Z",
+        safety_override=False,
+        disabled_tools=[],
+    )
+    db.update_job("job_browser_fail", status="failed", steps=6, estimated_cost_usd=0.24)
+
+    db.create_job(
+        job_id="job_terminal_ok",
+        objective="Run a shell command to inspect files",
+        model="gpt-5.4-mini",
+        created_at="2026-05-28T00:00:02Z",
+        safety_override=False,
+        disabled_tools=[],
+    )
+    db.update_job("job_terminal_ok", status="completed", steps=10, estimated_cost_usd=0.05)
+
+    analytics = db.analytics()
+    assert analytics["total_jobs"] == 3
+    assert analytics["finished_jobs"] == 3
+    assert analytics["completed_jobs"] == 2
+    assert analytics["failed_jobs"] == 1
+    assert analytics["success_rate"] == 66.67
+    assert analytics["avg_steps"] == 6.67
+    assert analytics["avg_cost_usd"] == 0.136667
+
+    browser = next(row for row in analytics["by_category"] if row["label"] == "Browser / web")
+    terminal = next(row for row in analytics["by_category"] if row["label"] == "Files / terminal")
+    assert browser["finished_jobs"] == 2
+    assert browser["success_rate"] == 50.0
+    assert browser["avg_steps"] == 5.0
+    assert terminal["success_rate"] == 100.0
+
+    assert [row["label"] for row in analytics["timeline"]] == ["2026-05-27", "2026-05-28"]
--- a/todo.md
+++ b/todo.md
@@ -4,12 +4,12 @@
 - [Bug] Enforce single active desktop-control run (or a strict queue) so concurrent jobs cannot fight over the same mouse/keyboard/screen session.
 - [Bug] Fix run artifact collisions in `setup_artifacts()` (`run_id` is second-granularity, so two jobs in the same second can share/overwrite the same directory).
 - [Bug] Remove global logger handler clobbering in `setup_logger()` (`logging.getLogger("screenjob").handlers.clear()` breaks concurrent runs and can redirect logs to the wrong file).
- [Bug] More consistent clicks and more uses of enhance images.
+- [x] More consistent clicks and more uses of enhance images.

 ## P1
- [Idea] Move ui.py into a seperate html file and js file.
- [Idea] Think harder using effort "medium" by default.
- [Idea] Decay old screenshots after 3 to 5 steps to save (1) tokens and (2) brain fuck in the agents.
+- [x] Move ui.py into a seperate html file and js file.
+- [x] Think harder using effort "medium" by default.
+- [x] Decay old screenshots after 3 to 5 steps to save (1) tokens and (2) brain fuck in the agents.
 - [Bug] Validate `disabled_tools` against an allowlist and disallow disabling critical completion flow (`task_complete`) to avoid guaranteed step-limit failures.
 - [Bug] Improve `execute_command` cancellation/timeout handling to terminate full process trees, not only the parent shell process.

@@ -20,4 +20,4 @@

 ## P3
 - [x] Add Replay Mode; Ability to replay a session by reconstructing the screen from screenshots and overlaying tool calls and click and type events.
- [Idea] Add lightweight analytics dashboards (success rate by objective category, avg steps/cost over time).
+- [x] Add lightweight analytics dashboards (success rate by objective category, avg steps/cost over time).
Author	SHA1	Message	Date
Space-Banane	8126b57404	Add lightweight analytics dashboard All checks were successful CI / test (push) Successful in 7s Details CI / test (pull_request) Successful in 7s Details Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-27 22:34:26 +02:00
Space-Banane	cceed18cf1	feat: (literally) "enhance" functionality with new parameters and improved image processing All checks were successful CI / test (push) Successful in 7s Details	2026-05-27 22:14:32 +02:00
Space-Banane	880468ef02	Mark completed P1 TODO items as done	2026-05-27 22:05:57 +02:00
Space-Banane	b05a7be668	Compact screenshot context every 4 steps by default	2026-05-27 22:04:15 +02:00
Space-Banane	0c019474af	Default model reasoning effort to medium	2026-05-27 22:02:20 +02:00