First MVP

2026-05-22 19:25:57 +02:00
parent 673f70b32a
commit 860ccb731d
40 changed files with 2336 additions and 0 deletions
--- a/Idea.md
+++ b/Idea.md
@@ -0,0 +1,245 @@
+Architecture:
+
+```text
+Gitea
+  └─ webhook: pull_request_comment / issue_comment
+       └─ gitea-codex-bot API
+            ├─ verifies X-Gitea-Signature
+            ├─ checks body starts with @codex review
+            ├─ queues review job
+            └─ worker:
+                ├─ clones repo / fetches PR branches
+                ├─ builds git diff + context
+                ├─ runs codex headless
+                ├─ parses JSON findings
+                └─ posts review comment as codex-bot
+```
+
+Use a real Gitea user, e.g. `codex-bot`. Give it a token with minimum access: read repo, read PRs/issues, write comments. Do not use your personal admin token. Gitea exposes Swagger/OpenAPI per instance at `/api/swagger` and `/swagger.v1.json`, so you can wire against your actual server version instead of guessing endpoints. ([Gitea Documentation][3])
+
+MVP behavior:
+
+```text
+User comments:
+@codex review
+
+Bot replies:
+👀 Codex review queued for commit abc123...
+
+Later edits/posts:
+## Codex Review
+
+Verdict: patch mostly correct
+Confidence: 0.78
+
+Findings:
+1. src/auth.ts:42-55
+   Token validation accepts expired tokens in one path.
+
+2. api/users.ts:88
+   Missing permission check before update.
+
+No blocking issues found in tests.
+```
+
+For v1, post one normal PR timeline comment. Do not fight inline comments yet. Gitea has PR review webhook concepts, but line-level diff review API support can be version-sensitive/awkward; there are still recent reports about API-token support for diff-level review comments being unclear. ([Gitea Documentation][1]) Summary comments are reliable and still useful.
+
+Core trigger logic:
+
+```ts
+if (event !== "pull_request_comment" && event !== "issue_comment") return;
+if (!payload.is_pull && !payload.pull_request) return;
+if (payload.sender.username === "codex-bot") return;
+if (!payload.comment.body.trim().startsWith("@codex review")) return;
+enqueueReview(payload.repository.full_name, payload.pull_request.number);
+```
+
+Job flow:
+
+```text
+1. Verify webhook HMAC.
+2. Dedupe by delivery ID/comment ID.
+3. Parse command:
+   @codex review
+   @codex review security
+   @codex review tests
+   @codex review --full
+4. Create “queued” comment.
+5. Clone/fetch repo into isolated temp dir.
+6. Checkout PR head.
+7. Generate:
+   git diff base...head
+   changed file list
+   optional full changed-file content
+   optional test output
+8. Run Codex headless with JSON schema.
+9. Validate JSON.
+10. Post/update review comment.
+```
+
+Use SQLite first:
+
+```sql
+reviews(
+  id,
+  repo,
+  pr_number,
+  head_sha,
+  trigger_comment_id,
+  status,
+  requested_by,
+  created_at,
+  updated_at,
+  result_json
+)
+```
+
+Suggested service stack:
+
+```text
+Backend: Python FastAPI or Node/TS Fastify
+Queue: SQLite jobs first, Redis later
+Runner: Docker worker container
+Storage: /var/lib/gitea-codex-bot
+Auth: bot PAT + webhook secret
+Deployment: docker compose
+```
+
+Config:
+
+```env
+GITEA_BASE_URL=https://git.example.com
+GITEA_TOKEN=...
+GITEA_BOT_USERNAME=codex-bot
+GITEA_WEBHOOK_SECRET=...
+OPENAI_API_KEY=...
+WORKDIR=/var/lib/gitea-codex/worktrees
+MAX_DIFF_BYTES=200000
+MAX_REVIEW_MINUTES=10
+CONCURRENCY=1
+```
+
+Good commands to support later:
+
+```text
+@codex review
+@codex review security
+@codex review performance
+@codex review tests
+@codex review --full
+@codex explain
+@codex fix
+@codex fix --branch
+@codex ignore
+@codex rerun
+```
+
+Best v2 feature: persistent review comment. Instead of spamming new comments, the bot finds its previous comment on that PR and edits it:
+
+```text
+<!-- codex-review:head_sha=abc123 -->
+## Codex Review
+...
+```
+
+Then reruns replace the same block.
+
+Best v3 feature: fixes. User comments:
+
+```text
+@codex fix finding 2
+```
+
+Bot creates a branch:
+
+```text
+codex/pr-42-fix-permission-check
+```
+
+Then opens a PR or pushes to the existing PR branch only if allowed. Keep this disabled by default. Review-only is safer.
+
+Security rules that matter:
+
+```text
+- Verify X-Gitea-Signature.
+- Ignore bot’s own comments.
+- Allowlist repos/orgs.
+- Never run on untrusted fork PRs unless sandboxed hard.
+- No Docker socket mount.
+- No host filesystem mount except temp workdir.
+- Timeout every job.
+- Limit diff size.
+- Redact .env, secrets, keys.
+- Use bot token, not admin token.
+- Log prompt + result, but not secrets.
+```
+
+Prompt shape for Codex:
+
+```text
+You are reviewing a Gitea pull request.
+
+Focus only on issues introduced by this PR.
+Prioritize correctness, security, data loss, broken behavior, bad migrations, and missing tests.
+Avoid style nitpicks.
+
+Return JSON:
+{
+  "verdict": "correct" | "has_issues",
+  "confidence": 0.0-1.0,
+  "summary": "...",
+  "findings": [
+    {
+      "severity": "low|medium|high|critical",
+      "file": "...",
+      "line_start": 1,
+      "line_end": 1,
+      "title": "...",
+      "body": "...",
+      "suggestion": "..."
+    }
+  ]
+}
+```
+
+Practical build order:
+
+```text
+1. Make bot account + token.
+2. Add webhook receiver.
+3. Verify signature + parse @codex review.
+4. Post “queued” comment.
+5. Clone repo and generate diff.
+6. Run Codex headless.
+7. Post one summary comment.
+8. Add dedupe + SQLite.
+9. Add per-repo config file.
+10. Add optional inline comments/fix branches later.
+```
+
+Per-repo config idea:
+
+```yaml
+# .codex-review.yml
+enabled: true
+review:
+  default_mode: summary
+  max_diff_bytes: 200000
+  include_tests: true
+  focus:
+    - correctness
+    - security
+    - maintainability
+ignore:
+  - "dist/**"
+  - "pnpm-lock.yaml"
+  - "*.min.js"
+commands:
+  allow_fix: false
+```
+
+Final recommendation: external webhook bot, summary comments first, bot account + token, Codex headless JSON, SQLite queue. Inline review comments and auto-fix branches are v2/v3. Trying to make the first version “full GitHub Copilot Reviews clone” is how this becomes annoying trash.
+
+[1]: https://docs.gitea.com/usage/repository/webhooks "Webhooks | Gitea Documentation"
+[2]: https://developers.openai.com/cookbook/examples/codex/build_code_review_with_codex_sdk "Build Code Review with the Codex SDK"
+[3]: https://docs.gitea.com/development/api-usage?utm_source=chatgpt.com "API Usage"