Files
gitea-codex/Idea.md
Space-Banane d8956b309d
Some checks failed
ci / test (push) Failing after 12s
ci / publish (push) Has been skipped
First MVP
2026-05-22 19:25:57 +02:00

246 lines
6.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
Architecture:
```text
Gitea
└─ webhook: pull_request_comment / issue_comment
└─ gitea-codex-bot API
├─ verifies X-Gitea-Signature
├─ checks body starts with @codex review
├─ queues review job
└─ worker:
├─ clones repo / fetches PR branches
├─ builds git diff + context
├─ runs codex headless
├─ parses JSON findings
└─ posts review comment as codex-bot
```
Use a real Gitea user, e.g. `codex-bot`. Give it a token with minimum access: read repo, read PRs/issues, write comments. Do not use your personal admin token. Gitea exposes Swagger/OpenAPI per instance at `/api/swagger` and `/swagger.v1.json`, so you can wire against your actual server version instead of guessing endpoints. ([Gitea Documentation][3])
MVP behavior:
```text
User comments:
@codex review
Bot replies:
👀 Codex review queued for commit abc123...
Later edits/posts:
## Codex Review
Verdict: patch mostly correct
Confidence: 0.78
Findings:
1. src/auth.ts:42-55
Token validation accepts expired tokens in one path.
2. api/users.ts:88
Missing permission check before update.
No blocking issues found in tests.
```
For v1, post one normal PR timeline comment. Do not fight inline comments yet. Gitea has PR review webhook concepts, but line-level diff review API support can be version-sensitive/awkward; there are still recent reports about API-token support for diff-level review comments being unclear. ([Gitea Documentation][1]) Summary comments are reliable and still useful.
Core trigger logic:
```ts
if (event !== "pull_request_comment" && event !== "issue_comment") return;
if (!payload.is_pull && !payload.pull_request) return;
if (payload.sender.username === "codex-bot") return;
if (!payload.comment.body.trim().startsWith("@codex review")) return;
enqueueReview(payload.repository.full_name, payload.pull_request.number);
```
Job flow:
```text
1. Verify webhook HMAC.
2. Dedupe by delivery ID/comment ID.
3. Parse command:
@codex review
@codex review security
@codex review tests
@codex review --full
4. Create “queued” comment.
5. Clone/fetch repo into isolated temp dir.
6. Checkout PR head.
7. Generate:
git diff base...head
changed file list
optional full changed-file content
optional test output
8. Run Codex headless with JSON schema.
9. Validate JSON.
10. Post/update review comment.
```
Use SQLite first:
```sql
reviews(
id,
repo,
pr_number,
head_sha,
trigger_comment_id,
status,
requested_by,
created_at,
updated_at,
result_json
)
```
Suggested service stack:
```text
Backend: Python FastAPI or Node/TS Fastify
Queue: SQLite jobs first, Redis later
Runner: Docker worker container
Storage: /var/lib/gitea-codex-bot
Auth: bot PAT + webhook secret
Deployment: docker compose
```
Config:
```env
GITEA_BASE_URL=https://git.example.com
GITEA_TOKEN=...
GITEA_BOT_USERNAME=codex-bot
GITEA_WEBHOOK_SECRET=...
OPENAI_API_KEY=...
WORKDIR=/var/lib/gitea-codex/worktrees
MAX_DIFF_BYTES=200000
MAX_REVIEW_MINUTES=10
CONCURRENCY=1
```
Good commands to support later:
```text
@codex review
@codex review security
@codex review performance
@codex review tests
@codex review --full
@codex explain
@codex fix
@codex fix --branch
@codex ignore
@codex rerun
```
Best v2 feature: persistent review comment. Instead of spamming new comments, the bot finds its previous comment on that PR and edits it:
```text
<!-- codex-review:head_sha=abc123 -->
## Codex Review
...
```
Then reruns replace the same block.
Best v3 feature: fixes. User comments:
```text
@codex fix finding 2
```
Bot creates a branch:
```text
codex/pr-42-fix-permission-check
```
Then opens a PR or pushes to the existing PR branch only if allowed. Keep this disabled by default. Review-only is safer.
Security rules that matter:
```text
- Verify X-Gitea-Signature.
- Ignore bots own comments.
- Allowlist repos/orgs.
- Never run on untrusted fork PRs unless sandboxed hard.
- No Docker socket mount.
- No host filesystem mount except temp workdir.
- Timeout every job.
- Limit diff size.
- Redact .env, secrets, keys.
- Use bot token, not admin token.
- Log prompt + result, but not secrets.
```
Prompt shape for Codex:
```text
You are reviewing a Gitea pull request.
Focus only on issues introduced by this PR.
Prioritize correctness, security, data loss, broken behavior, bad migrations, and missing tests.
Avoid style nitpicks.
Return JSON:
{
"verdict": "correct" | "has_issues",
"confidence": 0.0-1.0,
"summary": "...",
"findings": [
{
"severity": "low|medium|high|critical",
"file": "...",
"line_start": 1,
"line_end": 1,
"title": "...",
"body": "...",
"suggestion": "..."
}
]
}
```
Practical build order:
```text
1. Make bot account + token.
2. Add webhook receiver.
3. Verify signature + parse @codex review.
4. Post “queued” comment.
5. Clone repo and generate diff.
6. Run Codex headless.
7. Post one summary comment.
8. Add dedupe + SQLite.
9. Add per-repo config file.
10. Add optional inline comments/fix branches later.
```
Per-repo config idea:
```yaml
# .codex-review.yml
enabled: true
review:
default_mode: summary
max_diff_bytes: 200000
include_tests: true
focus:
- correctness
- security
- maintainability
ignore:
- "dist/**"
- "pnpm-lock.yaml"
- "*.min.js"
commands:
allow_fix: false
```
Final recommendation: external webhook bot, summary comments first, bot account + token, Codex headless JSON, SQLite queue. Inline review comments and auto-fix branches are v2/v3. Trying to make the first version “full GitHub Copilot Reviews clone” is how this becomes annoying trash.
[1]: https://docs.gitea.com/usage/repository/webhooks "Webhooks | Gitea Documentation"
[2]: https://developers.openai.com/cookbook/examples/codex/build_code_review_with_codex_sdk "Build Code Review with the Codex SDK"
[3]: https://docs.gitea.com/development/api-usage?utm_source=chatgpt.com "API Usage"