gitea-codex/AGENTS.md

# AGENTS.md

Guidance for autonomous/code-assist agents working in this repository.

## Mission

Build and maintain a webhook-driven Gitea PR review bot that:

1. verifies webhook authenticity,
2. parses `@codex` commands,
3. queues and executes review jobs,
4. posts/updates PR comments with structured findings.

Primary implementation lives under `src/gitea_codex_bot`.

## Tech Stack

- Python `>=3.11`
- FastAPI + Uvicorn
- SQLAlchemy + Alembic
- MariaDB (default), SQLite possible via `DATABASE_URL`
- Pytest for tests
- Docker-based review runner with host fallback

## Repository Map

- `src/gitea_codex_bot/main.py`
  - FastAPI app, `/healthz`, `/webhook/gitea`, lifespan worker boot.
- `src/gitea_codex_bot/config.py`
  - Environment-backed settings and DB URL composition.
- `src/gitea_codex_bot/db.py`
  - Engine/session factory and dependency session provider.
- `src/gitea_codex_bot/models.py`
  - ORM models: `WebhookEvent`, `ReviewJob`, `ReviewRun`, `BotComment`.
- `src/gitea_codex_bot/services/commands.py`
  - `@codex` command parsing.
- `src/gitea_codex_bot/services/jobs.py`
  - Event dedupe, queue transitions, cooldown logic.
- `src/gitea_codex_bot/services/security.py`
  - HMAC signature verification and payload digest.
- `src/gitea_codex_bot/services/gitea.py`
  - Gitea API client wrapper.
- `src/gitea_codex_bot/services/reviewer.py`
  - PR checkout/diff collection/prompt build/OpenAI call/fallback/fix helpers.
- `src/gitea_codex_bot/services/review_format.py`
  - Outbound comment formatting.
- `src/gitea_codex_bot/services/comments.py`
  - Persistent summary comment id tracking.
- `src/gitea_codex_bot/workers/dispatcher.py`
  - Job polling and orchestration.
- `src/gitea_codex_bot/workers/container_runner.py`
  - Docker review execution + fallback.
- `alembic/` + `alembic.ini`
  - Database migrations.
- `tests/`
  - Unit/integration-ish tests across config/security/jobs/webhook/migrations.
- `.gitea/workflows/ci.yml`
  - CI test + publish workflow.

## Runtime Flow

1. Gitea sends webhook to `POST /webhook/gitea`.
2. Signature is validated (`X-Gitea-Signature`, sha256 HMAC).
3. Non-supported events are ignored.
4. PR context and command are extracted.
5. Repo allowlist and dedupe checks run.
6. Job is enqueued (`review_jobs`).
7. Background worker claims queued jobs.
8. For review/rerun:
   - fetch PR context,
   - run review in ephemeral container if possible,
   - fallback to host execution on failure,
   - post or edit persistent PR summary comment.
9. Job/run status transitions are persisted.

## Supported Commands

- `@codex review [security|performance|tests] [--full]`
- `@codex rerun`
- `@codex explain`
- `@codex fix [--branch ...]` (gated by `ENABLE_FIX_COMMANDS`)
- `@codex ignore`

## Local Development

Install and run:

```bash
python -m pip install -e .[dev]
alembic upgrade head
uvicorn gitea_codex_bot.main:app --host 0.0.0.0 --port 8000
```

Run tests:

```bash
pytest
```

Docker compose:

```bash
docker compose up --build
```

## Environment Contract

Required:

- `GITEA_BASE_URL`
- `GITEA_TOKEN`
- `GITEA_BOT_USERNAME`
- `GITEA_WEBHOOK_SECRET`
- `ALLOWED_REPOS`
- `DB_HOST`, `DB_PORT`, `DB_NAME`, `DB_USER`, `DB_PASSWORD`

Common optional:

- `DATABASE_URL` (overrides DB parts)
- `OPENAI_API_KEY` (required when `CODEX_AUTH_MODE=api_key`)
- `OPENAI_PROJECT_ID`, `OPENAI_ORG_ID`
- `OPENAI_REVIEW_MODEL`
- `OPENAI_REASONING_EFFORT`
- `CODEX_AUTH_MODE` (`api_key` default, `chatgpt` supported)
- `CODEX_AUTH_JSON_PATH` (custom path to `auth.json` for `chatgpt` mode)
- `WORKDIR`, `MAX_DIFF_BYTES`, `MAX_REVIEW_MINUTES`, `CONCURRENCY`
- `REVIEW_RUNNER_IMAGE`
- `ENABLE_FIX_COMMANDS`
- `ALLOW_UNTRUSTED_FORKS`

## Database and Migrations

- SQLAlchemy models are authoritative for runtime behavior.
- Alembic migrations in `alembic/versions` must track schema changes.
- If model schema changes, add a migration and keep migration tests passing.
- CI runs `alembic upgrade head` before pytest.

## Testing Expectations

Before opening/merging changes:

1. run `pytest`,
2. if DB/model changes were made, ensure migration test still passes,
3. for webhook/queue logic, add or update focused tests in `tests/`.

Current tests rely on `tests/conftest.py` to inject default env and DB URL behavior.

## Change Guardrails

- Preserve webhook security checks and allowlist semantics.
- Preserve dedupe constraints (`delivery_id`, `repo+comment_id`, `repo+trigger_comment_id`).
- Keep bot self-comment ignore behavior.
- Keep persistent comment update behavior (avoid comment spam regressions).
- Be explicit when changing runner isolation/fallback behavior; this is a security-sensitive area.
- Keep response payloads and command parsing backward compatible unless intentionally versioned.

## Known Risks / Active Gaps

See `TODO.md` for priority backlog, especially:

- stronger isolated runner flow,
- stricter host fallback controls,
- end-to-end integration coverage.

Treat these as high-sensitivity areas when modifying worker/runner paths.

## Recommended Workflow for Agents

1. Read touched service + corresponding tests first.
2. Make minimal cohesive changes.
3. Add/update tests with behavior changes.
4. Run `pytest`.
5. Summarize impact, risks, and follow-ups in PR/commit notes.