Decision Log
This file records key decisions for diffsan. Keep entries short and factual.
When changing a decision, add a new entry (do not rewrite history).
D001 — Tool scope and primary goal¶
Decision: diffsan focuses on AI review of MR diffs in GitLab CI, prioritizing correctness/security findings first.
Why: Highest value and aligns with “do not block merges” human-review workflow.
Implications: Output schema and prompting emphasize correctness/security; other categories exist but are secondary.
D002 — Monolithic CLI architecture¶
Decision: Build as a single Python CLI (monolith) with internal module boundaries and contracts. Why: Small project, fastest to ship/debug, simple pipx distribution, minimal ops overhead. Implications: Clear internal contracts (Pydantic models + error codes + events) to prevent a “ball of mud.”
D003 — Distribution via pipx¶
Decision: Users install/run diffsan primarily via pipx.
Why: Simple, isolated installs; good for CI usage.
Implications: Keep dependencies reasonable; avoid heavy system requirements; ensure CLI entrypoint is stable.
D004 — Primary execution mode is GitLab CI¶
Decision: CI mode is the primary path; standalone mode is minimal and not a priority. Why: Main use case is MR pipelines. Implications: Standalone prints to stdout only; CI mode handles GitLab context, posting, skip logic, artifacts.
D005 — Agent support MVP: Cursor CLI only¶
Decision: MVP supports Cursor CLI headless agent only; Codex CLI is future work. Why: Reduces scope and accelerates MVP delivery. Implications: Design agent runner with an adapter seam to add Codex later.
D006 — Cursor output handling: strict JSON + repair retries¶
Decision: Require the agent to output only JSON matching schema; implement repair retries because Cursor is unstructured. Why: Posting requires structured data; Cursor may emit extra text. Implications: Pydantic validation drives retry loop; raw outputs are stored as artifacts; invalid outputs never posted to MR.
D007 — Review output schema: Pydantic as source of truth¶
Decision: Define review output and internal contracts with Pydantic models. Why: Strong typing and validation; easy to serialize artifacts; clearer contracts for agents. Implications: Any contract change must update both docs and Pydantic models and example JSON fixtures.
D008 — Findings minimum fields¶
Decision: Each finding includes: severity, category, path, line_start, line_end, body_markdown, optional suggested_patch.
Why: Minimum needed to create discussions and be useful.
Implications: Agents must supply line ranges; if positions cannot be computed, findings still appear in summary and artifacts.
D009 — Fingerprinting: sha256(raw diff text)¶
Decision: Fingerprint is sha256 of the raw diff text.
Why: Simple and deterministic for MVP.
Implications: Whitespace-only changes or ordering changes alter fingerprint; normalization may be added later.
D010 — Prior context: compact digest, avoid repeats¶
Decision: Store prior fingerprint in the summary note; feed agent a compact digest of prior findings and instruct it to avoid repeating and not re-assert unresolved issues. Why: Prevent spam and improve signal-to-noise. Implications: Digest parsing must be tolerant; agent prompting must explicitly discourage repetition.
D011 — Secret handling: best-effort redaction + MR warning¶
Decision: Perform best-effort regex-based secret detection and redaction before prompting; if secrets found, log high severity and include a warning in MR summary. Why: Prevent sensitive data leakage; raise awareness when secrets are in diffs. Implications: Never store raw secrets; store only hashes/lengths. MR warning must not reveal secret content.
D012 — Truncation: configurable hard limits, disclose in summary¶
Decision: Apply configurable hard limits (max_diff_chars, max_files, max_hunks_per_file). If truncation occurs, proceed with review but clearly disclose truncation in summary, including collapsible details of what was omitted.
Why: Context limits and cost control; transparency to reviewers.
Implications: Preprocessor must emit truncation report; formatter must include a truncation <details> section.
D013 — Skip rules MVP: auto-merge only, silent skip¶
Decision: MVP skip condition is auto-merge enabled. When skipping, do not post to MR; only print to stdout. Why: Start simple; avoid noise in auto-merge flows. Implications: GitLab integration must reliably detect auto-merge; if uncertain, fail open (do not skip).
D014 — Posting target: GitLab only (MVP)¶
Decision: Post to GitLab MR as:
- summary as a note
- findings as discussions when position is computable Why: Primary environment is GitLab; keep scope tight. Implications: GitLab API client is required; positioning errors must degrade gracefully.
D015 — Posting identity: dedicated bot user + project access token¶
Decision: Use a dedicated bot user via project access token. Why: Clear attribution and permissions management. Implications: Token must be provided via env var; auth failures are non-retryable.
D016 — Posting behavior: default new summary note; idempotent optional¶
Decision: Default is posting a new summary note each time; allow idempotent mode via config to update/replace. Why: Default is simple; idempotency helps reduce clutter for some teams. Implications: Idempotent mode requires locating prior note via tag/marker.
D017 — Failure behavior: non-zero exit, artifacts always¶
Decision: Return non-zero on errors; CI job can be configured allow-failure. Always write artifacts (prompt, raw output, review where possible) even on failure. Why: Don’t block merges, but do surface failures; preserve debuggability. Implications: Orchestrator must be careful to persist artifacts early and often.
D018 — Standalone mode: stdout only¶
Decision: Standalone mode prints to stdout and writes artifacts, but does not post to GitLab. Why: Keep MVP small; CI is the priority. Implications: CLI flags should make CI mode explicit; standalone accepts simplest diff source.
D019 — Skip on same fingerprint enabled by default¶
Decision: Add skip_on_same_fingerprint=true by default; skip when current diff fingerprint equals latest prior diffsan summary fingerprint.
Why: Reduce duplicate reviews/spam on re-runs with unchanged diff content.
Implications: Summary notes include a machine-readable fingerprint marker; skip logic compares current fingerprint against latest prior note context.
D020 — Agent support: Cursor default plus Codex structured path¶
Decision: Support both Cursor CLI and Codex CLI, with Cursor as default.
Why: Keep backward compatibility while enabling a structured-output backend.
Implications: Config must select agent kind explicitly (cursor or codex); orchestrator dispatches to agent-specific runners.
D021 — Prompting and retry policy are agent-specific¶
Decision: Keep schema-in-prompt + JSON repair retries for Cursor, but omit those sections and retries for Codex.
Why: Cursor is unstructured and needs repair loops; Codex supports schema-constrained output files.
Implications: Prompt builder and run pipeline branch by agent; output still passes through shared parse_and_validate before posting.