Configuration¶
This page documents the runtime inputs that affect diffsan today:
- TOML config file values
DIFFSAN_*environment overrides- CLI flags that override part of the config
- required CI and auth environment variables that are not part of
AppConfig
The canonical config schema lives in src/diffsan/contracts/models.py. This page reflects the current implementation in src/diffsan/core/config.py, src/diffsan/run.py, and related runtime modules.
Precedence¶
diffsan resolves configuration in this order:
- CLI overrides
DIFFSAN_*environment variables- Config file
- Built-in defaults
Current CLI overrides:
--ci/--no-cioverridesmode.ci--agent <cursor|codex>overridesagent.agent--proxy-url <url>overridesagent.proxy_url(Codex only)--config <path>selects the TOML config file to load
workdir and note_timezone are config values, but the current public CLI does not expose dedicated flags for them.
Config File Discovery¶
diffsan looks for a TOML config file in this order:
--config <path>DIFFSAN_CONFIG_FILE.diffsan.tomlin the current working directory- no config file
Rules:
- The selected path must exist.
- The selected path must be a file, not a directory.
- The file must contain a top-level TOML table/object.
- Unknown config keys are rejected during validation.
If config parsing or validation fails, diffsan exits with CONFIG_PARSE_ERROR.
Environment Variable Mapping¶
diffsan uses the DIFFSAN_ prefix for config overrides.
- Top-level keys map directly:
DIFFSAN_WORKDIR=.ai-reviewDIFFSAN_NOTE_TIMEZONE=Asia/Singapore- Nested keys use
__: DIFFSAN_MODE__CI=trueDIFFSAN_LIMITS__MAX_FILES=80DIFFSAN_GITLAB__SUMMARY_NOTE_TAG=security-bot
Notes:
- Booleans should be passed as
true/false. - Integers should be passed as numeric strings.
- Lists and other structured values should be passed as JSON strings.
DIFFSAN_CONFIG_FILEis special: it selects the config file path and is not part ofAppConfig.
Examples:
export DIFFSAN_WORKDIR=".ai-review"
export DIFFSAN_MODE__CI="true"
export DIFFSAN_AGENT__AGENT="codex"
export DIFFSAN_AGENT__PROXY_URL="https://proxy.example.com/v1"
export DIFFSAN_TRUNCATION__INCLUDE_EXTENSIONS='[".py",".ts"]'
export DIFFSAN_SECRETS__EXTRA_PATTERNS='["ghp_[A-Za-z0-9]{36}"]'
CLI Flags¶
Current diffsan CLI flags:
| Flag | Effect |
|---|---|
--ci/--no-ci |
Overrides mode.ci |
--agent <cursor|codex> |
Overrides agent.agent |
--proxy-url <url> |
Overrides agent.proxy_url for Codex runs |
--config <path> |
Selects the TOML config file |
--dry-run |
Runs the no-op harness and writes run artifacts without executing the review pipeline |
--version |
Prints the CLI version and exits |
--dry-run is runtime behavior, not part of the persisted config schema.
Full Config Reference¶
Top-Level Keys¶
| Key | Type | Default | Notes |
|---|---|---|---|
workdir |
str |
.diffsan |
Artifact directory. Created early in the run. |
note_timezone |
str |
system local timezone | Used when rendering timestamps in MR summary metadata. See timezone notes below. |
mode¶
| Key | Type | Default | Notes |
|---|---|---|---|
mode.ci |
bool |
false |
Enables CI mode. Real diff acquisition currently only works in CI mode; mode.ci = false is not yet a full standalone review path. |
limits¶
| Key | Type | Default | Notes |
|---|---|---|---|
limits.max_diff_chars |
int |
200000 |
Hard cap on prepared diff length sent to the agent. |
limits.max_files |
int |
60 |
Maximum number of diff files kept after ranking/filtering. |
limits.max_hunks_per_file |
int |
40 |
Maximum hunks retained per file before truncation. |
truncation¶
| Key | Type | Default | Notes |
|---|---|---|---|
truncation.priority_extensions |
list[str] |
[".py", ".js", ".ts", ".go", ".java", ".rb", ".php", ".rs"] |
Extensions ranked earlier when truncation is needed. |
truncation.depriority_extensions |
list[str] |
[".md", ".rst", ".txt", ".lock"] |
Extensions ranked later when truncation is needed. |
truncation.include_extensions |
list[str] \| null |
null |
If set, only files with these extensions are kept. |
truncation.ignore_globs |
list[str] |
["docs/**", "**/*.generated.*"] |
Paths dropped before prompt construction. |
secrets¶
| Key | Type | Default | Notes |
|---|---|---|---|
secrets.enabled |
bool |
true |
Enables secret scanning and redaction before prompting. |
secrets.extra_patterns |
list[str] |
[] |
Extra regex patterns to redact in addition to built-in detectors. |
secrets.post_warning_to_mr |
bool |
true |
When redaction occurs, include a warning section in the summary note. Despite the name, the current implementation does not post a separate warning note. |
skip¶
| Key | Type | Default | Notes |
|---|---|---|---|
skip.skip_on_auto_merge |
bool |
true |
Skip posting when MR auto-merge is detected. |
skip.skip_on_same_fingerprint |
bool |
true |
Skip posting when the current diff fingerprint matches the latest prior diffsan fingerprint. |
agent¶
| Key | Type | Default | Notes |
|---|---|---|---|
agent.agent |
"cursor" \| "codex" |
cursor |
Selects the agent backend. |
agent.cursor_command |
str \| null |
null |
Custom Cursor command. If omitted, diffsan uses the built-in default. |
agent.codex_command |
str \| null |
null |
Custom Codex command. If omitted, diffsan uses the built-in default. |
agent.proxy_url |
str \| null |
null |
Codex-only proxy URL. When set, diffsan updates ~/.codex/config.toml before invoking Codex. |
agent.max_json_retries |
int |
3 |
Maximum Cursor parse/repair attempts. Ignored for Codex runs. |
agent.json_repair_prompt |
str |
Return ONLY valid JSON that matches the schema. |
Prefix text used when building Cursor repair prompts. Ignored for Codex runs. |
agent.verbosity |
"low" \| "medium" \| "high" |
medium |
Passed into prompt guidance. |
agent.skills |
list[str] |
[] |
Passed into prompt guidance as lightweight review hints. |
agent.prompt_template |
str \| null |
null |
Reserved in the schema, but not currently used by the prompt builder. |
Validation rule:
agent.proxy_urlis only valid whenagent.agent = "codex".
gitlab¶
| Key | Type | Default | Notes |
|---|---|---|---|
gitlab.enabled |
bool |
true |
Controls GitLab prior-context fetches and posting. When false, diffsan still prints the summary to stdout. |
gitlab.base_url |
str |
https://gitlab.com |
Base GitLab URL. Can be a site root or an /api/v4 URL. In CI, CI_API_V4_URL takes precedence when present. |
gitlab.project_id |
str \| null |
null |
GitLab project identifier. If unset, diffsan falls back to CI_PROJECT_ID. |
gitlab.mr_iid |
int \| null |
null |
Merge request IID. If unset, diffsan falls back to CI_MERGE_REQUEST_IID. |
gitlab.token_env |
str |
GITLAB_TOKEN |
Name of the environment variable that contains the GitLab token. |
gitlab.idempotent_summary |
bool |
false |
Included in post planning metadata, but current posting still creates a new summary note each run. |
gitlab.summary_note_tag |
str |
ai-reviewer |
Marker used to locate prior diffsan summary notes. |
gitlab.retry_max |
int |
3 |
Maximum GitLab API attempts for retryable failures such as 429, 5xx, and transient network errors. |
logging¶
| Key | Type | Default | Notes |
|---|---|---|---|
logging.level |
"error" \| "warn" \| "info" \| "debug" |
info |
Present in the schema, but not currently wired into runtime logging behavior. |
logging.structured |
bool |
true |
Present in the schema, but current runs always emit structured events.jsonl. |
Timezone Values¶
note_timezone controls how MR summary timestamps are rendered.
Supported values in the current formatter:
- IANA timezone names such as
UTCorAsia/Singapore LOCALto use the runner's local timezone at render timeSGTas an alias forAsia/Singapore- UTC offsets such as
+08:00or-05:30
If the configured value is invalid, summary-note rendering falls back to Asia/Singapore (SGT).
Agent Command Behavior¶
Cursor¶
If agent.cursor_command is unset, diffsan runs:
If CURSOR_API_KEY is set, diffsan inserts:
If you provide a custom agent.cursor_command and it does not already include one of these trust flags, diffsan appends --trust:
--trust--yolo-f
Sensitive argument values such as API keys are redacted from persisted error context.
Codex¶
If agent.codex_command is unset, diffsan runs:
codex exec --output-schema <workdir>/codex-output-schema.json --output-last-message <workdir>/codex-output.json --sandbox read-only
Current Codex behavior:
- The prompt is passed on stdin.
- diffsan always writes the output schema to
<workdir>/codex-output-schema.json. - diffsan always reads structured JSON from
<workdir>/codex-output.json. - If a custom command already includes
--output-schemaor--output-last-message, diffsan rewrites those flags to point at the workdir artifacts. - If a custom command does not include a sandbox value, diffsan inserts
--sandbox read-only. - If a custom command already provides a sandbox value, diffsan preserves it.
- If
agent.proxy_urlis set, diffsan rewrites~/.codex/config.tomlbefore invoking Codex: - sets top-level
model_provider = "proxy" - writes a single
[model_providers.proxy]block with the suppliedbase_url - sets
env_key = "DIFFSAN_OPENAI_API_KEY" - When proxy mode is used, diffsan prints a reminder to set
DIFFSAN_OPENAI_API_KEY.
Runtime Environment Outside DIFFSAN_*¶
Some runtime inputs are not part of the config schema but are still required for a working CI run.
GitLab token¶
diffsan reads the GitLab token from the environment variable named by gitlab.token_env.
By default:
GitLab CI variables¶
Current diff fetching requires these CI variables:
CI_MERGE_REQUEST_TARGET_BRANCH_NAMECI_COMMIT_SHA
These are used when available and are required unless you override with config where noted:
CI_PROJECT_ID- required for GitLab API calls unless
gitlab.project_idis configured CI_MERGE_REQUEST_IID- required for GitLab API calls unless
gitlab.mr_iidis configured CI_API_V4_URL- optional; overrides
gitlab.base_urlresolution when present CI_MERGE_REQUEST_SOURCE_BRANCH_NAME- optional; stored in diff metadata
CI_MERGE_REQUEST_DIFF_BASE_SHA- optional; used as part of diff metadata and positioning
CI_PIPELINE_ID- optional; shown in summary-note metadata
Agent authentication¶
- Cursor default command optionally reads
CURSOR_API_KEY - Codex authentication is handled by the
codexCLI itself unlessagent.proxy_urlis set
If agent.proxy_url is set, diffsan configures Codex with:
[model_providers.proxy]
name = "proxy"
base_url = "<your proxy url>"
env_key = "DIFFSAN_OPENAI_API_KEY"
For proxy-backed Codex runs, you must provide:
If agent.proxy_url is not set, any non-proxy Codex authentication still depends on your existing Codex CLI setup.
Example .diffsan.toml¶
Minimal GitLab CI config using Codex¶
workdir = ".diffsan"
note_timezone = "UTC"
[mode]
ci = true
[agent]
agent = "codex"
verbosity = "medium"
[gitlab]
enabled = true
summary_note_tag = "ai-reviewer"
More opinionated config with filtering and secret rules¶
workdir = ".ai-review"
note_timezone = "Asia/Singapore"
[mode]
ci = true
[limits]
max_diff_chars = 250000
max_files = 80
max_hunks_per_file = 60
[truncation]
include_extensions = [".py", ".ts", ".tsx"]
ignore_globs = ["docs/**", "vendor/**", "**/*.generated.*"]
[secrets]
enabled = true
extra_patterns = [
"ghp_[A-Za-z0-9]{36}",
"glpat-[A-Za-z0-9_-]{20,}",
]
post_warning_to_mr = true
[skip]
skip_on_auto_merge = true
skip_on_same_fingerprint = true
[agent]
agent = "cursor"
verbosity = "high"
skills = ["security", "testing"]
max_json_retries = 3
[gitlab]
enabled = true
base_url = "https://gitlab.example.com"
token_env = "GITLAB_TOKEN"
summary_note_tag = "team-diffsan"
retry_max = 5
Practical Recommendations¶
- Set
mode.ci = truefor real MR reviews. - Keep
workdirinside the repository workspace so CI artifacts are easy to collect. - Prefer TOML for stable repo defaults, then use
DIFFSAN_*env vars for CI-specific overrides. - Use
gitlab.project_idandgitlab.mr_iidonly when you cannot rely on GitLab CI variables. - Keep
gitlab.summary_note_tagstable once you start using diffsan on a project, or prior-review detection will fragment. - Treat
agent.prompt_template,logging.level,logging.structured, andgitlab.idempotent_summaryas forward-looking knobs until their runtime behavior is implemented.