Introduction
bomdrift is a CLI and multi-SCM action that diffs two SBOMs and surfaces supply-chain risk signals on every changed dependency — flags new CVEs (with EPSS + CISA KEV scoring), typosquats across eight ecosystems, multi-major version jumps, young-maintainer takeovers, registry-metadata signals (recently-published, deprecated, maintainer-set-changed), and license-policy violations — ready to drop into a PR comment on GitHub, GitLab, Bitbucket, or Azure DevOps.
What problem does it solve?
The most actionable supply-chain question on a pull request is:
What changed in this diff’s dependencies that I should worry about?
— not “what’s in my SBOM?”. Plenty of tools answer the second question (OSV-scanner, Grype, Trivy). bomdrift answers the first.
Recent incidents bomdrift would have surfaced
axios npm compromise (Mar 31, 2026)
A maintainer was socially engineered (fake Slack/Teams call attributed to
North Korean UNC1069), and axios@1.14.1 + axios@0.30.4 shipped briefly
with a malicious runtime dep plain-crypto-js@4.2.1 that dropped the
WAVESHAPER.V2 RAT on Windows, macOS, and Linux.
Three of bomdrift’s signals would have fired in the diff that pulled the compromised release:
- Added — a brand-new transitive dependency
plain-crypto-js@4.2.1appears. - Typosquat —
plain-crypto-jsscores 0.95 against the legitimatecrypto-jsvia the suffix-containment boost rule. - Vulnerabilities — OSV.dev returns the published advisory IDs
(
MAL-2026-2306,GHSA-3p68-rc4w-qgx5, etc.) on both versions, with EPSS / KEV badges where applicable.
See examples/axios-incident/
for the SBOM pair and the rendered output.
Shai-Hulud worm (npm, Nov 2025)
700+ packages compromised by a self-replicating worm. Diff-time review of newly added transitive deps and version bumps was the only pre-merge defense. bomdrift’s “added components + CVE enrichment + recently- published registry signal” combination surfaces this class of attack at PR time.
xz-utils backdoor (CVE-2024-3094, Mar 2024)
A 2.6-year social-engineering campaign culminating in a backdoor shipped
in xz 5.6.0/5.6.1. The “Jia Tan” maintainer’s first commit was recent
relative to the release — exactly the maintainer-age heuristic bomdrift
implements via the GitHub REST API. The threshold is tunable via
--young-maintainer-days (default 90; v0.9.6+).
Sustained PyPI typosquat campaigns (2024–2026)
Hundreds of malicious packages disguised by single-character substitutions.
Jaro-Winkler similarity against top-N catalogs catches these reliably; see
the Typosquat detection chapter for the full
algorithm and the --typosquat-similarity-threshold knob (v0.9.6+).
Design ethos
- Small dep tree, no Docker, single binary. ~3.4 MB stripped + LTO. No tokio, no chrono, no semver crate, no octocrab — the constraint is load-bearing.
- Best-effort enrichers. Network failures (OSV, EPSS, KEV, GitHub, registries), plugin failures, and attestation-verify failures all warn-and-continue. A PR review is still useful without one signal, and the offline change-shape signals always work.
- Byte-deterministic output. Identical inputs render to byte-identical
Markdown / JSON / SARIF / VEX every time, honoring
SOURCE_DATE_EPOCH, so PR-comment upserts viapeter-evans/create-or-update-commentpatch in place instead of accumulating duplicate comments. - Cosign-signed releases. Every archive carries a Sigstore signature
via GitHub OIDC. Action defaults to verifying signatures; opt-out via
verify-signatures: falsefor trusted mirrors. As of v0.9.6, the same cosign machinery can verify the input SBOMs themselves via--before-attestation/--after-attestation. - OSS-first, no telemetry, no account. Apache-2.0; no daemon, no hosted UI, no signup.
Where to next?
Getting started
- New here? Start with the Quickstart.
- Wiring up the GitHub Action? See GitHub Action.
- On another forge? See GitLab CI, Bitbucket Pipelines, or Azure DevOps Pipelines.
- Looking up a specific flag? See CLI reference.
Suppressing findings
- Baseline & suppression — JSON snapshots, in-comment
/bomdrift suppress,expires+reason. - License policy — SPDX expression evaluation with allow/deny + per-exception granularity.
- VEX — OpenVEX 0.2.0 + CycloneDX VEX 1.6 consume / emit.
Output
- Output formats — markdown / terminal / JSON.
- SARIF + Code Scanning — stable rule IDs, fingerprints, Code Scanning ingestion.
Per-signal deep dives
- Enrichers overview — the contract every enricher honors, plus pointers into each chapter.
Advanced
- OCI attestation — fetch SBOMs as
cosign verify-attestation-verified OCI artifacts (v0.9.6+). - Plugins — external-process plugin protocol for custom rules (v0.9.6+).
- Architecture — module map, pipeline, determinism contract.
Quickstart
In a GitHub workflow (recommended)
The most common way to run bomdrift is the composite Action — drop it into
a pull_request workflow and let the action handle checkout, Syft install,
SBOM generation, diffing, and PR-comment posting:
# .github/workflows/sbom-diff.yml
name: SBOM diff
on: pull_request
permissions:
contents: read
pull-requests: write # to upsert the diff comment
jobs:
diff:
runs-on: ubuntu-latest
steps:
- uses: Metbcy/bomdrift@v1
# Optional:
# with:
# fail-on: critical-cve # exit 2 on HIGH/CRITICAL advisories
# path: services/api # scan a monorepo subdirectory
The @v1 mutable tag tracks the latest v0.x release. Pin to a specific
version (@v0.9.9) if you prefer reproducible builds. See
GitHub Action for every input.
If you prefer a checked-in policy file, install the binary and run
bomdrift init once. It writes .bomdrift.toml plus the SBOM-diff and
comment-suppression workflows, so future policy tweaks happen in TOML
instead of workflow YAML.
Locally with the binary
Three install paths are supported.
Via cargo (v0.9.9+)
cargo install --locked bomdrift
bomdrift --version
Via Docker / OCI (v0.9.9+)
docker run --rm ghcr.io/metbcy/bomdrift:latest --version
# Pin to a specific version for reproducible CI:
docker run --rm ghcr.io/metbcy/bomdrift:v0.9.9 --version
The image is multi-arch (linux/amd64, linux/arm64), distroless
(gcr.io/distroless/cc-debian13:nonroot), and runs as a non-root user.
Verify the inline SLSA attestation with
gh attestation verify --owner Metbcy oci://ghcr.io/metbcy/bomdrift:v0.9.9.
Via release archive (cosign-signed)
Pre-built binaries cover Linux x86_64 + aarch64, macOS aarch64, and Windows x86_64. Each archive is cosign-signed via Sigstore + GitHub OIDC and ships a SLSA build provenance attestation (v0.9.9+).
VERSION=v0.9.9
TARGET=x86_64-unknown-linux-gnu
curl -sSL -o bomdrift.tar.gz \
"https://github.com/Metbcy/bomdrift/releases/download/${VERSION}/bomdrift-${VERSION}-${TARGET}.tar.gz"
tar -xzf bomdrift.tar.gz
./bomdrift-${VERSION}-${TARGET}/bomdrift --version
# Diff two SBOMs
./bomdrift-${VERSION}-${TARGET}/bomdrift diff before.json after.json
# Emit SARIF to a file (no fragile YAML > redirection)
./bomdrift-${VERSION}-${TARGET}/bomdrift diff before.json after.json \
--output sarif --output-file bomdrift.sarif
To verify the archive’s signature before you trust the binary, see Release signing.
From source
cargo install --locked --git https://github.com/Metbcy/bomdrift --tag v0.9.9 bomdrift
Requires Rust 1.85+ (the project uses edition 2024). Prefer
cargo install bomdrift (above) unless you specifically want to
track an unreleased commit.
First diff
The repository ships four runnable example scenarios under examples/.
After cloning + cargo build --release:
./target/release/bomdrift diff \
examples/axios-incident/before.json \
examples/axios-incident/after.json \
--no-osv --no-maintainer-age
The output is GitHub-Flavored Markdown ready for PR-comment posting.
What’s next?
- Wire it up: GitHub Action · GitLab CI · Bitbucket · Azure DevOps.
- Reference: CLI reference · every flag with introduced-in annotations · Output formats · SARIF + Code Scanning.
- Suppress noise: Baseline & suppression lets a team adopt bomdrift on a project with pre-existing findings without drowning the first PR.
- License gating: License policy — SPDX expression evaluation with allow/deny + per-exception granularity.
- VEX: VEX — record exploitability decisions in OpenVEX 0.2.0 / CycloneDX VEX 1.6, suppress on subsequent diffs.
- Advanced (v0.9.6+): OCI attestation · Plugins for custom rules.
- Internals: Architecture · Contributing.
GitHub Action
The Metbcy/bomdrift action is a composite action (no Docker), which
keeps PR-comment latency low — typically 5–10s on a warm runner versus
30s+ for a Docker container action.
Quick start (zero-config, v0.5+)
On a pull_request workflow, the action defaults to comparing the PR’s
base branch against the PR’s head SHA — no checkout step, no Syft step,
no SBOM-path wiring needed:
on: pull_request
permissions:
contents: read
pull-requests: write
jobs:
diff:
runs-on: ubuntu-latest
steps:
- uses: Metbcy/bomdrift@v1
That’s it. The action checks out both refs into opaque sibling paths, generates CycloneDX-JSON SBOMs via Syft (installed automatically and cached across job runs), and posts the rendered diff as an upserted PR comment.
For a repo-owned policy, run bomdrift init once and commit the generated
.bomdrift.toml plus workflows. The action auto-loads .bomdrift.toml
from the repo root when present, or you can pass
config: .bomdrift.toml explicitly.
If you already produce SBOMs through a non-Syft toolchain — Trivy,
SPDX-tools, an in-house generator — supply the file paths via the
before-sbom / after-sbom inputs instead. The advanced flow below
documents that path; both flows continue to be supported in v1.
Inputs
The action exposes the full bomdrift CLI surface as inputs (v0.9.7+ input parity, current as of v0.9.9). For the canonical flag semantics see CLI reference; the tables below document only the action-side wrapper. Empty defaults mean “don’t pass the flag” — bomdrift then uses its own CLI/config defaults.
What’s new in v0.9.7
These inputs are newly exposed (the underlying CLI flags shipped earlier):
- VEX:
vex,emit-vex,vex-author,vex-default-justification - License policy:
allow-licenses,deny-licenses,allow-exception,deny-exception,allow-ambiguous-licenses - Enrichment toggles:
no-epss,no-kev,no-registry - Failure thresholds:
fail-on-epss - Calibration knobs:
recently-published-days,typosquat-similarity-threshold,young-maintainer-days,cache-ttl-hours,multi-major-delta(new CLI flag in v0.9.7) - Attestation:
before-attestation,after-attestation,cosign-identity,cosign-issuer,require-attestation - Plugins:
plugin
Before v0.9.7 these had to be driven through .bomdrift.toml or a direct
cargo install invocation. The config-file path remains supported and is
still preferred for repo-wide policy.
Core: refs, paths, SBOMs
| Input | Type | Default | Description |
|---|---|---|---|
before-ref | string | ${{ github.event.pull_request.base.ref }} | Git ref / SHA to check out as the “before” side. Default works on pull_request events. |
after-ref | string | ${{ github.event.pull_request.head.sha }} | Git ref / SHA for the “after” side. |
path | string | . | Subdirectory of the checked-out ref to scan with Syft (monorepos: path: services/api). |
before-sbom | string (path) | '' | Pre-generated “before” SBOM. Bypasses the in-action Syft invocation. |
after-sbom | string (path) | '' | Pre-generated “after” SBOM. |
format | enum | auto | Force input format: auto/cdx/spdx/syft. Maps to --format. |
Output
| Input | Type | Default | Description |
|---|---|---|---|
output | enum | markdown | Output format: terminal/markdown/json/sarif. PR comments require markdown. Maps to --output. |
comment-on-pr | bool | true | Post the rendered diff as a PR comment on pull_request events. |
comment-size-limit | number | 60000 | Bytes. Above this size, the PR-comment body is re-rendered with --summary-only. 0 disables the fallback. |
findings-only | bool | false | Markdown-only. Maps to --findings-only. |
upload-to-code-scanning | bool | false | Upload SARIF to GitHub Code Scanning. Requires output: sarif. |
github-token | string | ${{ github.token }} | Token used to post PR comments. |
Suppression and policy
| Input | Type | Default | Description |
|---|---|---|---|
config | string (path) | '' | Path to .bomdrift.toml. Empty auto-loads from the repo root when present. Maps to --config. |
baseline | string (path) | '' | Pre-captured bomdrift diff --output json snapshot to suppress against. Maps to --baseline. |
vex | string (multi-line paths) | '' | OpenVEX documents to consume; one path per line, each becomes a repeated --vex. |
emit-vex | string (path) | '' | Path to write a freshly emitted OpenVEX document. Maps to --emit-vex. |
vex-author | string | '' | Author identity for the emitted OpenVEX. Maps to --vex-author. |
vex-default-justification | string | '' | OpenVEX not_affected justification ID applied by default. Maps to --vex-default-justification. |
License policy
| Input | Type | Default | Description |
|---|---|---|---|
allow-licenses | string (comma list) | '' | SPDX expressions to allow. Maps to --allow-licenses. |
deny-licenses | string (comma list) | '' | SPDX expressions to deny. Maps to --deny-licenses. |
allow-exception | string (comma list) | '' | SPDX exception identifiers to allow inside WITH clauses. v0.9.7 refines compound-expression inheritance. Maps to --allow-exception. |
deny-exception | string (comma list) | '' | SPDX exception identifiers to deny. Maps to --deny-exception. |
allow-ambiguous-licenses | bool | false | Treat unresolved license expressions as allowed. Maps to --allow-ambiguous-licenses. |
Enrichment toggles
| Input | Type | Default | Description |
|---|---|---|---|
no-epss | bool | false | Disable EPSS exploit-likelihood enrichment. Maps to --no-epss. |
no-kev | bool | false | Disable CISA KEV enrichment. Maps to --no-kev. |
no-registry | bool | false | Disable registry / maintainer-age enrichment (no network calls to package registries). Maps to --no-registry. |
Calibration knobs
| Input | Type | Default | Description |
|---|---|---|---|
recently-published-days | number | '' | “Recently published” maintainer-age window. Maps to --recently-published-days. |
typosquat-similarity-threshold | number (0.0–1.0) | '' | Damerau-Levenshtein threshold for typosquat detection. Maps to --typosquat-similarity-threshold. |
young-maintainer-days | number | '' | Age below which a maintainer is flagged as “young”. Maps to --young-maintainer-days. |
cache-ttl-hours | number | '' | TTL for the on-disk enrichment cache. Maps to --cache-ttl-hours. |
multi-major-delta | number (≥1) | '' | Major-version delta at or above which an upgrade is flagged as multi-major (CLI default 2). Maps to --multi-major-delta. New in v0.9.7. |
Failure thresholds
| Input | Type | Default | Description |
|---|---|---|---|
fail-on | enum | none | Trip exit 2 on findings of the configured kind: none/cve/critical-cve/typosquat/license-change/any. The PR comment is still posted on a tripped run. |
fail-on-epss | number (0.0–1.0) | '' | Trip exit 2 when any new advisory has an EPSS score at or above this value. Maps to --fail-on-epss. |
max-added | number | '' | Exit 2 when more than this many dependencies are added. |
max-removed | number | '' | Exit 2 when more than this many dependencies are removed. |
max-version-changed | number | '' | Exit 2 when more than this many dependencies change version. |
OCI attestation
| Input | Type | Default | Description |
|---|---|---|---|
before-attestation | string (OCI ref) | '' | OCI reference for the cosign attestation covering the “before” SBOM. Maps to --before-attestation. |
after-attestation | string (OCI ref) | '' | OCI reference for the “after” SBOM attestation. Maps to --after-attestation. |
cosign-identity | string (regex) | '' | Regex matched against the cosign certificate identity (--certificate-identity-regexp). Maps to --cosign-identity. |
cosign-issuer | string (URL) | '' | OIDC issuer URL for keyless cosign verification. Maps to --cosign-issuer. |
require-attestation | bool | false | Fail when either side is missing a verified attestation. Maps to --require-attestation. |
For air-gapped / self-hosted Sigstore deployments, see Air-gapped / self-hosted Sigstore.
Plugins
| Input | Type | Default | Description |
|---|---|---|---|
plugin | string (multi-line paths) | '' | Plugin manifests to load; one path per line, each becomes a repeated --plugin. See Plugins. |
Release verification
| Input | Type | Default | Description |
|---|---|---|---|
verify-signatures | bool | true | Install cosign and verify the bomdrift release archive’s Sigstore signature. Set false on trusted mirrors / cached runners (saves ~15s). When true and cosign is missing, the action fails loudly. |
Outputs
The action does not declare formal outputs. Its side effects are:
- The rendered diff is written to stdout (visible in the workflow run log
under the
Run bomdriftstep). - When
output == markdownandGITHUB_STEP_SUMMARYis set, the rendered diff is appended to the step summary so reviewers can see it without a PR-comment posting permission. - On
pull_requestevents withcomment-on-pr: true, the rendered diff is upserted into a single PR comment marked<!-- bomdrift:diff -->. Subsequent pushes update the same comment instead of accumulating new ones (peter-evans/create-or-update-comment-style upsert). - When
fail-onor a diff budget trips, the action exits with code 2 — but only after the PR comment has been posted, so reviewers see the findings even when the workflow step fails.
Common patterns
Repo policy file
Use .bomdrift.toml when you want the policy in version control instead
of repeated YAML inputs:
[diff]
fail_on = "critical-cve"
baseline = ".bomdrift/baseline.json"
findings_only = true
max_added = 25
max_version_changed = 10
- uses: Metbcy/bomdrift@v1
with:
config: .bomdrift.toml
Explicit action inputs still override the config-backed defaults for one-off workflows.
Bring your own SBOMs (advanced / pre-v0.5 flow)
When the SBOMs come from a non-Syft toolchain (Trivy, SPDX-tools, proprietary scanners) or you already generate them in an earlier job step, supply both paths explicitly. The action skips the in-action Syft invocation entirely:
- uses: actions/checkout@v4
- uses: anchore/sbom-action@v0
with: { path: ., output-file: after.json }
- uses: actions/checkout@v4
with: { ref: ${{ github.event.pull_request.base.ref }}, path: base }
- uses: anchore/sbom-action@v0
with: { path: base, output-file: before.json }
- uses: Metbcy/bomdrift@v1
with: { before-sbom: before.json, after-sbom: after.json }
This is the v0.4-era “manual” pattern. It still works in v0.5 — the
before-sbom / after-sbom inputs were required: true in v0.4 and
became required: false in v0.5; nothing else changed about how they
behave. Existing v0.4 workflows continue to function unchanged after a
@v1 tag bump.
Block the merge on critical findings
- uses: Metbcy/bomdrift@v1
with:
before-sbom: before.json
after-sbom: after.json
fail-on: critical-cve
critical-cve filters on severity >= High per the OSV-fetched severity
(see OSV.dev CVE lookup). typosquat,
license-change, and any are also accepted thresholds — see
--fail-on.
Self-hosted / trusted-mirror runners
- uses: Metbcy/bomdrift@v1
with:
before-sbom: before.json
after-sbom: after.json
verify-signatures: false # ~15s faster, skips cosign-installer
This is appropriate when:
- You’re running on self-hosted runners with a hardened image you control.
- You’ve pre-pinned the bomdrift archive in your Nexus/Artifactory mirror and verified its signature once at mirror time.
- You’re running in a network-restricted environment where the public Sigstore endpoints aren’t reachable.
When verify-signatures: true and cosign isn’t installed (or the .sig/
.pem aren’t on the release), the action fails loudly rather than
silently degrading — that’s the whole point of the explicit opt-out.
Big monorepo with massive SBOMs
If bomdrift diff rendered output exceeds GitHub’s 65,536-char comment-body
cap, the v0.3 size fallback re-renders with --summary-only for the PR
comment and keeps the full body in the workflow step summary:
- uses: Metbcy/bomdrift@v1
with:
before-sbom: before.json
after-sbom: after.json
comment-size-limit: 60000 # default; tune for GHE with raised limits
Set comment-size-limit: 0 to disable the fallback entirely and let
GitHub return a 422 on oversized comments (rarely what you want).
Diff-only (no PR comment)
Useful for SARIF uploads, third-party comment posting, or when you just want the diff in the step summary:
- uses: Metbcy/bomdrift@v1
with:
before-sbom: before.json
after-sbom: after.json
output: sarif
comment-on-pr: false
- uses: github/codeql-action/upload-sarif@v3
with: { sarif_file: bomdrift.sarif }
The output: sarif produces SARIF v2.1.0 with stable rule IDs (see
Output formats).
Comment-driven suppression bridges (other forges)
The comment-suppress companion sub-action is GitHub-only — it relies
on the issue_comment workflow event. For GitLab, Bitbucket Cloud,
and Azure DevOps, bomdrift ships parallel Cloudflare Worker bridges
that listen on each forge’s webhook, validate the trigger, and dispatch
the equivalent bomdrift baseline add --from-comment "<body>" run on
the underlying CI:
examples/gitlab-ci/comment-bridge/(v0.9+)examples/bitbucket-pipelines/comment-bridge/(v0.9.5+)examples/azure-devops/comment-bridge/(v0.9.5+)
Each bridge enforces the same five guards: webhook secret /
HMAC verification, event-type filter, repo / project allowlist,
commenter-permission check, and a PR-context guard. The
/bomdrift suppress <ID> [reason: …] grammar is identical across all
four SCMs and shares a single regex (scripts/parse-suppress-comment.sh)
so behavior cannot drift. See the per-forge chapters
GitLab CI · Bitbucket ·
Azure DevOps for setup.
Action permissions
pull-requests: write is required when comment-on-pr: true (the
default). Without it, the comment-upsert step fails with a 403; the
action’s exit code remains the bomdrift exit (so a fail-on or budget
trip still fails the workflow correctly).
contents: read is required so the action’s internal actions/checkout
steps (zero-config flow) can fetch both refs. In the bring-your-own-SBOMs
flow it’s still required by whichever step generates the SBOMs upstream.
What the action does (v0.5+)
When the zero-config flow runs (no explicit before-sbom / after-sbom):
- Two sibling checkouts of
before-refandafter-refinto${{ github.workspace }}/__bomdrift_beforeand__bomdrift_after. Both withfetch-depth: 1andpersist-credentials: false. Skipped for whichever side has a pre-supplied SBOM path. - Syft installed via
anchore/sbom-action/download-syft@v0. Cached across job runs in the runner’s tool cache. syft scan dir:...against each checkout’s${path}subtree, producing CycloneDX-JSON into a tempfile under$RUNNER_TEMP. The bomdrift parser dropsEcosystem::Other("file")pseudo-components that Syft’s directory cataloger emits — set--include-file-components(CLI) or pass a pre-generated SBOM viabefore-sbom/after-sbomto bypass.bomdrift diffruns as in the v0.4 flow, and the upsert + step summary plumbing is unchanged.
The new behavior costs about 30 MB of one-time tool cache and 3–5s of cold-cache wall time per first invocation. Subsequent runs in the same job (or in repos that share the runner’s tool cache) reuse Syft.
Monorepo setup
When a single repo owns N services with independent dependency trees
(services/api, services/worker, apps/web, …), running one
bomdrift job per service gives each PR a focused, per-service comment
without merging unrelated diff churn into a single 65k-char wall.
Pattern A — path: per matrix entry
The simplest setup uses a job matrix and the action’s path input:
on: pull_request
permissions:
contents: read
pull-requests: write
jobs:
diff:
strategy:
fail-fast: false
matrix:
service: [api, worker, web]
runs-on: ubuntu-latest
steps:
- uses: Metbcy/bomdrift@v1
with:
path: services/${{ matrix.service }}
fail-on: critical-cve
Each matrix leg posts (or upserts) its own PR comment, distinguished
by the rendered title (e.g. “SBOM diff — services/api”). The
<!-- bomdrift:diff --> upsert marker is namespaced internally by
path:, so leg N’s comment doesn’t clobber leg N-1’s.
fail-fast: false is recommended: a vulnerability in worker shouldn’t
hide an emergent api finding from the same PR.
Pattern B — share a baseline across services
Most monorepos do want one shared exception list (the same false positive will show up in any service that depends on the same package). Point each leg at the same file:
- uses: Metbcy/bomdrift@v1
with:
path: services/${{ matrix.service }}
baseline: .bomdrift/baseline.json
The baseline file is keyed by (purl_with_version, advisory_id) — see
Match keys — so a suppression for
pkg:npm/colour-print@2.1.0 covers every service that pulls in that
exact version. New versions still surface (intentional; that’s the
point of the version-pinned key).
When services pin different versions of the same dep, you’ll get per-version baseline entries. That’s working-as-intended — a known-fine finding at v1.0.0 should still get a fresh review at v1.1.0.
Pattern C — per-service .bomdrift.toml
When the policy itself differs (worker has a stricter fail-on,
docs-site has a generous max-added), drop a .bomdrift.toml per
service:
- uses: Metbcy/bomdrift@v1
with:
path: services/${{ matrix.service }}
config: services/${{ matrix.service }}/.bomdrift.toml
The auto-discovery only checks the repo root, so an explicit
config: is required for nested files.
What to scope per service vs. globally
| Setting | Scope | Why |
|---|---|---|
fail-on, max-* budgets | Per-service | Worker’s risk surface ≠ web’s |
baseline | Shared | Same false positives across services |
comment-on-pr, output | Per-service | Diff-only legs vs. PR-comment legs |
verify-signatures | Global | Runner-image property, not service property |
Action-broke troubleshooting checklist
When a previously-working bomdrift action job starts failing — typically right after a merge to your default branch, a token rotation, or a runner-image upgrade — work through these in order. Each row is one symptom, one fix so you can grep your job log for the symptom and land on the recipe.
| Symptom (in the job log) | Likely cause | Fix |
|---|---|---|
403 Resource not accessible by integration on the comment-upsert step | pull-requests: write permission missing on the workflow / job | Add permissions: { pull-requests: write, contents: read } at the workflow or job level. PR comments need pull-requests: write; the action’s internal checkouts need contents: read. |
Forks cannot post PR comments warning, exit 0 | PR is from a fork; default GITHUB_TOKEN on pull_request events is read-only | Switch the trigger to pull_request_target (and harden — see GitHub’s guidance), or accept that fork PRs only get the workflow step summary, not a PR comment. |
Could not find SBOM at services/api after a green earlier run | Default branch protection bumped the merge-base; before-ref now points at a commit that predates the services/api directory | Either move the path: value to match the new layout, or pin before-ref explicitly to a known-good commit (before-ref: main). |
cosign: signature verification failed after a release-archive rotation | Cached release archive in the runner’s tool cache is stale and predates a rotation | Bump to the latest patch tag (e.g. Metbcy/bomdrift@v1 re-resolves to the floating tag), or set verify-signatures: false on a self-hosted runner you’ve pinned manually. |
path: services/api warning + empty SBOM | The path doesn’t exist post-checkout — typo, or the directory was renamed in before-ref only | bomdrift v0.7+ surfaces an actionable error pointing at this exact case. See the monorepo section for the matrix recipe; double-check ${{ matrix.service }} substitution. |
| “Comment exceeds 65,536 characters” 422 from GitHub | A massive diff blew past the size cap; the v0.3 fallback to --summary-only was disabled (comment-size-limit: 0) | Re-enable the fallback (drop comment-size-limit to use the default, or set it to 60000). The full body is preserved in the workflow step summary. |
| Action runs, no PR comment appears, exit 0 | Workflow event isn’t pull_request (the comment path is gated on PR events), or comment-on-pr: false was set explicitly | For push/schedule events, the comment path is intentionally skipped — use the step summary or upload the markdown as an artifact. |
If you hit a failure mode not in the table above, please open an issue with the failing job log — the troubleshooting table grows from real reports.
GitLab CI
bomdrift v0.7+ ships first-class GitLab support via a documented
.gitlab-ci.yml template plus a --platform gitlab CLI flag that
swaps the rendered footer to the GitLab MR-note shape. The template
lives in examples/gitlab-ci/;
this chapter walks through the moving parts.
Why a template instead of a custom action
GitLab CI doesn’t have a “marketplace action” model; the unit of
reusability is a YAML snippet. A composite GitHub-Action-style binary
would still need a YAML wrapper, so v0.7 ships the YAML directly. You
can include: it from a shared CI repo if you run bomdrift across
many projects:
include:
- project: 'platform/ci-templates'
file: '/bomdrift/diff.gitlab-ci.yml'
ref: main
Quickstart (zero-config, v0.7+)
On an MR pipeline, the template defaults to comparing the merge-base SHA against the MR head SHA — no manual SBOM wiring needed:
- Copy
examples/gitlab-ci/.gitlab-ci.ymlto your project root. - Add
BOMDRIFT_API_TOKENas a masked CI/CD variable. The token must be a Project Access Token with theapiscope;CI_JOB_TOKENdoesn’t work (it’s read-only on most instances). - Push an MR. The
bomdrift:diffjob runs Syft on both refs, renders the markdown diff, and posts/upserts an MR note marked<!-- bomdrift:diff -->.
That’s it. No .bomdrift.toml required for the default flow; add one
only when you want a repo-pinned policy.
What the job does
Step-by-step (matches the bash <<'BOMDRIFT' block in the template):
- Detects arch (x86_64 / aarch64) and downloads the matching
bomdrift-${VERSION}-...musl.tar.gzfrom GitHub Releases. - Optionally cosign-verifies the archive when
cosignis on PATH andBOMDRIFT_VERIFY_SIGNATURES=true(default). Falls back to a warning when cosign isn’t installed; setBOMDRIFT_VERIFY_SIGNATURES=falseto silence the warning on a runner image you’ve pinned manually. - Installs Syft via the upstream
install.sh. - Creates two
git worktrees — one at the merge-base SHA (CI_MERGE_REQUEST_DIFF_BASE_SHA), one at the MR head (CI_COMMIT_SHA). Worktrees share the active checkout’s.git, so this is cheap. - Generates CycloneDX-JSON SBOMs for both worktrees with
syft scan dir:.... - Runs
bomdrift diffwith--platform gitlab, which renders the GitLab-shaped footer (/-/issues/new?...plusbomdrift baseline addhint instead of the GitHub/bomdrift suppresscomment-driven flow). - Posts/upserts the MR note via the GitLab REST API — finds the
existing note by the
<!-- bomdrift:diff -->marker and PATCHes it, otherwise POSTs a new one.
The full markdown body is also kept as a job artifact (diff.md)
with a 7-day retention so reviewers can recover it after the MR
merges.
Tokens & permissions
| Token | Scope | Used for |
|---|---|---|
BOMDRIFT_API_TOKEN | api | Posting / updating MR notes |
BOMDRIFT_PUSH_TOKEN (optional) | api + write_repository | Suppression job’s commit-back-to-MR-branch step |
Splitting the two tokens means the diff path keeps working even if the suppression token is rotated, and you can give the diff token a narrower blast radius. Mark both as Masked and as Protected when your default branch is the only place suppression commits should land.
CI_JOB_TOKEN is intentionally not used for the comment path: on
most GitLab instances its scope is read-only, and even where it can
post comments the surface area is wider than what bomdrift needs.
CLI auto-detection
bomdrift diff auto-detects GitLab CI from the environment:
GITLAB_CI=true→ flips--platformtogitlab(unless overridden).CI_PROJECT_URL→ used asrepo_url(footer link target) when--repo-urlandBOMDRIFT_REPO_URLare both unset.
Explicit flags always win; the env detection only fills in unset
values. To force GitHub-shape output from a GitLab runner (rare —
mostly useful when cross-posting to a mirror), pass
--platform github explicitly.
Suppressions
For v0.7, GitLab suppressions are manual or job-driven, not comment-driven. Two paths:
Path 1 — CLI
The same bomdrift baseline add <ID> command works in any GitLab
job or local shell:
bomdrift baseline add GHSA-xxxx-yyyy-zzzz
bomdrift baseline add CVE-2026-12345 --path custom/baseline.json
Commit .bomdrift/baseline.json to your MR branch and the next
bomdrift:diff run sees the finding as suppressed. See
Baseline & suppression for match-key semantics and
the worked false-positive example.
Path 2 — manual GitLab job
Copy examples/gitlab-ci/suppress.gitlab-ci.yml
to your project (or merge its job into your main .gitlab-ci.yml).
The job is when: manual — invisible until a reviewer triggers it
from the MR’s pipeline view with a BOMDRIFT_SUPPRESS_ID variable.
On trigger it runs bomdrift baseline add and pushes the result back
to the MR branch using BOMDRIFT_PUSH_TOKEN.
Comment-driven suppression on GitLab (v0.9+)
In-comment /bomdrift suppress <ID> is supported on GitLab as of v0.9
via the Cloudflare Worker bridge.
GitLab’s note webhook fires on every comment on every MR with no
command-prefix filter, so the bridge enforces five guards (webhook
secret, event-type filter, repo allowlist, commenter-permission check,
PR-context guard) before invoking bomdrift baseline add --from-comment "<body>" against the underlying CI. The grammar is identical to the
GitHub comment-suppress sub-action; both share the
scripts/parse-suppress-comment.sh regex so behavior cannot drift.
Self-Managed GitLab
The template uses CI_API_V4_URL (auto-populated on every job)
instead of hardcoding gitlab.com/api/v4, so it works against
Self-Managed instances unchanged. Two things to watch:
- Outbound reachability. The job downloads the bomdrift archive
from GitHub Releases and Syft from the upstream install script. If
your runners can’t reach those, mirror them to your internal Nexus
/ Artifactory and override the
BOMDRIFT_RELEASE_BASE_URLvariable shown in the example README. - Cosign + Sigstore. Keyless verification needs OIDC connectivity
to
oauth2.sigstore.dev. On air-gapped runners, setBOMDRIFT_VERIFY_SIGNATURES=false— bomdrift fails loudly rather than silently skipping when the env var is absent and cosign isn’t reachable, so the explicit opt-out is the right escape hatch.
Troubleshooting
See the examples README troubleshooting table for the most common failure modes (token scoping, signature verification on locked-down runners, push-back-to-protected-branch permissions).
What’s the same vs. the GitHub Action
| Feature | GitHub Action | GitLab template |
|---|---|---|
| Zero-config flow | ✅ | ✅ |
| Syft auto-install | ✅ | ✅ |
| MR/PR comment upsert | ✅ | ✅ |
--summary-only size fallback | ✅ (65k cap) | n/a (1MB cap is rarely hit) |
| Cosign verification of release archive | ✅ | ✅ |
| Per-service monorepo support | ✅ matrix | ✅ matrix (parallel keyword) |
| In-comment suppression | ✅ | v0.8 |
| Manual suppression job | n/a | ✅ |
<!-- bomdrift:diff --> marker | ✅ | ✅ (same shape — cross-platform tooling can grep one shape) |
Comment-driven suppression (advanced)
Trade-off up front. Comment-driven suppression turns a reviewer comment like
/bomdrift suppress GHSA-...into an automatic baseline edit. To wire it up safely you need to operate a small public webhook handler. The manual suppression job documented above is supported and lower-risk; reach for the bridge only when the zero-click UX is worth running a service.
The GitHub flow ships out-of-the-box (comment-suppress sub-action
fronted by the existing webhook). GitLab requires a webhook handler
because GitLab’s Note Hook doesn’t include a command-prefix filter.
Bridge
examples/gitlab-ci/comment-bridge/ ships a Cloudflare Worker
reference implementation that enforces five security guards:
- Webhook secret verification (constant-time
X-Gitlab-Token). - Event-type filter (
Note Hookonly). - Project-ID allowlist.
- Commenter access_level >= 30 (Developer+ on the project).
- MR-context guard (rejects fork-MR comment exfiltration).
When the guards pass, the worker triggers the GitLab pipeline with
BOMDRIFT_NOTE_BODY set to the raw comment body. The
bomdrift:suppress job in suppress.gitlab-ci.yml then runs
bomdrift baseline add --from-comment "$BOMDRIFT_NOTE_BODY" to
extract the directive and update .bomdrift/baseline.json.
The threat model is documented in
examples/gitlab-ci/comment-bridge/README.md.
The same logic ports to Vercel / Netlify / AWS Lambda — see
vercel-equivalent.md.
How notes are upserted
bomdrift posts the diff as a single MR note, not as a Discussion. The lifecycle is:
- First run:
POST /projects/:id/merge_requests/:iid/notescreates the note. The response carries an integeridwhich bomdrift records implicitly by re-finding the note via the<!-- bomdrift:diff -->marker on subsequent runs. - Subsequent runs:
PUT /projects/:id/merge_requests/:iid/notes/:note_idmodifies the existing note’sbodyin place.
Concretely the upsert:
- Modifies the note body in place. The note ID is stable across pipeline runs, so any permalink to the note (right-click → Copy link on the timestamp) keeps working for the lifetime of the MR.
- Does not regenerate the note. GitLab does not delete-and-recreate on PUT; the comment’s position in the MR timeline does not move.
- Does not re-fire
Note Hookwebhooks for unchanged content. GitLab firesNote Hookon note creation but not on body-only edits, so a comment-bridge wired toNote Hookwill not loop on bomdrift’s own upserts. (The bridge’s event-type filter is a defence-in-depth here, not the primary guard.) - Does not affect threaded replies. GitLab’s data model puts notes and replies under a parent discussion; replies attached to bomdrift’s note (e.g. a reviewer typing “ack — accepting this”) remain attached to the same discussion thread regardless of how many times bomdrift edits the parent body. This matches the GitHub-side behaviour where reviewer threaded replies under the bot comment survive each upsert.
bomdrift deliberately uses the Notes API, not the Discussions API, for the diff template. The Discussions API creates a thread root that is awkward to update (you’d be editing the first note of a discussion, with subtly different permission semantics), and the diff comment isn’t trying to start a structured conversation — it’s a single living status comment that reviewers may reply to. Other reviewers can still reply to the bot’s note and GitLab will create a discussion implicitly around their reply; bomdrift just doesn’t seed the discussion itself.
Author and signing
The note’s author is whatever identity owns BOMDRIFT_API_TOKEN
(typically a Project Access Token, which surfaces as a bot user on
the project). On every PUT, GitLab updates the note’s updated_at
and last_edited_by_id fields to point at that same bot identity —
not the original MR author. This is expected and matches the
GitHub equivalent’s behaviour with a bot token: edits show up under
the bot’s identity, while the original commit/MR authorship is
untouched. If your review process audits comment-edit history
(unusual but legitimate on regulated projects), give the token a
descriptive name (e.g. bomdrift-ci-bot) so the audit trail reads
clearly.
Recommended hosting
Cloudflare Workers — the reference. The free tier covers most
webhook traffic. wrangler tail makes live debugging easy.
Vercel / Netlify Edge Functions are equally good if your team
already operates on those platforms.
Bitbucket Pipelines
bomdrift runs in Bitbucket Cloud Pipelines and posts a single upserted PR comment per pull request, mirroring the GitHub Action and GitLab template flow.
Quickstart
Copy examples/bitbucket-pipelines/bitbucket-pipelines.yml
to your repo root and add a Repository Variable named
BOMDRIFT_API_TOKEN containing a Bitbucket App Password with the
pullrequest:write scope.
What the job does
- Installs Syft and bomdrift in a
rust:1.88container. - Generates a CycloneDX SBOM for the PR target branch and the PR
head via
syft dir:. - Renders the diff to markdown with
bomdrift diff --platform bitbucket. - Looks up the existing bomdrift comment on the PR (by the
<!-- bomdrift:diff -->marker) and either creates a new comment or updates the existing one.
Tokens & permissions
| Variable | Scope | Why |
|---|---|---|
BOMDRIFT_API_TOKEN | App Password, pullrequest:write | Posting / updating PR comments. |
The job never auto-pushes to your branch. Suppression is the manual
bomdrift baseline add flow plus a commit on your branch.
CLI auto-detection
Setting BITBUCKET_BUILD_NUMBER in the environment auto-selects
--platform bitbucket when the flag is omitted. The Pipelines
runner sets this variable on every build.
BITBUCKET_GIT_HTTP_ORIGIN is honored as a --repo-url fallback,
so the markdown footer’s “Report this finding” link works without
plumbing.
Suppressions
The supported, no-infrastructure-required flow is the manual baseline edit:
bomdrift baseline add GHSA-... --reason "audit complete (PR #42)"
git add .bomdrift/baseline.json
git commit -m "baseline: suppress GHSA-..."
Comment-driven suppression (advanced, v0.9.5+)
Trade-off up front. Comment-driven suppression turns a reviewer comment like
/bomdrift suppress GHSA-...into an automatic baseline edit. To wire it up safely you need to operate a small public webhook handler. The manual flow above is supported and lower-risk; reach for the bridge only when the zero-click UX is worth running a service.
examples/bitbucket-pipelines/comment-bridge/ ships a Cloudflare
Worker reference implementation that enforces five security guards:
- Webhook HMAC verification (
X-Hub-Signature: sha256=…against the byte-exact request body). - Event-type filter (
pullrequest:comment_createdonly). - Repo-full-name allowlist.
- Commenter-permission lookup (
write/admin/owneronly). - PR-context guard (rejects fork-PR comment-suppress).
When the guards pass, the worker triggers the
bomdrift-comment-suppress custom pipeline (defined in the example
bitbucket-pipelines.yml) with BOMDRIFT_NOTE_BODY set to the raw
comment body. The pipeline runs
bomdrift baseline add --from-comment "$BOMDRIFT_NOTE_BODY" and
pushes the resulting baseline edit back to the PR’s source branch.
The full threat model and deployment guide live in
examples/bitbucket-pipelines/comment-bridge/README.md.
The same logic ports to Vercel / Netlify / AWS Lambda — see
vercel-equivalent.md.
Troubleshooting
See examples/bitbucket-pipelines/README.md.
Azure DevOps Pipelines
bomdrift runs in Azure Pipelines and posts a single upserted PR thread per pull request.
Quickstart
Copy examples/azure-devops/azure-pipelines.yml
to your repo root and add a secret pipeline variable named
BOMDRIFT_API_TOKEN containing a PAT with the Code (Read & Write)
scope.
What the job does
- Installs Rust + bomdrift + Syft on the
ubuntu-latestagent. - Generates a CycloneDX SBOM for the PR target branch and the PR head.
- Renders the diff to markdown with
bomdrift diff --platform azure-devops. - Looks up the existing bomdrift PR thread (by the
<!-- bomdrift:diff -->marker) and either creates a new thread or updates the existing comment.
Tokens & permissions
| Variable | Scope | Why |
|---|---|---|
BOMDRIFT_API_TOKEN | PAT, Code (Read & Write) | Creating / updating PR threads. |
The default System.AccessToken is not used because most
organizations don’t grant it permission to create PR threads.
CLI auto-detection
Setting TF_BUILD=true (Azure Pipelines sets this on every job)
auto-selects --platform azure-devops when the flag is omitted.
BUILD_REPOSITORY_URI is honored as a --repo-url fallback. Note
that this variable is empty for some local debug runs; passing
--repo-url explicitly is fine.
Suppressions
The supported, no-infrastructure-required flow is the manual baseline
edit: run bomdrift baseline add locally and commit the result to
your PR branch.
Comment-driven suppression (advanced, v0.9.5+)
Trade-off up front. Comment-driven suppression turns a reviewer comment like
/bomdrift suppress GHSA-...into an automatic baseline edit. To wire it up safely you need to operate a small public webhook handler. The manual flow above is supported and lower-risk; reach for the bridge only when the zero-click UX is worth running a service.
examples/azure-devops/comment-bridge/ ships a Cloudflare Worker
reference implementation that enforces five security guards:
- Webhook secret verification (
X-Bomdrift-Bridge-Secretcustom header, constant-time compare). - Event-type filter (
ms.vss-code.git-pullrequest-comment-eventonly). - Project-UUID allowlist.
- Commenter-permission lookup (Contributors team membership).
- PR-context guard (active PR targeting the protected main branch).
When the guards pass, the worker POSTs to
/_apis/pipelines/{id}/runs with BOMDRIFT_NOTE_BODY as a template
parameter. The example azure-pipelines.yml defines a conditional
bomdrift_suppress stage gated on that parameter; it runs
bomdrift baseline add --from-comment "$BOMDRIFT_NOTE_BODY" and
pushes the resulting baseline edit back to the PR’s source branch.
Normal PR-build runs leave the parameter empty so the suppress stage
is skipped.
The full threat model and deployment guide live in
examples/azure-devops/comment-bridge/README.md.
The same logic ports to Vercel / Netlify / AWS Lambda — see
vercel-equivalent.md.
Troubleshooting
See examples/azure-devops/README.md.
CLI reference
This page documents every bomdrift subcommand and flag. The authoritative
help text is bomdrift --help / bomdrift <subcommand> --help; this page
groups the same information by behavior so it’s easier to look up. Each
flag carries an introduced-in annotation so future readers can reason
about which version a feature first shipped in.
Subcommands
bomdrift diff <BEFORE> <AFTER> [OPTIONS]
bomdrift init [--config-only] [--force]
bomdrift baseline add [<ID>] [--path <PATH>] [--expires <YYYY-MM-DD>] [--reason <TEXT>] [--from-comment <BODY>]
bomdrift refresh-typosquat [--ecosystem <ECOSYSTEM>]
| Subcommand | Purpose |
|---|---|
diff | Diff two SBOMs and emit findings. The everything-flag. |
init | Scaffold .bomdrift.toml + GitHub workflows. |
baseline add | Append an advisory ID to a baseline file. |
refresh-typosquat | Re-fetch the bundled top-package lists. |
bomdrift diff
Diff two SBOMs and surface supply-chain risk signals on changed components.
Positional arguments
<BEFORE>— path to the “before” SBOM (CycloneDX 1.5/1.6, SPDX 2.3, or Syft JSON). Optional when--before-attestationis set instead.<AFTER>— path to the “after” SBOM. Optional when--after-attestationis set instead.
Output formats
--output <FORMAT>
Introduced in v0.1.
Output format. One of:
terminal— ANSI-colored tree-style output. Default when stdout is a TTY.markdown— GitHub-Flavored Markdown ready for PR-comment posting. Default when stdout is piped/redirected.json— pretty-printed{"changes": ..., "enrichment": ...}graph.sarif— SARIF v2.1.0 for GitHub Code Scanning ingestion.
Config key: [diff] output.
--output-file <PATH>
Introduced in v0.8.
Write the chosen --output format to this path instead of stdout. Useful
for SARIF (--output sarif --output-file bomdrift.sarif) where YAML
quoting > redirection is fragile in CI templates.
--format <FORMAT>
Introduced in v0.1.
Force input format detection. auto (default) / cdx / spdx / syft.
auto looks at the JSON top-level fields to dispatch.
--summary-only
Introduced in v0.3. Markdown-only.
Emits just the summary table + a footer pointing at the full output. Used by the action’s comment-size fallback when the full diff exceeds GitHub’s 65,536-char comment-body cap.
--findings-only
Introduced in v0.6. Markdown-only.
Keeps the summary table and risk-bearing sections (vulnerabilities, typosquats, version jumps, young maintainers, license changes, registry-metadata findings) but omits raw Added / Removed / Version changed detail tables. Useful when a PR intentionally updates a large lockfile and reviewers only want the actionable findings inline.
--include-file-components
Introduced in v0.5.
Keep Ecosystem::Other("file") pseudo-components emitted by Syft’s
directory cataloger. Off by default — those produce phantom Added/Removed
pairs that drown real package changes.
Repo policy config
--config <PATH>
Introduced in v0.6.
Load defaults from a .bomdrift.toml policy file. When omitted, an
existing .bomdrift.toml in the current working directory is loaded
automatically; missing default config is ignored. An explicit --config
path must exist and parse.
CLI flags override config values for one-off runs.
Example .bomdrift.toml:
[diff]
fail_on = "critical-cve"
baseline = ".bomdrift/baseline.json"
findings_only = true
max_added = 25
max_version_changed = 10
# Calibration knobs (v0.9.6+)
typosquat_similarity_threshold = 0.92
young_maintainer_days = 90
cache_ttl_hours = 24
[license]
allow = ["Apache-2.0", "MIT", "BSD-*"]
deny = ["GPL-*", "AGPL-*"]
allow_exceptions = ["LLVM-exception", "Classpath-exception-2.0"]
Suppression
--baseline <PATH>
Introduced in v0.5.
Path to a JSON snapshot whose findings should be suppressed from this
run’s output (and from the --fail-on trip evaluation). Match keys are
conservative — a finding at a different version than baseline still
surfaces. See Baseline & suppression for the schema and
match-key semantics.
--vex <PATH>
Introduced in v0.9. Repeatable.
Path(s) to VEX (Vulnerability Exploitability eXchange) files to consume.
Each file is auto-detected as either OpenVEX 0.2.0 or CycloneDX VEX 1.6.
Statements with status not_affected / fixed suppress matching
findings; under_investigation annotates without suppressing;
affected annotates as a no-op badge. See VEX for the
finding-id matching rules including the synthetic-id convention.
--emit-vex <PATH>
Introduced in v0.9.
Emit a single OpenVEX 0.2.0 doc covering every finding in the
post-baseline diff. Baseline-suppressed entries inherit their
vex_status from the baseline entry (defaulting to
under_investigation); un-suppressed findings emit as affected.
--vex-author <STRING>
Introduced in v0.9.
VEX author for --emit-vex. Falls back to repo_url, then to
"bomdrift".
--vex-default-justification <STRING>
Introduced in v0.9.
Default OpenVEX justification written into emitted statements when
the source baseline entry doesn’t supply one. Defaults to
"vulnerable_code_not_in_execute_path".
Enrichment toggles
Each of these disables one enricher entirely (no network, no cache lookup). All default to on.
| Flag | Disables | Introduced |
|---|---|---|
--no-osv | OSV.dev CVE lookup | v0.1 |
--no-osv-cache | The 24h on-disk OSV severity cache only — keeps OSV enabled | v0.3 |
--no-maintainer-age | GitHub-REST maintainer-age enricher | v0.2 |
--no-epss | FIRST.org EPSS enricher | v0.8 |
--no-kev | CISA KEV enricher | v0.8 |
--no-registry | Registry-metadata enrichers (npm/PyPI/crates.io) | v0.9 |
--recently-published-days <N>
Introduced in v0.9.
Recently-published threshold in days for the registry enricher
(default 14). Set to 0 to disable that specific kind without
disabling the other registry checks.
Calibration
bomdrift’s heuristic enrichers ship with conservative defaults that work
for most repos. When the defaults aren’t right at scale, every threshold
is tunable either through [diff] keys in .bomdrift.toml or via the
matching CLI flag. CLI flags override config values for one-off runs.
--typosquat-similarity-threshold <FLOAT>
Introduced in v0.9.6.
Type: float in [0.0, 1.0]. Default: 0.92.
Config key: [diff] typosquat_similarity_threshold.
Minimum normalized edit-distance similarity between a candidate package
name and a top-list entry before bomdrift flags it as a possible
typosquat. Higher values = stricter (fewer false positives, more false
negatives). Lowering to 0.85 catches softer near-misses; raising to
0.95 only catches one- or two-character swaps on short names.
--young-maintainer-days <N>
Introduced in v0.9.6.
Type: positive integer (days). Default: 90.
Config key: [diff] young_maintainer_days.
A package’s top contributor whose oldest commit is newer than this many
days is flagged as a young-maintainer signal. Defaults to a quarter;
raise to 180 for stricter ecosystems, lower to 30 for tighter
signals.
--cache-ttl-hours <N>
Introduced in v0.9.6.
Type: positive integer (hours). Default: 24.
Config key: [diff] cache_ttl_hours.
Time-to-live for the OSV / EPSS / KEV / registry-metadata caches under
<XDG_CACHE_HOME>/bomdrift/. The same TTL applies to all four caches
(v0.9.6 unified the previously duplicated constants). Lower to 1 for
fast-changing security feeds in long-running self-hosted runners; raise
to 168 (one week) when running offline.
--multi-major-delta <N>
Introduced in v0.9.7.
Type: positive integer (>= 1). Default: 2.
Config key: [diff] multi_major_delta.
Major-version delta at or above which the version-jump enricher classifies
an upgrade as a multi-major jump. With the default of 2, an upgrade
from 1.x to 2.x is a single-major bump (treated normally), while
1.x → 3.x (delta = 2) trips the multi-major signal. Lower to 1 to
flag every cross-major bump as multi-major; raise to 3 or higher to
quiet noisy ecosystems that release majors aggressively.
This flag closes the last hardcoded calibration threshold: pre-v0.9.7
the multi-major boundary lived as a const in the version-jump
enricher. With the knob exposed, every gating decision in --debug-calibration
output emits the active threshold rather than the const default — so
debug rows for the version-jump kind are now portable across repos
with different calibrations.
License policy
--allow-licenses <LIST> / --deny-licenses <LIST>
Introduced in v0.8. Comma-separated, repeatable.
SPDX license identifiers (or *-suffix globs) permitted / forbidden by
policy. Deny wins when a license matches both. CLI flag takes precedence
over [license] allow / deny in .bomdrift.toml (override, not
merge). v0.9 adds full SPDX expression evaluation via the spdx crate
so compound expressions like (MIT OR GPL-3.0) evaluate correctly.
--allow-exception <LIST> / --deny-exception <LIST>
Introduced in v0.9.5. Comma-separated, repeatable.
SPDX exception identifiers (e.g. LLVM-exception,
Classpath-exception-2.0) permitted / forbidden as the right-hand side
of a WITH clause. When set, Apache-2.0 WITH <other> violates policy
even if Apache-2.0 is on the base allow list. Empty lists preserve
v0.9 behavior (exception treated as informational).
--allow-ambiguous-licenses
Introduced in v0.8.
When set, compound SPDX expressions like (MIT OR GPL-3.0) are
permitted. Off by default — fail-closed.
See License policy for the full evaluation semantics.
Failure thresholds
--fail-on <THRESHOLD>
Introduced in v0.2; expanded across v0.4 / v0.8 / v0.9.
Exit with code 2 when findings of the configured threshold surface. One of:
none— never trips (default).cve— trips on any CVE / GHSA / MAL advisory finding.critical-cve— trips when at least one finding hasseverity >= Highper the OSV-fetched severity. (Naming kept for back-compat — covers the HIGH-or-CRITICAL bucket; HIGH alone is the common actively-exploited case.)typosquat— trips on any typosquat finding.license-change— trips on same-version license changes.kev— trips on any advisory in the CISA KEV catalog (v0.8+).recently-published/deprecated— registry-metadata finding gates (v0.9+).any— trips on any finding.
The PR-comment body is written to stdout before exit-2 — the
action’s tee + PIPESTATUS wrapper relies on this so the comment
posts even when the workflow step fails.
--fail-on-epss <FLOAT>
Introduced in v0.8.
Trip exit-2 when any advisory’s EPSS score is >= this threshold
(0.0 – 1.0). Recommended starting point: 0.5 (top decile of
actively-exploited CVEs).
Diff budgets
--max-added <N>, --max-removed <N>, and --max-version-changed <N>
fail the run with exit code 2 when a diff exceeds the configured
dependency-churn budget. Introduced in v0.4. The rendered body is still
written before exit, just like --fail-on.
Forge integration
--platform <PLATFORM>
Introduced in v0.7; expanded in v0.9 (Bitbucket / Azure DevOps).
github (default), gitlab, bitbucket, or azure-devops. Drives
the rendered markdown comment’s footer:
github—/issues/new?...URL shape,/bomdrift suppress <ID>comment-driven flow (requires the comment-suppress sub-action).gitlab—/-/issues/new?issuable_template=false-positiveURL shape; manualbomdrift baseline add <ID>flow or the optional Cloudflare Worker bridge for in-comment suppression. See GitLab CI.bitbucket—/issues/newURL shape; comment-bridge in v0.9.5+. See Bitbucket.azure-devops—/_workitems/create?templateName=false-positiveURL shape; comment-bridge in v0.9.5+. See Azure DevOps.
When the flag is omitted, bomdrift auto-detects from CI environment
variables in this order: GITLAB_CI=true → GitLab,
BITBUCKET_BUILD_NUMBER → Bitbucket, TF_BUILD → Azure DevOps,
otherwise GitHub. The explicit flag always wins. Also configurable via
[diff] platform = "<value>" in .bomdrift.toml.
Set in lockstep with --repo-url (or BOMDRIFT_REPO_URL, or — on
GitLab CI — CI_PROJECT_URL). Without a URL the footer is omitted
entirely; the platform flag controls only the footer’s shape.
--repo-url <URL>
Introduced in v0.5.
Repository URL used to render the markdown comment’s
action-affordance footer. Falls back to BOMDRIFT_REPO_URL env var.
Attestation (OCI-fetched SBOMs)
All flags in this section introduced in v0.9.6. See OCI attestation for end-to-end usage.
--before-attestation <REFERENCE>
Fetch the “before” SBOM as a cosign verify-attestation-verified
attestation attached to an OCI artifact instead of reading a local
file. Mutually exclusive with the positional before argument.
Requires --cosign-identity and --cosign-issuer.
--after-attestation <REFERENCE>
Same, for the “after” side. Mutually exclusive with the positional
after argument.
--cosign-identity <REGEX>
Regex passed to cosign verify-attestation --certificate-identity-regexp. Required when either
--before-attestation or --after-attestation is set. Example:
https://github.com/owner/.+.
--cosign-issuer <URL>
URL passed to cosign verify-attestation --certificate-oidc-issuer.
Required alongside --cosign-identity. Example:
https://token.actions.githubusercontent.com.
--require-attestation
Refuse to fall back to local-file SBOMs: both sides MUST come from a
verified OCI attestation. Implies --before-attestation and
--after-attestation are both set.
Plugins
--plugin <MANIFEST>
Introduced in v0.9.6. Repeatable.
Path to a plugin manifest TOML. Each plugin is an external executable invoked once per added / version-changed component with JSON over stdin/stdout. Plugin failures (timeout, non-zero exit, malformed JSON) drop their findings without failing the diff. See Plugins for the protocol reference and a worked example.
Diagnostics
--debug-calibration
Introduced in v0.7.
Off by default. When set, bomdrift diff writes one row to stderr per
finding it considers, with the schema:
kind|key|score|threshold
kind is one of typosquat, version-jump, maintainer-age, cve,
recently-published, deprecated, maintainer-set-changed. key is
a stable identifier (the package purl, advisory ID, etc.). score and
threshold are the numeric inputs to the gating decision.
The flag is purely diagnostic — it doesn’t change which findings get rendered. Pipe to a file:
bomdrift diff old.cdx.json new.cdx.json --debug-calibration 2> calibration.tsv
--debug-calibration-format <FORMAT>
Introduced in v0.8.
pipe (default, back-compat with v0.7) emits kind|key|score|threshold
per line; jsonl emits one JSON object per line for downstream tooling
that doesn’t want to maintain a custom CSV-ish parser.
bomdrift init
Introduced in v0.6.
Scaffold a copy-paste adoption setup in the current repository:
bomdrift init
Writes:
.bomdrift.toml.github/workflows/sbom-diff.yml.github/workflows/bomdrift-suppress.yml
Flags:
--config-only— write only.bomdrift.toml.--force— overwrite existing generated files. Without--force, existing files are preserved and the command fails loudly.
bomdrift baseline add
Introduced in v0.5; --expires/--reason v0.8; --from-comment v0.9.
Append an advisory ID to a baseline file’s suppressed_advisories list.
The file is created if missing; existing fields are preserved. Idempotent
(re-adding an existing ID is a no-op).
bomdrift baseline add GHSA-xxxx-yyyy-zzzz
bomdrift baseline add CVE-2026-12345 --path custom/baseline.json
bomdrift baseline add GHSA-evil-1234 \
--expires 2026-12-31 \
--reason "Awaiting upstream patch (issue #42)"
Flags
<ID>— advisory identifier (GHSA/CVE/MAL/OSV). Optional when--from-commentis supplied.--path <PATH>— baseline file path. Default.bomdrift/baseline.json.--expires <YYYY-MM-DD>— strict-format expiry; bomdrift refuses malformed dates (no silent never-expiring entries).--reason <TEXT>— free-form rationale; surfaces in VEX exports + expiry warnings.--from-comment <BODY>— parse a/bomdrift suppress <ID> [reason: ...]directive from a forge-issued comment body. Used by the GitLab / Bitbucket / Azure DevOps comment-bridge Workers. Exits non-zero on no-match so a webhook never silently no-ops.
bomdrift refresh-typosquat
Introduced in v0.4.
Refresh the bundled typosquat top-package lists from upstream sources.
bomdrift refresh-typosquat # all wired-up ecosystems
bomdrift refresh-typosquat --ecosystem pypi # one specific list
--ecosystem <ECOSYSTEM>
Which ecosystem’s list to refresh. One of all (default), npm,
pypi, cargo, nuget, maven, go, gem, composer. The first
four fetch from canonical upstream feeds; the remaining four are
curated data/<eco>-top*.txt snapshots and --ecosystem <name> for
those emits a notice rather than fetching.
Refreshed lists are written to
<XDG_CACHE_HOME>/bomdrift/typosquat/<eco>.txt via temp-file + atomic
rename. The typosquat enricher prefers cache files over the embedded
snapshot when present and parseable.
Exit codes
| Code | Meaning |
|---|---|
| 0 | Success. |
| 1 | bomdrift internal error (parse failure, network mishap not gated by best-effort path, etc.). |
| 2 | --fail-on threshold or diff budget tripped. The body is still on stdout — the action posts it before propagating the exit code. |
| (clap 2) | Usage error from clap (unknown flag, missing required argument). Distinguishable from --fail-on-driven exit 2 by stderr containing error: ... rather than the rendered body. |
Environment variables
| Variable | Purpose | Introduced |
|---|---|---|
GITHUB_TOKEN | Bumps the GitHub REST rate limit from 60/hr unauth to 5000/hr authenticated, used by the maintainer-age enricher. | v0.2 |
BOMDRIFT_REPO_URL | Fallback for --repo-url when the flag isn’t passed. | v0.5 |
GITLAB_CI | When true, auto-selects --platform gitlab (unless overridden). | v0.7 |
BITBUCKET_BUILD_NUMBER | When set, auto-selects --platform bitbucket. | v0.9 |
TF_BUILD | When set, auto-selects --platform azure-devops. | v0.9 |
CI_PROJECT_URL | On GitLab CI, used as a final fallback for --repo-url after BOMDRIFT_REPO_URL. | v0.7 |
XDG_CACHE_HOME | Cache root for OSV / EPSS / KEV / registry caches and refreshed typosquat lists. Defaults to ~/.cache on Linux. | v0.1 |
SOURCE_DATE_EPOCH | When set, used as “now” for byte-deterministic output (baseline-expiry comparisons, VEX timestamps, etc.). | v0.9 |
NO_COLOR | Honored by the terminal renderer. | v0.1 |
CLICOLOR_FORCE | Forces ANSI even on a non-TTY. | v0.1 |
BOMDRIFT_DEBUG | When 1, enables verbose stderr notes from best-effort enrichers. | v0.8 |
Output formats
bomdrift writes one rendered representation of a diff per invocation. The shape is deterministic — identical inputs produce byte-identical output — which is what the PR-comment upsert mechanism in the action relies on.
--output selects the format. Default is terminal when stdout is a TTY,
markdown otherwise.
Markdown
The default for piped/redirected output, designed to drop into a GitHub PR comment. Renders the diff as a summary table at the top, followed by per-section tables for each change category and finding type.
## SBOM diff
| Change | Count |
|---|---:|
| Added | 1 |
| Removed | 1 |
| Version changed | 1 |
| License changed | 0 |
| Possible typosquats | 1 |
### Added
| Ecosystem | Name | Version |
|---|---|---|
| npm | plain-crypto-js | 4.2.1 |
### Possible typosquats
| Ecosystem | Name | Version | Similar to | Similarity |
|---|---|---|---|---:|
| npm | plain-crypto-js | 4.2.1 | crypto-js | 0.95 |
When OSV.dev enrichment is enabled, an additional Vulnerabilities section lists each affected component with its advisory IDs sorted highest-severity-first within a component (ties broken by ID, so output stays byte-deterministic).
--summary-only emits only the summary table + a footer line. Used by
the action’s comment-size fallback for big-PR survival.
Terminal
ANSI-colored, tree-style output. Default when stdout is a TTY. Falls back
to markdown when stdout is piped/redirected (so action workflows that
capture stdout always see safe markdown). Honors NO_COLOR (skip ANSI)
and CLICOLOR_FORCE (force ANSI even on a non-TTY).
Findings are rendered with bracketed prefixes:
| Prefix | Meaning |
|---|---|
[ADD] | Added component |
[REM] | Removed component |
[VER] | Version changed |
[LIC] | License changed (same version) |
[CVE] | OSV.dev advisory |
[SQT] | Typosquat |
[JMP] | Multi-major version jump |
[YNG] | Young maintainer |
No emojis — bomdrift’s renderers stay strictly bracketed-prefix per project convention, both for terminal accessibility and for grepability of CI logs.
JSON
Pretty-printed {"changes": ChangeSet, "enrichment": Enrichment} graph
for downstream tooling, baselines, debugging.
{
"changes": {
"added": [ ... Component objects ... ],
"removed": [ ... ],
"version_changed": [[ before, after ], ... ],
"license_changed": [[ before, after ], ... ]
},
"enrichment": {
"vulns": { "<purl>": [{ "id": "...", "severity": "..." }, ...] },
"typosquats": [ ... ],
"version_jumps": [ ... ],
"maintainer_age": [ ... ]
}
}
The Enrichment.vulns shape is per-purl, per-advisory severity-tagged
as of v0.3. v0.2 emitted a flat Vec<String> of advisory IDs without
severity — consumers parsing v0.2 output need to migrate. See the
CHANGELOG
for the migration note.
JSON output is the canonical format for --baseline snapshots: capture
once with bomdrift diff --output json > baseline.json, replay with
bomdrift diff --baseline baseline.json on subsequent runs to suppress
already-triaged findings.
SARIF v2.1.0
Suitable for ingestion by GitHub Code Scanning, GitLab Vulnerability Reports, and any other consumer that speaks SARIF.
Stable rule IDs
These IDs surface in Code Scanning’s UI and are the join key for suppressions, so they’re load-bearing public API once any consumer has seen a finding. Renaming any of them is a breaking change.
| Rule ID | Source | Maps to |
|---|---|---|
bomdrift.cve | enrichment.vulns | one result per (component, advisory_id) |
bomdrift.typosquat | enrichment.typosquats | one per typosquat finding |
bomdrift.version-jump | enrichment.version_jumps | one per multi-major bump |
bomdrift.young-maintainer | enrichment.maintainer_age | one per young-maintainer finding |
bomdrift.license-change | cs.license_changed | one per license-changed-without-version-bump |
All five rules are always emitted in tool.driver.rules, even when the
current diff has zero findings of that kind — Code Scanning consumers
enumerate rules independently of results, so omitting unused rules
confuses the suppression UI.
Severity mapping
result.level maps from the OSV-fetched severity:
Critical/High→level: "error"Medium/Low/None→level: "warning"
This is intentionally separate from --fail-on critical-cve’s
threshold (which also fires on High); SARIF’s three-level model
(error/warning/note) doesn’t map 1:1 to OSV’s four severity
labels, so the renderer collapses High+Critical into error and
everything else into warning.
Locations
SARIF requires locations on every result. Since SBOM-derived findings
have no source line numbers, all results project onto a synthetic
physicalLocation.artifactLocation.uri = "sbom", matching the
convention used by trivy.
Determinism
Enrichment.vulns is a HashMap and its iteration order is
non-deterministic. The SARIF renderer sorts the keys before emission.
Other finding collections are already deterministically ordered Vecs
(their enrichers iterate the BTreeMap-derived ChangeSet order), so
they need no extra sorting. The render-twice-byte-equal regression
test in src/render/sarif.rs::tests::render_is_pure_byte_deterministic
guards against future regressions of this contract.
SARIF + GitHub Code Scanning
bomdrift can emit findings in SARIF v2.1.0 for ingestion by GitHub Code Scanning, GitLab Vulnerability Reports, and any other consumer that speaks SARIF.
bomdrift diff before.cdx.json after.cdx.json \
--output sarif \
--output-file bomdrift.sarif
Rule taxonomy
bomdrift emits the following stable rule IDs (load-bearing — never renamed
across releases). All rules are present in tool.driver.rules even when
the current diff has zero results of that kind, so Code Scanning UI
suppression flows can enumerate them upfront.
| Rule ID | Surfaces | SARIF level |
|---|---|---|
bomdrift.cve | OSV.dev advisory ID(s) for the component | error for High/Critical, else warning |
bomdrift.typosquat | Component name similar to a popular package | warning |
bomdrift.version-jump | Multi-major version bump | warning |
bomdrift.young-maintainer | Top GitHub contributor’s first commit < 90 days ago | warning |
bomdrift.license-change | License changed at the same version | warning |
bomdrift.license-violation | Component license violates configured allow/deny policy | warning |
Fingerprint stability
Each result carries partialFingerprints.primaryHash/v1 — a SHA-256 digest
of a stable identity tuple per rule:
| Rule | Identity |
|---|---|
bomdrift.cve | `ruleId |
bomdrift.typosquat | `ruleId |
bomdrift.version-jump | `ruleId |
bomdrift.young-maintainer | `ruleId |
bomdrift.license-change | `ruleId |
bomdrift.license-violation | `ruleId |
The /v1 suffix on the fingerprint key lets bomdrift evolve identity
schemes in future releases without GitHub re-opening every existing alert.
Two distinct CVEs on the same purl produce distinct fingerprints; the
same finding produced across two runs produces a byte-equal fingerprint.
Wire up GitHub Code Scanning
Set the new action input upload-to-code-scanning: 'true' and ensure your
workflow has the security-events: write permission. The composite action
runs github/codeql-action/upload-sarif@v3 after bomdrift writes
${{ github.workspace }}/bomdrift.sarif.
permissions:
contents: read
security-events: write # required for SARIF upload
pull-requests: write # only if you also want PR comments
jobs:
bomdrift:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: Metbcy/bomdrift@v1
with:
output: sarif
upload-to-code-scanning: 'true'
Direct CLI use (any CI)
When integrating with GitLab Vulnerability Reports, Bitbucket, or any
arbitrary SARIF consumer, prefer --output-file over shell redirection:
bomdrift diff before.json after.json \
--output sarif \
--output-file bomdrift.sarif
The --output-file form is YAML-quoting-safe (no > redirection) and
keeps stdout free for human-readable progress logging.
Determinism
Renderer output is byte-deterministic across runs for identical inputs.
HashMap-keyed advisory lists are sorted by purl key before emission;
license arrays are sorted before fingerprinting. The
SOURCE_DATE_EPOCH environment variable is honored everywhere bomdrift
emits a timestamp (the SARIF document itself currently carries no
timestamps, but related VEX emission in v0.9 will).
Troubleshooting
- Alerts don’t appear in the Security tab. Confirm
permissions.security-events: writeon the calling workflow ANDupload-to-code-scanning: 'true'on the action input. Check the “Upload SARIF to Code Scanning” step in the job log for the API response. - Same finding appears twice after a re-run. This is a fingerprint bug — file an issue with the SARIF artifact and the inputs that produced it. Fingerprints should remain byte-equal across runs.
- Severity wrong / missing. Bomdrift maps GHSA’s
database_specific.severitytext label. Advisories without a label surface at SARIFwarningand theproperties.severityfield readsNONE.
Enrichers overview
An enricher runs over the ChangeSet produced by the diff core and
adds risk-signal metadata to the rendered output without modifying the
ChangeSet itself. Each is independent, has its own opt-out flag, and
follows a best-effort contract: any failure (network, rate-limit,
upstream API change) is logged once to stderr and the diff renders
without that enricher’s findings.
Shipping enrichers
| Enricher | Source | Network? | Default | Opt-out flag | Calibration |
|---|---|---|---|---|---|
| OSV.dev CVE lookup | OSV.dev /v1/querybatch + /v1/vulns/{id} | yes | on | --no-osv | --cache-ttl-hours (v0.9.6) |
| EPSS | FIRST.org /api/v1/epss | yes | on | --no-epss | --cache-ttl-hours; --fail-on-epss <0.0–1.0> |
| CISA KEV | CISA known-exploited catalog | yes | on | --no-kev | --cache-ttl-hours; --fail-on kev |
| Typosquat | Embedded top-N lists, optional XDG cache | no | on | (none — pure compute) | --typosquat-similarity-threshold (v0.9.6) |
| Multi-major version jump | The diff itself | no | on | (none — pure compute) | (hard-coded MIN_MAJOR_DELTA = 2 — see chapter for rationale) |
| Maintainer age | GitHub REST /repos/.../contributors + /commits | yes | on | --no-maintainer-age | --young-maintainer-days (v0.9.6) |
| Registry metadata | npm / PyPI / crates.io public APIs | yes | on (v0.9+) | --no-registry | --recently-published-days; --cache-ttl-hours |
| License policy | SBOM licenses field + SPDX expression eval | no | on | (configured by allow/deny lists) | --allow-licenses, --deny-licenses, --allow-exception, --deny-exception |
| Plugins | External-process plugins (v0.9.6+) | varies | off (opt-in) | (don’t pass --plugin) | (per-plugin manifest) |
Best-effort contract
Every enricher that touches the network honors the same contract:
- Per-request timeout (15s for OSV, 15s for GitHub) so a misbehaving upstream can’t hang a CI job.
- Errors warn, never block. A failed enricher logs one line to stderr (the warning is the same key every time, so it dedupes reasonably) and the diff renders without that enricher’s contributions.
- Rate-limit awareness. OSV’s
/v1/querybatchis unauthenticated; the GitHub REST API honorsGITHUB_TOKENfor the 5000/hr cap. On a403 + X-RateLimit-Remaining: 0, the maintainer-age enricher returns whatever was already collected and warns once. - Per-component caching within a single run. Repeated
cs.addedentries from the same project (e.g. monorepo subpackages sharing a GitHub repo) don’t multiply HTTP requests.
Determinism
Each enricher’s output is structured into the Enrichment graph
(vulns: HashMap<...>, typosquats: Vec<...>, version_jumps: Vec<...>,
maintainer_age: Vec<...>). Renderers iterate these in deterministic
order — Vecs in their natural BTreeMap-derived order from the
ChangeSet, the vulns HashMap with its keys sorted before emission.
This is the contract that lets peter-evans/create-or-update-comment
upsert PR comments in place: identical inputs render to byte-identical
output, so the comment body is patched only when the diff genuinely
changes.
Why these signals?
The enricher set was chosen because each maps to a real, recent, high-impact incident class:
- OSV.dev CVE lookup: published advisories, the broadest signal.
- EPSS: probability of exploitation in next 30 days; dampens false-urgency on Critical-CVSS-but-low-exploitation advisories.
- CISA KEV: known-exploited; the highest-confidence “act now” filter.
- Typosquat: malicious packages mimicking popular ones (the
plain-crypto-jsaxios dropper, the PyPI campaigns 2024–2026). - Multi-major version jump: takeover swaps, namespace reuse.
- Maintainer age: long-game social-engineering campaigns (xz / Jia Tan).
- Registry metadata: recently-published, deprecated, maintainer-set-changed — the npm Shai-Hulud-style worm precursors.
- License policy: not a malicious-code signal but a policy gate that the same diff-time reviewer is best positioned to enforce.
For organizations with environment-specific rules outside this list, the v0.9.6 Plugins protocol lets you layer custom enrichers on top without forking bomdrift.
See also
- CLI reference — Enrichment toggles
- CLI reference — Calibration
- Architecture — Best-effort enricher contract
OSV.dev CVE lookup
bomdrift’s CVE enricher queries the Open Source Vulnerability database for added and version-bumped components, populating the Vulnerabilities section in the rendered output with advisory IDs (CVE, GHSA, MAL, etc.) and per-advisory severity.
Two-stage lookup
Stage 1: /v1/querybatch
A single batched POST returns advisory IDs for every queried component in one round-trip. bomdrift batches up to 1000 queries per request (the documented cap); larger diffs chunk into multiple batches.
Each query is a package@version keyed by purl ecosystem (npm, PyPI,
crates.io, Maven, etc.). Components without a parseable purl are
skipped silently.
Stage 2: /v1/vulns/{id}
For each unique advisory ID returned by stage 1, bomdrift issues a follow-up GET to populate severity. Severity is sourced (in order):
- GHSA’s
database_specific.severitytext label (LOW|MODERATE|HIGH|CRITICAL). This is the most consistent shape across the OSV corpus. - Highest CVSS_V3 vector score from the
severity[]array, mapped to a label by the standard CVSS-v3 severity rating (Critical ≥ 9.0, High ≥ 7.0, Medium ≥ 4.0, Low ≥ 0.1). Severity::Nonewhen neither shape is present. These advisories render with anoneseverity label and don’t trip--fail-on critical-cve.
On-disk severity cache
Stage-2 lookups are N+1 in the worst case — one query per unique
advisory ID. bomdrift caches stage-2 responses on disk at
<XDG_CACHE_HOME>/bomdrift/osv/<advisory_id>.json with a 24h TTL.
~/.cache/bomdrift/osv/
├── CVE-2025-12345.json
├── GHSA-3p68-rc4w-qgx5.json
└── MAL-2026-2306.json
Each cache file looks like:
{
"fetched_at": 1745878800,
"severity": "Critical",
"raw": { ... full /vulns/{id} response ... }
}
Cache behavior
- Cache hits log nothing. A successful 24h-fresh hit is silent.
- Cache misses are silent too. Each miss issues a network fetch and writes the result on success.
- End-of-run summary. A single line goes to stderr like
osv: 18/22 severities served from cacheso CI logs show the cache hit ratio without per-file noise. - Atomic writes. Cache files are written to
<id>.json.tmpthen renamed, mirroring the temp-file + rename pattern used bybomdrift refresh-typosquat. - Stale TTL. 24h is a deliberate balance between rerun friction (a CI job re-running 30 minutes after the last one wants the cache) and stale-severity risk (a published severity correction after 24h is rare and the renderer’s contract is “best effort”).
--no-osv-cache
For paranoid reruns where you want fresh fetches even within the 24h window:
bomdrift diff before.json after.json --no-osv-cache
The cache itself is purely an optimization — the bypass flag always works, it just costs N+1 fetches per run. Use sparingly.
--no-osv (offline mode)
Skip the entire OSV pipeline (both stages, no cache writes). Use for:
- Tests and example scenarios where determinism matters more than freshness.
- Air-gapped CI environments.
- Quick smoke tests of the change-shape signals without the network latency.
bomdrift diff before.json after.json --no-osv
Severity → --fail-on mapping
| Threshold | Trips when… |
|---|---|
none | Never. |
cve | Any vuln finding present (regardless of severity). |
critical-cve | Any finding with severity >= High (covers HIGH and CRITICAL). |
typosquat | Any typosquat finding; OSV findings do not trip it. |
license-change | Any same-version license change; OSV findings do not trip it. |
any | Any finding of any kind, plus license-changed-without-version-bump. |
The critical-cve name covers HIGH-or-CRITICAL because CRITICAL alone
is rare in the GHSA tagging and many actively-exploited advisories ship
as HIGH. The threshold name stays stable; the threshold value covers
the actionable bucket.
Network behavior
- Per-request timeout: 15 seconds.
- No authentication: OSV.dev’s
/v1/querybatchand/v1/vulns/{id}endpoints are both unauthenticated public APIs. - User-Agent:
bomdrift/<version>so the OSV team can attribute traffic if needed. - Failures warn and continue: a network mishap (DNS, timeout, 5xx)
emits a single stderr warning and the diff renders without the
Vulnerabilities section. The exit code remains 0 unless
--fail-onwas set and a previously-cached vuln tripped it.
Why OSV.dev specifically?
- Cross-ecosystem unification. OSV merges npm advisories from GHSA, PyPI advisories from PyPA, Cargo advisories from RustSec, Maven advisories from GHSA, etc. into a single API, so bomdrift doesn’t need ecosystem-specific clients.
- Open API, no key required. Every consumer of the
/v1/querybatchendpoint gets the same data without registration overhead. - Public schema. The response shape is documented at ossf.github.io/osv-schema/, so bomdrift can reason about the shape without depending on an API client crate that drags in tokio.
EPSS
bomdrift queries the Exploit Prediction Scoring System (EPSS) from FIRST.org for every CVE-aliased advisory and surfaces the per-CVE score (0.0 – 1.0) in markdown / terminal / SARIF output.
EPSS estimates the probability that a given CVE will be exploited in the next 30 days. Combined with severity it gives reviewers a sharper signal than CVSS alone — a Critical CVE with EPSS 0.01 is far less urgent than a Medium CVE with EPSS 0.85.
Output
- Markdown: per-advisory badge
EPSS 0.87after the severity label. - Terminal: same badge, no markup.
- JSON:
enrichment.vulns[purl][i].epss_scorenumeric field. - SARIF:
properties.epssScoreonbomdrift.cveresults.
When an advisory is keyed by GHSA but has CVE aliases, the score is the max across all CVE aliases so a GHSA covering two CVEs surfaces the worse of the two.
Threshold gating
bomdrift diff before.json after.json --fail-on-epss 0.5
Exits 2 when any advisory has score ≥ 0.5. 0.5 is roughly the top decile of actively-exploited CVEs; tune for your team’s risk appetite.
Calibration
--cache-ttl-hours <N>(v0.9.6+) — overrides the default 24h disk cache TTL for the EPSS scores cache.--fail-on-epss <FLOAT>— threshold gate; see Threshold gating.
Disabling
bomdrift diff before.json after.json --no-epss
or in .bomdrift.toml:
[diff]
no_epss = true
Both forms skip the FIRST.org HTTP call AND the disk cache lookup.
Caching
24h TTL at <XDG_CACHE>/bomdrift/epss/<cve>.json. Negative results
(CVEs FIRST.org returned no score for) are cached to avoid re-querying
recently-published CVEs that haven’t been scored yet.
Best-effort
Like every bomdrift enricher, EPSS is best-effort: a network failure or
a malformed response surfaces a BOMDRIFT_DEBUG=1 stderr note and the
diff renders with empty epss_score fields. EPSS being unreachable is
never a reason to block a PR review.
CISA KEV
bomdrift downloads the CISA Known Exploited Vulnerabilities catalog and
flips a KEV flag on every advisory whose primary id or aliases include a
CVE listed in the catalog.
CISA KEV is the highest-confidence “actively exploited in the wild” signal
available — CISA only adds CVEs to the catalog after observing real-world
exploitation. It’s a tighter filter than --fail-on critical-cve (which
fires on CVSS High or above regardless of exploitation evidence).
Output
- Markdown: bold
**KEV**badge after the severity / EPSS label. - Terminal: plain
KEVtoken. - JSON:
enrichment.vulns[purl][i].kevboolean field. - SARIF:
properties.kev: trueonbomdrift.cveresults when set.
Threshold gating
bomdrift diff before.json after.json --fail-on kev
Exits 2 when any advisory has its KEV flag set. --fail-on any also
includes KEV.
Calibration
--cache-ttl-hours <N> (v0.9.6+)
The 24h TTL for the catalog file is now configurable via the unified cache-TTL knob. Lower for faster CISA-update propagation in long-running self-hosted runners; raise when running offline or against archived SBOMs.
Disabling
bomdrift diff before.json after.json --no-kev
or in .bomdrift.toml:
[diff]
no_kev = true
Caching
24h TTL on the bulk catalog JSON at
<XDG_CACHE>/bomdrift/kev/catalog.json. Once-daily refresh matches CISA’s
publication cadence.
Best-effort
Network failure logs at BOMDRIFT_DEBUG=1 and the diff renders with KEV
flags absent. A stale catalog (within the 24h window) is preferred over
re-fetching on every run.
Typosquat detection
The typosquat enricher flags newly added components whose names are suspiciously close to a popular package in the same ecosystem. v0.4 covers npm, PyPI, Cargo, Maven, Go, RubyGems, NuGet, and Composer with rules tuned per ecosystem.
The signal
Typosquatting is a real and recurring supply-chain attack pattern:
- The 2024 PyPI campaign that registered
colorama-0.4.7— note the trailing zero — to drop a credential stealer. - The Mar 2026 axios incident’s
plain-crypto-js@4.2.1— a typo of the legitimatecrypto-js— used to exfiltrate via WAVESHAPER.V2. - Sustained npm
lodashlookalikes (loadash,loadsh,loadshes) through 2024–2026.
The pattern is consistent across ecosystems: a candidate name with high
visual / phonetic similarity to a popular package, often with a single
character substitution / insertion / deletion, sometimes with an added
prefix or suffix. The defender’s task is to flag the candidate at PR
review time, before npm install or pip install runs the malicious
code.
Algorithm
The core scoring is Jaro-Winkler similarity with a suffix-containment
boost for the textbook prefix-add pattern (plain-crypto-js).
Threshold: 0.92 for a finding to surface. Maven is the exception (see
below).
Per-ecosystem rules
| Ecosystem | Canonicalization | Separators | Scoring |
|---|---|---|---|
| npm | lowercase | -, _, ., / | Jaro-Winkler + suffix boost |
| PyPI | PEP 503 (lowercase, -/_/. collapse) | -, _, . | Jaro-Winkler + suffix boost |
| Cargo | lowercase | - | Jaro-Winkler + suffix boost |
| Maven | lowercase | (n/a) | Levenshtein ≤ 2 on artifactId only |
| Go | lowercase | -, / | Jaro-Winkler on last path segment |
| Gem | lowercase | -, _ | Jaro-Winkler + suffix boost |
| NuGet | lowercase (case-insensitive per spec) | . | Jaro-Winkler + suffix boost |
| Composer | lowercase | -, / | Jaro-Winkler on package portion |
Filtering rules (npm / PyPI / Cargo)
- Exact match (case-insensitive after canonicalization) → skip. The candidate IS a popular package, not a squat.
- Likely-legit ecosystem extension → skip. When the candidate
starts with the legit name followed by a separator, this matches
the well-established convention for extension packages
(
react-router,axios-retry,eslint-plugin-react,pytest-asyncio). The structural rule is keyed on ecosystem- specific separator sets so PyPI’s-/_/.interchange doesn’t leak into npm’s wider set. - Suffix containment with a substantial added prefix → boost. When
the candidate ends with the legit name (length ≥ 5) AND the added
prefix is longer than 3 characters, the score is boosted to at
least 0.95. This catches the deceptive
plain-crypto-jspattern that pure JW alone misses (the long prefix kills base similarity). - Otherwise: plain Jaro-Winkler. Threshold 0.92 catches single-
character drift like
cross-env → crossenv(~0.98) orexpress → expresss(~0.97), whilereact → react-router(~0.88) stays below the threshold.
Match-form rules (Go and Composer)
Go and Composer share an additional structural rule: the user-visible
coordinate has a stable, long prefix (Go’s host/owner/, Composer’s
vendor/) that’s duplicated across many legitimate packages. Including
the prefix in Jaro-Winkler scoring would inflate similarity past
anything useful — every Spring artifact would score 0.95+ against every
other Spring artifact, every Symfony package against every other
Symfony package.
Both ecosystems extract a match form from the canonicalized coordinate before scoring:
- Go: the last path segment of
host/owner/repo(e.g.github.com/spf13/cobra→cobra). - Composer: the package portion of
vendor/package(e.g.symfony/console→console).
Comparison happens on match forms. When two distinct full coordinates
collapse to the same match form (github.com/spf13/cobra and
github.com/myorg/cobra), they’re treated as legitimate forks and
not flagged. Only typo’d match forms (cobraa vs cobra) trip the
JW similarity threshold.
Maven rules
Maven coordinates are groupId:artifactId. The shared groupId prefix
is often very long (org.springframework.boot:,
com.fasterxml.jackson.core:) and would inflate Jaro-Winkler past
anything useful — every Spring artifact would score 0.95+ against
every other Spring artifact. The Maven path skips JW + suffix-
containment entirely and uses Levenshtein distance ≤ 2 on the
artifactId portion only.
commons-lng3 differs from commons-lang3 by Levenshtein 1
(insert a), so it fires regardless of whether the groupId matches.
A different-groupId republish of an exact commons-lang3 artifact
does not fire — that’s a legitimate fork / republish, not a typo.
Reputational care
The renderer wording is intentional:
X is similar to Y
— never X is a typosquat of Y. Flagging a legitimate package as a malicious squat in a public PR comment is real reputational harm to the package author. The structural similarity is observable; intent is not. The human reviewing the PR is the analyst making the determination.
The CLI / Action exit code reflects this: typosquat findings are
always informational. --fail-on typosquat exists for projects that
want to gate on the structural signal explicitly, but it’s never the
default.
Reference lists
Embedded snapshots ship in the binary:
| File | Source | Size |
|---|---|---|
data/npm-top1k.txt | anvaka/npmrank | 1000 |
data/pypi-top200.txt | hugovk/top-pypi-packages | 200 |
data/cargo-top200.txt | crates.io API ?sort=downloads | 200 |
data/maven-top100.txt | mvnrepository.com Most Popular (curated) | ~100 |
data/go-top200.txt | pkg.go.dev + awesome-go (curated) | ~180 |
data/gem-top200.txt | rubygems.org popular gems (curated) | ~245 |
data/nuget-top200.txt | nuget.org v3 search API ?orderby=totalDownloads | 200 |
data/composer-top200.txt | packagist.org popular categories (curated) | ~190 |
v0.7 expanded the curated Go, Composer, and Gem lists — the
ship-with-binary snapshots now cover the CNCF / HashiCorp / gRPC-
ecosystem corners of Go, the Symfony / Laravel / Doctrine /
testing / Packagist-popular tail of Composer, and the Rails /
dry-rb / serializer / search corners of RubyGems. Each top-up is
grouped under a # --- v0.7 top-up: <category> (source: ...) ---
header in the data file so future curators can see provenance.
Lists are intentionally smaller than npm-top1k.txt for the multi-
ecosystem ships (v0.2 + v0.4): the algorithm is identical across
ecosystems, so a smaller seed still proves the signal end-to-end. Lists
grow in subsequent releases without code changes — only the embedded
snapshot does.
Refreshing
bomdrift refresh-typosquat # all eight ecosystems
bomdrift refresh-typosquat --ecosystem npm
bomdrift refresh-typosquat --ecosystem pypi
bomdrift refresh-typosquat --ecosystem cargo
bomdrift refresh-typosquat --ecosystem nuget
Refreshed lists are written to
<XDG_CACHE_HOME>/bomdrift/typosquat/<ecosystem>.txt via temp-file +
atomic rename. The enricher prefers cache files over the embedded
snapshot when present and parseable.
--ecosystem maven|go|gem|composer are accepted but emit a notice:
Maven Central, pkg.go.dev, RubyGems, and Packagist all lack stable
public popularity feeds (or have had ones that went through breaking
changes). The curated lists shipped in the binary remain the source
of truth; refreshing those means editing data/<eco>-top*.txt and
rebuilding bomdrift. PRs adding names to the curated lists are
welcome.
Calibration
--typosquat-similarity-threshold <FLOAT> (v0.9.6+)
Default 0.92, range [0.0, 1.0]. Configurable via CLI flag or
[diff] typosquat_similarity_threshold = <float> in .bomdrift.toml.
The threshold applies to the JW + suffix-boost path (npm, PyPI, Cargo, RubyGems, NuGet, Go, Composer). The Maven Levenshtein-≤-2 path is hardcoded — Levenshtein distance and JW similarity aren’t directly comparable, so a single threshold flag would either over- or under-suppress on Maven.
Recommended ranges:
0.95— very strict; only catches near-perfect matches. Good for tightening down false positives in monorepos with many internally forked dependencies.0.92(default) — calibrated against the top-1000-of-each-ecosystem test corpus to produce zero false positives there.0.85— lenient; catches softer near-misses at the cost of more false positives. Useful for paranoid security review of brand-new PyPI / npm packages.
The threshold also appears in --debug-calibration rows so collected
samples can guide tuning:
typosquat|<purl>|<similarity_score>|0.92
False-positive management
The structural rules + thresholds aim for “no false positives on the top 1000 of each ecosystem.” If you discover a false positive in the wild:
- Add a regression test in
src/enrich/typosquat.rs::testsshowing the false positive doesn’t fire. - Open a PR. Tightening the rule (rather than special-casing the package name) is preferred — drives a cleaner heuristic.
Disabling
Pure compute, no network. There is no --no-typosquat flag — disabling
the typosquat enricher would defeat its primary purpose. To suppress
specific false-positive findings, hand-curate a per-component baseline
entry; see Baseline & suppression — Worked example.
To gate exit code on typosquat findings, use --fail-on typosquat.
See also
- CLI reference —
--typosquat-similarity-threshold bomdrift refresh-typosquat- Baseline — false-positive triage
Multi-major version jumps
Pure-compute, no network, no new dependencies. The version-jump
heuristic flags dependency upgrades that cross two or more major
versions in a single diff (e.g. 1.x → 4.x).
Why it’s a useful signal
A single major bump (1 → 2) is the standard SemVer signal reviewers
already pay attention to — bomdrift does not flag it. Two or more
majors at once is the unusual case worth a closer look:
- Takeover swaps: a maintainer transition followed by a major-version rename to “reset” the package identity (the xz pattern, scaled down).
- Namespace reuse: an unrelated package republished at a higher major under the same name, intentionally or after an account compromise.
- “Cleaned up the dep tree” PRs: legitimate but high-risk refactors that silently jump several majors at once and bypass the usual SemVer guard-rails.
Always informational severity — never trips --fail-on thresholds
narrower than any.
Major-version extraction
Hand-rolled, ~5 lines. We deliberately avoid the semver crate: full
SemVer parsing is unnecessary when only the major number is consulted,
and pulling the dep would add transitive weight for no functional gain.
Accepted forms (each yields a Some(major))
1.2.3→ 1v1.0.0→ 1 (leadingvtolerated)2.5.3-beta.1→ 2 (pre-release suffix ignored)3.0.0+build.123→ 3 (build metadata ignored)4/4-rc.1→ 4 (no minor required)
Rejected forms (yield None, the pair is skipped — never flagged)
- empty string
- non-numeric (
latest,nightly,main) - leading-zero numbers (
01.2.3) — ambiguous and almost always a sign of a non-SemVer scheme; safer to skip than misinterpret.
Calibration
The multi-major delta threshold is exposed as
--multi-major-delta <N>
(introduced in v0.9.7) with the matching [diff] multi_major_delta
config key. Default 2; minimum 1.
Raising the threshold to 3 or higher quiets noisy ecosystems that
release majors aggressively (some npm web frameworks ship a major every
few months). The signal still fires for genuinely unusual jumps but
stops competing with everyday upgrades for reviewer attention.
Lowering to 1 is supported but discouraged: it duplicates the
standard SemVer-bump signal reviewers already see on every PR, and
drowns the multi-major signal’s actual purpose (catching the xz pattern
and namespace-reuse swaps). bomdrift validates >= 1 so 0 is
rejected at the clap layer rather than silently disabling the enricher.
For per-component carve-outs use a baseline entry instead of dropping the global threshold; see Baseline — When the bump is the false positive.
Disabling
There is no --no-version-jump flag — pure compute, zero cost. If you
need to gate exit code only on version-jump findings, use --fail-on any. To suppress a specific bump as a known-acceptable, write a
per-component baseline entry — see
Baseline — When the bump is the false positive.
Examples
| Before | After | Flagged? |
|---|---|---|
1.0.0 | 4.17.21 | yes (1 → 4) |
2.34.0 | 4.5.0 | yes (2 → 4) |
1.0.0 | 2.0.0 | no (single major bump) |
1.0.0 | 1.99.0 | no (no major bump) |
latest | nightly | no (skipped — non-numeric) |
01.2.3 | 04.0.0 | no (skipped — leading-zero ambiguity) |
See examples/version-jumps/
for a runnable scenario.
Maintainer age signal
Flag newly added GitHub-hosted dependencies whose top contributor’s first commit is suspiciously recent. The xz/Jia Tan pattern.
Why it matters
The xz-utils backdoor (CVE-2024-3094, Mar 2024) was the work of “Jia Tan”, a GitHub identity that started contributing roughly two years before landing the malicious payload. The pattern — a brand-new account becoming the de facto sole maintainer of a low-traffic but widely-depended-upon package — is a leading indicator of long-game supply-chain takeovers.
We can’t catch Jia Tan in retrospect, but we can flag the next one earlier in their arc by surfacing “this package’s top contributor opened their first PR less than 90 days ago” at the moment a new dep is added.
Threshold
90 days by default. Intentionally aggressive: most legitimate new packages will trip this on initial introduction. That’s fine — a human reviewer can dismiss “the package is brand-new and the author is its only maintainer” trivially.
The expensive miss is the silent takeover of an existing package by
a recently-arrived contributor, which is what the 90-day window
captures. Tune for your environment via --young-maintainer-days <N>
or [diff] young_maintainer_days = <N> (v0.9.6+); see
Calibration below.
How it works
For each cs.added component with a GitHub source_url:
- GET
/repos/{owner}/{repo}/contributors?per_page=1— top contributor login. - GET
/repos/{owner}/{repo}/contributorsto count contributors. Skip if > 50 — “top contributor joined recently” loses meaning when 200 people have committed (Linux, Kubernetes, React, etc.). - GET
/repos/{owner}/{repo}/commits?author=<login>&per_page=1to find the most recent commit by that author. - Paginate to the last page to find their first commit. The “first commit by author” pagination trick is slow on prolific contributors (last page can be page 50+) but is correct without needing the GraphQL API.
- Compare against the SBOM-after timestamp (or
clock::now()when the SBOM lacks a metadata timestamp). Flag when the first commit is younger thanYOUNG_MAINTAINER_DAYS(default 90; tunable via--young-maintainer-days <N>in v0.9.6+).
Skipped cases
- Components without a
source_url(CycloneDXexternalReferenceswith novcsentry, etc.) — silently skipped. - Non-
github.comsource URLs — silently skipped (GitLab / Codeberg / etc. would need per-host clients; out of scope for v0). - Repositories with > 50 contributors — skipped because the “top contributor’s first commit” loses meaning on monorepos and multi-vendor projects.
- Repositories returning 404 or 403 — skipped, warned once.
Per-repo results are cached within a single bomdrift run so repeated
cs.added entries from the same project don’t re-issue the same three
requests.
Network behavior
- Per-request timeout: 15 seconds.
GITHUB_TOKENhonored: bumps the unauthenticated 60/hr cap to the authenticated 5000/hr cap. Without a token, large diffs (~30+ added GitHub deps) will hit rate-limiting; surface as a warning, partial results render, exit code stays 0.- No
octocrab: theoctocrabcrate would pull in tokio + ~70 transitive crates. Hand-rolledureqGETs + a 25-line ISO-8601 parser keep the bomdrift binary under our 5 MB target.
Calibration
--young-maintainer-days <N> (CLI; v0.9.6+) or [diff] young_maintainer_days = <N> in .bomdrift.toml overrides the 90-day
default. Must be >= 1.
Recommended ranges:
30–60for paranoid security-sensitive monorepos.90(default) for general-purpose use; the calibration target for the xz pattern.180for ecosystems with high contributor churn where the default surfaces too many legitimate first-time-author packages.
The threshold also appears in --debug-calibration rows so collected
samples can guide tuning:
maintainer-age|<purl>|<days_since_first_commit>|90
Disabling
--no-maintainer-age skips the entire enricher (no GitHub API calls).
Required for:
- Offline runs and tests.
- CI environments where
GITHUB_TOKENis unset and the unauthenticated rate limit (60/hr) is too low for the diff being analyzed. - Smoke tests of the deterministic offline signals.
bomdrift diff before.json after.json --no-maintainer-age
Severity
Always informational. The maintainer-age signal never trips
--fail-on critical-cve; it surfaces only under --fail-on any. The
intent is for human review, not gating: many legitimate packages have
brand-new authors, and the threshold is calibrated to surface the
xz-style pattern, not to fail the build automatically.
Calibration roadmap (v0.9.6+ status)
Past calibration backlog and how each item resolved:
- Tunable threshold flag — shipped in v0.9.6 as
--young-maintainer-days <N>. See Calibration above. - Multi-signal fusion — combine maintainer-age with the registry enricher’s “recently-published” or “maintainer-set-changed” findings to narrow the false-positive rate. The signals all surface in the same diff today; explicit fusion in a single composite finding is a v1.0 follow-up.
- GraphQL pagination — decided not to pursue. Adds a token
requirement (the GraphQL endpoint always wants auth) for one
saved round-trip per component. The
last-pageREST trick is documented as the canonical approach; see the module doc-comment insrc/enrich/maintainer.rsfor the rationale.
See Roadmap for the current backlog.
Registry-metadata enrichers (npm / PyPI / crates.io)
bomdrift queries package registries for each newly-added component (plus npm version-changed components for the maintainer-set check) and surfaces three kinds of finding:
- Recently published — the publish timestamp is within
--recently-published-days(default 14 days). Recent publishes correlate with takeover swaps and namespace-reuse attacks. - Deprecated — the package or version is flagged deprecated on npm, yanked on PyPI / crates.io, or carries an “Inactive” PyPI classifier.
- Maintainer set changed (npm only) — the maintainer set listed for the new version differs from the maintainer set listed for the old version. Classic xz / Jia Tan precursor.
Sources
| Ecosystem | URL | Headers |
|---|---|---|
| npm | https://registry.npmjs.org/<pkg> (URL-encoded @scope/name) | User-Agent: bomdrift/<version> |
| PyPI | https://pypi.org/pypi/<pkg>/json | — |
| crates.io | https://crates.io/api/v1/crates/<name> | User-Agent: bomdrift/0.9.0 (https://github.com/Metbcy/bomdrift) (required by crates.io) |
Disk cache
Per ecosystem under <XDG_CACHE>/bomdrift/registry/<eco>/<pkg>.json,
24-hour TTL, atomic temp-file + rename writes. Mirrors the OSV / EPSS
/ KEV cache shape.
Best-effort
A registry timeout, parse error, or unsupported ecosystem returns
Ok with no findings. Diff rendering NEVER blocks on registry
responses.
Calibration
--recently-published-days <N>— override the default 14-day threshold. Set--recently-published-days 0to disable that check while keeping deprecation / maintainer-set-changed.--cache-ttl-hours <N>(v0.9.6+) — overrides the default 24h disk cache TTL for the per-ecosystem registry caches.
Disabling
bomdrift diff before.json after.json --no-registry
Disables all three checks at once. Equivalent to [diff] no_registry = true in .bomdrift.toml.
Flags
--no-registry— skip all three checks.--recently-published-days <N>— see Calibration.--fail-on recently-published,--fail-on deprecated— exit-2 thresholds.
Output
- Markdown: three new sections — “Recently published”, “Deprecated upstream”, “Maintainer set changed (npm)” — in the per-category area.
- JSON:
enrichment.recently_published,enrichment.deprecated,enrichment.maintainer_set_changed. - SARIF: rules
bomdrift.recently-published,bomdrift.deprecated,bomdrift.maintainer-set-changedwith stablepartialFingerprints.primaryHash/v1. - Calibration rows (
--debug-calibration):recently-published|<purl>|<days>|14,deprecated|<purl>|<message>|any,maintainer-set-changed|<purl>|<changes>|1.
Why npm-only for maintainer-set-changed?
PyPI and crates.io don’t expose a clean “maintainers per version” view in their public REST API:
- PyPI: the
info.maintainerandinfo.authorfields are free-text and inconsistent across releases. There’s no historical record per release. - crates.io:
ownersis package-level, not version-level, so we can’t tell which owners had publish rights at the time of an individual version.
When the upstream APIs gain a per-version maintainer view we’ll extend the enricher; a future-version follow-up.
Baseline & suppression
The --baseline <path> flag suppresses findings that are already present
in a previously captured bomdrift diff --output json snapshot. It exists
to make adopting bomdrift on a project with pre-existing findings
practical — the first PR shouldn’t drown in noise that’s already been
reviewed and accepted.
How it works
-
Capture a baseline once, after a maintainer has reviewed and accepted the current state of findings as known acceptable:
bomdrift diff before.json after.json --output json > .bomdrift-baseline.jsonCommit
.bomdrift-baseline.jsonto the repo. -
On subsequent runs, pass
--baseline:bomdrift diff before.json after.json --baseline .bomdrift-baseline.json -
Findings whose match key is already present in the baseline are dropped from the rendered output and from the
--fail-ontrip evaluation. New findings — either at a new component, a new version of a known component, or a new advisory ID — surface normally.
Match keys
Match keys are intentionally conservative. A finding at a different version than baseline still surfaces — version drift is exactly the case where a known-acceptable finding becomes an unknown one, so suppressing across versions would defeat the point.
| Finding type | Match key |
|---|---|
| Vulnerability (CVE / GHSA / MAL) | (purl_with_version, advisory_id) |
| Typosquat | (purl_with_version) |
| Multi-major version jump | (purl_with_version) (the after-version) |
| Young maintainer | (purl_with_version) |
Notes:
- License-changed-without-version-bump pairs are part of the ChangeSet,
not the enrichment.
--baselinesuppresses findings, not the diff itself, so license changes always surface in the rendered output. This is intentional — a license change at a known version is still a change worth a reviewer’s eye. - Vulnerabilities use the advisory ID in the key, so a new GHSA against an already-known component still fires.
- Typosquats use the after-version in the key, so a typo’d
foo@1.0.0in the baseline doesn’t suppress a typo’dfoo@2.0.0.
Forward compatibility
The baseline parser is intentionally forgiving about missing fields.
v0.2 baselines can suppress a vuln by (purl, advisory_id) even when
the v0.3+ enrichment has populated severity, just with reduced
precision. Regenerate baselines under v0.3+ to capture the full match
shape.
As of v0.4, the action ships a baseline: input that plumbs straight
through to --baseline — no need for a custom step calling the
binary directly.
In-comment suppression (v0.5+)
Editing .bomdrift/baseline.json by hand on every accepted finding is
friction. v0.5 ships a comment-driven flow: a reviewer comments
/bomdrift suppress <ADVISORY-ID> on a PR, and a companion sub-action
appends the ID to the baseline file and commits it to the PR’s head
branch. The next bomdrift run on the same PR sees the finding as
suppressed.
Setup
Add a second workflow alongside your normal bomdrift one:
# .github/workflows/bomdrift-suppress.yml
name: bomdrift suppress
on:
issue_comment:
types: [created]
permissions:
contents: write # to commit the baseline file
pull-requests: write # to react on the trigger comment
jobs:
suppress:
if: |
github.event.issue.pull_request &&
startsWith(github.event.comment.body, '/bomdrift suppress ')
runs-on: ubuntu-latest
steps:
- uses: Metbcy/bomdrift/comment-suppress@v1
The if: filter is conservative — it gates on both
github.event.issue.pull_request (so issue comments don’t trigger)
and the comment-body prefix. The sub-action also re-validates both
internally and exits cleanly on non-matching events, so the filter is
defense-in-depth, not load-bearing.
What it does
- Parses the comment body for
/bomdrift suppress <id>. The ID must match a GHSA / CVE / MAL pattern. - Reacts to acknowledge that the command was accepted.
- Resolves the PR’s head ref via the GitHub API.
- Downloads the latest bomdrift release archive and (by default) verifies its cosign signature.
- Clones the PR’s head branch into a sibling worktree.
- Runs
bomdrift baseline add <id> --path <baseline-path>, which appends the ID to thesuppressed_advisoriesarray in the baseline file (creating the file if missing). - Commits + pushes the baseline change with message
chore(bomdrift): suppress <id>. - Reacts on the trigger comment to show success or failure.
What it suppresses
The v0.5 in-comment flow uses a wildcard advisory match: the specified ID is suppressed across all components, not just the one the comment was attached to. This is intentional — the typical case is “this advisory is a known false positive in our environment regardless of which dep pulls it in.” For per-component suppression, hand-edit the baseline using the existing diff-output JSON shape (see Match keys above) — both shapes coexist in the same file.
CLI equivalent
The same operation is available from the command line for users who want to curate a baseline outside CI:
bomdrift baseline add GHSA-xxxx-yyyy-zzzz
bomdrift baseline add CVE-2026-12345 --path custom/baseline.json
The command is idempotent — re-adding an existing ID is a no-op.
--from-comment (v0.9+)
When the GitLab comment-suppress bridge (or any other webhook
handler) hands you a raw note body, pass it via --from-comment
and let bomdrift extract the directive:
bomdrift baseline add --from-comment "Looks fine. /bomdrift suppress GHSA-mwcw-c2x4-8c55 reason: vendor PR #42 already merged"
The flag accepts the entire comment body. bomdrift parses the first
/bomdrift suppress <ID>[ reason: <text>] line, validates the ID
shape, and either appends the entry (writing object-form when a
reason is present) or exits non-zero with a clear stderr message
when no directive is found. The grammar is identical to the GitHub
comment-suppress sub-action — the two parsers are deliberately
kept in lockstep.
Workflow integration
A typical CI pattern commits the baseline alongside the source code and refreshes it after a maintainer reviews and accepts new noise as known acceptable:
- uses: Metbcy/bomdrift@v1
with:
before-sbom: before.json
after-sbom: after.json
baseline: .bomdrift/baseline.json
fail-on: critical-cve
When this fails on a new finding, the maintainer either:
- Fixes the finding (upgrade the dep, replace the typosquat) — no baseline change needed.
- Accepts the finding as known acceptable — regenerates the baseline
and commits it:
Reviewers see the diff against the previous baseline in the same PR and decide whether the new entry is acceptable.bomdrift diff before.json after.json --output json > .bomdrift-baseline.json git add .bomdrift-baseline.json
When NOT to use a baseline
- For a fresh project. If you can fix every finding before merging the bomdrift integration PR, do that — the baseline is technical debt, even if it’s debt with a clear purpose.
- For severity-bucket gating. Use
--fail-on critical-cveto gate the merge on actionable severity instead of suppressing everything under that severity. Baselines are for “we know about this, it’s fine for now”, not “ignore this entire class”. - For findings you’ll fix in the next PR. A baseline is a long-lived artifact; for one-PR exceptions, just upgrade the dep.
Worked example: triaging a false positive
Real-world false positives are the most common reason adopters reach for the baseline. A typical case looks like this on a PR:
🚨 Typosquat candidate — new dependency
colour-printis within Levenshtein distance 1 of well-known packagecolorprint. Review for impersonation.
In our example, colour-print is a deliberate British-English spelling
maintained by a long-trusted internal team — this is the canonical
“signal that’s true in the abstract, wrong for our codebase” case. The
Levenshtein heuristic should fire on this; what’s wrong is the
verdict, not the detection. Suppressing the whole typosquat class
(via --fail-on cve) loses coverage on actually-malicious squats; a
wildcard config field would over-suppress; what we want is exactly
this finding suppressed.
Step 1 — capture the current finding shape
Before deciding what to suppress, see what bomdrift saw. Run with
--output json and pull out the typosquat finding:
bomdrift diff before.json after.json --output json \
| jq '.enrichment.typosquat[] | select(.purl | contains("colour-print"))'
Output:
{
"purl": "pkg:npm/colour-print@2.1.0",
"candidate_for": "colorprint",
"distance": 1,
"ecosystem": "npm"
}
The purl_with_version here is pkg:npm/colour-print@2.1.0 — the
match key for the typosquat entry per the table above.
Step 2 — write a per-component baseline entry
Edit .bomdrift/baseline.json (the file bomdrift init scaffolds, or
whatever path you pass to --baseline). The diff-output JSON shape
takes precedence, so a hand-written entry uses the same fields the
JSON output produces:
{
"suppressed_advisories": [],
"findings": {
"typosquat": [
{
"purl": "pkg:npm/colour-print@2.1.0",
"candidate_for": "colorprint",
"ecosystem": "npm",
"_note": "British-English spelling, owned by team-foo since 2019. Re-evaluate on major-version bump."
}
]
}
}
The _note field is an underscore-prefixed extension; bomdrift
preserves unknown fields verbatim on round-trip and never reads them
back, so it’s a safe place to capture the why. Future maintainers
who read the baseline see the rationale without spelunking through
git blame.
Step 3 — verify the suppression takes effect
Re-run the diff with the baseline applied:
bomdrift diff before.json after.json \
--baseline .bomdrift/baseline.json
The colour-print finding is gone; everything else (including any
other typosquat candidate that shows up the same week) still
surfaces. That’s the trade-off: a precise hand-written entry beats a
wildcard or a class-wide opt-out, because the next typosquat against a
new package still trips the gate.
Why a hand-edited entry beats --fail-on tuning
It’s tempting to “just” loosen --fail-on typosquat to --fail-on critical-cve. Don’t:
- The typosquat enricher is your earliest signal for malicious
packages — a real squat (
colorizeimpersonatingcolorise) is caught here before the OSV.dev advisory exists. - A baseline entry is auditable:
git log .bomdrift/baseline.jsonshows when this exception was made and by whom. - A wildcard config setting (e.g., a hypothetical
[diff.typosquat] allow_distance_1 = true) would also suppress unrelated future squats. Per-component is the smallest possible exception that still fixes this one PR.
When the bump is the false positive
Sometimes the finding is a multi-major version jump on a package you
expect to leap (a calver-style release schedule, a coordinated
ecosystem-wide bump). The same per-component recipe works — replace
the typosquat array with version_jump, key by the after-version’s
purl. Update the entry on the next jump.
Schema reference
The unified BaselineEntry shape (introduced in v0.9.5; v0.5 string
entries continue to parse as the back-compat case):
| Field | Type | Required | Introduced | Description |
|---|---|---|---|---|
id | string | yes (when not the bare-string form) | v0.5 | Advisory identifier — GHSA-…, CVE-…, MAL-…, or OSV-…. |
purl | string | no | v0.5 | Restrict the suppression to a specific component (otherwise wildcards across all components). May be versionless (pkg:npm/foo) or version-pinned (pkg:npm/foo@1.2.3). |
expires | string YYYY-MM-DD | no | v0.8 | Strict-format expiry date. After this date the entry surfaces a warning and stops suppressing. Malformed dates fail loudly — no silent never-expiring entries. |
reason | string | no | v0.8 | Free-form rationale; surfaces in the expiry warning and as the OpenVEX statement_text in --emit-vex output. |
vex_status | string | no | v0.9 | One of OpenVEX’s vocabulary: not_affected, affected, fixed, under_investigation. Drives --emit-vex output. Defaults to under_investigation so --emit-vex doesn’t fabricate not_affected claims. |
vex_justification | string | no | v0.9 | OpenVEX justification when vex_status = not_affected. E.g., vulnerable_code_not_in_execute_path, component_not_present. |
Cross-link: vex_status and vex_justification are passthrough to the
VEX emit format. The
License policy
chapter covers using baseline entries to suppress LicenseViolation
findings (the same id / purl / reason schema applies; license
violations key by a synthetic ID bomdrift.license-violation:<purl>).
Two valid shapes per entry
The suppressed_advisories array accepts either form per entry:
{
"suppressed_advisories": [
"GHSA-old-school",
{
"id": "GHSA-evil-1234",
"purl": "pkg:npm/foo",
"expires": "2026-12-31",
"reason": "Awaiting upstream patch (issue #42)",
"vex_status": "under_investigation"
}
]
}
Bare strings remain in the file for v0.5 compatibility; bomdrift baseline add --reason … always emits the object form.
Time-boxed suppressions (expires + reason)
v0.8 adds two optional fields on each suppressed_advisories entry:
{
"suppressed_advisories": [
{
"id": "GHSA-evil-1234",
"purl": "pkg:npm/foo",
"expires": "2026-12-31",
"reason": "Awaiting upstream patch (issue #42)"
},
"GHSA-old-school"
]
}
Both fields are optional. String entries (the v0.5 form) keep working — the array is a union of both shapes.
Behavior
-
Active entry (
expiresis today or in the future, OR noexpires): finding is suppressed as before. -
Expired entry (
expiresis strictly before today): finding surfaces, and bomdrift prints one warning line per expired entry to stderr:warning: baseline entry GHSA-evil-1234 (pkg:npm/foo) expired 2026-04-29; finding will surface in this run — was: Awaiting upstream patch (issue #42) -
Malformed
expires(e.g.2026/12/31): bomdrift refuses to load the baseline rather than silently treating it as never-expiring. Use strictYYYY-MM-DDzero-padded.
The “today” comparison honors SOURCE_DATE_EPOCH so reproducible-build
contexts stay deterministic.
CLI
bomdrift baseline add GHSA-evil-1234 \
--expires 2026-12-31 \
--reason "Awaiting upstream patch (issue #42)"
The comment-suppress companion action also picks up an optional
reason: <text> line in the triggering comment body:
/bomdrift suppress GHSA-evil-1234
reason: Awaiting upstream patch (issue #42)
Worked rotation example
Six months ago the team accepted GHSA-evil-1234 with a 6-month expiry. Today the warning fires:
warning: baseline entry GHSA-evil-1234 expired 2026-04-29 …
The reviewer either renews the suppression (new PR, new expiry + reason) or removes the entry and merges the upstream patch. Suppressions become reviewed work-items, not silent forever-state.
VEX (Vulnerability Exploitability eXchange)
bomdrift consumes and emits VEX statements so reviewers can record exploitability decisions next to their SBOMs and have those decisions suppress noise on subsequent diffs.
Two formats are supported on input (auto-detected per file):
- OpenVEX 0.2.0 — see https://github.com/openvex/spec.
- CycloneDX VEX 1.6 —
analysis.stateis mapped onto the OpenVEX vocabulary (not_affected/resolved→not_affected,exploitable→affected,in_triage→under_investigation).
OpenVEX is bomdrift’s preferred output format on emission (--emit-vex)
because the standalone JSON-LD doc is the smallest interop surface.
Consuming VEX (--vex <path>)
The flag is repeatable. Each file is auto-detected by its top-level
shape. Statements match findings by (vuln_id_or_alias, product_purl).
| VEX status | Effect on the matching finding |
|---|---|
not_affected | Suppresses (counted in “Suppressed by VEX”) |
fixed | Suppresses |
under_investigation | Annotates with VEX:under_investigation |
affected | Annotates with VEX:affected |
A VEX statement’s products[] may be either purl strings or
{"@id": "pkg:..."} objects. A versionless statement
(pkg:npm/foo) matches every versioned finding-product
(pkg:npm/foo@1.2.3); a versioned statement only matches the exact
purl.
Synthetic finding IDs
bomdrift emits non-CVE findings (typosquats, version-jumps, maintainer-age, license-violations). To author VEX statements that suppress them, use the synthetic ID convention:
| Finding kind | Synthetic ID format |
|---|---|
| Typosquat | bomdrift.typosquat:<purl>:<closest> |
| Version-jump | bomdrift.version-jump:<purl>:<before_major>-><after_major> |
| Maintainer-age | bomdrift.young-maintainer:<purl>:<top_contributor> |
| License-violation | bomdrift.license-violation:<purl>:<license_string> |
Example OpenVEX statement suppressing a typosquat finding:
{
"vulnerability": { "name": "bomdrift.typosquat:pkg:npm/plain-crypto-js@4.2.1:crypto-js" },
"products": [ { "@id": "pkg:npm/plain-crypto-js@4.2.1" } ],
"status": "not_affected",
"justification": "vulnerable_code_not_present",
"status_notes": "verified the package is a re-export and not impersonating crypto-js"
}
Multiple files
--vex first.json --vex second.json is processed left-to-right.
Statements with the same (vuln_id, product) are first-write-wins —
later files do NOT override earlier ones. Layer policy-level VEX first
and project-level VEX second so the project-level entries override the
defaults. (Or pass them in the reverse order if you want the opposite
precedence.)
Verifying with vexctl
If you have vexctl installed:
vexctl filter --vex bomdrift.openvex.json sbom.cdx.json
verifies the VEX doc is well-formed and that statements match a known purl in your SBOM.
Emitting VEX (--emit-vex <path>)
Writes a single OpenVEX 0.2.0 document covering every finding in the post-baseline diff.
-
Baseline-suppressed findings inherit their
vex_statusfrom the baseline entry, defaulting tounder_investigation. Baseline ≠ “not affected” — baseline often means “accepted in PR review” or “temporarily ignored”, so emittingnot_affectedby default would publish a false claim. Opt in by addingvex_status: "not_affected"to the baseline entry:{ "id": "GHSA-x-y-z", "purl": "pkg:npm/foo", "expires": "2026-12-31", "reason": "Awaiting upstream patch (issue #42)", "vex_status": "not_affected", "vex_justification": "vulnerable_code_not_present" } -
Un-suppressed findings emit as
affectedwithstatus_notesdescribing the bomdrift finding kind. The justification field falls back to the configured[diff] vex_default_justification(defaultvulnerable_code_not_in_execute_path).
The doc’s timestamp honors SOURCE_DATE_EPOCH, so --emit-vex
output is byte-deterministic in CI when the env is set.
Configuration keys
[diff]
vex_author = "https://example.com/security"
vex_default_justification = "vulnerable_code_not_in_execute_path"
vex_author falls back to repo_url when unset; falls back to
"bomdrift" when both are missing.
Justification vocabulary
bomdrift uses the OpenVEX 0.2.0 spec’s standard justification values
verbatim: component_not_present, vulnerable_code_not_present,
vulnerable_code_not_in_execute_path,
vulnerable_code_cannot_be_controlled_by_adversary,
inline_mitigations_already_exist, plus the
under_investigation-related justifications the spec defines.
Richer justification vocabularies (per-organization tags,
custom-reason strings, tool-specific extensions) are out of scope —
authoring against a single canonical enum keeps --emit-vex output
interoperable with any OpenVEX consumer. If the OpenVEX spec evolves
to add new justifications, bomdrift follows the spec; non-spec
justifications won’t be invented here.
Worked rotation example
-
Run a diff that surfaces
GHSA-evilonpkg:npm/foo@1.0.0. -
Investigate, conclude the vulnerable function is not on your execute path.
-
Add the entry to
.bomdrift/baseline.jsonwith VEX status:{ "schema_version": 1, "suppressed_advisories": [ { "id": "GHSA-evil", "purl": "pkg:npm/foo@1.0.0", "expires": "2027-01-01", "reason": "Function is unreachable per audit (PR #123)", "vex_status": "not_affected", "vex_justification": "vulnerable_code_not_in_execute_path" } ] } -
Re-run with
--emit-vex bomdrift.openvex.jsonto produce a publishable exploitability statement that downstream consumers can ingest with their own--vexflag.
License policy
bomdrift can enforce a license allow/deny policy on every newly added or
version-changed component. Distinct from the License changed finding
(which detects same-version license drift), this is “the configured
policy says this license isn’t allowed.”
Configuration
In .bomdrift.toml:
[license]
allow = ["MIT", "Apache-2.0", "BSD-3-Clause", "ISC"]
deny = ["GPL-3.0-only", "AGPL-*"]
allow_ambiguous = false
Or via CLI flags (override the config block when set, matching the GitHub Dependency Review Action flag names exactly):
bomdrift diff before.json after.json \
--allow-licenses MIT,Apache-2.0,BSD-3-Clause \
--deny-licenses 'GPL-3.0-only,AGPL-*'
Both flags accept comma-separated values and may be repeated.
Matching rules (v0.8 — fail-closed)
| Input | With allow_ambiguous=false | With allow_ambiguous=true |
|---|---|---|
Atomic license on allow | permit | permit |
Atomic license on deny | deny | deny |
Atomic license matching *-suffix glob in deny (AGPL-* ↔ AGPL-3.0-only) | deny | deny |
Atomic license not on allow (when allow is non-empty) | not-allowed | not-allowed |
Compound expression (MIT OR GPL-3.0) | ambiguous | permit |
NOASSERTION / OTHER / empty | ambiguous | permit |
Deny wins when a license matches both allow and deny.
Compound SPDX expression evaluation ((MIT OR Apache-2.0) against
allow={Apache-2.0} resolves to permit) lands in v0.9 via the spdx
crate. v0.8 fails closed on every compound expression unless
allow_ambiguous=true is set explicitly.
Threshold gating
bomdrift diff before.json after.json --fail-on license-violation
Exits 2 when any violation is present. --fail-on any also includes
license violations.
Output
- Markdown: new “License violations” section before “License changed”, with ecosystem / name / version / license / matched-rule columns.
- Terminal:
[LIC]tag + matched rule per finding. - JSON:
enrichment.license_violationstop-level array. - SARIF:
bomdrift.license-violationrule + per-finding result with stablepartialFingerprints.primaryHash/v1. See SARIF + Code Scanning.
Suppression
License violations honor the standard --baseline machinery via the
v0.5 suppressed_advisories field. Use a fully-qualified license
identifier (or the SPDX expression as written by the SBOM) as the
suppression key. The v0.8 expires + reason fields work the same
way.
SPDX expression evaluation (v0.9+)
bomdrift evaluates each license string as a full SPDX expression via
the spdx crate. Evaluation outcomes:
| Expression | Allow | Deny | Outcome |
|---|---|---|---|
MIT | [MIT] | — | Permitted (allow exact match) |
(MIT OR Apache-2.0) | [MIT] | — | Permitted (one branch allowed) |
(MIT AND GPL-3.0-only) | [MIT] | [GPL-3.0-only] | Violation (deny wins) |
(GPL-3.0-only OR MIT) AND BSD-3-Clause | [MIT, BSD-3-Clause] | [GPL-3.0-only] | Violation (denial path could resolve to GPL) |
Apache-2.0 WITH LLVM-exception | [Apache-2.0] | — | Permitted (base license allowed; exception identity is currently informational only) |
Custom (non-SPDX) | [MIT] | — | Falls back to atomic match → not in allow list |
NOASSERTION / OTHER / empty | [MIT] | — | Ambiguous → violation (fail-closed) |
Precedence
- Deny wins — any required atomic on the deny list (including any OR-branch) trips a violation, because the resolved license could be the denied alternative.
- Glob —
*suffix patterns work in both lists (e.g.AGPL-*matches everyAGPL-*-onlyfamily member). - Allow — when the allow list is non-empty, the SPDX expression
must
evaluateto true under a closure that returns true for allow-listed atomics. - Non-SPDX strings — fall through to the v0.8 atomic-string matcher so vendor-specific license strings keep working.
Deprecated: allow_ambiguous
The v0.8 allow_ambiguous flag flipped fail-closed behavior on
compound expressions. v0.9’s evaluator handles compounds correctly,
so the flag is now a no-op when SPDX parsing succeeds. A one-time
deprecation warning is printed to stderr per run when the flag is
set. The flag still works on the fallback path (non-SPDX strings) for
back-compat; it will be removed in v1.0.
WITH (exception) granularity
Per-exception allow/deny is configured with
--allow-exception / --deny-exception
(or [license] allow_exceptions / deny_exceptions in .bomdrift.toml).
When either list is non-empty, the right-hand side of every WITH
clause is evaluated against it: Apache-2.0 WITH LLVM-exception is
permitted iff Apache-2.0 passes the base policy AND LLVM-exception
is on the allow list (or absent from a non-empty deny list). Empty
exception lists preserve v0.9 behavior — exceptions are informational
only.
Compound-expression inheritance (v0.9.7)
v0.9.7 refines how exception decisions propagate through compound expressions. The rules:
- AND inherits:
(X WITH ex) AND (Y)denies if either sub-clause would deny on its own. A denied exception in any conjunct denies the whole expression — every required atomic must be satisfiable, so a poisonedWITHclause poisons the conjunction. - OR does not poison:
(X WITH ex_a) OR (X WITH ex_b)is permitted when at least one branch is permitted. A denied exception on one branch doesn’t sink the expression as long as another branch resolves cleanly. - Bare exception lookup:
WITH <exception>without an allow/deny exception list configured falls through to v0.9 behavior (informational; the base license alone gates). - Deny still wins atomically: a base license on the deny list denies regardless of the exception attached.
Worked examples
Assume [license] allow = ["Apache-2.0", "MIT"],
allow_exceptions = ["LLVM-exception"],
deny_exceptions = ["Classpath-exception-2.0"].
| Expression | Resolution | Why |
|---|---|---|
Apache-2.0 WITH LLVM-exception | permit | base allowed, exception allowed |
Apache-2.0 WITH Classpath-exception-2.0 | deny | exception on deny list |
Apache-2.0 WITH Some-other-exception | deny | base allowed, but exception not on the non-empty allow list |
(Apache-2.0 WITH LLVM-exception) AND BSD-3-Clause | deny | AND inherits — BSD-3-Clause not on allow list, denies the conjunction even though the WITH half is fine |
(Apache-2.0 WITH LLVM-exception) AND MIT | permit | both conjuncts pass independently |
(Apache-2.0 WITH Classpath-exception-2.0) AND MIT | deny | denied exception poisons the AND |
(Apache-2.0 WITH Classpath-exception-2.0) OR (Apache-2.0 WITH LLVM-exception) | permit | OR doesn’t poison — the LLVM branch resolves cleanly |
(Apache-2.0 WITH Classpath-exception-2.0) OR (GPL-3.0-only) | deny | both branches denied (one by exception, one by missing-from-allow) |
The runtime evaluator constructs a closure over the allow / deny
exception sets and lets the spdx crate’s expression-evaluation walk
the tree; the rules above describe the closure’s per-leaf decision.
OCI artifact attestation
bomdrift can verify that the SBOMs it diffs were signed by your build system before any drift signal is computed. This closes the “who produced this SBOM?” gap: you already trust the binary you shipped through SLSA-style signing — the SBOM that describes that binary’s supply chain deserves the same scrutiny.
Shipped in v0.9.6. The verification path is opt-in per flag;
existing file-based diffs (bomdrift diff before.json after.json)
are unaffected unless you explicitly pass attestation flags.
Overview
An OCI attestation is a signed in-toto envelope, stored next to a
container image in an OCI registry, that asserts a claim about that
image. bomdrift consumes attestations whose predicate type is
cyclonedx: the predicate body is a CycloneDX SBOM, which bomdrift
then diffs against another (also-attested) SBOM.
bomdrift does not ship a Sigstore client. It shells out to
cosign, which handles:
- in-toto envelope signature verification,
- certificate-chain validation against Fulcio,
- transparency-log inclusion proof (Rekor),
- certificate-identity matching against your supplied regex/issuer.
bomdrift trusts cosign’s verdict. If cosign exits 0, bomdrift parses the verified predicate and feeds it to the diff core. If cosign exits non-zero, bomdrift surfaces the cosign stderr verbatim and exits 1.
Threat model gap NOT addressed
bomdrift does not implement Sigstore protocol verification itself.
You are trusting cosign’s implementation, the cosign binary on
PATH, and whichever Sigstore instance cosign is configured against
(public-good by default; see Self-managed Sigstore).
Prerequisites
- Install cosign. Follow
https://docs.sigstore.dev/system_config/installation/. v0.9.6
was developed and tested against
cosign 2.x. Pin to a specific cosign version in your CI image so signature-verification semantics don’t drift across runs. - Push your SBOMs as cyclonedx attestations on the same OCI reference as the binary they describe (see next section).
Generating attestations
The canonical guide is the sigstore docs; this section is a sketch.
# Produce the SBOM however you do today (Syft, etc.).
syft <oci-ref> -o cyclonedx-json > sbom.cdx.json
# Sign it as an attestation against the same digest.
cosign attest \
--predicate sbom.cdx.json \
--type cyclonedx \
ghcr.io/myorg/myapp@sha256:abc...
The --type cyclonedx flag is the predicate-type matcher bomdrift
filters on. Other predicate types (SPDX, SLSA provenance, custom)
are ignored — see What’s NOT in v0.9.6.
Verifying with bomdrift
Pass an OCI reference instead of a local file path via the attestation flags:
bomdrift diff \
--before-attestation oci://ghcr.io/myorg/myapp@sha256:abc... \
--after-attestation oci://ghcr.io/myorg/myapp@sha256:def... \
--cosign-identity '^https://github.com/myorg/.+@refs/tags/v.+$' \
--cosign-issuer https://token.actions.githubusercontent.com
--before-attestation <OCI-REF>
OCI reference (with oci:// scheme) of the “before” image whose
attached cyclonedx attestation is the “before” SBOM. Mutually
exclusive with the positional <BEFORE> argument; pass one or the
other.
--after-attestation <OCI-REF>
Same as above, for the “after” SBOM.
--cosign-identity <REGEX>
Required when any --*-attestation flag is set. RE2-syntax regex
that the certificate’s subject Subject Alternative Name must
match. For GitHub Actions OIDC, this is typically the workflow URL
plus a refs constraint, e.g.
^https://github.com/myorg/myapp/.github/workflows/release\.yml@refs/tags/v.+$.
bomdrift passes this to cosign as --certificate-identity-regexp.
--cosign-issuer <URL>
Required when any --*-attestation flag is set. The OIDC issuer
that minted the signing certificate. For GitHub Actions, this is
https://token.actions.githubusercontent.com.
bomdrift passes this to cosign as --certificate-oidc-issuer.
--require-attestation
Hard-mode flag. When set:
- Both
--before-attestationand--after-attestationmust be provided. - Positional
<BEFORE>and<AFTER>file arguments are rejected (clap conflict). - Any cosign verification failure exits 1; there is no fallback to unverified file inputs.
Use this on the production-CI gate that blocks releases. In dev
loops where you sometimes diff a local file against a published
attestation, leave --require-attestation off and let the operator
mix file inputs with attestation inputs.
What bomdrift trusts
The trust boundaries, made explicit:
- bomdrift trusts cosign to verify the in-toto envelope’s signature, certificate chain, and Rekor inclusion proof.
- bomdrift trusts cosign to enforce the certificate identity regex and OIDC issuer match.
- bomdrift does not independently re-verify the Sigstore
transparency log. That is
cosign verify-attestation’s job. - bomdrift assumes the predicate-type filter (
--type=cyclonedx) is honored by cosign. It is, but the assumption is documented here so future cosign behavior changes are visible to auditors. - bomdrift parses the verified predicate as CycloneDX JSON. Anything cosign hands back that doesn’t parse as CycloneDX exits bomdrift with a parse error.
Self-managed Sigstore instances
If you run your own Sigstore stack (private Fulcio + Rekor), cosign honors the standard Sigstore env vars:
| Variable | Purpose |
|---|---|
COSIGN_REKOR_URL / SIGSTORE_REKOR_URL | Override the public-good Rekor instance. |
COSIGN_FULCIO_URL / SIGSTORE_FULCIO_URL | Override Fulcio. |
COSIGN_OIDC_ISSUER | Override the default OIDC issuer probed during signing. |
SIGSTORE_ROOT_FILE | Pin a custom Sigstore TUF root for verification. |
bomdrift inherits the parent process environment when shelling out
to cosign, so exporting these before invoking bomdrift diff is
sufficient. No bomdrift-side flags are needed.
export SIGSTORE_REKOR_URL=https://rekor.internal.example.com
bomdrift diff --before-attestation ... --after-attestation ... ...
Air-gapped / self-hosted Sigstore
Regulated environments — finance, defense, healthcare on-prem, government
cloud — frequently can’t reach the public-good Sigstore instance
(rekor.sigstore.dev, fulcio.sigstore.dev, tuf-repo-cdn.sigstore.dev).
The org runs its own Sigstore stack inside the trust boundary, with its
own TUF root, Fulcio CA, and Rekor transparency log. bomdrift supports
this without any bomdrift-side configuration: the attestation module
shells out to cosign and does not scrub or modify the calling
environment, so every Sigstore env var cosign respects flows through
unchanged.
Environment variables
| Variable | Purpose |
|---|---|
SIGSTORE_REKOR_URL / COSIGN_REKOR_URL | Transparency-log endpoint (your private Rekor). |
SIGSTORE_FULCIO_URL / COSIGN_FULCIO_URL | Short-lived cert issuer (your private Fulcio). |
SIGSTORE_OIDC_ISSUER / COSIGN_OIDC_ISSUER | OIDC issuer used by the keyless flow. In a true air-gap you’ll likely use key-based attestations instead — see below. |
SIGSTORE_ROOT_FILE | Path to a custom Sigstore TUF root JSON (root.json). |
TUF_ROOT | Directory containing TUF metadata (root + targets). |
COSIGN_REPOSITORY | Alternate cosign-data registry, when attestations are stored separately from the artifact’s registry. |
bomdrift forwards the unchanged process environment to every cosign invocation, so exporting the variables on the workflow / shell that invokes bomdrift is enough — no bomdrift flag is needed.
Worked example: GitHub Actions against a private Sigstore
- uses: Metbcy/bomdrift@v1
with:
before-attestation: oci://registry.internal.example/myapp@sha256:abc...
after-attestation: oci://registry.internal.example/myapp@sha256:def...
cosign-identity: '^https://github.example.internal/.+$'
cosign-issuer: https://oidc.internal.example
require-attestation: 'true'
env:
SIGSTORE_REKOR_URL: https://internal-rekor.example
COSIGN_FULCIO_URL: https://internal-fulcio.example
SIGSTORE_OIDC_ISSUER: https://oidc.internal.example
TUF_ROOT: ${{ github.workspace }}/.sigstore/tuf
SIGSTORE_ROOT_FILE: ${{ github.workspace }}/.sigstore/tuf/root.json
The action’s composite step inherits this env: block, propagates it to
the bomdrift binary, and bomdrift propagates it again to cosign. No
input on the action surface is needed for any of these — they are
cosign’s own contract.
Key-based (non-keyless) attestations
In a true air-gap, the OIDC keyless flow may not be reachable: there’s no public-good Fulcio CA to mint short-lived certificates, and your internal OIDC issuer may not be wired up to your internal Fulcio yet. cosign’s fallback is key-based attestation:
cosign attest --key cosign.key --predicate sbom.cdx.json \
--type cyclonedx registry.internal.example/myapp@sha256:abc...
For verification, cosign auto-detects a cosign.pub in the working
directory or honors the COSIGN_PUBLIC_KEY env var. bomdrift’s current
--cosign-identity / --cosign-issuer flags target the keyless flow;
for the key-based flow, leave them empty (or pass identity values that
match how cosign records key-based attestations) and rely on env-var
passthrough:
export COSIGN_PUBLIC_KEY=$PWD/cosign.pub
bomdrift diff \
--before-attestation oci://registry.internal.example/myapp@sha256:abc... \
--after-attestation oci://registry.internal.example/myapp@sha256:def...
cosign reads COSIGN_PUBLIC_KEY directly when no certificate-identity
flags are present. bomdrift forwards the env unchanged, so no
bomdrift-side configuration is required.
Troubleshooting checklist
When verification fails in an air-gapped setup, walk this list:
Error: updating local metadata and targets— TUF can’t reach the configured TUF repo. VerifyTUF_ROOTpoints at a directory pre-populated with your org’s TUF metadata, and thatSIGSTORE_ROOT_FILEreferences a validroot.json.Error: getting Rekor public keys— Rekor URL is unreachable from the runner.curl -v "$SIGSTORE_REKOR_URL/api/v1/log/publicKey"from the same runner identity to confirm network reachability.x509: certificate signed by unknown authority— your private Fulcio’s intermediate CA isn’t in the system trust store. Either install it on the runner image, or setSSL_CERT_FILEto a bundle that includes it.Error: no matching signatureswith key-based attestations — cosign found the attestation but the public key didn’t match. ConfirmCOSIGN_PUBLIC_KEYresolves to the same key that signed the attestation, and that no--cosign-identity/--cosign-issuervalues are present (those force the keyless code path).Error: dial tcp: lookup rekor.sigstore.dev— cosign fell back to the public-good defaults because one of the SIGSTORE_* env vars wasn’t actually exported into bomdrift’s process. On GitHub Actions, double-check theenv:block lives on the same step as the action (or a parentjobs.<id>.env:block), not on a different step.- Verification works locally but not in CI — the runner image lacks
cosign, or cosign was installed but
PATHisn’t propagated to the composite-action subshell. Theverify-signatures: truecodepath already installs cosign for release signature verification; reuse that install or pin a known cosign version explicitly.
The air-gapped path uses cosign’s own contract, so any deeper diagnosis
is a cosign problem, not a bomdrift problem. Reproduce with cosign verify-attestation --type cyclonedx ... directly, with the same env
vars exported, before opening a bomdrift issue.
Troubleshooting
executable file not found in $PATH: cosign
bomdrift couldn’t find cosign on PATH. Install per
Prerequisites, or set PATH so the cosign binary
is reachable from the bomdrift process.
Error: no matching signatures
The cosign verification rejected every attached signature. Most
common cause: --cosign-identity regex doesn’t match the actual
certificate SAN. Debug with cosign directly first:
cosign verify-attestation \
--type cyclonedx \
--certificate-identity-regexp '<your-regex>' \
--certificate-oidc-issuer '<your-issuer>' \
ghcr.io/myorg/myapp@sha256:abc...
If cosign’s own output is more revealing, you’ve isolated the problem outside bomdrift.
predicate type mismatch / no attestations of the requested type
The OCI reference has attestations, but none of type cyclonedx.
bomdrift only consumes CycloneDX SBOM attestations in v0.9.6 — see
the next section.
Error: parsing CycloneDX: ...
cosign verified the envelope but bomdrift couldn’t parse the
predicate body as CycloneDX. Inspect the raw predicate by running
the cosign command above with -o json and look at
payload.predicate.
What’s NOT in v0.9.6
- SPDX SBOM attestations. Only CycloneDX. SPDX-attestation support is a future ask; file an issue if you need it. The predicate parser is the only piece that needs to grow.
- Direct Rekor verification. Deferred to cosign. bomdrift will not grow a Sigstore client implementation.
- Air-gapped Sigstore. Documented as a first-class flow via cosign-respected env-var passthrough; see Air-gapped / self-hosted Sigstore.
- In-process attestation (no shell-out). Pulling in a full-fat Sigstore Rust SDK contradicts the OSS-first / small-dep-tree design constraint. Revisit once a minimal, audited Rust Sigstore client exists.
Related
- Plugins — for verifying additional org-specific signals on attested SBOMs.
- Output formats — verified diffs render identically to file-based diffs.
- Roadmap — for the broader v0.9.6 dispositions.
Plugins
bomdrift’s enricher set is intentionally curated — typosquats, maintainer age, registry metadata, OSV/EPSS/KEV. Org-specific signals (banned packages, license-tier policies, internal package allowlists) don’t belong in the binary, but they need a first-class extension point. v0.9.6 ships that extension point as external-process plugins.
Overview
A plugin is an executable on the filesystem (any language, any shape of dependencies) that reads a JSON envelope from stdin and writes a JSON envelope to stdout. bomdrift invokes it once per matching component during a diff. Findings the plugin emits are merged into bomdrift’s output across every render path: terminal, markdown, JSON, SARIF.
Plugins are not a sandbox. They run as your CI user with the same filesystem and network access bomdrift itself has. Treat plugin source the same way you’d treat any external CI script.
Why external-process and not WASM
The original v0.4 sketch on the roadmap floated WASM. v0.9.6 deliberately picks shell-out instead:
- Smaller dep tree. No wasmtime / wasmer pulled into the bomdrift binary. The dep-tree audit is a real OSS-first constraint.
- Any language. Plugins write Bash, Python, Go, Rust, whatever. WASM would force a per-language toolchain.
- Sandboxing is the user’s environment. CI runners already isolate per-job. Adding WASM-level sandboxing inside an already isolated container is duplicate effort for marginal value.
- Failure isolation is cheap. A child-process crash can’t take bomdrift down; we already get that for free from the OS.
WASM may revisit in v1.0+ if a clear need materializes (in-browser diffing, multi-tenant CI without per-job isolation). For now, the shell-out model wins on simplicity and dep cost.
Manifest format
A plugin manifest is a TOML file pointed at by --plugin <path>.
The flag is repeatable — bomdrift loads each manifest in
declaration order and runs all matching plugins per component.
[plugin]
name = "my-plugin"
description = "What this plugin checks for"
exec = "./run.sh"
timeout_ms = 5000
invoke_on = ["added", "version-changed"]
Fields
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
name | string | yes | — | Unique within a single bomdrift run. Used in error messages and SARIF rule IDs. |
description | string | no | — | Free-form. Surfaced when bomdrift logs plugin failures. |
exec | string | yes | — | Path to the executable, resolved relative to the manifest directory. Use ./ prefix to make this explicit. Absolute paths are accepted. |
timeout_ms | integer | no | 5000 | Wall-clock timeout per invocation. After expiry the process is killed and the invocation’s findings are dropped. |
invoke_on | string list | yes | — | Subset of ["added", "version-changed"]. Future versions may add removed, license-changed, maintainer-changed. Unknown values are rejected at load time. |
exec must be marked executable on disk. bomdrift does not
auto-chmod +x; this would mask permission bugs.
Protocol — stdin/stdout JSON shape
bomdrift writes one JSON object on the plugin’s stdin, closes stdin, and reads exactly one JSON object from stdout (parsing the last complete JSON object on stdout — earlier output is treated as plugin log noise and discarded silently, but plugins shouldn’t rely on this). The plugin should write its findings JSON and exit promptly.
Stdin
{
"component": {
"purl": "pkg:npm/foo@1.2.3",
"name": "foo",
"version": "1.2.3",
"licenses": ["MIT"]
},
"event": "added",
"before": null
}
component— the after component. Always present.event—"added"or"version-changed". Matches the manifest’sinvoke_onfilter.before—nullforadded, the before component (same shape ascomponent) forversion-changed.
Unknown fields may appear in future bomdrift versions. Plugins must ignore unknown fields on stdin and not assume the input shape is closed.
Stdout
Exactly one JSON object on a single line (newline-terminated is fine; multi-line pretty-printed JSON is also accepted as long as it’s a single value):
{
"findings": [
{
"kind": "your-finding-tag",
"message": "human-readable description",
"severity": "info",
"rule_id": "stable.id.for.this.kind"
}
]
}
| Field | Type | Required | Notes |
|---|---|---|---|
kind | string | yes | Free-text tag. Surfaced in the markdown/terminal renderers as the finding category. Keep it short and stable. |
message | string | yes | One-line human-readable description. |
severity | string | yes | One of "info", "warning", "error". Maps to SARIF level as note / warning / error. |
rule_id | string | yes | Stable identifier for this class of finding. Used in SARIF partialFingerprints; should be the same across runs for the same logical finding so dedup works. |
An empty findings array is the no-match path:
{"findings": []}
SARIF mapping
All plugin findings render under a single SARIF rule:
bomdrift.plugin. The plugin’s rule_id is threaded into the
SARIF result’s partialFingerprints so that GitHub Code Scanning
and similar consumers can dedup runs of the same finding.
Failure semantics
Plugins are best-effort. Their failures never fail the bomdrift diff:
| Failure mode | bomdrift response |
|---|---|
| Plugin exits non-zero | Drop findings from this invocation. Log warning if BOMDRIFT_DEBUG=1. |
Wall-clock timeout (timeout_ms) | Kill the process. Drop findings. Log warning if BOMDRIFT_DEBUG=1. |
| Stdout is not parseable JSON | Drop findings. Log warning if BOMDRIFT_DEBUG=1. |
Stdout JSON is missing findings | Drop findings. Log warning if BOMDRIFT_DEBUG=1. |
findings[i].severity is unknown | Drop that finding. Other findings in the same invocation pass through. |
| Plugin exec is missing on disk | Manifest load fails fast (before any diff work). Exit 1. |
The contract: the rest of the bomdrift report still renders. A bad
plugin is a noisy plugin, not a broken pipeline. Run with
BOMDRIFT_DEBUG=1 while authoring a plugin to see why findings are
being dropped.
Windows note
On Windows, Command::kill() has known quirks where killed
processes may leave orphan grandchildren. bomdrift kills the direct
child cleanly; if your plugin spawns sub-processes, ensure it
forwards the timeout signal itself. Plugin timeouts on Windows are
best-effort in v0.9.6.
Worked example: banned-packages
The reference implementation lives in
examples/plugins/banned-packages/:
examples/plugins/banned-packages/
├── README.md # how to adapt for your org
├── plugin.toml # the manifest below
├── check-banned.sh # bash + jq implementation
└── banned.txt # purl prefixes to flag
plugin.toml:
[plugin]
name = "banned-packages"
description = "Flag dependencies on the org-maintained banned-packages list"
exec = "./check-banned.sh"
timeout_ms = 5000
invoke_on = ["added", "version-changed"]
Invocation:
bomdrift diff before.cdx.json after.cdx.json \
--plugin examples/plugins/banned-packages/plugin.toml
See the example’s README for adaptation guidance, performance characteristics, and security notes.
Performance
bomdrift invokes plugins sequentially, once per matching
component. With N Added/VersionChanged components and P
plugins, you’ll see N × P invocations. Implications:
- Process-startup cost matters. A bash plugin that forks
jqten times costs ~30 ms of fork + interpreter warmup per call. AtN = 200, P = 3that’s ~18 s of pure startup overhead. Compile to a static Go/Rust binary if hot-path performance matters. - Tune
timeout_ms. The default (5000) is generous for pure-CPU plugins; a plugin that hits a network endpoint per component might need30000. A plugin that’s intermittently slow ruins your CI cycle time — consider sampling inside the plugin (return early for components that don’t match its scope). - No parallelism in v0.9.6. Concurrent plugin execution is on the table for v1.0 if a meaningful workload demands it. File an issue with timing data if you hit this.
Security
bomdrift does not sandbox plugins:
- Plugins run as the bomdrift parent’s user.
- Plugins inherit the parent’s environment (including secret-bearing
env vars like
GITHUB_TOKEN,NPM_TOKEN, etc.). - Plugins inherit the parent’s filesystem and network access.
- Plugins can spawn arbitrary sub-processes.
Treat plugin source like any external CI script:
- Vet what you ship. Read the plugin source, including any binary dependencies it pulls in.
- Pin to a commit / tag. Don’t
curl ... | bashan always-latest plugin executable. - Minimize the env. If a plugin doesn’t need a secret, don’t
let it inherit one.
env -i bomdrift diff ...strips the environment; manually re-export only what bomdrift itself needs. - Mirror internally. For high-trust pipelines, vendor the plugin into your own repo or internal artifact store rather than pulling from a public registry on every CI run.
Stability promise
The plugin protocol’s stdin/stdout JSON shape is best-effort stable in v0.9.6:
- We may add fields to the stdin envelope in a future minor release. Plugins must ignore unknown fields.
- We will not remove or rename documented stdin or stdout fields without a major version bump.
- The stdout
findingsschema is the public contract; treatkind,message,severity,rule_idas semver-stable. - The TOML manifest schema may grow new optional fields; existing fields stay.
If the protocol needs a breaking change for v1.0, a deprecation
window with a protocol_version envelope field will land at least
one minor release before the break.
CI integration
A typical GitHub Actions job that wires in a plugin:
jobs:
bomdrift:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
# Make sure jq is available if your plugin needs it.
- run: sudo apt-get install -y jq
- uses: Metbcy/bomdrift@v1
with:
before-sbom: before.cdx.json
after-sbom: after.cdx.json
extra-args: --plugin examples/plugins/banned-packages/plugin.toml
For multiple plugins, repeat --plugin in extra-args:
extra-args: >-
--plugin .bomdrift/plugins/banned-packages/plugin.toml
--plugin .bomdrift/plugins/license-tier/plugin.toml
Related
examples/plugins/banned-packages/— worked reference.- SARIF + Code Scanning — how
bomdrift.pluginfindings appear in Code Scanning. - Roadmap — design rationale for shipping plugins in v0.9.6.
Release signing
Every bomdrift release archive is signed with cosign keyless via Sigstore + GitHub OIDC. This means:
- The signing key is not stored in the repo or in any GitHub Secret.
- Each signature is bound to the GitHub Actions workflow run that
produced it, with the OIDC issuer
(
token.actions.githubusercontent.com) acting as the identity provider. - The signing transparency log is the public Sigstore Rekor instance.
Verifying a release manually
VERSION=v0.9.6
TARGET=x86_64-unknown-linux-gnu
ARCHIVE=bomdrift-${VERSION}-${TARGET}.tar.gz
# Download the archive + signature + certificate
BASE="https://github.com/Metbcy/bomdrift/releases/download/${VERSION}"
curl -fsSL -O "${BASE}/${ARCHIVE}"
curl -fsSL -O "${BASE}/${ARCHIVE}.sig"
curl -fsSL -O "${BASE}/${ARCHIVE}.pem"
# Verify
cosign verify-blob \
--certificate-identity "https://github.com/Metbcy/bomdrift/.github/workflows/release.yml@refs/tags/${VERSION}" \
--certificate-oidc-issuer https://token.actions.githubusercontent.com \
--certificate "${ARCHIVE}.pem" \
--signature "${ARCHIVE}.sig" \
"${ARCHIVE}"
A successful verification prints Verified OK. Anything else means the
archive has been tampered with (or the certificate’s identity doesn’t
match the expected workflow run — same outcome, do not trust).
What the certificate identity proves
The --certificate-identity argument pins the verification to the
exact workflow file that produced the signature, including the tag
ref. As long as release.yml is the only workflow that ever signs
bomdrift archives (it is, and the file is reviewed in PRs), an
attacker who can’t push to the bomdrift repo can’t produce a verifiable
signature.
The --certificate-oidc-issuer pins to GitHub’s OIDC issuer.
Substituting a different IDP-backed signature wouldn’t pass.
Action-side verification (default)
The Metbcy/bomdrift action calls cosign verify-blob automatically
on every download (when verify-signatures: true, the default). When
verification fails, the action exits non-zero before running
bomdrift, so a tampered binary never executes.
To skip verification (saves ~15s by also skipping the cosign-installer step), set:
- uses: Metbcy/bomdrift@v1
with:
before-sbom: before.json
after-sbom: after.json
verify-signatures: false
This is appropriate when:
- You’re running on self-hosted runners with a hardened image you control.
- You’ve pre-pinned the bomdrift archive in your Nexus/Artifactory mirror and verified its signature once at mirror time.
- You’re running in a network-restricted environment where the public Sigstore endpoints aren’t reachable.
When verify-signatures: true and cosign isn’t installed (or the
.sig / .pem aren’t on the release), the action fails loudly
rather than silently degrading — that’s the whole point of the
explicit opt-out.
Why keyless?
The traditional alternative is a long-lived signing key stored as a GitHub Secret. That’s:
- A single credential that, if leaked, lets an attacker sign forever.
- A rotation problem — every key rotation breaks all consumers who pinned the verifying public key.
- An audit gap — there’s no public record of who signed what when.
Keyless cosign moves the trust to the GitHub OIDC issuer + the Sigstore Rekor transparency log: every signature has a public, queryable record of the exact GitHub Actions workflow run that produced it, and the signing certificate is short-lived (10 minutes).
SHA-256 checksums
In addition to cosign, every archive ships with a .sha256 file for
old-school checksum verification:
curl -fsSL -O "${BASE}/${ARCHIVE}.sha256"
sha256sum -c "${ARCHIVE}.sha256" # GNU
shasum -a 256 -c "${ARCHIVE}.sha256" # macOS
Checksums alone don’t authenticate the archive (an attacker who can
modify the .tar.gz can also modify the .sha256); cosign is the
authoritative verification path. The checksums exist for older toolchains
and for quick local-rerun checks.
SLSA build provenance (v0.9.9+)
In addition to the cosign-keyless signature on each archive, the release pipeline produces a SLSA build provenance attestation covering both the per-target archives and the multi-arch ghcr.io image. The two are complementary, not redundant:
- cosign proves “the bomdrift maintainer’s GitHub OIDC identity signed this artifact.” It binds the artifact to the human (or workflow run) holding the signing identity at sign time.
- SLSA provenance proves “this artifact was produced by the
public
release.ymlworkflow on tagv0.9.9in this repo, against this commit SHA.” It binds the artifact to the build itself — including the source ref, the workflow file, and the ephemeral runner identity.
Both verifications must pass for the release to be trustworthy. An
attacker who compromised the maintainer’s signing identity (cosign
verifies) but couldn’t push to Metbcy/bomdrift (SLSA fails) would
trip SLSA. Conversely, an attacker who pushed a malicious workflow
to a fork (SLSA verifies for the fork) wouldn’t have the
maintainer’s OIDC identity (cosign fails).
Verifying SLSA provenance — gh (recommended)
The simplest path uses gh, which calls into the SLSA verifier with
the right defaults for GitHub-hosted attestations:
VERSION=v0.9.9
TARGET=x86_64-unknown-linux-gnu
ARCHIVE=bomdrift-${VERSION}-${TARGET}.tar.gz
BASE="https://github.com/Metbcy/bomdrift/releases/download/${VERSION}"
curl -fsSL -O "${BASE}/${ARCHIVE}"
gh attestation verify --owner Metbcy "${ARCHIVE}"
A successful verification prints
Loaded ... attestation(s) ... verified. Pin the source ref by
adding --source-ref refs/tags/${VERSION} if you want to reject
attestations from other tags.
Verifying SLSA provenance — slsa-verifier
For air-gapped or non-GitHub environments where gh isn’t
available:
slsa-verifier verify-artifact \
--provenance-path "${ARCHIVE}.intoto.jsonl" \
--source-uri github.com/Metbcy/bomdrift \
--source-tag ${VERSION} \
"${ARCHIVE}"
The .intoto.jsonl file is downloaded automatically by gh attestation download, or you can fetch it directly from the
release’s attestation manifest at
https://github.com/Metbcy/bomdrift/attestations.
Verifying the ghcr.io image attestation
The multi-arch image carries an inline attestation (pushed by the
build job’s push-to-registry: true):
gh attestation verify --owner Metbcy oci://ghcr.io/metbcy/bomdrift:${VERSION}
Architecture
bomdrift is a single-binary Rust CLI with three logical layers: parse, diff, enrich + render. Every layer is pure (no shared mutable state) so the same input produces byte-identical output every time — the upsert contract.
Module layout
src/
├── main.rs — clap entry point; dispatches to lib::run
├── lib.rs — top-level wiring: load_sbom -> diff -> enrich -> render
├── cli.rs — clap derive types: DiffArgs, RefreshArgs, FailOn, etc.
├── config.rs — `.bomdrift.toml` policy (de)serialization + merge
├── clock.rs — single source of truth for "now" (honors SOURCE_DATE_EPOCH)
├── attestation.rs — `cosign verify-attestation` shell-out (v0.9.6)
├── plugin.rs — external-process plugin loader (v0.9.6)
├── vex.rs — VEX consume (OpenVEX 0.2.0, CycloneDX VEX 1.6) + emit (OpenVEX)
├── baseline.rs — `--baseline` snapshot suppression + `expires`/`reason`/`vex_status`
├── refresh.rs — `bomdrift refresh-typosquat` subcommand
├── model/ — unified component / SBOM types
│ ├── component.rs — Component, Ecosystem, Hash, Relationship
│ └── sbom.rs — Sbom, SbomFormat
├── parse/ — format-specific parsers
│ ├── cyclonedx.rs — CDX 1.5/1.6 JSON
│ ├── spdx.rs — SPDX 2.3 JSON
│ └── syft.rs — Syft JSON
├── diff/ — pair-by-version ChangeSet computation
│ ├── mod.rs — diff(), ChangeSet
│ └── key.rs — ComponentKey (purl-without-version | (eco, name))
├── enrich/ — risk-signal enrichers
│ ├── osv.rs — OSV.dev /v1/querybatch + /v1/vulns/{id}
│ ├── epss.rs — FIRST.org EPSS per-CVE scores (v0.8)
│ ├── kev.rs — CISA KEV catalog (v0.8)
│ ├── registry.rs — npm / PyPI / crates.io metadata (v0.9)
│ ├── license.rs — SPDX expression evaluation + allow/deny + per-exception (v0.8 / v0.9 / v0.9.5)
│ ├── typosquat.rs — Jaro-Winkler + suffix boost / Levenshtein / last-segment / package-portion
│ ├── version_jump.rs — major-delta >= 2 heuristic
│ ├── maintainer.rs — GitHub REST contributor-age (the xz pattern)
│ ├── cache.rs — single source of truth for CACHE_TTL_SECS (v0.9.6 unified)
│ └── mod.rs — Enrichment graph aggregating findings
└── render/ — output formatters
├── markdown.rs — GFM PR-comment body
├── term.rs — TTY-aware ANSI
├── json.rs — pretty-printed serde graph
└── sarif.rs — SARIF v2.1.0 with stable rule IDs + partialFingerprints
The pipeline
OSV.dev /querybatch + /vulns/{id}
|
v
SBOM file --[parse::*]--> Sbom --+ /Enrichment\
| | - vulns | -- typosquat (pure)
SBOM file --[parse::*]--> Sbom --+--+ - typosq's | -- version_jump (pure)
| | - jumps | -- maintainer (GitHub API)
v | - main_age |
ChangeSet --------/
|
v
(--baseline applies here, suppresses findings)
|
v
render::*
|
v
markdown / term / json / sarif
parse layer
Each parser is hand-rolled (~150 LOC). We deliberately avoid the
cyclonedx-bom and spdx-rs crates: their dep trees are heavy
relative to the parsing surface we actually use, and the SBOM JSON
shapes are stable enough that hand-rolling is low maintenance.
The unified model::Component
carries:
name,version,ecosystem(parsed from purl when available, fallback to the source SBOM’s hint)purl(Option<String>),bom_ref(Option<String>)licenses: Vec<String>(canonicalized to SPDX expressions when possible)hashes: Vec<Hash>,supplier: Option<String>,source_url: Option<String>,relationship
SbomFormat::auto_detect looks at top-level JSON fields to dispatch:
bomFormat: "CycloneDX" → CDX, spdxVersion: "..." → SPDX, schema: {name: "Syft"} → Syft. --format <FORMAT> overrides detection.
diff layer
The diff core groups components by ComponentKey and computes per-key:
B = group_by_key(before.components)
A = group_by_key(after.components)
for K in keys(B) ∪ keys(A):
versions in A[K] \ B[K] → ChangeSet::added
versions in B[K] \ A[K] → ChangeSet::removed
versions in A[K] ∩ B[K] with differing licenses → ChangeSet::license_changed
legacy single-version case (|B[K]| = |A[K]| = 1, versions differ)
→ ChangeSet::version_changed (folds in license-changes-with-version-bumps)
ComponentKey is Purl(string-without-version) when the component
has a parseable purl, else NameTuple(Ecosystem, name). This is what
makes cross-format diffs work: a CDX SBOM diffed against an SPDX SBOM
of the same project keys consistently across the two formats.
The BTreeMap-based grouping is what gives the diff its byte-deterministic
ordering. No timestamps leak in, no insertion-order leakage. The
is_deterministic integration test guards the contract.
enrich layer
Enrichers are independent. Each takes a &ChangeSet, returns its
specific finding type (Vec<TyposquatFinding>,
Vec<VersionJumpFinding>, etc.), and the lib’s run_diff aggregates
them into a single Enrichment graph.
Best-effort contract:
- Per-request timeout (15s).
- Errors warn once, never block.
- Per-component caching within a single run.
The OSV enricher is the only one that touches a persistent on-disk
cache (<XDG_CACHE_HOME>/bomdrift/osv/). All other enrichers are
either pure-compute or only cache within a single process.
render layer
Renderers are pure functions: (ChangeSet, Enrichment) → String. The
markdown renderer is the canonical “PR comment” path; terminal is the
TTY default; JSON is the downstream-tooling shape; SARIF is for Code
Scanning ingestion.
Determinism is the upsert contract:
Enrichment::vulnsis aHashMap(the OSV enricher fills it via unordered batch responses). Renderers that emit it (markdown, JSON, SARIF) sort the keys before emission.Enrichment::typosquats/version_jumps/maintainer_ageareVecs populated incs.added/cs.version_changediteration order — which is BTreeMap-derived, so stable.ChangeSet::added/removed/version_changed/license_changedareVecs populated inBTreeMap<ComponentKey, ...>iteration order.
Result: identical inputs render to byte-identical output every time,
which is what peter-evans/create-or-update-comment relies on for the
upsert behavior in the action.
Best-effort enricher contract
Every enricher — network (OSV / EPSS / KEV / GitHub / registries), shell-out (cosign attestation), or external process (plugins) — honors the same fail-soft contract:
- Per-request timeout so a misbehaving upstream can’t hang a CI job.
- Errors warn once to stderr (deduped by key) and the diff renders without that source’s findings.
- Per-component caching within a single run so monorepo subpackages sharing a parent project don’t multiply HTTP requests.
- Best-effort never blocks the diff render. Exit code stays 0 from
the enricher itself; the only way an enricher influences exit code is
indirectly via
--fail-onthresholds tripping on findings it produced.
src/enrich/osv.rs is the canonical pattern; new enrichers MUST mirror
its Result<Vec<Finding>>-where-Err-is-warned-not-propagated shape.
The attestation.rs and plugin.rs modules apply the same contract to
non-network shell-outs: a missing cosign binary, a plugin timeout, or
a malformed plugin response all warn and continue.
Byte-determinism contract
Identical inputs MUST render to byte-identical outputs across every
format. This is what peter-evans/create-or-update-comment relies on
to upsert a PR comment in place rather than accumulating duplicates,
and what makes SARIF / VEX / JSON safe to commit to git.
Concretely:
- All
HashMaps emitted into output are sorted by key first. - All
Vecs populated fromcs.added/version_changediteration inherit the diff core’s BTreeMap-derived order. - Every “now” reference goes through
clock::now(), which honorsSOURCE_DATE_EPOCHfor reproducible-build contexts and for tests. - VEX
@idUUIDs and CycloneDX VEXbom-refstrings are deterministic hashes of the finding tuple, never random.
Tests that mutate SOURCE_DATE_EPOCH MUST acquire clock::test_env_lock()
to serialize across the crate’s parallel test threads — a v0.9.5
discovery during the release/v0.9.5 cleanup. See
Contributing for the recipe.
Why no async / tokio?
bomdrift is intentionally synchronous. The single-binary CLI runs to completion in seconds; concurrent network requests would shave maybe 1–2 seconds off the OSV enricher path on diffs with > 100 unique CVEs, at the cost of:
- ~70 transitive crates (tokio, mio, futures, …).
- A panic-on-blocking-call class of bug that’s a constant trap for contributors.
- A bigger, slower-to-build, slower-to-link binary.
The OSV /v1/querybatch endpoint already batches (1000 queries per
request), so the parallelism we’d want is mostly already there. The
N+1 stage-2 /v1/vulns/{id} calls are gated by the on-disk severity
cache, which makes reruns within the configured TTL essentially free.
Plugin processes (v0.9.6+) are also invoked synchronously: at most one external child at a time, with a per-component timeout. Parallel plugin execution would re-introduce the tokio dependency cost without solving a measured bottleneck.
Why no chrono / no semver / no octocrab?
Same reasoning. We need:
- One ISO-8601 timestamp shape (the canonical
YYYY-MM-DDTHH:MM:SSZGitHub always emits). Hand-rolled parser is ~25 LOC, lives inclock.rs. - The major version of a SemVer string. Hand-rolled extractor is
~5 LOC in
enrich/version_jump.rs. - GitHub REST: a small set of endpoints (contributors, commits)
hand-rolled atop
ureq.octocrabwould pull in tokio.
All three pulls would add transitive weight for no functional gain. The constraint is documented at the top of each affected file so future contributors don’t reflexively reach for the popular crate.
Approved dependencies
As of v0.9.6:
| Crate | Purpose | Notes |
|---|---|---|
clap | CLI parsing | derive feature only |
serde, serde_json | (de)serialization | parse + render |
anyhow, thiserror | error types | |
ureq | HTTP | sync, rustls — no tokio |
strsim | typosquat scoring | Jaro-Winkler + Levenshtein |
owo-colors, supports-color | terminal renderer | |
directories | XDG paths | |
toml | .bomdrift.toml parsing | |
time = "0.3.47" | timestamp formatting | minimal feature set |
sha2 = "0.10" | partialFingerprint hashes (SARIF), VEX @id | |
spdx = "=0.10.9" | exact-pinned SPDX expression evaluation | License-policy semantics shift on minor list updates; pin exactly |
base64 = "0.22" | OCI attestation payload decoding (v0.9.6) | |
wait-timeout = "0.2" | bounded plugin-process wait on Windows (v0.9.7) | sidesteps Command::kill()’s Windows quirks; tiny dep, no transitive weight |
Forbidden by policy: tokio, chrono, semver, octocrab,
async-trait, anything pulling rustls + ring + tokio transitively
beyond what ureq already brings.
Binary size budget
- Target: ≤ 5 MB stripped + LTO on Linux x86_64.
- Current (v0.9.6): ~3.4 MB.
- Audit:
cargo bloat --release --crates -n 20periodically to confirm no unexpected dep-tree growth.
Contributing
Thanks for considering a contribution! bomdrift is intentionally small and the contribution loop is fast.
Looking for somewhere to start?
Issues labeled good first issue
are scoped for first-time contributors:
- Add a name to one of the typosquat top-N lists
(
data/<eco>-top*.txt— see the comment header in any of those files). - Fix a doc typo (mdBook in
docs/, README, or any module-level//!comment). - Improve an error message (bomdrift’s
anyhowchains can usually be more specific about what failed). - Refresh a curated typosquat list from its upstream source (snapshot date is in the file header).
For larger changes (a new enricher, a new ecosystem, an output-format addition), open a discussion or issue first so we can talk through the design before you sink time into a PR.
Development loop
git clone https://github.com/Metbcy/bomdrift
cd bomdrift
cargo check --all-targets # fast feedback while editing
cargo test --release # full test suite (~420 tests as of v0.9.6)
rustup run 1.88 cargo clippy --all-targets --all-features -- -D warnings
cargo fmt --all -- --check # MUST pass; run `cargo fmt --all` to fix
Rust 1.88+ required (the project uses edition 2024; CI is pinned to
1.88 to keep clippy lints stable across releases — see
Cargo.toml’s rust-version field).
Project conventions
Commits
feat(scope): add X— new featurefix(scope): Y— bug fixdocs(scope): Z— documentation onlychore: W— maintenance with no behavioral change
Commit bodies should explain why, not what — git diff shows the
what. Multi-line commit messages are fine; use the heredoc
git commit -m "$(cat <<'EOF' ... EOF)" pattern for readability.
Commit signing on main
main enforces required_signatures via the repository ruleset.
This does NOT mean PR contributors need GPG/SSH signing keys
configured. Here’s how it actually shakes out:
| You’re a… | Do you need to sign? |
|---|---|
| Contributor opening a PR from a fork or feature branch | No. Push commits as-is. The maintainer chooses the merge method. |
Maintainer merging via gh pr merge --merge | No. GitHub’s web-UI key signs the merge commit; it counts as verified. |
Maintainer merging via gh pr merge --squash | No. Same — GitHub signs the squash commit. |
Maintainer merging via gh pr merge --rebase | Yes. Rebase replays your PR commits verbatim onto main, so they must already be signed. |
Anyone pushing directly to main | Yes (and the ruleset blocks it via pull_request anyway, so this only matters for emergency bypass). |
Practical rule of thumb for contributors: don’t worry about it. The maintainer will pick the right merge method.
If you’d like your commits to land verbatim on main for git-blame
attribution (and want to use rebase-merge), set up local signing once:
# SSH-key signing (simplest, no GPG headache)
git config --global gpg.format ssh
git config --global user.signingkey ~/.ssh/id_ed25519.pub
git config --global commit.gpgsign true
Then add the same SSH public key to your GitHub account under SSH and GPG keys → Signing keys.
Branch model
Single-purpose feature branches off main, merged via merge-commits
(git merge --no-ff) so the fan-out graph stays readable. Push the
feature branch alongside the merge to preserve the history visually
on the GitHub network graph.
No emojis in code or rendered output
Strictly bracketed-prefix everything ([ADD], [CVE], [SQT], etc.).
This is for terminal accessibility, grepability of CI logs, and to
keep the markdown PR comment readable in monospace fonts.
No Co-authored-by: <yourself> lines
The Co-authored-by trailer is reserved for collaborators who
genuinely co-authored the commit. The project’s CI tooling adds its
own trailer; don’t duplicate.
Where to put new code
| If you’re adding… | Put it in… |
|---|---|
| A new SBOM format parser | src/parse/<format>.rs + parse::SbomFormat::auto_detect |
| A new enricher | src/enrich/<name>.rs + add to Enrichment struct |
| A new output format | src/render/<format>.rs + OutputFormat clap enum |
| A new diff-core algorithm | src/diff/ (rare; please open an issue first) |
| A new typosquat ecosystem | data/<eco>-topN.txt + SupportedEcosystem enum |
| A new CLI flag | src/cli.rs + wire through lib.rs::run_diff |
| Documentation | docs/src/<chapter>.md + add to docs/src/SUMMARY.md |
Tests
Three layers, all run by cargo test --release:
- Unit tests (
#[cfg(test)] mod testsinside eachsrc/<module>.rs): test the smallest unit. Mock at the function-argument boundary (e.g. inject a fakefn fetcher(url) -> Result<Vec<u8>>for network enrichers). - CLI tests (
tests/cli.rs): spawn the actualbomdriftbinary viaCARGO_BIN_EXE_bomdriftand assert on stdout/stderr/exit code. These are end-to-end and slower; reserve them for user-visible surface (flags, output shape). - Integration tests (
tests/integration.rs): exercise the library API directly without spawning the binary. Faster than CLI tests but cheaper than spinning up the full process.
Network-touching enrichers should have a unit test for the network-
failure path (fake fetcher returns Err) — the best-effort contract
matters and silently breaking it would be an easy regression.
Coverage (v0.9.8+)
CI runs cargo llvm-cov on every PR and posts a sticky comment with
the overall line coverage % (the full lcov report is uploaded as the
coverage-lcov workflow artifact — the artifact name intentionally
avoids the standard lcov-output filename, since email/feed renderers
that strip Markdown backticks autolink anything ending in a TLD and
that filename’s extension resolves to a real, unrelated parked
domain). The job is informational for now — there is no
--fail-under-lines threshold yet. The plan is to add a ratchet in
v0.9.9 once 2–3 releases have made the baseline visible. Until then,
the report is a nudge, not a gate; PRs that move coverage in the
wrong direction without justification will get a review comment, not
a red check.
Test conventions (v0.9.5+)
Tests that mutate SOURCE_DATE_EPOCH (directly or indirectly via
bomdrift::clock::*) MUST acquire clock::test_env_lock() to serialize
across the crate’s parallel test threads. Without the lock, two tests
running in parallel can read each other’s mutated env var and
intermittently fail in ways that look format-deterministic but aren’t.
#![allow(unused)]
fn main() {
#[test]
fn baseline_expiry_relative_to_source_date_epoch() {
let _lock = bomdrift::clock::test_env_lock();
// SAFETY: serialized by _lock above.
unsafe { std::env::set_var("SOURCE_DATE_EPOCH", "1735689600") }; // 2025-01-01
// ... test body ...
}
}
The lock is a std::sync::Mutex<()> — re-entrant calls within a single
test thread are fine, but a panic without the guard will poison it. If
you see “PoisonError” in CI but not locally, a previous test panicked
without releasing — fix the panicking test, not the poison handling.
Adding a new enricher
The shortest viable PR shape, mirroring how enrich::epss was added in
v0.8 and enrich::registry in v0.9:
src/enrich/<name>.rs— pureenrich(cs: &ChangeSet, ...) -> Vec<<Name>Finding>with a fail-soft fetcher boundary. Mirror the shape ofsrc/enrich/osv.rs.- Wire into
Enrichment— add a field to thebomdrift::enrich::Enrichmentstruct insrc/enrich/mod.rs; havelib.rs::run_diffpopulate it. - Add a
--no-<name>flag tosrc/cli.rs::DiffArgs, plumb through the[diff] no_<name>config key. - Renderers — add a section to
render::markdown,render::term,render::json. For SARIF, add a stable rule ID (bomdrift.<name>), apartialFingerprints.primaryHash/v1identity tuple, and a fingerprint-stability test. --debug-calibrationrow — emit one<kind>|<key>|<score>|<threshold>line per finding considered.- Docs — add
docs/src/enrichers/<name>.mdand link it fromdocs/src/SUMMARY.mdanddocs/src/enrichers/overview.md. - CHANGELOG —
## [Unreleased]entry under### Added.
Adding a new finding kind
When a new finding kind is purely a rendering layer (e.g., a new synthetic ID for VEX export or a new SARIF rule for an existing enricher), the recipe is shorter:
- Synthetic-id grammar — extend
bomdrift::vex::SyntheticFindingKindand theparse_synthetic_idparser. Round-trip must be exact. - SARIF rule — add the rule descriptor to
render::sarif::ALL_RULESso it appears intool.driver.ruleseven with zero results, then apartialFingerprintsidentity tuple for the new rule. - Markdown / terminal / JSON sections — mirror the existing per-finding sections.
- Determinism test — round-trip the rendered SARIF / VEX through the parser and assert byte-for-byte equality with the input.
Documentation
When you add a CLI flag / action input / enricher, update:
- The relevant chapter in
docs/src/. - The CHANGELOG entry under
## [Unreleased]. - The README’s Features list (only for user-visible surface).
- Module doc comment explaining why (
//! ...at the top of the file).
mdBook builds with cd docs && mdbook build. The output renders to
docs/book/; check that locally before pushing.
Reporting issues
For false positives / negatives in the heuristic enrichers (typosquat, version-jump, maintainer-age), the most useful issue includes:
- The component name + version that fired (or should have).
- The expected behavior + observed behavior.
- A minimal SBOM pair if possible (synthetic CDX 1.5 JSON works).
Open an issue at https://github.com/Metbcy/bomdrift/issues.
Security disclosures
For supply-chain bugs in bomdrift itself — particularly anything that could let bomdrift run untrusted input as code — please report privately via GitHub Security Advisories rather than a public issue.
Benchmarks
bomdrift uses criterion for benchmarking the four hot paths: parse, diff, typosquat, and render. Benchmarks are not run in CI (the variance on shared GitHub runners is ±20%, which buries any real signal); they’re a tool for validating perf-relevant changes on hardware you control.
Running
# Run all benchmarks
cargo bench
# Run just one harness
cargo bench --bench parse
cargo bench --bench diff
cargo bench --bench typosquat
cargo bench --bench render
# Filter inside a harness
cargo bench --bench typosquat -- npm_batch
Criterion writes an HTML report to target/criterion/report/index.html
on each run with throughput plots, distribution histograms, and
diff-against-previous-run charts.
What each harness measures
parse — SBOM parser layer
For each of the three fixture formats (CycloneDX, SPDX, Syft):
json_value: cost ofserde_json::from_strto aValueonly. Captures the JSON-deserialization floor independent of bomdrift’s parser.full_pipeline: cost offrom_str+parse_with_formatto a normalizedmodel::Sbom. The delta vsjson_valueis bomdrift’s parsing overhead.
A regression in the delta is the signal worth investigating — a
regression in json_value is a serde_json change.
diff — diff core
axios_fixture_pair: realistic small-PR shape (~3 components per side). The lower bound for any diff invocation.synth_monorepo_200: 200 components per side, half of them version-changed. The realistic monorepo upper bound for a single PR.synth_self_diff_200: same input on both sides. Worst case for the BTreeMap-intersection path with no resulting work to do.
A regression on synth_monorepo_200 likely indicates a hot-loop
change in diff_one_key; a regression on synth_self_diff_200 likely
indicates a ComponentKey::Ord change.
typosquat — Jaro-Winkler scoring
one_npm_typosquat_axios: a single candidate (plain-crypto-js) scored against the embedded npm top-1k list. The typosquat enricher’s per-candidate cost.npm_batch10/50/100: a batch of N candidates, exercising the per-candidate cost amortized.mixed_three_ecosystems: one candidate per ecosystem (npm + PyPI- Cargo), exercising the per-ecosystem dispatch and embedded-list load cost (after the OnceLock has been hit).
The first invocation of each ecosystem in a process pays the legit-list
parsing + canonicalization cost (~1ms for the npm 1k list); subsequent
invocations are hot. Criterion’s iter() measures the cached path.
render — output renderers
For each of markdown / JSON / SARIF / terminal, with a synthetic ChangeSet shaped like a moderate PR (50 added / 20 removed / 30 version-changed / 5 license-changed + 10 typosquats + 15 CVEs).
A regression on one renderer specifically is usually a string-formatting change in that file. A regression across all renderers is usually a ChangeSet shape change that propagated.
Suggested workflow for perf-relevant PRs
- On a clean main, run
cargo benchand let criterion record the baseline. - Switch to your branch, make the change, and run
cargo benchagain. - Criterion’s HTML report shows a “Change vs previous” column with confidence intervals. ±5% is noise on most hardware; ±10%+ is worth looking at; statistical significance markers (criterion’s “Performance has improved” / “Performance has regressed” lines) are the first-class signal.
- If the change is intentional (e.g. a feature that adds a new pass), note the new baseline in the PR description so reviewers know to compare against the post-change number, not the pre-change one.
Why no CI integration?
- Shared GitHub runners have ±20% variance run-over-run on these benchmarks. Real regressions are smaller than the noise floor.
- Self-hosted runners with pinned hardware would solve that, but the project doesn’t have that infrastructure (and the operational cost isn’t worth it at the project’s scale).
- For now, run benchmarks locally on a quiet machine; a future contributor can wire up a self-hosted bench runner if the project grows enough to justify it.
Property-based testing
bomdrift uses proptest for
property-based tests of the parser, diff core, typosquat canonicalization,
and version-jump extractor. Property tests run as part of cargo test
alongside the unit tests — there’s no separate harness.
What’s tested
Parser layer
Hypothesis: feeding arbitrary bytes through
serde_json::from_slicefollowed byparse_with_formatmust NEVER panic. Errors are fine; panics are bugs.
Tests in src/parse/mod.rs::tests:
parse_pipeline_does_not_panic_on_arbitrary_bytes— 1024 random byte sequences (0–2048 bytes each). Most error at JSON parse; the few valid-JSON-but-not-an-SBOM cases exercise the parser’s error paths.parse_pipeline_does_not_panic_on_arbitrary_json— 1024 random serde_json::Value trees up to depth 3. Far more efficient at exploring the parser’s behavior on well-formed-JSON-but-not-an-SBOM than random bytes.parse_pipeline_does_not_panic_with_format_hint— same as above but with eachSbomFormathint forced. Catches per-parser panics that auto-detect would have routed away from.ecosystem_from_purl_does_not_panic— arbitrary unicode strings through the purl-type extractor.hash_alg_does_not_panic— arbitrary algorithm strings through the hash-algorithm normalizer.
Typosquat canonicalization
Tests in src/enrich/typosquat.rs::tests:
pep503_normalize_does_not_panic— arbitrary unicode through the PyPI normalizer. Output invariants asserted: lowercase only, no leading/trailing dashes.last_path_segment_returns_substring— arbitrary unicode through the Go/Composer match-form extractor. The result must be a substring of the input and must contain no/.enrich_does_not_panic_on_arbitrary_components— random ChangeSets with up to 32 added components of varying ecosystems must go through the full enrich() path without panicking.
Diff core
Tests in src/diff/mod.rs::tests:
diff_self_is_empty— for anySbom,diff(a, a)produces an emptyChangeSet. The strongest invariant; catches parser non- determinism that other tests miss.diff_swap_roles_when_inputs_swapped—diff(a, b)anddiff(b, a)swapadded/removedcardinalities and preserveversion_changedandlicense_changedcardinalities. Catches asymmetric bugs in the per-key dispatch.diff_is_deterministic— two calls on the same input produce byte-equalChangeSetstructures. The upsert contract for the PR-comment renderer relies on this.
Version-jump extractor
Tests in src/enrich/version_jump.rs::tests:
extract_major_does_not_panic— arbitrary strings throughextract_major().extract_major_round_trips_well_formed_numerics— for any major version 1..10000, the function round-trips the bare form, thev-prefixed form, and the pre-release suffix form.extract_major_handles_unicode_without_panic— arbitrary unicode prefix + a well-formed version number. The function should treat the prefix as garbage (return None) but never panic.
Why property-based, not cargo-fuzz?
- Stable Rust. proptest works on the stable toolchain; cargo-fuzz
requires nightly via the
libFuzzerLLVM coverage instrumentation. - Runs as part of
cargo test. No separate harness, no cross-build complexity, no CI configuration delta. Every PR runs the property tests automatically. - Counterexample shrinking. When a property fails, proptest shrinks the failing input toward a minimal reproduction. The resulting test failure is much easier to debug than a 2KB random byte sequence from libFuzzer’s corpus.
The trade-off is corpus persistence — proptest doesn’t accumulate a crash corpus the way libFuzzer does. For a tool of bomdrift’s size that’s a fair trade; if the project grows to need long-running fuzz campaigns, a future contributor can wire up cargo-fuzz alongside proptest.
Running
# Run all tests including property tests
cargo test --release
# Just one property test
cargo test --release diff_self_is_empty
# Increase case count for thorough exploration
PROPTEST_CASES=10000 cargo test --release diff_self_is_empty
The default case counts (512–2048 per property) are calibrated so the
full test suite finishes in ~2 seconds. Bump PROPTEST_CASES for
deeper exploration on a release machine.
When a property test fails
- proptest prints a minimized counterexample. Copy it verbatim into a new unit test in the same module.
- Add a
#[test]that exercises the counterexample directly. This becomes a regression guard; the property test’s randomness alone isn’t sufficient long-term coverage for a known-bad input. - Fix the bug.
- Both the property test and the new unit test should now pass.
Real-world SBOM regression corpus
In addition to property tests, bomdrift ships a corpus of real-world
SBOMs in tests/fixtures/real-world/ (sourced from the official
CycloneDX and SPDX example repos). The regression tests in
tests/real_world.rs exercise:
- Every fixture parses without error.
- Every fixture has at least one component.
- Components with known purl types (
pkg:npm/,pkg:pypi/, etc.) resolve to the canonicalEcosystemvariant — not toEcosystem::Other(_). - Diffing two unrelated real-world SBOMs doesn’t panic.
- Self-diffing a real-world SBOM produces an empty ChangeSet.
- All four renderers produce non-empty output on a real diff.
The corpus is kept small (~2.7 MB total) so test runtime stays sub-second. Refresh it by re-fetching from upstream.
Roadmap
What’s planned, what’s deliberately out of scope, and what the acceptance criteria for new contributions look like.
Shipped (v0.9.9 — distribution)
The “distribution release.” No source-code feature work; every install path now works in one command.
cargo install bomdrift— published to crates.io. Cargo metadata +[package.metadata.docs.rs]+ anexcludelist trimming the published crate to 220 KiB compressed. Newpublish-dry-runPR-time CI guard.docker run ghcr.io/metbcy/bomdrift:v0.9.9— multi-arch (linux/amd64, linux/arm64) distroless image on every release. Tag matrix:vX.Y.Z,:vX.Y,:vX,:latest.- SLSA build provenance on every release archive AND the ghcr.io
image, via
actions/attest-build-provenance@v2. Verify withgh attestation verifyorslsa-verifier. Complementary to the existing cosign keyless signatures — see Release signing. - Automated
v1major-tag retag —release.ymlforce-pushes the major-version tag (v1today;v${major}once v1.0.0 ships) to point at the latest release on every tag. - Manual recovery workflow — new
rebuild-docker.ymllets a maintainer rebuild + push the docker image for any past tag without re-cutting the release. ReadsDockerfilefrommainso future fixes apply backwards. - README + Marketplace polish — crates.io / docs.rs / Marketplace badges; rewritten Marketplace listing description leading with the axios narrative.
Shipped (v0.9.8 — code-review-driven hardening)
- Continuous parser fuzzing via
cargo-fuzzagainst CycloneDX, SPDX, and Syft JSON parsers. PR-time short pass + weekly long scheduled run. See Continuous fuzzing. - CI coverage report via
cargo-llvm-covwith a sticky PR comment. Informational;--fail-under-lineswill be added once coverage is visible across 2–3 releases. - Production code audited for
unwrap/expect/panic/todo/unimplemented. Crate-rootclippy::*warns enforce going forward. Zero production.unwrap()remain; remaining.expect()sites carry rationale comments. - All
unsafeblocks documented with// SAFETY:comments, withclippy::undocumented_unsafe_blocksenforcing the contract. src/lib.rs47 KB → 31 lines —run_difforchestration extracted tosrc/run.rs. Public API surface preserved byte-for-byte.
Shipped (v0.9.7 — milestone follow-ups)
- SPDX
WITH-chain exception inheritance —(X WITH ex) AND (Y)/(X WITH ex_a) OR (X WITH ex_b)now evaluate per-leaf with proper AND/OR semantics. AND inherits a denied exception; OR doesn’t poison if another branch is permitted. --multi-major-delta <N>— last hardcoded calibration threshold lifted. Default 2; tunable via flag or[diff] multi_major_deltaconfig key.- Windows plugin timeout (first-class) — replaced manual
Child::try_wait()polling with thewait-timeoutcrate. Behavior unchanged on Unix; first-class on Windows. action.ymlinput parity — twenty-five new inputs map every v0.7-v0.9.7 CLI flag to an action input.- Air-gapped / self-hosted Sigstore docs — documents env-var
passthrough (
SIGSTORE_REKOR_URL,COSIGN_FULCIO_URL, etc.) and key-based attestation fallback.
Shipped (v0.9.6 — finish the roadmap)
- OCI artifact attestation verification —
--before-attestation,--after-attestation,--cosign-identity,--cosign-issuer, and--require-attestation. bomdrift shells out tocosign verify-attestation --type=cyclonedxand consumes the verified CycloneDX SBOM payload. See Attestation. - Custom rules / plugin system — external-process plugins via
repeatable
--plugin <manifest.toml>. JSON over stdin/stdout, best-effort failures, newbomdrift.pluginSARIF rule. See Plugins. - Calibration knobs —
--typosquat-similarity-threshold,--young-maintainer-days,--cache-ttl-hoursflags plus matching[diff]config keys. Every previously hardcoded threshold is now configurable. - Cache-TTL unification — internal refactor consolidating the
four duplicated
CACHE_TTL_SECSconstants behind a singlecache::ttl()helper. No user-visible change.
Shipped (v0.9.5 — polish + multi-SCM parity)
- Per-exception SPDX allow/deny via
[license] allow_exceptions/deny_exceptionsand--allow-exception/--deny-exceptionCLI flags.Apache-2.0 WITH LLVM-exceptionetc. now evaluated at the exception level, not just the base license. - Bitbucket + Azure DevOps comment-driven suppression bridges — Cloudflare Worker references with the same five guards as the GitLab bridge. bomdrift now has comment-driven suppression parity across all four major SCMs.
bomdrift::vex::parse_synthetic_idpublic helper — round-trips bomdrift’s synthetic finding IDs back to a structured kind. Lets external VEX tooling identify which finding a statement targets.spdxcrate exact-pinned (=0.10.9) so license-list updates can’t silently change policy semantics.BaselineEntry/ExpiredEntryunified internally; public alias preserved.- CI Rust toolchain pinned to MSRV 1.88; bumps are deliberate.
- Single source of truth for the suppress-comment grammar
(
scripts/parse-suppress-comment.sh+ CI sync guard). - GitLab note upsert + threading semantics documented.
Shipped (v0.9 — interoperability + breadth)
- VEX consume —
--vex <path>accepts OpenVEX 0.2.0 + CycloneDX VEX 1.6 statements;not_affected/fixedsuppress findings,under_investigationannotates. - VEX emit —
--emit-vex <path>emits an OpenVEX 0.2.0 document with explicit per-entryvex_status(defaultunder_investigation, never auto-promoted). - Full SPDX expression evaluator via the
spdxcrate. Deprecatesallow_ambiguous. - Bitbucket Pipelines + Azure DevOps Pipelines templates with
auto-detection (
BITBUCKET_BUILD_NUMBER,TF_BUILD) and per-platform footer shapes. - Registry-metadata enrichers — npm/PyPI/crates.io. New kinds: recently-published, deprecated, maintainer-set-changed (npm only).
- GitLab comment-driven suppression via a security-reviewed Cloudflare Worker reference bridge (five guards).
- Explicit non-goals + pair-with recommendations in README and STATUS.
Shipped (v0.8 — supply-chain hardening)
- SARIF + GitHub Code Scanning with stable per-result fingerprints
and one-line action opt-in (
upload-to-code-scanning: true). - EPSS scoring on every CVE-aliased advisory;
--fail-on-epss. - CISA KEV flagging of known-exploited advisories;
--fail-on kev. - License allow/deny policy with
*-suffix glob matching and fail-closed compound-expression handling. Newbomdrift.license-violationSARIF rule. - Baseline
expires+reasonwith stderr warnings on expiry. timecrate +clockmodule honoringSOURCE_DATE_EPOCH.- OSV CVE aliases threaded through
VulnRef. --debug-calibration-format jsonland--output-file <PATH>.
Investigated and decided
- GraphQL maintainer-age — investigated again for v0.9.6 and
rejected. GitHub’s GraphQL
history()connection doesn’t expose ascending-date ordering, so finding the oldest contributor commit still requires walking the cursor backward from the most recent commit. REST’sGET /repos/{o}/{r}/commits?author=X&per_page=1plusLink-header parsing for the last page lets bomdrift fetch a single author’s oldest commit in two requests. Decided: REST stays. Closing this one off the roadmap permanently — re-open only if GitHub adds ASC ordering to the GraphQL history connection.
Calibration
All calibration thresholds are configurable via .bomdrift.toml and
CLI flags. Tune [diff] typosquat_similarity_threshold,
young_maintainer_days, recently_published_days, cache_ttl_hours.
See CLI reference for flag forms.
Blocked on upstream
- PyPI / crates.io maintainer-set-changed. The npm enricher
(shipped v0.9) compares maintainer sets for VersionChanged
components by reading
registry.npmjs.org’s per-versionmaintainers[]array. PyPI’shttps://pypi.org/pypi/<pkg>/jsonreturns repository-level maintainers but no per-version history. Crates.io’shttps://crates.io/api/v1/crates/<name>returns repository-levelcrate.ownersbut no per-versionpublished_byhistory. If either ecosystem ships a per-version maintainer endpoint, bomdrift adds the enricher in a future minor release.
Future candidates (not committed)
Candidates that could land in a future release if maintainer time and adoption signal warrant:
- Homebrew tap (
Metbcy/homebrew-tap) —brew install Metbcy/tap/bomdrift. macOS adopters reach forbrew installfirst; this closes the macOS install gap. nixflake, AURPKGBUILD,winget+ Scoop manifests — the Linux power-user and Windows-package-manager install paths.- README diet — move the comparison table to a dedicated
compare.mdpage, shorten the README to a one-screen pitch. asciinemademo recorded againstexamples/axios-incident/, embedded in the README and the docs landing page.- Comparison docs (deep) —
compare/socket.md,compare/snyk.md,compare/trivy-grype-osv.md— neutral-tone pages that explain when to pick bomdrift vs. each competitor.
Non-goals
These are explicit non-goals. Don’t open a PR for them — it’ll be declined.
SBOM generation
bomdrift only consumes SBOMs. Use Syft to generate them — it’s already excellent and bomdrift’s contribution would be net-negative.
Replacing your SCA scanner
OSV-scanner, Grype, Trivy all have richer vulnerability databases and broader package metadata than bomdrift. bomdrift’s CVE enrichment is change-focused: only on what’s new in this diff. If you want “what’s in my SBOM right now?”, run an SCA scanner. If you want “what changed in this PR’s deps that I should worry about?”, that’s bomdrift’s question.
Reachability / call-graph analysis
Determining whether the vulnerable function in a flagged advisory is actually invoked from your application’s entry points is a fundamentally different analysis than diff-level supply-chain risk. It requires whole-program call-graph construction, language-specific runtime modeling (dynamic dispatch, reflection, eval), and an ever-growing per-CVE vulnerable-symbol database. The vendors who do this well — Endor Labs, Snyk Reachability — invest at a scale OSS bomdrift can’t match, and the per-CVE symbol curation is the moat, not the call-graph engine itself. Pair bomdrift with Endor or Snyk for reachability; bomdrift answers “what changed”, they answer “does the change reach prod code”.
Dependency-tree visualization
cargo tree,
pnpm why, and ecosystem-specific
equivalents handle this well. bomdrift’s diff core could in principle
walk the dependencies / relationships arrays from the source
SBOM, but it’s outside the “what’s risky” scope.
Per-language deep parsing
bomdrift treats SBOMs as the source of truth for what’s installed.
Walking package-lock.json / Pipfile.lock / Cargo.lock directly
would let us catch things SBOMs miss (lockfile drift), but doubles
the parser surface for marginal signal — and the SBOM-generation
ecosystem is converging fast enough that this won’t matter in 18
months.
Web UI / dashboard
bomdrift is intentionally a CI tool. Long-running stateful dashboards (org-wide vuln tracking, exception management UI) are better served by tools designed for that — Anchore Enterprise, Snyk, etc. The PR comment is the UX.
Contribution acceptance criteria
A new enricher / output format / parser PR should:
- Pass
cargo clippy --all-targets --all-features -- -D warningson its own. The codebase is clippy-clean and we keep it that way. - Add unit tests in
src/<your-module>/testscovering the happy path + at least one edge case. Best-effort enrichers should test the network-failure path (via fake fetcher injection). - Add an end-to-end test in
tests/cli.rsif it’s CLI-visible, ortests/integration.rsif it’s library-internal. - Document its rationale in a module doc comment at the top of the file. The “why” is more interesting than the “what” — future contributors lift the rationale, not just the implementation.
- Stay best-effort. Network or filesystem failures must not block the diff from rendering. The contract is “render whatever we got”, not “all-or-nothing”.
- Not pull in tokio / chrono / semver / octocrab without strong justification. The dep-tree audit is real — see Architecture.
See Contributing for the development loop.