Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

bomdrift is a CLI and multi-SCM action that diffs two SBOMs and surfaces supply-chain risk signals on every changed dependency — flags new CVEs (with EPSS + CISA KEV scoring), typosquats across eight ecosystems, multi-major version jumps, young-maintainer takeovers, registry-metadata signals (recently-published, deprecated, maintainer-set-changed), and license-policy violations — ready to drop into a PR comment on GitHub, GitLab, Bitbucket, or Azure DevOps.

What problem does it solve?

The most actionable supply-chain question on a pull request is:

What changed in this diff’s dependencies that I should worry about?

— not “what’s in my SBOM?”. Plenty of tools answer the second question (OSV-scanner, Grype, Trivy). bomdrift answers the first.

Recent incidents bomdrift would have surfaced

axios npm compromise (Mar 31, 2026)

A maintainer was socially engineered (fake Slack/Teams call attributed to North Korean UNC1069), and axios@1.14.1 + axios@0.30.4 shipped briefly with a malicious runtime dep plain-crypto-js@4.2.1 that dropped the WAVESHAPER.V2 RAT on Windows, macOS, and Linux.

Three of bomdrift’s signals would have fired in the diff that pulled the compromised release:

  1. Added — a brand-new transitive dependency plain-crypto-js@4.2.1 appears.
  2. Typosquatplain-crypto-js scores 0.95 against the legitimate crypto-js via the suffix-containment boost rule.
  3. Vulnerabilities — OSV.dev returns the published advisory IDs (MAL-2026-2306, GHSA-3p68-rc4w-qgx5, etc.) on both versions, with EPSS / KEV badges where applicable.

See examples/axios-incident/ for the SBOM pair and the rendered output.

Shai-Hulud worm (npm, Nov 2025)

700+ packages compromised by a self-replicating worm. Diff-time review of newly added transitive deps and version bumps was the only pre-merge defense. bomdrift’s “added components + CVE enrichment + recently- published registry signal” combination surfaces this class of attack at PR time.

xz-utils backdoor (CVE-2024-3094, Mar 2024)

A 2.6-year social-engineering campaign culminating in a backdoor shipped in xz 5.6.0/5.6.1. The “Jia Tan” maintainer’s first commit was recent relative to the release — exactly the maintainer-age heuristic bomdrift implements via the GitHub REST API. The threshold is tunable via --young-maintainer-days (default 90; v0.9.6+).

Sustained PyPI typosquat campaigns (2024–2026)

Hundreds of malicious packages disguised by single-character substitutions. Jaro-Winkler similarity against top-N catalogs catches these reliably; see the Typosquat detection chapter for the full algorithm and the --typosquat-similarity-threshold knob (v0.9.6+).

Design ethos

  • Small dep tree, no Docker, single binary. ~3.4 MB stripped + LTO. No tokio, no chrono, no semver crate, no octocrab — the constraint is load-bearing.
  • Best-effort enrichers. Network failures (OSV, EPSS, KEV, GitHub, registries), plugin failures, and attestation-verify failures all warn-and-continue. A PR review is still useful without one signal, and the offline change-shape signals always work.
  • Byte-deterministic output. Identical inputs render to byte-identical Markdown / JSON / SARIF / VEX every time, honoring SOURCE_DATE_EPOCH, so PR-comment upserts via peter-evans/create-or-update-comment patch in place instead of accumulating duplicate comments.
  • Cosign-signed releases. Every archive carries a Sigstore signature via GitHub OIDC. Action defaults to verifying signatures; opt-out via verify-signatures: false for trusted mirrors. As of v0.9.6, the same cosign machinery can verify the input SBOMs themselves via --before-attestation / --after-attestation.
  • OSS-first, no telemetry, no account. Apache-2.0; no daemon, no hosted UI, no signup.

Where to next?

Getting started

Suppressing findings

  • Baseline & suppression — JSON snapshots, in-comment /bomdrift suppress, expires + reason.
  • License policy — SPDX expression evaluation with allow/deny + per-exception granularity.
  • VEX — OpenVEX 0.2.0 + CycloneDX VEX 1.6 consume / emit.

Output

Per-signal deep dives

  • Enrichers overview — the contract every enricher honors, plus pointers into each chapter.

Advanced

  • OCI attestation — fetch SBOMs as cosign verify-attestation-verified OCI artifacts (v0.9.6+).
  • Plugins — external-process plugin protocol for custom rules (v0.9.6+).
  • Architecture — module map, pipeline, determinism contract.

Quickstart

The most common way to run bomdrift is the composite Action — drop it into a pull_request workflow and let the action handle checkout, Syft install, SBOM generation, diffing, and PR-comment posting:

# .github/workflows/sbom-diff.yml
name: SBOM diff
on: pull_request
permissions:
  contents: read
  pull-requests: write       # to upsert the diff comment
jobs:
  diff:
    runs-on: ubuntu-latest
    steps:
      - uses: Metbcy/bomdrift@v1
        # Optional:
        # with:
        #   fail-on: critical-cve   # exit 2 on HIGH/CRITICAL advisories
        #   path: services/api      # scan a monorepo subdirectory

The @v1 mutable tag tracks the latest v0.x release. Pin to a specific version (@v0.9.9) if you prefer reproducible builds. See GitHub Action for every input.

If you prefer a checked-in policy file, install the binary and run bomdrift init once. It writes .bomdrift.toml plus the SBOM-diff and comment-suppression workflows, so future policy tweaks happen in TOML instead of workflow YAML.

Locally with the binary

Three install paths are supported.

Via cargo (v0.9.9+)

cargo install --locked bomdrift
bomdrift --version

Via Docker / OCI (v0.9.9+)

docker run --rm ghcr.io/metbcy/bomdrift:latest --version
# Pin to a specific version for reproducible CI:
docker run --rm ghcr.io/metbcy/bomdrift:v0.9.9 --version

The image is multi-arch (linux/amd64, linux/arm64), distroless (gcr.io/distroless/cc-debian13:nonroot), and runs as a non-root user. Verify the inline SLSA attestation with gh attestation verify --owner Metbcy oci://ghcr.io/metbcy/bomdrift:v0.9.9.

Via release archive (cosign-signed)

Pre-built binaries cover Linux x86_64 + aarch64, macOS aarch64, and Windows x86_64. Each archive is cosign-signed via Sigstore + GitHub OIDC and ships a SLSA build provenance attestation (v0.9.9+).

VERSION=v0.9.9
TARGET=x86_64-unknown-linux-gnu
curl -sSL -o bomdrift.tar.gz \
  "https://github.com/Metbcy/bomdrift/releases/download/${VERSION}/bomdrift-${VERSION}-${TARGET}.tar.gz"
tar -xzf bomdrift.tar.gz
./bomdrift-${VERSION}-${TARGET}/bomdrift --version

# Diff two SBOMs
./bomdrift-${VERSION}-${TARGET}/bomdrift diff before.json after.json

# Emit SARIF to a file (no fragile YAML > redirection)
./bomdrift-${VERSION}-${TARGET}/bomdrift diff before.json after.json \
    --output sarif --output-file bomdrift.sarif

To verify the archive’s signature before you trust the binary, see Release signing.

From source

cargo install --locked --git https://github.com/Metbcy/bomdrift --tag v0.9.9 bomdrift

Requires Rust 1.85+ (the project uses edition 2024). Prefer cargo install bomdrift (above) unless you specifically want to track an unreleased commit.

First diff

The repository ships four runnable example scenarios under examples/. After cloning + cargo build --release:

./target/release/bomdrift diff \
  examples/axios-incident/before.json \
  examples/axios-incident/after.json \
  --no-osv --no-maintainer-age

The output is GitHub-Flavored Markdown ready for PR-comment posting.

What’s next?

GitHub Action

The Metbcy/bomdrift action is a composite action (no Docker), which keeps PR-comment latency low — typically 5–10s on a warm runner versus 30s+ for a Docker container action.

Quick start (zero-config, v0.5+)

On a pull_request workflow, the action defaults to comparing the PR’s base branch against the PR’s head SHA — no checkout step, no Syft step, no SBOM-path wiring needed:

on: pull_request
permissions:
  contents: read
  pull-requests: write
jobs:
  diff:
    runs-on: ubuntu-latest
    steps:
      - uses: Metbcy/bomdrift@v1

That’s it. The action checks out both refs into opaque sibling paths, generates CycloneDX-JSON SBOMs via Syft (installed automatically and cached across job runs), and posts the rendered diff as an upserted PR comment.

For a repo-owned policy, run bomdrift init once and commit the generated .bomdrift.toml plus workflows. The action auto-loads .bomdrift.toml from the repo root when present, or you can pass config: .bomdrift.toml explicitly.

If you already produce SBOMs through a non-Syft toolchain — Trivy, SPDX-tools, an in-house generator — supply the file paths via the before-sbom / after-sbom inputs instead. The advanced flow below documents that path; both flows continue to be supported in v1.

Inputs

The action exposes the full bomdrift CLI surface as inputs (v0.9.7+ input parity, current as of v0.9.9). For the canonical flag semantics see CLI reference; the tables below document only the action-side wrapper. Empty defaults mean “don’t pass the flag” — bomdrift then uses its own CLI/config defaults.

What’s new in v0.9.7

These inputs are newly exposed (the underlying CLI flags shipped earlier):

  • VEX: vex, emit-vex, vex-author, vex-default-justification
  • License policy: allow-licenses, deny-licenses, allow-exception, deny-exception, allow-ambiguous-licenses
  • Enrichment toggles: no-epss, no-kev, no-registry
  • Failure thresholds: fail-on-epss
  • Calibration knobs: recently-published-days, typosquat-similarity-threshold, young-maintainer-days, cache-ttl-hours, multi-major-delta (new CLI flag in v0.9.7)
  • Attestation: before-attestation, after-attestation, cosign-identity, cosign-issuer, require-attestation
  • Plugins: plugin

Before v0.9.7 these had to be driven through .bomdrift.toml or a direct cargo install invocation. The config-file path remains supported and is still preferred for repo-wide policy.

Core: refs, paths, SBOMs

InputTypeDefaultDescription
before-refstring${{ github.event.pull_request.base.ref }}Git ref / SHA to check out as the “before” side. Default works on pull_request events.
after-refstring${{ github.event.pull_request.head.sha }}Git ref / SHA for the “after” side.
pathstring.Subdirectory of the checked-out ref to scan with Syft (monorepos: path: services/api).
before-sbomstring (path)''Pre-generated “before” SBOM. Bypasses the in-action Syft invocation.
after-sbomstring (path)''Pre-generated “after” SBOM.
formatenumautoForce input format: auto/cdx/spdx/syft. Maps to --format.

Output

InputTypeDefaultDescription
outputenummarkdownOutput format: terminal/markdown/json/sarif. PR comments require markdown. Maps to --output.
comment-on-prbooltruePost the rendered diff as a PR comment on pull_request events.
comment-size-limitnumber60000Bytes. Above this size, the PR-comment body is re-rendered with --summary-only. 0 disables the fallback.
findings-onlyboolfalseMarkdown-only. Maps to --findings-only.
upload-to-code-scanningboolfalseUpload SARIF to GitHub Code Scanning. Requires output: sarif.
github-tokenstring${{ github.token }}Token used to post PR comments.

Suppression and policy

InputTypeDefaultDescription
configstring (path)''Path to .bomdrift.toml. Empty auto-loads from the repo root when present. Maps to --config.
baselinestring (path)''Pre-captured bomdrift diff --output json snapshot to suppress against. Maps to --baseline.
vexstring (multi-line paths)''OpenVEX documents to consume; one path per line, each becomes a repeated --vex.
emit-vexstring (path)''Path to write a freshly emitted OpenVEX document. Maps to --emit-vex.
vex-authorstring''Author identity for the emitted OpenVEX. Maps to --vex-author.
vex-default-justificationstring''OpenVEX not_affected justification ID applied by default. Maps to --vex-default-justification.

License policy

InputTypeDefaultDescription
allow-licensesstring (comma list)''SPDX expressions to allow. Maps to --allow-licenses.
deny-licensesstring (comma list)''SPDX expressions to deny. Maps to --deny-licenses.
allow-exceptionstring (comma list)''SPDX exception identifiers to allow inside WITH clauses. v0.9.7 refines compound-expression inheritance. Maps to --allow-exception.
deny-exceptionstring (comma list)''SPDX exception identifiers to deny. Maps to --deny-exception.
allow-ambiguous-licensesboolfalseTreat unresolved license expressions as allowed. Maps to --allow-ambiguous-licenses.

Enrichment toggles

InputTypeDefaultDescription
no-epssboolfalseDisable EPSS exploit-likelihood enrichment. Maps to --no-epss.
no-kevboolfalseDisable CISA KEV enrichment. Maps to --no-kev.
no-registryboolfalseDisable registry / maintainer-age enrichment (no network calls to package registries). Maps to --no-registry.

Calibration knobs

InputTypeDefaultDescription
recently-published-daysnumber''“Recently published” maintainer-age window. Maps to --recently-published-days.
typosquat-similarity-thresholdnumber (0.0–1.0)''Damerau-Levenshtein threshold for typosquat detection. Maps to --typosquat-similarity-threshold.
young-maintainer-daysnumber''Age below which a maintainer is flagged as “young”. Maps to --young-maintainer-days.
cache-ttl-hoursnumber''TTL for the on-disk enrichment cache. Maps to --cache-ttl-hours.
multi-major-deltanumber (≥1)''Major-version delta at or above which an upgrade is flagged as multi-major (CLI default 2). Maps to --multi-major-delta. New in v0.9.7.

Failure thresholds

InputTypeDefaultDescription
fail-onenumnoneTrip exit 2 on findings of the configured kind: none/cve/critical-cve/typosquat/license-change/any. The PR comment is still posted on a tripped run.
fail-on-epssnumber (0.0–1.0)''Trip exit 2 when any new advisory has an EPSS score at or above this value. Maps to --fail-on-epss.
max-addednumber''Exit 2 when more than this many dependencies are added.
max-removednumber''Exit 2 when more than this many dependencies are removed.
max-version-changednumber''Exit 2 when more than this many dependencies change version.

OCI attestation

InputTypeDefaultDescription
before-attestationstring (OCI ref)''OCI reference for the cosign attestation covering the “before” SBOM. Maps to --before-attestation.
after-attestationstring (OCI ref)''OCI reference for the “after” SBOM attestation. Maps to --after-attestation.
cosign-identitystring (regex)''Regex matched against the cosign certificate identity (--certificate-identity-regexp). Maps to --cosign-identity.
cosign-issuerstring (URL)''OIDC issuer URL for keyless cosign verification. Maps to --cosign-issuer.
require-attestationboolfalseFail when either side is missing a verified attestation. Maps to --require-attestation.

For air-gapped / self-hosted Sigstore deployments, see Air-gapped / self-hosted Sigstore.

Plugins

InputTypeDefaultDescription
pluginstring (multi-line paths)''Plugin manifests to load; one path per line, each becomes a repeated --plugin. See Plugins.

Release verification

InputTypeDefaultDescription
verify-signaturesbooltrueInstall cosign and verify the bomdrift release archive’s Sigstore signature. Set false on trusted mirrors / cached runners (saves ~15s). When true and cosign is missing, the action fails loudly.

Outputs

The action does not declare formal outputs. Its side effects are:

  1. The rendered diff is written to stdout (visible in the workflow run log under the Run bomdrift step).
  2. When output == markdown and GITHUB_STEP_SUMMARY is set, the rendered diff is appended to the step summary so reviewers can see it without a PR-comment posting permission.
  3. On pull_request events with comment-on-pr: true, the rendered diff is upserted into a single PR comment marked <!-- bomdrift:diff -->. Subsequent pushes update the same comment instead of accumulating new ones (peter-evans/create-or-update-comment-style upsert).
  4. When fail-on or a diff budget trips, the action exits with code 2 — but only after the PR comment has been posted, so reviewers see the findings even when the workflow step fails.

Common patterns

Repo policy file

Use .bomdrift.toml when you want the policy in version control instead of repeated YAML inputs:

[diff]
fail_on = "critical-cve"
baseline = ".bomdrift/baseline.json"
findings_only = true
max_added = 25
max_version_changed = 10
- uses: Metbcy/bomdrift@v1
  with:
    config: .bomdrift.toml

Explicit action inputs still override the config-backed defaults for one-off workflows.

Bring your own SBOMs (advanced / pre-v0.5 flow)

When the SBOMs come from a non-Syft toolchain (Trivy, SPDX-tools, proprietary scanners) or you already generate them in an earlier job step, supply both paths explicitly. The action skips the in-action Syft invocation entirely:

- uses: actions/checkout@v4
- uses: anchore/sbom-action@v0
  with: { path: ., output-file: after.json }
- uses: actions/checkout@v4
  with: { ref: ${{ github.event.pull_request.base.ref }}, path: base }
- uses: anchore/sbom-action@v0
  with: { path: base, output-file: before.json }
- uses: Metbcy/bomdrift@v1
  with: { before-sbom: before.json, after-sbom: after.json }

This is the v0.4-era “manual” pattern. It still works in v0.5 — the before-sbom / after-sbom inputs were required: true in v0.4 and became required: false in v0.5; nothing else changed about how they behave. Existing v0.4 workflows continue to function unchanged after a @v1 tag bump.

Block the merge on critical findings

- uses: Metbcy/bomdrift@v1
  with:
    before-sbom: before.json
    after-sbom:  after.json
    fail-on:     critical-cve

critical-cve filters on severity >= High per the OSV-fetched severity (see OSV.dev CVE lookup). typosquat, license-change, and any are also accepted thresholds — see --fail-on.

Self-hosted / trusted-mirror runners

- uses: Metbcy/bomdrift@v1
  with:
    before-sbom: before.json
    after-sbom:  after.json
    verify-signatures: false   # ~15s faster, skips cosign-installer

This is appropriate when:

  • You’re running on self-hosted runners with a hardened image you control.
  • You’ve pre-pinned the bomdrift archive in your Nexus/Artifactory mirror and verified its signature once at mirror time.
  • You’re running in a network-restricted environment where the public Sigstore endpoints aren’t reachable.

When verify-signatures: true and cosign isn’t installed (or the .sig/ .pem aren’t on the release), the action fails loudly rather than silently degrading — that’s the whole point of the explicit opt-out.

Big monorepo with massive SBOMs

If bomdrift diff rendered output exceeds GitHub’s 65,536-char comment-body cap, the v0.3 size fallback re-renders with --summary-only for the PR comment and keeps the full body in the workflow step summary:

- uses: Metbcy/bomdrift@v1
  with:
    before-sbom: before.json
    after-sbom:  after.json
    comment-size-limit: 60000   # default; tune for GHE with raised limits

Set comment-size-limit: 0 to disable the fallback entirely and let GitHub return a 422 on oversized comments (rarely what you want).

Diff-only (no PR comment)

Useful for SARIF uploads, third-party comment posting, or when you just want the diff in the step summary:

- uses: Metbcy/bomdrift@v1
  with:
    before-sbom:    before.json
    after-sbom:     after.json
    output:         sarif
    comment-on-pr:  false

- uses: github/codeql-action/upload-sarif@v3
  with: { sarif_file: bomdrift.sarif }

The output: sarif produces SARIF v2.1.0 with stable rule IDs (see Output formats).

Comment-driven suppression bridges (other forges)

The comment-suppress companion sub-action is GitHub-only — it relies on the issue_comment workflow event. For GitLab, Bitbucket Cloud, and Azure DevOps, bomdrift ships parallel Cloudflare Worker bridges that listen on each forge’s webhook, validate the trigger, and dispatch the equivalent bomdrift baseline add --from-comment "<body>" run on the underlying CI:

Each bridge enforces the same five guards: webhook secret / HMAC verification, event-type filter, repo / project allowlist, commenter-permission check, and a PR-context guard. The /bomdrift suppress <ID> [reason: …] grammar is identical across all four SCMs and shares a single regex (scripts/parse-suppress-comment.sh) so behavior cannot drift. See the per-forge chapters GitLab CI · Bitbucket · Azure DevOps for setup.

Action permissions

pull-requests: write is required when comment-on-pr: true (the default). Without it, the comment-upsert step fails with a 403; the action’s exit code remains the bomdrift exit (so a fail-on or budget trip still fails the workflow correctly).

contents: read is required so the action’s internal actions/checkout steps (zero-config flow) can fetch both refs. In the bring-your-own-SBOMs flow it’s still required by whichever step generates the SBOMs upstream.

What the action does (v0.5+)

When the zero-config flow runs (no explicit before-sbom / after-sbom):

  1. Two sibling checkouts of before-ref and after-ref into ${{ github.workspace }}/__bomdrift_before and __bomdrift_after. Both with fetch-depth: 1 and persist-credentials: false. Skipped for whichever side has a pre-supplied SBOM path.
  2. Syft installed via anchore/sbom-action/download-syft@v0. Cached across job runs in the runner’s tool cache.
  3. syft scan dir:... against each checkout’s ${path} subtree, producing CycloneDX-JSON into a tempfile under $RUNNER_TEMP. The bomdrift parser drops Ecosystem::Other("file") pseudo-components that Syft’s directory cataloger emits — set --include-file-components (CLI) or pass a pre-generated SBOM via before-sbom / after-sbom to bypass.
  4. bomdrift diff runs as in the v0.4 flow, and the upsert + step summary plumbing is unchanged.

The new behavior costs about 30 MB of one-time tool cache and 3–5s of cold-cache wall time per first invocation. Subsequent runs in the same job (or in repos that share the runner’s tool cache) reuse Syft.

Monorepo setup

When a single repo owns N services with independent dependency trees (services/api, services/worker, apps/web, …), running one bomdrift job per service gives each PR a focused, per-service comment without merging unrelated diff churn into a single 65k-char wall.

Pattern A — path: per matrix entry

The simplest setup uses a job matrix and the action’s path input:

on: pull_request
permissions:
  contents: read
  pull-requests: write
jobs:
  diff:
    strategy:
      fail-fast: false
      matrix:
        service: [api, worker, web]
    runs-on: ubuntu-latest
    steps:
      - uses: Metbcy/bomdrift@v1
        with:
          path: services/${{ matrix.service }}
          fail-on: critical-cve

Each matrix leg posts (or upserts) its own PR comment, distinguished by the rendered title (e.g. “SBOM diff — services/api”). The <!-- bomdrift:diff --> upsert marker is namespaced internally by path:, so leg N’s comment doesn’t clobber leg N-1’s.

fail-fast: false is recommended: a vulnerability in worker shouldn’t hide an emergent api finding from the same PR.

Pattern B — share a baseline across services

Most monorepos do want one shared exception list (the same false positive will show up in any service that depends on the same package). Point each leg at the same file:

- uses: Metbcy/bomdrift@v1
  with:
    path: services/${{ matrix.service }}
    baseline: .bomdrift/baseline.json

The baseline file is keyed by (purl_with_version, advisory_id) — see Match keys — so a suppression for pkg:npm/colour-print@2.1.0 covers every service that pulls in that exact version. New versions still surface (intentional; that’s the point of the version-pinned key).

When services pin different versions of the same dep, you’ll get per-version baseline entries. That’s working-as-intended — a known-fine finding at v1.0.0 should still get a fresh review at v1.1.0.

Pattern C — per-service .bomdrift.toml

When the policy itself differs (worker has a stricter fail-on, docs-site has a generous max-added), drop a .bomdrift.toml per service:

- uses: Metbcy/bomdrift@v1
  with:
    path:   services/${{ matrix.service }}
    config: services/${{ matrix.service }}/.bomdrift.toml

The auto-discovery only checks the repo root, so an explicit config: is required for nested files.

What to scope per service vs. globally

SettingScopeWhy
fail-on, max-* budgetsPer-serviceWorker’s risk surface ≠ web’s
baselineSharedSame false positives across services
comment-on-pr, outputPer-serviceDiff-only legs vs. PR-comment legs
verify-signaturesGlobalRunner-image property, not service property

Action-broke troubleshooting checklist

When a previously-working bomdrift action job starts failing — typically right after a merge to your default branch, a token rotation, or a runner-image upgrade — work through these in order. Each row is one symptom, one fix so you can grep your job log for the symptom and land on the recipe.

Symptom (in the job log)Likely causeFix
403 Resource not accessible by integration on the comment-upsert steppull-requests: write permission missing on the workflow / jobAdd permissions: { pull-requests: write, contents: read } at the workflow or job level. PR comments need pull-requests: write; the action’s internal checkouts need contents: read.
Forks cannot post PR comments warning, exit 0PR is from a fork; default GITHUB_TOKEN on pull_request events is read-onlySwitch the trigger to pull_request_target (and harden — see GitHub’s guidance), or accept that fork PRs only get the workflow step summary, not a PR comment.
Could not find SBOM at services/api after a green earlier runDefault branch protection bumped the merge-base; before-ref now points at a commit that predates the services/api directoryEither move the path: value to match the new layout, or pin before-ref explicitly to a known-good commit (before-ref: main).
cosign: signature verification failed after a release-archive rotationCached release archive in the runner’s tool cache is stale and predates a rotationBump to the latest patch tag (e.g. Metbcy/bomdrift@v1 re-resolves to the floating tag), or set verify-signatures: false on a self-hosted runner you’ve pinned manually.
path: services/api warning + empty SBOMThe path doesn’t exist post-checkout — typo, or the directory was renamed in before-ref onlybomdrift v0.7+ surfaces an actionable error pointing at this exact case. See the monorepo section for the matrix recipe; double-check ${{ matrix.service }} substitution.
“Comment exceeds 65,536 characters” 422 from GitHubA massive diff blew past the size cap; the v0.3 fallback to --summary-only was disabled (comment-size-limit: 0)Re-enable the fallback (drop comment-size-limit to use the default, or set it to 60000). The full body is preserved in the workflow step summary.
Action runs, no PR comment appears, exit 0Workflow event isn’t pull_request (the comment path is gated on PR events), or comment-on-pr: false was set explicitlyFor push/schedule events, the comment path is intentionally skipped — use the step summary or upload the markdown as an artifact.

If you hit a failure mode not in the table above, please open an issue with the failing job log — the troubleshooting table grows from real reports.

GitLab CI

bomdrift v0.7+ ships first-class GitLab support via a documented .gitlab-ci.yml template plus a --platform gitlab CLI flag that swaps the rendered footer to the GitLab MR-note shape. The template lives in examples/gitlab-ci/; this chapter walks through the moving parts.

Why a template instead of a custom action

GitLab CI doesn’t have a “marketplace action” model; the unit of reusability is a YAML snippet. A composite GitHub-Action-style binary would still need a YAML wrapper, so v0.7 ships the YAML directly. You can include: it from a shared CI repo if you run bomdrift across many projects:

include:
  - project: 'platform/ci-templates'
    file:    '/bomdrift/diff.gitlab-ci.yml'
    ref:     main

Quickstart (zero-config, v0.7+)

On an MR pipeline, the template defaults to comparing the merge-base SHA against the MR head SHA — no manual SBOM wiring needed:

  1. Copy examples/gitlab-ci/.gitlab-ci.yml to your project root.
  2. Add BOMDRIFT_API_TOKEN as a masked CI/CD variable. The token must be a Project Access Token with the api scope; CI_JOB_TOKEN doesn’t work (it’s read-only on most instances).
  3. Push an MR. The bomdrift:diff job runs Syft on both refs, renders the markdown diff, and posts/upserts an MR note marked <!-- bomdrift:diff -->.

That’s it. No .bomdrift.toml required for the default flow; add one only when you want a repo-pinned policy.

What the job does

Step-by-step (matches the bash <<'BOMDRIFT' block in the template):

  1. Detects arch (x86_64 / aarch64) and downloads the matching bomdrift-${VERSION}-...musl.tar.gz from GitHub Releases.
  2. Optionally cosign-verifies the archive when cosign is on PATH and BOMDRIFT_VERIFY_SIGNATURES=true (default). Falls back to a warning when cosign isn’t installed; set BOMDRIFT_VERIFY_SIGNATURES=false to silence the warning on a runner image you’ve pinned manually.
  3. Installs Syft via the upstream install.sh.
  4. Creates two git worktrees — one at the merge-base SHA (CI_MERGE_REQUEST_DIFF_BASE_SHA), one at the MR head (CI_COMMIT_SHA). Worktrees share the active checkout’s .git, so this is cheap.
  5. Generates CycloneDX-JSON SBOMs for both worktrees with syft scan dir:....
  6. Runs bomdrift diff with --platform gitlab, which renders the GitLab-shaped footer (/-/issues/new?... plus bomdrift baseline add hint instead of the GitHub /bomdrift suppress comment-driven flow).
  7. Posts/upserts the MR note via the GitLab REST API — finds the existing note by the <!-- bomdrift:diff --> marker and PATCHes it, otherwise POSTs a new one.

The full markdown body is also kept as a job artifact (diff.md) with a 7-day retention so reviewers can recover it after the MR merges.

Tokens & permissions

TokenScopeUsed for
BOMDRIFT_API_TOKENapiPosting / updating MR notes
BOMDRIFT_PUSH_TOKEN (optional)api + write_repositorySuppression job’s commit-back-to-MR-branch step

Splitting the two tokens means the diff path keeps working even if the suppression token is rotated, and you can give the diff token a narrower blast radius. Mark both as Masked and as Protected when your default branch is the only place suppression commits should land.

CI_JOB_TOKEN is intentionally not used for the comment path: on most GitLab instances its scope is read-only, and even where it can post comments the surface area is wider than what bomdrift needs.

CLI auto-detection

bomdrift diff auto-detects GitLab CI from the environment:

  • GITLAB_CI=true → flips --platform to gitlab (unless overridden).
  • CI_PROJECT_URL → used as repo_url (footer link target) when --repo-url and BOMDRIFT_REPO_URL are both unset.

Explicit flags always win; the env detection only fills in unset values. To force GitHub-shape output from a GitLab runner (rare — mostly useful when cross-posting to a mirror), pass --platform github explicitly.

Suppressions

For v0.7, GitLab suppressions are manual or job-driven, not comment-driven. Two paths:

Path 1 — CLI

The same bomdrift baseline add <ID> command works in any GitLab job or local shell:

bomdrift baseline add GHSA-xxxx-yyyy-zzzz
bomdrift baseline add CVE-2026-12345 --path custom/baseline.json

Commit .bomdrift/baseline.json to your MR branch and the next bomdrift:diff run sees the finding as suppressed. See Baseline & suppression for match-key semantics and the worked false-positive example.

Path 2 — manual GitLab job

Copy examples/gitlab-ci/suppress.gitlab-ci.yml to your project (or merge its job into your main .gitlab-ci.yml). The job is when: manual — invisible until a reviewer triggers it from the MR’s pipeline view with a BOMDRIFT_SUPPRESS_ID variable. On trigger it runs bomdrift baseline add and pushes the result back to the MR branch using BOMDRIFT_PUSH_TOKEN.

Comment-driven suppression on GitLab (v0.9+)

In-comment /bomdrift suppress <ID> is supported on GitLab as of v0.9 via the Cloudflare Worker bridge. GitLab’s note webhook fires on every comment on every MR with no command-prefix filter, so the bridge enforces five guards (webhook secret, event-type filter, repo allowlist, commenter-permission check, PR-context guard) before invoking bomdrift baseline add --from-comment "<body>" against the underlying CI. The grammar is identical to the GitHub comment-suppress sub-action; both share the scripts/parse-suppress-comment.sh regex so behavior cannot drift.

Self-Managed GitLab

The template uses CI_API_V4_URL (auto-populated on every job) instead of hardcoding gitlab.com/api/v4, so it works against Self-Managed instances unchanged. Two things to watch:

  • Outbound reachability. The job downloads the bomdrift archive from GitHub Releases and Syft from the upstream install script. If your runners can’t reach those, mirror them to your internal Nexus / Artifactory and override the BOMDRIFT_RELEASE_BASE_URL variable shown in the example README.
  • Cosign + Sigstore. Keyless verification needs OIDC connectivity to oauth2.sigstore.dev. On air-gapped runners, set BOMDRIFT_VERIFY_SIGNATURES=false — bomdrift fails loudly rather than silently skipping when the env var is absent and cosign isn’t reachable, so the explicit opt-out is the right escape hatch.

Troubleshooting

See the examples README troubleshooting table for the most common failure modes (token scoping, signature verification on locked-down runners, push-back-to-protected-branch permissions).

What’s the same vs. the GitHub Action

FeatureGitHub ActionGitLab template
Zero-config flow
Syft auto-install
MR/PR comment upsert
--summary-only size fallback✅ (65k cap)n/a (1MB cap is rarely hit)
Cosign verification of release archive
Per-service monorepo support✅ matrix✅ matrix (parallel keyword)
In-comment suppressionv0.8
Manual suppression jobn/a
<!-- bomdrift:diff --> marker✅ (same shape — cross-platform tooling can grep one shape)

Comment-driven suppression (advanced)

Trade-off up front. Comment-driven suppression turns a reviewer comment like /bomdrift suppress GHSA-... into an automatic baseline edit. To wire it up safely you need to operate a small public webhook handler. The manual suppression job documented above is supported and lower-risk; reach for the bridge only when the zero-click UX is worth running a service.

The GitHub flow ships out-of-the-box (comment-suppress sub-action fronted by the existing webhook). GitLab requires a webhook handler because GitLab’s Note Hook doesn’t include a command-prefix filter.

Bridge

examples/gitlab-ci/comment-bridge/ ships a Cloudflare Worker reference implementation that enforces five security guards:

  1. Webhook secret verification (constant-time X-Gitlab-Token).
  2. Event-type filter (Note Hook only).
  3. Project-ID allowlist.
  4. Commenter access_level >= 30 (Developer+ on the project).
  5. MR-context guard (rejects fork-MR comment exfiltration).

When the guards pass, the worker triggers the GitLab pipeline with BOMDRIFT_NOTE_BODY set to the raw comment body. The bomdrift:suppress job in suppress.gitlab-ci.yml then runs bomdrift baseline add --from-comment "$BOMDRIFT_NOTE_BODY" to extract the directive and update .bomdrift/baseline.json.

The threat model is documented in examples/gitlab-ci/comment-bridge/README.md. The same logic ports to Vercel / Netlify / AWS Lambda — see vercel-equivalent.md.

How notes are upserted

bomdrift posts the diff as a single MR note, not as a Discussion. The lifecycle is:

  • First run: POST /projects/:id/merge_requests/:iid/notes creates the note. The response carries an integer id which bomdrift records implicitly by re-finding the note via the <!-- bomdrift:diff --> marker on subsequent runs.
  • Subsequent runs: PUT /projects/:id/merge_requests/:iid/notes/:note_id modifies the existing note’s body in place.

Concretely the upsert:

  • Modifies the note body in place. The note ID is stable across pipeline runs, so any permalink to the note (right-click → Copy link on the timestamp) keeps working for the lifetime of the MR.
  • Does not regenerate the note. GitLab does not delete-and-recreate on PUT; the comment’s position in the MR timeline does not move.
  • Does not re-fire Note Hook webhooks for unchanged content. GitLab fires Note Hook on note creation but not on body-only edits, so a comment-bridge wired to Note Hook will not loop on bomdrift’s own upserts. (The bridge’s event-type filter is a defence-in-depth here, not the primary guard.)
  • Does not affect threaded replies. GitLab’s data model puts notes and replies under a parent discussion; replies attached to bomdrift’s note (e.g. a reviewer typing “ack — accepting this”) remain attached to the same discussion thread regardless of how many times bomdrift edits the parent body. This matches the GitHub-side behaviour where reviewer threaded replies under the bot comment survive each upsert.

bomdrift deliberately uses the Notes API, not the Discussions API, for the diff template. The Discussions API creates a thread root that is awkward to update (you’d be editing the first note of a discussion, with subtly different permission semantics), and the diff comment isn’t trying to start a structured conversation — it’s a single living status comment that reviewers may reply to. Other reviewers can still reply to the bot’s note and GitLab will create a discussion implicitly around their reply; bomdrift just doesn’t seed the discussion itself.

Author and signing

The note’s author is whatever identity owns BOMDRIFT_API_TOKEN (typically a Project Access Token, which surfaces as a bot user on the project). On every PUT, GitLab updates the note’s updated_at and last_edited_by_id fields to point at that same bot identity — not the original MR author. This is expected and matches the GitHub equivalent’s behaviour with a bot token: edits show up under the bot’s identity, while the original commit/MR authorship is untouched. If your review process audits comment-edit history (unusual but legitimate on regulated projects), give the token a descriptive name (e.g. bomdrift-ci-bot) so the audit trail reads clearly.

Cloudflare Workers — the reference. The free tier covers most webhook traffic. wrangler tail makes live debugging easy. Vercel / Netlify Edge Functions are equally good if your team already operates on those platforms.

Bitbucket Pipelines

bomdrift runs in Bitbucket Cloud Pipelines and posts a single upserted PR comment per pull request, mirroring the GitHub Action and GitLab template flow.

Quickstart

Copy examples/bitbucket-pipelines/bitbucket-pipelines.yml to your repo root and add a Repository Variable named BOMDRIFT_API_TOKEN containing a Bitbucket App Password with the pullrequest:write scope.

What the job does

  1. Installs Syft and bomdrift in a rust:1.88 container.
  2. Generates a CycloneDX SBOM for the PR target branch and the PR head via syft dir:.
  3. Renders the diff to markdown with bomdrift diff --platform bitbucket.
  4. Looks up the existing bomdrift comment on the PR (by the <!-- bomdrift:diff --> marker) and either creates a new comment or updates the existing one.

Tokens & permissions

VariableScopeWhy
BOMDRIFT_API_TOKENApp Password, pullrequest:writePosting / updating PR comments.

The job never auto-pushes to your branch. Suppression is the manual bomdrift baseline add flow plus a commit on your branch.

CLI auto-detection

Setting BITBUCKET_BUILD_NUMBER in the environment auto-selects --platform bitbucket when the flag is omitted. The Pipelines runner sets this variable on every build.

BITBUCKET_GIT_HTTP_ORIGIN is honored as a --repo-url fallback, so the markdown footer’s “Report this finding” link works without plumbing.

Suppressions

The supported, no-infrastructure-required flow is the manual baseline edit:

bomdrift baseline add GHSA-... --reason "audit complete (PR #42)"
git add .bomdrift/baseline.json
git commit -m "baseline: suppress GHSA-..."

Comment-driven suppression (advanced, v0.9.5+)

Trade-off up front. Comment-driven suppression turns a reviewer comment like /bomdrift suppress GHSA-... into an automatic baseline edit. To wire it up safely you need to operate a small public webhook handler. The manual flow above is supported and lower-risk; reach for the bridge only when the zero-click UX is worth running a service.

examples/bitbucket-pipelines/comment-bridge/ ships a Cloudflare Worker reference implementation that enforces five security guards:

  1. Webhook HMAC verification (X-Hub-Signature: sha256=… against the byte-exact request body).
  2. Event-type filter (pullrequest:comment_created only).
  3. Repo-full-name allowlist.
  4. Commenter-permission lookup (write / admin / owner only).
  5. PR-context guard (rejects fork-PR comment-suppress).

When the guards pass, the worker triggers the bomdrift-comment-suppress custom pipeline (defined in the example bitbucket-pipelines.yml) with BOMDRIFT_NOTE_BODY set to the raw comment body. The pipeline runs bomdrift baseline add --from-comment "$BOMDRIFT_NOTE_BODY" and pushes the resulting baseline edit back to the PR’s source branch.

The full threat model and deployment guide live in examples/bitbucket-pipelines/comment-bridge/README.md. The same logic ports to Vercel / Netlify / AWS Lambda — see vercel-equivalent.md.

Troubleshooting

See examples/bitbucket-pipelines/README.md.

Azure DevOps Pipelines

bomdrift runs in Azure Pipelines and posts a single upserted PR thread per pull request.

Quickstart

Copy examples/azure-devops/azure-pipelines.yml to your repo root and add a secret pipeline variable named BOMDRIFT_API_TOKEN containing a PAT with the Code (Read & Write) scope.

What the job does

  1. Installs Rust + bomdrift + Syft on the ubuntu-latest agent.
  2. Generates a CycloneDX SBOM for the PR target branch and the PR head.
  3. Renders the diff to markdown with bomdrift diff --platform azure-devops.
  4. Looks up the existing bomdrift PR thread (by the <!-- bomdrift:diff --> marker) and either creates a new thread or updates the existing comment.

Tokens & permissions

VariableScopeWhy
BOMDRIFT_API_TOKENPAT, Code (Read & Write)Creating / updating PR threads.

The default System.AccessToken is not used because most organizations don’t grant it permission to create PR threads.

CLI auto-detection

Setting TF_BUILD=true (Azure Pipelines sets this on every job) auto-selects --platform azure-devops when the flag is omitted.

BUILD_REPOSITORY_URI is honored as a --repo-url fallback. Note that this variable is empty for some local debug runs; passing --repo-url explicitly is fine.

Suppressions

The supported, no-infrastructure-required flow is the manual baseline edit: run bomdrift baseline add locally and commit the result to your PR branch.

Comment-driven suppression (advanced, v0.9.5+)

Trade-off up front. Comment-driven suppression turns a reviewer comment like /bomdrift suppress GHSA-... into an automatic baseline edit. To wire it up safely you need to operate a small public webhook handler. The manual flow above is supported and lower-risk; reach for the bridge only when the zero-click UX is worth running a service.

examples/azure-devops/comment-bridge/ ships a Cloudflare Worker reference implementation that enforces five security guards:

  1. Webhook secret verification (X-Bomdrift-Bridge-Secret custom header, constant-time compare).
  2. Event-type filter (ms.vss-code.git-pullrequest-comment-event only).
  3. Project-UUID allowlist.
  4. Commenter-permission lookup (Contributors team membership).
  5. PR-context guard (active PR targeting the protected main branch).

When the guards pass, the worker POSTs to /_apis/pipelines/{id}/runs with BOMDRIFT_NOTE_BODY as a template parameter. The example azure-pipelines.yml defines a conditional bomdrift_suppress stage gated on that parameter; it runs bomdrift baseline add --from-comment "$BOMDRIFT_NOTE_BODY" and pushes the resulting baseline edit back to the PR’s source branch. Normal PR-build runs leave the parameter empty so the suppress stage is skipped.

The full threat model and deployment guide live in examples/azure-devops/comment-bridge/README.md. The same logic ports to Vercel / Netlify / AWS Lambda — see vercel-equivalent.md.

Troubleshooting

See examples/azure-devops/README.md.

CLI reference

This page documents every bomdrift subcommand and flag. The authoritative help text is bomdrift --help / bomdrift <subcommand> --help; this page groups the same information by behavior so it’s easier to look up. Each flag carries an introduced-in annotation so future readers can reason about which version a feature first shipped in.

Subcommands

bomdrift diff <BEFORE> <AFTER> [OPTIONS]
bomdrift init [--config-only] [--force]
bomdrift baseline add [<ID>] [--path <PATH>] [--expires <YYYY-MM-DD>] [--reason <TEXT>] [--from-comment <BODY>]
bomdrift refresh-typosquat [--ecosystem <ECOSYSTEM>]
SubcommandPurpose
diffDiff two SBOMs and emit findings. The everything-flag.
initScaffold .bomdrift.toml + GitHub workflows.
baseline addAppend an advisory ID to a baseline file.
refresh-typosquatRe-fetch the bundled top-package lists.

bomdrift diff

Diff two SBOMs and surface supply-chain risk signals on changed components.

Positional arguments

  • <BEFORE> — path to the “before” SBOM (CycloneDX 1.5/1.6, SPDX 2.3, or Syft JSON). Optional when --before-attestation is set instead.
  • <AFTER> — path to the “after” SBOM. Optional when --after-attestation is set instead.

Output formats

--output <FORMAT>

Introduced in v0.1.

Output format. One of:

  • terminal — ANSI-colored tree-style output. Default when stdout is a TTY.
  • markdown — GitHub-Flavored Markdown ready for PR-comment posting. Default when stdout is piped/redirected.
  • json — pretty-printed {"changes": ..., "enrichment": ...} graph.
  • sarif — SARIF v2.1.0 for GitHub Code Scanning ingestion.

Config key: [diff] output.

--output-file <PATH>

Introduced in v0.8.

Write the chosen --output format to this path instead of stdout. Useful for SARIF (--output sarif --output-file bomdrift.sarif) where YAML quoting > redirection is fragile in CI templates.

--format <FORMAT>

Introduced in v0.1.

Force input format detection. auto (default) / cdx / spdx / syft. auto looks at the JSON top-level fields to dispatch.

--summary-only

Introduced in v0.3. Markdown-only.

Emits just the summary table + a footer pointing at the full output. Used by the action’s comment-size fallback when the full diff exceeds GitHub’s 65,536-char comment-body cap.

--findings-only

Introduced in v0.6. Markdown-only.

Keeps the summary table and risk-bearing sections (vulnerabilities, typosquats, version jumps, young maintainers, license changes, registry-metadata findings) but omits raw Added / Removed / Version changed detail tables. Useful when a PR intentionally updates a large lockfile and reviewers only want the actionable findings inline.

--include-file-components

Introduced in v0.5.

Keep Ecosystem::Other("file") pseudo-components emitted by Syft’s directory cataloger. Off by default — those produce phantom Added/Removed pairs that drown real package changes.

Repo policy config

--config <PATH>

Introduced in v0.6.

Load defaults from a .bomdrift.toml policy file. When omitted, an existing .bomdrift.toml in the current working directory is loaded automatically; missing default config is ignored. An explicit --config path must exist and parse.

CLI flags override config values for one-off runs.

Example .bomdrift.toml:

[diff]
fail_on = "critical-cve"
baseline = ".bomdrift/baseline.json"
findings_only = true
max_added = 25
max_version_changed = 10

# Calibration knobs (v0.9.6+)
typosquat_similarity_threshold = 0.92
young_maintainer_days = 90
cache_ttl_hours = 24

[license]
allow = ["Apache-2.0", "MIT", "BSD-*"]
deny = ["GPL-*", "AGPL-*"]
allow_exceptions = ["LLVM-exception", "Classpath-exception-2.0"]

Suppression

--baseline <PATH>

Introduced in v0.5.

Path to a JSON snapshot whose findings should be suppressed from this run’s output (and from the --fail-on trip evaluation). Match keys are conservative — a finding at a different version than baseline still surfaces. See Baseline & suppression for the schema and match-key semantics.

--vex <PATH>

Introduced in v0.9. Repeatable.

Path(s) to VEX (Vulnerability Exploitability eXchange) files to consume. Each file is auto-detected as either OpenVEX 0.2.0 or CycloneDX VEX 1.6. Statements with status not_affected / fixed suppress matching findings; under_investigation annotates without suppressing; affected annotates as a no-op badge. See VEX for the finding-id matching rules including the synthetic-id convention.

--emit-vex <PATH>

Introduced in v0.9.

Emit a single OpenVEX 0.2.0 doc covering every finding in the post-baseline diff. Baseline-suppressed entries inherit their vex_status from the baseline entry (defaulting to under_investigation); un-suppressed findings emit as affected.

--vex-author <STRING>

Introduced in v0.9.

VEX author for --emit-vex. Falls back to repo_url, then to "bomdrift".

--vex-default-justification <STRING>

Introduced in v0.9.

Default OpenVEX justification written into emitted statements when the source baseline entry doesn’t supply one. Defaults to "vulnerable_code_not_in_execute_path".

Enrichment toggles

Each of these disables one enricher entirely (no network, no cache lookup). All default to on.

FlagDisablesIntroduced
--no-osvOSV.dev CVE lookupv0.1
--no-osv-cacheThe 24h on-disk OSV severity cache only — keeps OSV enabledv0.3
--no-maintainer-ageGitHub-REST maintainer-age enricherv0.2
--no-epssFIRST.org EPSS enricherv0.8
--no-kevCISA KEV enricherv0.8
--no-registryRegistry-metadata enrichers (npm/PyPI/crates.io)v0.9

--recently-published-days <N>

Introduced in v0.9.

Recently-published threshold in days for the registry enricher (default 14). Set to 0 to disable that specific kind without disabling the other registry checks.

Calibration

bomdrift’s heuristic enrichers ship with conservative defaults that work for most repos. When the defaults aren’t right at scale, every threshold is tunable either through [diff] keys in .bomdrift.toml or via the matching CLI flag. CLI flags override config values for one-off runs.

--typosquat-similarity-threshold <FLOAT>

Introduced in v0.9.6.

Type: float in [0.0, 1.0]. Default: 0.92. Config key: [diff] typosquat_similarity_threshold.

Minimum normalized edit-distance similarity between a candidate package name and a top-list entry before bomdrift flags it as a possible typosquat. Higher values = stricter (fewer false positives, more false negatives). Lowering to 0.85 catches softer near-misses; raising to 0.95 only catches one- or two-character swaps on short names.

--young-maintainer-days <N>

Introduced in v0.9.6.

Type: positive integer (days). Default: 90. Config key: [diff] young_maintainer_days.

A package’s top contributor whose oldest commit is newer than this many days is flagged as a young-maintainer signal. Defaults to a quarter; raise to 180 for stricter ecosystems, lower to 30 for tighter signals.

--cache-ttl-hours <N>

Introduced in v0.9.6.

Type: positive integer (hours). Default: 24. Config key: [diff] cache_ttl_hours.

Time-to-live for the OSV / EPSS / KEV / registry-metadata caches under <XDG_CACHE_HOME>/bomdrift/. The same TTL applies to all four caches (v0.9.6 unified the previously duplicated constants). Lower to 1 for fast-changing security feeds in long-running self-hosted runners; raise to 168 (one week) when running offline.

--multi-major-delta <N>

Introduced in v0.9.7.

Type: positive integer (>= 1). Default: 2. Config key: [diff] multi_major_delta.

Major-version delta at or above which the version-jump enricher classifies an upgrade as a multi-major jump. With the default of 2, an upgrade from 1.x to 2.x is a single-major bump (treated normally), while 1.x → 3.x (delta = 2) trips the multi-major signal. Lower to 1 to flag every cross-major bump as multi-major; raise to 3 or higher to quiet noisy ecosystems that release majors aggressively.

This flag closes the last hardcoded calibration threshold: pre-v0.9.7 the multi-major boundary lived as a const in the version-jump enricher. With the knob exposed, every gating decision in --debug-calibration output emits the active threshold rather than the const default — so debug rows for the version-jump kind are now portable across repos with different calibrations.

License policy

--allow-licenses <LIST> / --deny-licenses <LIST>

Introduced in v0.8. Comma-separated, repeatable.

SPDX license identifiers (or *-suffix globs) permitted / forbidden by policy. Deny wins when a license matches both. CLI flag takes precedence over [license] allow / deny in .bomdrift.toml (override, not merge). v0.9 adds full SPDX expression evaluation via the spdx crate so compound expressions like (MIT OR GPL-3.0) evaluate correctly.

--allow-exception <LIST> / --deny-exception <LIST>

Introduced in v0.9.5. Comma-separated, repeatable.

SPDX exception identifiers (e.g. LLVM-exception, Classpath-exception-2.0) permitted / forbidden as the right-hand side of a WITH clause. When set, Apache-2.0 WITH <other> violates policy even if Apache-2.0 is on the base allow list. Empty lists preserve v0.9 behavior (exception treated as informational).

--allow-ambiguous-licenses

Introduced in v0.8.

When set, compound SPDX expressions like (MIT OR GPL-3.0) are permitted. Off by default — fail-closed.

See License policy for the full evaluation semantics.

Failure thresholds

--fail-on <THRESHOLD>

Introduced in v0.2; expanded across v0.4 / v0.8 / v0.9.

Exit with code 2 when findings of the configured threshold surface. One of:

  • none — never trips (default).
  • cve — trips on any CVE / GHSA / MAL advisory finding.
  • critical-cve — trips when at least one finding has severity >= High per the OSV-fetched severity. (Naming kept for back-compat — covers the HIGH-or-CRITICAL bucket; HIGH alone is the common actively-exploited case.)
  • typosquat — trips on any typosquat finding.
  • license-change — trips on same-version license changes.
  • kev — trips on any advisory in the CISA KEV catalog (v0.8+).
  • recently-published / deprecated — registry-metadata finding gates (v0.9+).
  • any — trips on any finding.

The PR-comment body is written to stdout before exit-2 — the action’s tee + PIPESTATUS wrapper relies on this so the comment posts even when the workflow step fails.

--fail-on-epss <FLOAT>

Introduced in v0.8.

Trip exit-2 when any advisory’s EPSS score is >= this threshold (0.01.0). Recommended starting point: 0.5 (top decile of actively-exploited CVEs).

Diff budgets

--max-added <N>, --max-removed <N>, and --max-version-changed <N> fail the run with exit code 2 when a diff exceeds the configured dependency-churn budget. Introduced in v0.4. The rendered body is still written before exit, just like --fail-on.

Forge integration

--platform <PLATFORM>

Introduced in v0.7; expanded in v0.9 (Bitbucket / Azure DevOps).

github (default), gitlab, bitbucket, or azure-devops. Drives the rendered markdown comment’s footer:

  • github/issues/new?... URL shape, /bomdrift suppress <ID> comment-driven flow (requires the comment-suppress sub-action).
  • gitlab/-/issues/new?issuable_template=false-positive URL shape; manual bomdrift baseline add <ID> flow or the optional Cloudflare Worker bridge for in-comment suppression. See GitLab CI.
  • bitbucket/issues/new URL shape; comment-bridge in v0.9.5+. See Bitbucket.
  • azure-devops/_workitems/create?templateName=false-positive URL shape; comment-bridge in v0.9.5+. See Azure DevOps.

When the flag is omitted, bomdrift auto-detects from CI environment variables in this order: GITLAB_CI=true → GitLab, BITBUCKET_BUILD_NUMBER → Bitbucket, TF_BUILD → Azure DevOps, otherwise GitHub. The explicit flag always wins. Also configurable via [diff] platform = "<value>" in .bomdrift.toml.

Set in lockstep with --repo-url (or BOMDRIFT_REPO_URL, or — on GitLab CI — CI_PROJECT_URL). Without a URL the footer is omitted entirely; the platform flag controls only the footer’s shape.

--repo-url <URL>

Introduced in v0.5.

Repository URL used to render the markdown comment’s action-affordance footer. Falls back to BOMDRIFT_REPO_URL env var.

Attestation (OCI-fetched SBOMs)

All flags in this section introduced in v0.9.6. See OCI attestation for end-to-end usage.

--before-attestation <REFERENCE>

Fetch the “before” SBOM as a cosign verify-attestation-verified attestation attached to an OCI artifact instead of reading a local file. Mutually exclusive with the positional before argument. Requires --cosign-identity and --cosign-issuer.

--after-attestation <REFERENCE>

Same, for the “after” side. Mutually exclusive with the positional after argument.

--cosign-identity <REGEX>

Regex passed to cosign verify-attestation --certificate-identity-regexp. Required when either --before-attestation or --after-attestation is set. Example: https://github.com/owner/.+.

--cosign-issuer <URL>

URL passed to cosign verify-attestation --certificate-oidc-issuer. Required alongside --cosign-identity. Example: https://token.actions.githubusercontent.com.

--require-attestation

Refuse to fall back to local-file SBOMs: both sides MUST come from a verified OCI attestation. Implies --before-attestation and --after-attestation are both set.

Plugins

--plugin <MANIFEST>

Introduced in v0.9.6. Repeatable.

Path to a plugin manifest TOML. Each plugin is an external executable invoked once per added / version-changed component with JSON over stdin/stdout. Plugin failures (timeout, non-zero exit, malformed JSON) drop their findings without failing the diff. See Plugins for the protocol reference and a worked example.

Diagnostics

--debug-calibration

Introduced in v0.7.

Off by default. When set, bomdrift diff writes one row to stderr per finding it considers, with the schema:

kind|key|score|threshold

kind is one of typosquat, version-jump, maintainer-age, cve, recently-published, deprecated, maintainer-set-changed. key is a stable identifier (the package purl, advisory ID, etc.). score and threshold are the numeric inputs to the gating decision.

The flag is purely diagnostic — it doesn’t change which findings get rendered. Pipe to a file:

bomdrift diff old.cdx.json new.cdx.json --debug-calibration 2> calibration.tsv

--debug-calibration-format <FORMAT>

Introduced in v0.8.

pipe (default, back-compat with v0.7) emits kind|key|score|threshold per line; jsonl emits one JSON object per line for downstream tooling that doesn’t want to maintain a custom CSV-ish parser.

bomdrift init

Introduced in v0.6.

Scaffold a copy-paste adoption setup in the current repository:

bomdrift init

Writes:

  • .bomdrift.toml
  • .github/workflows/sbom-diff.yml
  • .github/workflows/bomdrift-suppress.yml

Flags:

  • --config-only — write only .bomdrift.toml.
  • --force — overwrite existing generated files. Without --force, existing files are preserved and the command fails loudly.

bomdrift baseline add

Introduced in v0.5; --expires/--reason v0.8; --from-comment v0.9.

Append an advisory ID to a baseline file’s suppressed_advisories list. The file is created if missing; existing fields are preserved. Idempotent (re-adding an existing ID is a no-op).

bomdrift baseline add GHSA-xxxx-yyyy-zzzz
bomdrift baseline add CVE-2026-12345 --path custom/baseline.json
bomdrift baseline add GHSA-evil-1234 \
    --expires 2026-12-31 \
    --reason "Awaiting upstream patch (issue #42)"

Flags

  • <ID> — advisory identifier (GHSA/CVE/MAL/OSV). Optional when --from-comment is supplied.
  • --path <PATH> — baseline file path. Default .bomdrift/baseline.json.
  • --expires <YYYY-MM-DD> — strict-format expiry; bomdrift refuses malformed dates (no silent never-expiring entries).
  • --reason <TEXT> — free-form rationale; surfaces in VEX exports + expiry warnings.
  • --from-comment <BODY> — parse a /bomdrift suppress <ID> [reason: ...] directive from a forge-issued comment body. Used by the GitLab / Bitbucket / Azure DevOps comment-bridge Workers. Exits non-zero on no-match so a webhook never silently no-ops.

bomdrift refresh-typosquat

Introduced in v0.4.

Refresh the bundled typosquat top-package lists from upstream sources.

bomdrift refresh-typosquat                     # all wired-up ecosystems
bomdrift refresh-typosquat --ecosystem pypi    # one specific list

--ecosystem <ECOSYSTEM>

Which ecosystem’s list to refresh. One of all (default), npm, pypi, cargo, nuget, maven, go, gem, composer. The first four fetch from canonical upstream feeds; the remaining four are curated data/<eco>-top*.txt snapshots and --ecosystem <name> for those emits a notice rather than fetching.

Refreshed lists are written to <XDG_CACHE_HOME>/bomdrift/typosquat/<eco>.txt via temp-file + atomic rename. The typosquat enricher prefers cache files over the embedded snapshot when present and parseable.

Exit codes

CodeMeaning
0Success.
1bomdrift internal error (parse failure, network mishap not gated by best-effort path, etc.).
2--fail-on threshold or diff budget tripped. The body is still on stdout — the action posts it before propagating the exit code.
(clap 2)Usage error from clap (unknown flag, missing required argument). Distinguishable from --fail-on-driven exit 2 by stderr containing error: ... rather than the rendered body.

Environment variables

VariablePurposeIntroduced
GITHUB_TOKENBumps the GitHub REST rate limit from 60/hr unauth to 5000/hr authenticated, used by the maintainer-age enricher.v0.2
BOMDRIFT_REPO_URLFallback for --repo-url when the flag isn’t passed.v0.5
GITLAB_CIWhen true, auto-selects --platform gitlab (unless overridden).v0.7
BITBUCKET_BUILD_NUMBERWhen set, auto-selects --platform bitbucket.v0.9
TF_BUILDWhen set, auto-selects --platform azure-devops.v0.9
CI_PROJECT_URLOn GitLab CI, used as a final fallback for --repo-url after BOMDRIFT_REPO_URL.v0.7
XDG_CACHE_HOMECache root for OSV / EPSS / KEV / registry caches and refreshed typosquat lists. Defaults to ~/.cache on Linux.v0.1
SOURCE_DATE_EPOCHWhen set, used as “now” for byte-deterministic output (baseline-expiry comparisons, VEX timestamps, etc.).v0.9
NO_COLORHonored by the terminal renderer.v0.1
CLICOLOR_FORCEForces ANSI even on a non-TTY.v0.1
BOMDRIFT_DEBUGWhen 1, enables verbose stderr notes from best-effort enrichers.v0.8

Output formats

bomdrift writes one rendered representation of a diff per invocation. The shape is deterministic — identical inputs produce byte-identical output — which is what the PR-comment upsert mechanism in the action relies on.

--output selects the format. Default is terminal when stdout is a TTY, markdown otherwise.

Markdown

The default for piped/redirected output, designed to drop into a GitHub PR comment. Renders the diff as a summary table at the top, followed by per-section tables for each change category and finding type.

## SBOM diff

| Change | Count |
|---|---:|
| Added | 1 |
| Removed | 1 |
| Version changed | 1 |
| License changed | 0 |
| Possible typosquats | 1 |

### Added
| Ecosystem | Name | Version |
|---|---|---|
| npm | plain-crypto-js | 4.2.1 |

### Possible typosquats
| Ecosystem | Name | Version | Similar to | Similarity |
|---|---|---|---|---:|
| npm | plain-crypto-js | 4.2.1 | crypto-js | 0.95 |

When OSV.dev enrichment is enabled, an additional Vulnerabilities section lists each affected component with its advisory IDs sorted highest-severity-first within a component (ties broken by ID, so output stays byte-deterministic).

--summary-only emits only the summary table + a footer line. Used by the action’s comment-size fallback for big-PR survival.

Terminal

ANSI-colored, tree-style output. Default when stdout is a TTY. Falls back to markdown when stdout is piped/redirected (so action workflows that capture stdout always see safe markdown). Honors NO_COLOR (skip ANSI) and CLICOLOR_FORCE (force ANSI even on a non-TTY).

Findings are rendered with bracketed prefixes:

PrefixMeaning
[ADD]Added component
[REM]Removed component
[VER]Version changed
[LIC]License changed (same version)
[CVE]OSV.dev advisory
[SQT]Typosquat
[JMP]Multi-major version jump
[YNG]Young maintainer

No emojis — bomdrift’s renderers stay strictly bracketed-prefix per project convention, both for terminal accessibility and for grepability of CI logs.

JSON

Pretty-printed {"changes": ChangeSet, "enrichment": Enrichment} graph for downstream tooling, baselines, debugging.

{
  "changes": {
    "added":           [ ... Component objects ... ],
    "removed":         [ ... ],
    "version_changed": [[ before, after ], ... ],
    "license_changed": [[ before, after ], ... ]
  },
  "enrichment": {
    "vulns":          { "<purl>": [{ "id": "...", "severity": "..." }, ...] },
    "typosquats":     [ ... ],
    "version_jumps":  [ ... ],
    "maintainer_age": [ ... ]
  }
}

The Enrichment.vulns shape is per-purl, per-advisory severity-tagged as of v0.3. v0.2 emitted a flat Vec<String> of advisory IDs without severity — consumers parsing v0.2 output need to migrate. See the CHANGELOG for the migration note.

JSON output is the canonical format for --baseline snapshots: capture once with bomdrift diff --output json > baseline.json, replay with bomdrift diff --baseline baseline.json on subsequent runs to suppress already-triaged findings.

SARIF v2.1.0

Suitable for ingestion by GitHub Code Scanning, GitLab Vulnerability Reports, and any other consumer that speaks SARIF.

Stable rule IDs

These IDs surface in Code Scanning’s UI and are the join key for suppressions, so they’re load-bearing public API once any consumer has seen a finding. Renaming any of them is a breaking change.

Rule IDSourceMaps to
bomdrift.cveenrichment.vulnsone result per (component, advisory_id)
bomdrift.typosquatenrichment.typosquatsone per typosquat finding
bomdrift.version-jumpenrichment.version_jumpsone per multi-major bump
bomdrift.young-maintainerenrichment.maintainer_ageone per young-maintainer finding
bomdrift.license-changecs.license_changedone per license-changed-without-version-bump

All five rules are always emitted in tool.driver.rules, even when the current diff has zero findings of that kind — Code Scanning consumers enumerate rules independently of results, so omitting unused rules confuses the suppression UI.

Severity mapping

result.level maps from the OSV-fetched severity:

  • Critical / Highlevel: "error"
  • Medium / Low / Nonelevel: "warning"

This is intentionally separate from --fail-on critical-cve’s threshold (which also fires on High); SARIF’s three-level model (error/warning/note) doesn’t map 1:1 to OSV’s four severity labels, so the renderer collapses High+Critical into error and everything else into warning.

Locations

SARIF requires locations on every result. Since SBOM-derived findings have no source line numbers, all results project onto a synthetic physicalLocation.artifactLocation.uri = "sbom", matching the convention used by trivy.

Determinism

Enrichment.vulns is a HashMap and its iteration order is non-deterministic. The SARIF renderer sorts the keys before emission. Other finding collections are already deterministically ordered Vecs (their enrichers iterate the BTreeMap-derived ChangeSet order), so they need no extra sorting. The render-twice-byte-equal regression test in src/render/sarif.rs::tests::render_is_pure_byte_deterministic guards against future regressions of this contract.

SARIF + GitHub Code Scanning

bomdrift can emit findings in SARIF v2.1.0 for ingestion by GitHub Code Scanning, GitLab Vulnerability Reports, and any other consumer that speaks SARIF.

bomdrift diff before.cdx.json after.cdx.json \
    --output sarif \
    --output-file bomdrift.sarif

Rule taxonomy

bomdrift emits the following stable rule IDs (load-bearing — never renamed across releases). All rules are present in tool.driver.rules even when the current diff has zero results of that kind, so Code Scanning UI suppression flows can enumerate them upfront.

Rule IDSurfacesSARIF level
bomdrift.cveOSV.dev advisory ID(s) for the componenterror for High/Critical, else warning
bomdrift.typosquatComponent name similar to a popular packagewarning
bomdrift.version-jumpMulti-major version bumpwarning
bomdrift.young-maintainerTop GitHub contributor’s first commit < 90 days agowarning
bomdrift.license-changeLicense changed at the same versionwarning
bomdrift.license-violationComponent license violates configured allow/deny policywarning

Fingerprint stability

Each result carries partialFingerprints.primaryHash/v1 — a SHA-256 digest of a stable identity tuple per rule:

RuleIdentity
bomdrift.cve`ruleId
bomdrift.typosquat`ruleId
bomdrift.version-jump`ruleId
bomdrift.young-maintainer`ruleId
bomdrift.license-change`ruleId
bomdrift.license-violation`ruleId

The /v1 suffix on the fingerprint key lets bomdrift evolve identity schemes in future releases without GitHub re-opening every existing alert. Two distinct CVEs on the same purl produce distinct fingerprints; the same finding produced across two runs produces a byte-equal fingerprint.

Wire up GitHub Code Scanning

Set the new action input upload-to-code-scanning: 'true' and ensure your workflow has the security-events: write permission. The composite action runs github/codeql-action/upload-sarif@v3 after bomdrift writes ${{ github.workspace }}/bomdrift.sarif.

permissions:
  contents: read
  security-events: write   # required for SARIF upload
  pull-requests: write     # only if you also want PR comments

jobs:
  bomdrift:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: Metbcy/bomdrift@v1
        with:
          output: sarif
          upload-to-code-scanning: 'true'

Direct CLI use (any CI)

When integrating with GitLab Vulnerability Reports, Bitbucket, or any arbitrary SARIF consumer, prefer --output-file over shell redirection:

bomdrift diff before.json after.json \
    --output sarif \
    --output-file bomdrift.sarif

The --output-file form is YAML-quoting-safe (no > redirection) and keeps stdout free for human-readable progress logging.

Determinism

Renderer output is byte-deterministic across runs for identical inputs. HashMap-keyed advisory lists are sorted by purl key before emission; license arrays are sorted before fingerprinting. The SOURCE_DATE_EPOCH environment variable is honored everywhere bomdrift emits a timestamp (the SARIF document itself currently carries no timestamps, but related VEX emission in v0.9 will).

Troubleshooting

  • Alerts don’t appear in the Security tab. Confirm permissions.security-events: write on the calling workflow AND upload-to-code-scanning: 'true' on the action input. Check the “Upload SARIF to Code Scanning” step in the job log for the API response.
  • Same finding appears twice after a re-run. This is a fingerprint bug — file an issue with the SARIF artifact and the inputs that produced it. Fingerprints should remain byte-equal across runs.
  • Severity wrong / missing. Bomdrift maps GHSA’s database_specific.severity text label. Advisories without a label surface at SARIF warning and the properties.severity field reads NONE.

Enrichers overview

An enricher runs over the ChangeSet produced by the diff core and adds risk-signal metadata to the rendered output without modifying the ChangeSet itself. Each is independent, has its own opt-out flag, and follows a best-effort contract: any failure (network, rate-limit, upstream API change) is logged once to stderr and the diff renders without that enricher’s findings.

Shipping enrichers

EnricherSourceNetwork?DefaultOpt-out flagCalibration
OSV.dev CVE lookupOSV.dev /v1/querybatch + /v1/vulns/{id}yeson--no-osv--cache-ttl-hours (v0.9.6)
EPSSFIRST.org /api/v1/epssyeson--no-epss--cache-ttl-hours; --fail-on-epss <0.0–1.0>
CISA KEVCISA known-exploited catalogyeson--no-kev--cache-ttl-hours; --fail-on kev
TyposquatEmbedded top-N lists, optional XDG cachenoon(none — pure compute)--typosquat-similarity-threshold (v0.9.6)
Multi-major version jumpThe diff itselfnoon(none — pure compute)(hard-coded MIN_MAJOR_DELTA = 2 — see chapter for rationale)
Maintainer ageGitHub REST /repos/.../contributors + /commitsyeson--no-maintainer-age--young-maintainer-days (v0.9.6)
Registry metadatanpm / PyPI / crates.io public APIsyeson (v0.9+)--no-registry--recently-published-days; --cache-ttl-hours
License policySBOM licenses field + SPDX expression evalnoon(configured by allow/deny lists)--allow-licenses, --deny-licenses, --allow-exception, --deny-exception
PluginsExternal-process plugins (v0.9.6+)variesoff (opt-in)(don’t pass --plugin)(per-plugin manifest)

Best-effort contract

Every enricher that touches the network honors the same contract:

  1. Per-request timeout (15s for OSV, 15s for GitHub) so a misbehaving upstream can’t hang a CI job.
  2. Errors warn, never block. A failed enricher logs one line to stderr (the warning is the same key every time, so it dedupes reasonably) and the diff renders without that enricher’s contributions.
  3. Rate-limit awareness. OSV’s /v1/querybatch is unauthenticated; the GitHub REST API honors GITHUB_TOKEN for the 5000/hr cap. On a 403 + X-RateLimit-Remaining: 0, the maintainer-age enricher returns whatever was already collected and warns once.
  4. Per-component caching within a single run. Repeated cs.added entries from the same project (e.g. monorepo subpackages sharing a GitHub repo) don’t multiply HTTP requests.

Determinism

Each enricher’s output is structured into the Enrichment graph (vulns: HashMap<...>, typosquats: Vec<...>, version_jumps: Vec<...>, maintainer_age: Vec<...>). Renderers iterate these in deterministic order — Vecs in their natural BTreeMap-derived order from the ChangeSet, the vulns HashMap with its keys sorted before emission.

This is the contract that lets peter-evans/create-or-update-comment upsert PR comments in place: identical inputs render to byte-identical output, so the comment body is patched only when the diff genuinely changes.

Why these signals?

The enricher set was chosen because each maps to a real, recent, high-impact incident class:

  • OSV.dev CVE lookup: published advisories, the broadest signal.
  • EPSS: probability of exploitation in next 30 days; dampens false-urgency on Critical-CVSS-but-low-exploitation advisories.
  • CISA KEV: known-exploited; the highest-confidence “act now” filter.
  • Typosquat: malicious packages mimicking popular ones (the plain-crypto-js axios dropper, the PyPI campaigns 2024–2026).
  • Multi-major version jump: takeover swaps, namespace reuse.
  • Maintainer age: long-game social-engineering campaigns (xz / Jia Tan).
  • Registry metadata: recently-published, deprecated, maintainer-set-changed — the npm Shai-Hulud-style worm precursors.
  • License policy: not a malicious-code signal but a policy gate that the same diff-time reviewer is best positioned to enforce.

For organizations with environment-specific rules outside this list, the v0.9.6 Plugins protocol lets you layer custom enrichers on top without forking bomdrift.

See also

OSV.dev CVE lookup

bomdrift’s CVE enricher queries the Open Source Vulnerability database for added and version-bumped components, populating the Vulnerabilities section in the rendered output with advisory IDs (CVE, GHSA, MAL, etc.) and per-advisory severity.

Two-stage lookup

Stage 1: /v1/querybatch

A single batched POST returns advisory IDs for every queried component in one round-trip. bomdrift batches up to 1000 queries per request (the documented cap); larger diffs chunk into multiple batches.

Each query is a package@version keyed by purl ecosystem (npm, PyPI, crates.io, Maven, etc.). Components without a parseable purl are skipped silently.

Stage 2: /v1/vulns/{id}

For each unique advisory ID returned by stage 1, bomdrift issues a follow-up GET to populate severity. Severity is sourced (in order):

  1. GHSA’s database_specific.severity text label (LOW|MODERATE|HIGH|CRITICAL). This is the most consistent shape across the OSV corpus.
  2. Highest CVSS_V3 vector score from the severity[] array, mapped to a label by the standard CVSS-v3 severity rating (Critical ≥ 9.0, High ≥ 7.0, Medium ≥ 4.0, Low ≥ 0.1).
  3. Severity::None when neither shape is present. These advisories render with a none severity label and don’t trip --fail-on critical-cve.

On-disk severity cache

Stage-2 lookups are N+1 in the worst case — one query per unique advisory ID. bomdrift caches stage-2 responses on disk at <XDG_CACHE_HOME>/bomdrift/osv/<advisory_id>.json with a 24h TTL.

~/.cache/bomdrift/osv/
├── CVE-2025-12345.json
├── GHSA-3p68-rc4w-qgx5.json
└── MAL-2026-2306.json

Each cache file looks like:

{
  "fetched_at": 1745878800,
  "severity": "Critical",
  "raw": { ... full /vulns/{id} response ... }
}

Cache behavior

  • Cache hits log nothing. A successful 24h-fresh hit is silent.
  • Cache misses are silent too. Each miss issues a network fetch and writes the result on success.
  • End-of-run summary. A single line goes to stderr like osv: 18/22 severities served from cache so CI logs show the cache hit ratio without per-file noise.
  • Atomic writes. Cache files are written to <id>.json.tmp then renamed, mirroring the temp-file + rename pattern used by bomdrift refresh-typosquat.
  • Stale TTL. 24h is a deliberate balance between rerun friction (a CI job re-running 30 minutes after the last one wants the cache) and stale-severity risk (a published severity correction after 24h is rare and the renderer’s contract is “best effort”).

--no-osv-cache

For paranoid reruns where you want fresh fetches even within the 24h window:

bomdrift diff before.json after.json --no-osv-cache

The cache itself is purely an optimization — the bypass flag always works, it just costs N+1 fetches per run. Use sparingly.

--no-osv (offline mode)

Skip the entire OSV pipeline (both stages, no cache writes). Use for:

  • Tests and example scenarios where determinism matters more than freshness.
  • Air-gapped CI environments.
  • Quick smoke tests of the change-shape signals without the network latency.
bomdrift diff before.json after.json --no-osv

Severity → --fail-on mapping

ThresholdTrips when…
noneNever.
cveAny vuln finding present (regardless of severity).
critical-cveAny finding with severity >= High (covers HIGH and CRITICAL).
typosquatAny typosquat finding; OSV findings do not trip it.
license-changeAny same-version license change; OSV findings do not trip it.
anyAny finding of any kind, plus license-changed-without-version-bump.

The critical-cve name covers HIGH-or-CRITICAL because CRITICAL alone is rare in the GHSA tagging and many actively-exploited advisories ship as HIGH. The threshold name stays stable; the threshold value covers the actionable bucket.

Network behavior

  • Per-request timeout: 15 seconds.
  • No authentication: OSV.dev’s /v1/querybatch and /v1/vulns/{id} endpoints are both unauthenticated public APIs.
  • User-Agent: bomdrift/<version> so the OSV team can attribute traffic if needed.
  • Failures warn and continue: a network mishap (DNS, timeout, 5xx) emits a single stderr warning and the diff renders without the Vulnerabilities section. The exit code remains 0 unless --fail-on was set and a previously-cached vuln tripped it.

Why OSV.dev specifically?

  • Cross-ecosystem unification. OSV merges npm advisories from GHSA, PyPI advisories from PyPA, Cargo advisories from RustSec, Maven advisories from GHSA, etc. into a single API, so bomdrift doesn’t need ecosystem-specific clients.
  • Open API, no key required. Every consumer of the /v1/querybatch endpoint gets the same data without registration overhead.
  • Public schema. The response shape is documented at ossf.github.io/osv-schema/, so bomdrift can reason about the shape without depending on an API client crate that drags in tokio.

EPSS

bomdrift queries the Exploit Prediction Scoring System (EPSS) from FIRST.org for every CVE-aliased advisory and surfaces the per-CVE score (0.0 – 1.0) in markdown / terminal / SARIF output.

EPSS estimates the probability that a given CVE will be exploited in the next 30 days. Combined with severity it gives reviewers a sharper signal than CVSS alone — a Critical CVE with EPSS 0.01 is far less urgent than a Medium CVE with EPSS 0.85.

Output

  • Markdown: per-advisory badge EPSS 0.87 after the severity label.
  • Terminal: same badge, no markup.
  • JSON: enrichment.vulns[purl][i].epss_score numeric field.
  • SARIF: properties.epssScore on bomdrift.cve results.

When an advisory is keyed by GHSA but has CVE aliases, the score is the max across all CVE aliases so a GHSA covering two CVEs surfaces the worse of the two.

Threshold gating

bomdrift diff before.json after.json --fail-on-epss 0.5

Exits 2 when any advisory has score ≥ 0.5. 0.5 is roughly the top decile of actively-exploited CVEs; tune for your team’s risk appetite.

Calibration

  • --cache-ttl-hours <N> (v0.9.6+) — overrides the default 24h disk cache TTL for the EPSS scores cache.
  • --fail-on-epss <FLOAT> — threshold gate; see Threshold gating.

Disabling

bomdrift diff before.json after.json --no-epss

or in .bomdrift.toml:

[diff]
no_epss = true

Both forms skip the FIRST.org HTTP call AND the disk cache lookup.

Caching

24h TTL at <XDG_CACHE>/bomdrift/epss/<cve>.json. Negative results (CVEs FIRST.org returned no score for) are cached to avoid re-querying recently-published CVEs that haven’t been scored yet.

Best-effort

Like every bomdrift enricher, EPSS is best-effort: a network failure or a malformed response surfaces a BOMDRIFT_DEBUG=1 stderr note and the diff renders with empty epss_score fields. EPSS being unreachable is never a reason to block a PR review.

CISA KEV

bomdrift downloads the CISA Known Exploited Vulnerabilities catalog and flips a KEV flag on every advisory whose primary id or aliases include a CVE listed in the catalog.

CISA KEV is the highest-confidence “actively exploited in the wild” signal available — CISA only adds CVEs to the catalog after observing real-world exploitation. It’s a tighter filter than --fail-on critical-cve (which fires on CVSS High or above regardless of exploitation evidence).

Output

  • Markdown: bold **KEV** badge after the severity / EPSS label.
  • Terminal: plain KEV token.
  • JSON: enrichment.vulns[purl][i].kev boolean field.
  • SARIF: properties.kev: true on bomdrift.cve results when set.

Threshold gating

bomdrift diff before.json after.json --fail-on kev

Exits 2 when any advisory has its KEV flag set. --fail-on any also includes KEV.

Calibration

--cache-ttl-hours <N> (v0.9.6+)

The 24h TTL for the catalog file is now configurable via the unified cache-TTL knob. Lower for faster CISA-update propagation in long-running self-hosted runners; raise when running offline or against archived SBOMs.

Disabling

bomdrift diff before.json after.json --no-kev

or in .bomdrift.toml:

[diff]
no_kev = true

Caching

24h TTL on the bulk catalog JSON at <XDG_CACHE>/bomdrift/kev/catalog.json. Once-daily refresh matches CISA’s publication cadence.

Best-effort

Network failure logs at BOMDRIFT_DEBUG=1 and the diff renders with KEV flags absent. A stale catalog (within the 24h window) is preferred over re-fetching on every run.

Typosquat detection

The typosquat enricher flags newly added components whose names are suspiciously close to a popular package in the same ecosystem. v0.4 covers npm, PyPI, Cargo, Maven, Go, RubyGems, NuGet, and Composer with rules tuned per ecosystem.

The signal

Typosquatting is a real and recurring supply-chain attack pattern:

  • The 2024 PyPI campaign that registered colorama-0.4.7 — note the trailing zero — to drop a credential stealer.
  • The Mar 2026 axios incident’s plain-crypto-js@4.2.1 — a typo of the legitimate crypto-js — used to exfiltrate via WAVESHAPER.V2.
  • Sustained npm lodash lookalikes (loadash, loadsh, loadshes) through 2024–2026.

The pattern is consistent across ecosystems: a candidate name with high visual / phonetic similarity to a popular package, often with a single character substitution / insertion / deletion, sometimes with an added prefix or suffix. The defender’s task is to flag the candidate at PR review time, before npm install or pip install runs the malicious code.

Algorithm

The core scoring is Jaro-Winkler similarity with a suffix-containment boost for the textbook prefix-add pattern (plain-crypto-js). Threshold: 0.92 for a finding to surface. Maven is the exception (see below).

Per-ecosystem rules

EcosystemCanonicalizationSeparatorsScoring
npmlowercase-, _, ., /Jaro-Winkler + suffix boost
PyPIPEP 503 (lowercase, -/_/. collapse)-, _, .Jaro-Winkler + suffix boost
Cargolowercase-Jaro-Winkler + suffix boost
Mavenlowercase(n/a)Levenshtein ≤ 2 on artifactId only
Golowercase-, /Jaro-Winkler on last path segment
Gemlowercase-, _Jaro-Winkler + suffix boost
NuGetlowercase (case-insensitive per spec).Jaro-Winkler + suffix boost
Composerlowercase-, /Jaro-Winkler on package portion

Filtering rules (npm / PyPI / Cargo)

  1. Exact match (case-insensitive after canonicalization) → skip. The candidate IS a popular package, not a squat.
  2. Likely-legit ecosystem extension → skip. When the candidate starts with the legit name followed by a separator, this matches the well-established convention for extension packages (react-router, axios-retry, eslint-plugin-react, pytest-asyncio). The structural rule is keyed on ecosystem- specific separator sets so PyPI’s -/_/. interchange doesn’t leak into npm’s wider set.
  3. Suffix containment with a substantial added prefix → boost. When the candidate ends with the legit name (length ≥ 5) AND the added prefix is longer than 3 characters, the score is boosted to at least 0.95. This catches the deceptive plain-crypto-js pattern that pure JW alone misses (the long prefix kills base similarity).
  4. Otherwise: plain Jaro-Winkler. Threshold 0.92 catches single- character drift like cross-env → crossenv (~0.98) or express → expresss (~0.97), while react → react-router (~0.88) stays below the threshold.

Match-form rules (Go and Composer)

Go and Composer share an additional structural rule: the user-visible coordinate has a stable, long prefix (Go’s host/owner/, Composer’s vendor/) that’s duplicated across many legitimate packages. Including the prefix in Jaro-Winkler scoring would inflate similarity past anything useful — every Spring artifact would score 0.95+ against every other Spring artifact, every Symfony package against every other Symfony package.

Both ecosystems extract a match form from the canonicalized coordinate before scoring:

  • Go: the last path segment of host/owner/repo (e.g. github.com/spf13/cobracobra).
  • Composer: the package portion of vendor/package (e.g. symfony/consoleconsole).

Comparison happens on match forms. When two distinct full coordinates collapse to the same match form (github.com/spf13/cobra and github.com/myorg/cobra), they’re treated as legitimate forks and not flagged. Only typo’d match forms (cobraa vs cobra) trip the JW similarity threshold.

Maven rules

Maven coordinates are groupId:artifactId. The shared groupId prefix is often very long (org.springframework.boot:, com.fasterxml.jackson.core:) and would inflate Jaro-Winkler past anything useful — every Spring artifact would score 0.95+ against every other Spring artifact. The Maven path skips JW + suffix- containment entirely and uses Levenshtein distance ≤ 2 on the artifactId portion only.

commons-lng3 differs from commons-lang3 by Levenshtein 1 (insert a), so it fires regardless of whether the groupId matches. A different-groupId republish of an exact commons-lang3 artifact does not fire — that’s a legitimate fork / republish, not a typo.

Reputational care

The renderer wording is intentional:

X is similar to Y

— never X is a typosquat of Y. Flagging a legitimate package as a malicious squat in a public PR comment is real reputational harm to the package author. The structural similarity is observable; intent is not. The human reviewing the PR is the analyst making the determination.

The CLI / Action exit code reflects this: typosquat findings are always informational. --fail-on typosquat exists for projects that want to gate on the structural signal explicitly, but it’s never the default.

Reference lists

Embedded snapshots ship in the binary:

FileSourceSize
data/npm-top1k.txtanvaka/npmrank1000
data/pypi-top200.txthugovk/top-pypi-packages200
data/cargo-top200.txtcrates.io API ?sort=downloads200
data/maven-top100.txtmvnrepository.com Most Popular (curated)~100
data/go-top200.txtpkg.go.dev + awesome-go (curated)~180
data/gem-top200.txtrubygems.org popular gems (curated)~245
data/nuget-top200.txtnuget.org v3 search API ?orderby=totalDownloads200
data/composer-top200.txtpackagist.org popular categories (curated)~190

v0.7 expanded the curated Go, Composer, and Gem lists — the ship-with-binary snapshots now cover the CNCF / HashiCorp / gRPC- ecosystem corners of Go, the Symfony / Laravel / Doctrine / testing / Packagist-popular tail of Composer, and the Rails / dry-rb / serializer / search corners of RubyGems. Each top-up is grouped under a # --- v0.7 top-up: <category> (source: ...) --- header in the data file so future curators can see provenance.

Lists are intentionally smaller than npm-top1k.txt for the multi- ecosystem ships (v0.2 + v0.4): the algorithm is identical across ecosystems, so a smaller seed still proves the signal end-to-end. Lists grow in subsequent releases without code changes — only the embedded snapshot does.

Refreshing

bomdrift refresh-typosquat                    # all eight ecosystems
bomdrift refresh-typosquat --ecosystem npm
bomdrift refresh-typosquat --ecosystem pypi
bomdrift refresh-typosquat --ecosystem cargo
bomdrift refresh-typosquat --ecosystem nuget

Refreshed lists are written to <XDG_CACHE_HOME>/bomdrift/typosquat/<ecosystem>.txt via temp-file + atomic rename. The enricher prefers cache files over the embedded snapshot when present and parseable.

--ecosystem maven|go|gem|composer are accepted but emit a notice: Maven Central, pkg.go.dev, RubyGems, and Packagist all lack stable public popularity feeds (or have had ones that went through breaking changes). The curated lists shipped in the binary remain the source of truth; refreshing those means editing data/<eco>-top*.txt and rebuilding bomdrift. PRs adding names to the curated lists are welcome.

Calibration

--typosquat-similarity-threshold <FLOAT> (v0.9.6+)

Default 0.92, range [0.0, 1.0]. Configurable via CLI flag or [diff] typosquat_similarity_threshold = <float> in .bomdrift.toml.

The threshold applies to the JW + suffix-boost path (npm, PyPI, Cargo, RubyGems, NuGet, Go, Composer). The Maven Levenshtein-≤-2 path is hardcoded — Levenshtein distance and JW similarity aren’t directly comparable, so a single threshold flag would either over- or under-suppress on Maven.

Recommended ranges:

  • 0.95 — very strict; only catches near-perfect matches. Good for tightening down false positives in monorepos with many internally forked dependencies.
  • 0.92 (default) — calibrated against the top-1000-of-each-ecosystem test corpus to produce zero false positives there.
  • 0.85 — lenient; catches softer near-misses at the cost of more false positives. Useful for paranoid security review of brand-new PyPI / npm packages.

The threshold also appears in --debug-calibration rows so collected samples can guide tuning:

typosquat|<purl>|<similarity_score>|0.92

False-positive management

The structural rules + thresholds aim for “no false positives on the top 1000 of each ecosystem.” If you discover a false positive in the wild:

  1. Add a regression test in src/enrich/typosquat.rs::tests showing the false positive doesn’t fire.
  2. Open a PR. Tightening the rule (rather than special-casing the package name) is preferred — drives a cleaner heuristic.

Disabling

Pure compute, no network. There is no --no-typosquat flag — disabling the typosquat enricher would defeat its primary purpose. To suppress specific false-positive findings, hand-curate a per-component baseline entry; see Baseline & suppression — Worked example.

To gate exit code on typosquat findings, use --fail-on typosquat.

See also

Multi-major version jumps

Pure-compute, no network, no new dependencies. The version-jump heuristic flags dependency upgrades that cross two or more major versions in a single diff (e.g. 1.x → 4.x).

Why it’s a useful signal

A single major bump (1 → 2) is the standard SemVer signal reviewers already pay attention to — bomdrift does not flag it. Two or more majors at once is the unusual case worth a closer look:

  • Takeover swaps: a maintainer transition followed by a major-version rename to “reset” the package identity (the xz pattern, scaled down).
  • Namespace reuse: an unrelated package republished at a higher major under the same name, intentionally or after an account compromise.
  • “Cleaned up the dep tree” PRs: legitimate but high-risk refactors that silently jump several majors at once and bypass the usual SemVer guard-rails.

Always informational severity — never trips --fail-on thresholds narrower than any.

Major-version extraction

Hand-rolled, ~5 lines. We deliberately avoid the semver crate: full SemVer parsing is unnecessary when only the major number is consulted, and pulling the dep would add transitive weight for no functional gain.

Accepted forms (each yields a Some(major))

  • 1.2.3 → 1
  • v1.0.0 → 1 (leading v tolerated)
  • 2.5.3-beta.1 → 2 (pre-release suffix ignored)
  • 3.0.0+build.123 → 3 (build metadata ignored)
  • 4 / 4-rc.1 → 4 (no minor required)

Rejected forms (yield None, the pair is skipped — never flagged)

  • empty string
  • non-numeric (latest, nightly, main)
  • leading-zero numbers (01.2.3) — ambiguous and almost always a sign of a non-SemVer scheme; safer to skip than misinterpret.

Calibration

The multi-major delta threshold is exposed as --multi-major-delta <N> (introduced in v0.9.7) with the matching [diff] multi_major_delta config key. Default 2; minimum 1.

Raising the threshold to 3 or higher quiets noisy ecosystems that release majors aggressively (some npm web frameworks ship a major every few months). The signal still fires for genuinely unusual jumps but stops competing with everyday upgrades for reviewer attention.

Lowering to 1 is supported but discouraged: it duplicates the standard SemVer-bump signal reviewers already see on every PR, and drowns the multi-major signal’s actual purpose (catching the xz pattern and namespace-reuse swaps). bomdrift validates >= 1 so 0 is rejected at the clap layer rather than silently disabling the enricher.

For per-component carve-outs use a baseline entry instead of dropping the global threshold; see Baseline — When the bump is the false positive.

Disabling

There is no --no-version-jump flag — pure compute, zero cost. If you need to gate exit code only on version-jump findings, use --fail-on any. To suppress a specific bump as a known-acceptable, write a per-component baseline entry — see Baseline — When the bump is the false positive.

Examples

BeforeAfterFlagged?
1.0.04.17.21yes (1 → 4)
2.34.04.5.0yes (2 → 4)
1.0.02.0.0no (single major bump)
1.0.01.99.0no (no major bump)
latestnightlyno (skipped — non-numeric)
01.2.304.0.0no (skipped — leading-zero ambiguity)

See examples/version-jumps/ for a runnable scenario.

Maintainer age signal

Flag newly added GitHub-hosted dependencies whose top contributor’s first commit is suspiciously recent. The xz/Jia Tan pattern.

Why it matters

The xz-utils backdoor (CVE-2024-3094, Mar 2024) was the work of “Jia Tan”, a GitHub identity that started contributing roughly two years before landing the malicious payload. The pattern — a brand-new account becoming the de facto sole maintainer of a low-traffic but widely-depended-upon package — is a leading indicator of long-game supply-chain takeovers.

We can’t catch Jia Tan in retrospect, but we can flag the next one earlier in their arc by surfacing “this package’s top contributor opened their first PR less than 90 days ago” at the moment a new dep is added.

Threshold

90 days by default. Intentionally aggressive: most legitimate new packages will trip this on initial introduction. That’s fine — a human reviewer can dismiss “the package is brand-new and the author is its only maintainer” trivially.

The expensive miss is the silent takeover of an existing package by a recently-arrived contributor, which is what the 90-day window captures. Tune for your environment via --young-maintainer-days <N> or [diff] young_maintainer_days = <N> (v0.9.6+); see Calibration below.

How it works

For each cs.added component with a GitHub source_url:

  1. GET /repos/{owner}/{repo}/contributors?per_page=1 — top contributor login.
  2. GET /repos/{owner}/{repo}/contributors to count contributors. Skip if > 50 — “top contributor joined recently” loses meaning when 200 people have committed (Linux, Kubernetes, React, etc.).
  3. GET /repos/{owner}/{repo}/commits?author=<login>&per_page=1 to find the most recent commit by that author.
  4. Paginate to the last page to find their first commit. The “first commit by author” pagination trick is slow on prolific contributors (last page can be page 50+) but is correct without needing the GraphQL API.
  5. Compare against the SBOM-after timestamp (or clock::now() when the SBOM lacks a metadata timestamp). Flag when the first commit is younger than YOUNG_MAINTAINER_DAYS (default 90; tunable via --young-maintainer-days <N> in v0.9.6+).

Skipped cases

  • Components without a source_url (CycloneDX externalReferences with no vcs entry, etc.) — silently skipped.
  • Non-github.com source URLs — silently skipped (GitLab / Codeberg / etc. would need per-host clients; out of scope for v0).
  • Repositories with > 50 contributors — skipped because the “top contributor’s first commit” loses meaning on monorepos and multi-vendor projects.
  • Repositories returning 404 or 403 — skipped, warned once.

Per-repo results are cached within a single bomdrift run so repeated cs.added entries from the same project don’t re-issue the same three requests.

Network behavior

  • Per-request timeout: 15 seconds.
  • GITHUB_TOKEN honored: bumps the unauthenticated 60/hr cap to the authenticated 5000/hr cap. Without a token, large diffs (~30+ added GitHub deps) will hit rate-limiting; surface as a warning, partial results render, exit code stays 0.
  • No octocrab: the octocrab crate would pull in tokio + ~70 transitive crates. Hand-rolled ureq GETs + a 25-line ISO-8601 parser keep the bomdrift binary under our 5 MB target.

Calibration

--young-maintainer-days <N> (CLI; v0.9.6+) or [diff] young_maintainer_days = <N> in .bomdrift.toml overrides the 90-day default. Must be >= 1.

Recommended ranges:

  • 3060 for paranoid security-sensitive monorepos.
  • 90 (default) for general-purpose use; the calibration target for the xz pattern.
  • 180 for ecosystems with high contributor churn where the default surfaces too many legitimate first-time-author packages.

The threshold also appears in --debug-calibration rows so collected samples can guide tuning:

maintainer-age|<purl>|<days_since_first_commit>|90

Disabling

--no-maintainer-age skips the entire enricher (no GitHub API calls). Required for:

  • Offline runs and tests.
  • CI environments where GITHUB_TOKEN is unset and the unauthenticated rate limit (60/hr) is too low for the diff being analyzed.
  • Smoke tests of the deterministic offline signals.
bomdrift diff before.json after.json --no-maintainer-age

Severity

Always informational. The maintainer-age signal never trips --fail-on critical-cve; it surfaces only under --fail-on any. The intent is for human review, not gating: many legitimate packages have brand-new authors, and the threshold is calibrated to surface the xz-style pattern, not to fail the build automatically.

Calibration roadmap (v0.9.6+ status)

Past calibration backlog and how each item resolved:

  • Tunable threshold flagshipped in v0.9.6 as --young-maintainer-days <N>. See Calibration above.
  • Multi-signal fusion — combine maintainer-age with the registry enricher’s “recently-published” or “maintainer-set-changed” findings to narrow the false-positive rate. The signals all surface in the same diff today; explicit fusion in a single composite finding is a v1.0 follow-up.
  • GraphQL paginationdecided not to pursue. Adds a token requirement (the GraphQL endpoint always wants auth) for one saved round-trip per component. The last-page REST trick is documented as the canonical approach; see the module doc-comment in src/enrich/maintainer.rs for the rationale.

See Roadmap for the current backlog.

Registry-metadata enrichers (npm / PyPI / crates.io)

bomdrift queries package registries for each newly-added component (plus npm version-changed components for the maintainer-set check) and surfaces three kinds of finding:

  • Recently published — the publish timestamp is within --recently-published-days (default 14 days). Recent publishes correlate with takeover swaps and namespace-reuse attacks.
  • Deprecated — the package or version is flagged deprecated on npm, yanked on PyPI / crates.io, or carries an “Inactive” PyPI classifier.
  • Maintainer set changed (npm only) — the maintainer set listed for the new version differs from the maintainer set listed for the old version. Classic xz / Jia Tan precursor.

Sources

EcosystemURLHeaders
npmhttps://registry.npmjs.org/<pkg> (URL-encoded @scope/name)User-Agent: bomdrift/<version>
PyPIhttps://pypi.org/pypi/<pkg>/json
crates.iohttps://crates.io/api/v1/crates/<name>User-Agent: bomdrift/0.9.0 (https://github.com/Metbcy/bomdrift) (required by crates.io)

Disk cache

Per ecosystem under <XDG_CACHE>/bomdrift/registry/<eco>/<pkg>.json, 24-hour TTL, atomic temp-file + rename writes. Mirrors the OSV / EPSS / KEV cache shape.

Best-effort

A registry timeout, parse error, or unsupported ecosystem returns Ok with no findings. Diff rendering NEVER blocks on registry responses.

Calibration

  • --recently-published-days <N> — override the default 14-day threshold. Set --recently-published-days 0 to disable that check while keeping deprecation / maintainer-set-changed.
  • --cache-ttl-hours <N> (v0.9.6+) — overrides the default 24h disk cache TTL for the per-ecosystem registry caches.

Disabling

bomdrift diff before.json after.json --no-registry

Disables all three checks at once. Equivalent to [diff] no_registry = true in .bomdrift.toml.

Flags

  • --no-registry — skip all three checks.
  • --recently-published-days <N> — see Calibration.
  • --fail-on recently-published, --fail-on deprecated — exit-2 thresholds.

Output

  • Markdown: three new sections — “Recently published”, “Deprecated upstream”, “Maintainer set changed (npm)” — in the per-category area.
  • JSON: enrichment.recently_published, enrichment.deprecated, enrichment.maintainer_set_changed.
  • SARIF: rules bomdrift.recently-published, bomdrift.deprecated, bomdrift.maintainer-set-changed with stable partialFingerprints.primaryHash/v1.
  • Calibration rows (--debug-calibration): recently-published|<purl>|<days>|14, deprecated|<purl>|<message>|any, maintainer-set-changed|<purl>|<changes>|1.

Why npm-only for maintainer-set-changed?

PyPI and crates.io don’t expose a clean “maintainers per version” view in their public REST API:

  • PyPI: the info.maintainer and info.author fields are free-text and inconsistent across releases. There’s no historical record per release.
  • crates.io: owners is package-level, not version-level, so we can’t tell which owners had publish rights at the time of an individual version.

When the upstream APIs gain a per-version maintainer view we’ll extend the enricher; a future-version follow-up.

Baseline & suppression

The --baseline <path> flag suppresses findings that are already present in a previously captured bomdrift diff --output json snapshot. It exists to make adopting bomdrift on a project with pre-existing findings practical — the first PR shouldn’t drown in noise that’s already been reviewed and accepted.

How it works

  1. Capture a baseline once, after a maintainer has reviewed and accepted the current state of findings as known acceptable:

    bomdrift diff before.json after.json --output json > .bomdrift-baseline.json
    

    Commit .bomdrift-baseline.json to the repo.

  2. On subsequent runs, pass --baseline:

    bomdrift diff before.json after.json --baseline .bomdrift-baseline.json
    
  3. Findings whose match key is already present in the baseline are dropped from the rendered output and from the --fail-on trip evaluation. New findings — either at a new component, a new version of a known component, or a new advisory ID — surface normally.

Match keys

Match keys are intentionally conservative. A finding at a different version than baseline still surfaces — version drift is exactly the case where a known-acceptable finding becomes an unknown one, so suppressing across versions would defeat the point.

Finding typeMatch key
Vulnerability (CVE / GHSA / MAL)(purl_with_version, advisory_id)
Typosquat(purl_with_version)
Multi-major version jump(purl_with_version) (the after-version)
Young maintainer(purl_with_version)

Notes:

  • License-changed-without-version-bump pairs are part of the ChangeSet, not the enrichment. --baseline suppresses findings, not the diff itself, so license changes always surface in the rendered output. This is intentional — a license change at a known version is still a change worth a reviewer’s eye.
  • Vulnerabilities use the advisory ID in the key, so a new GHSA against an already-known component still fires.
  • Typosquats use the after-version in the key, so a typo’d foo@1.0.0 in the baseline doesn’t suppress a typo’d foo@2.0.0.

Forward compatibility

The baseline parser is intentionally forgiving about missing fields. v0.2 baselines can suppress a vuln by (purl, advisory_id) even when the v0.3+ enrichment has populated severity, just with reduced precision. Regenerate baselines under v0.3+ to capture the full match shape.

As of v0.4, the action ships a baseline: input that plumbs straight through to --baseline — no need for a custom step calling the binary directly.

In-comment suppression (v0.5+)

Editing .bomdrift/baseline.json by hand on every accepted finding is friction. v0.5 ships a comment-driven flow: a reviewer comments /bomdrift suppress <ADVISORY-ID> on a PR, and a companion sub-action appends the ID to the baseline file and commits it to the PR’s head branch. The next bomdrift run on the same PR sees the finding as suppressed.

Setup

Add a second workflow alongside your normal bomdrift one:

# .github/workflows/bomdrift-suppress.yml
name: bomdrift suppress
on:
  issue_comment:
    types: [created]

permissions:
  contents: write       # to commit the baseline file
  pull-requests: write  # to react on the trigger comment

jobs:
  suppress:
    if: |
      github.event.issue.pull_request &&
      startsWith(github.event.comment.body, '/bomdrift suppress ')
    runs-on: ubuntu-latest
    steps:
      - uses: Metbcy/bomdrift/comment-suppress@v1

The if: filter is conservative — it gates on both github.event.issue.pull_request (so issue comments don’t trigger) and the comment-body prefix. The sub-action also re-validates both internally and exits cleanly on non-matching events, so the filter is defense-in-depth, not load-bearing.

What it does

  1. Parses the comment body for /bomdrift suppress <id>. The ID must match a GHSA / CVE / MAL pattern.
  2. Reacts to acknowledge that the command was accepted.
  3. Resolves the PR’s head ref via the GitHub API.
  4. Downloads the latest bomdrift release archive and (by default) verifies its cosign signature.
  5. Clones the PR’s head branch into a sibling worktree.
  6. Runs bomdrift baseline add <id> --path <baseline-path>, which appends the ID to the suppressed_advisories array in the baseline file (creating the file if missing).
  7. Commits + pushes the baseline change with message chore(bomdrift): suppress <id>.
  8. Reacts on the trigger comment to show success or failure.

What it suppresses

The v0.5 in-comment flow uses a wildcard advisory match: the specified ID is suppressed across all components, not just the one the comment was attached to. This is intentional — the typical case is “this advisory is a known false positive in our environment regardless of which dep pulls it in.” For per-component suppression, hand-edit the baseline using the existing diff-output JSON shape (see Match keys above) — both shapes coexist in the same file.

CLI equivalent

The same operation is available from the command line for users who want to curate a baseline outside CI:

bomdrift baseline add GHSA-xxxx-yyyy-zzzz
bomdrift baseline add CVE-2026-12345 --path custom/baseline.json

The command is idempotent — re-adding an existing ID is a no-op.

--from-comment (v0.9+)

When the GitLab comment-suppress bridge (or any other webhook handler) hands you a raw note body, pass it via --from-comment and let bomdrift extract the directive:

bomdrift baseline add --from-comment "Looks fine. /bomdrift suppress GHSA-mwcw-c2x4-8c55 reason: vendor PR #42 already merged"

The flag accepts the entire comment body. bomdrift parses the first /bomdrift suppress <ID>[ reason: <text>] line, validates the ID shape, and either appends the entry (writing object-form when a reason is present) or exits non-zero with a clear stderr message when no directive is found. The grammar is identical to the GitHub comment-suppress sub-action — the two parsers are deliberately kept in lockstep.

Workflow integration

A typical CI pattern commits the baseline alongside the source code and refreshes it after a maintainer reviews and accepts new noise as known acceptable:

- uses: Metbcy/bomdrift@v1
  with:
    before-sbom: before.json
    after-sbom:  after.json
    baseline:    .bomdrift/baseline.json
    fail-on:     critical-cve

When this fails on a new finding, the maintainer either:

  1. Fixes the finding (upgrade the dep, replace the typosquat) — no baseline change needed.
  2. Accepts the finding as known acceptable — regenerates the baseline and commits it:
    bomdrift diff before.json after.json --output json > .bomdrift-baseline.json
    git add .bomdrift-baseline.json
    
    Reviewers see the diff against the previous baseline in the same PR and decide whether the new entry is acceptable.

When NOT to use a baseline

  • For a fresh project. If you can fix every finding before merging the bomdrift integration PR, do that — the baseline is technical debt, even if it’s debt with a clear purpose.
  • For severity-bucket gating. Use --fail-on critical-cve to gate the merge on actionable severity instead of suppressing everything under that severity. Baselines are for “we know about this, it’s fine for now”, not “ignore this entire class”.
  • For findings you’ll fix in the next PR. A baseline is a long-lived artifact; for one-PR exceptions, just upgrade the dep.

Worked example: triaging a false positive

Real-world false positives are the most common reason adopters reach for the baseline. A typical case looks like this on a PR:

🚨 Typosquat candidate — new dependency colour-print is within Levenshtein distance 1 of well-known package colorprint. Review for impersonation.

In our example, colour-print is a deliberate British-English spelling maintained by a long-trusted internal team — this is the canonical “signal that’s true in the abstract, wrong for our codebase” case. The Levenshtein heuristic should fire on this; what’s wrong is the verdict, not the detection. Suppressing the whole typosquat class (via --fail-on cve) loses coverage on actually-malicious squats; a wildcard config field would over-suppress; what we want is exactly this finding suppressed.

Step 1 — capture the current finding shape

Before deciding what to suppress, see what bomdrift saw. Run with --output json and pull out the typosquat finding:

bomdrift diff before.json after.json --output json \
  | jq '.enrichment.typosquat[] | select(.purl | contains("colour-print"))'

Output:

{
  "purl": "pkg:npm/colour-print@2.1.0",
  "candidate_for": "colorprint",
  "distance": 1,
  "ecosystem": "npm"
}

The purl_with_version here is pkg:npm/colour-print@2.1.0 — the match key for the typosquat entry per the table above.

Step 2 — write a per-component baseline entry

Edit .bomdrift/baseline.json (the file bomdrift init scaffolds, or whatever path you pass to --baseline). The diff-output JSON shape takes precedence, so a hand-written entry uses the same fields the JSON output produces:

{
  "suppressed_advisories": [],
  "findings": {
    "typosquat": [
      {
        "purl": "pkg:npm/colour-print@2.1.0",
        "candidate_for": "colorprint",
        "ecosystem": "npm",
        "_note": "British-English spelling, owned by team-foo since 2019. Re-evaluate on major-version bump."
      }
    ]
  }
}

The _note field is an underscore-prefixed extension; bomdrift preserves unknown fields verbatim on round-trip and never reads them back, so it’s a safe place to capture the why. Future maintainers who read the baseline see the rationale without spelunking through git blame.

Step 3 — verify the suppression takes effect

Re-run the diff with the baseline applied:

bomdrift diff before.json after.json \
  --baseline .bomdrift/baseline.json

The colour-print finding is gone; everything else (including any other typosquat candidate that shows up the same week) still surfaces. That’s the trade-off: a precise hand-written entry beats a wildcard or a class-wide opt-out, because the next typosquat against a new package still trips the gate.

Why a hand-edited entry beats --fail-on tuning

It’s tempting to “just” loosen --fail-on typosquat to --fail-on critical-cve. Don’t:

  • The typosquat enricher is your earliest signal for malicious packages — a real squat (colorize impersonating colorise) is caught here before the OSV.dev advisory exists.
  • A baseline entry is auditable: git log .bomdrift/baseline.json shows when this exception was made and by whom.
  • A wildcard config setting (e.g., a hypothetical [diff.typosquat] allow_distance_1 = true) would also suppress unrelated future squats. Per-component is the smallest possible exception that still fixes this one PR.

When the bump is the false positive

Sometimes the finding is a multi-major version jump on a package you expect to leap (a calver-style release schedule, a coordinated ecosystem-wide bump). The same per-component recipe works — replace the typosquat array with version_jump, key by the after-version’s purl. Update the entry on the next jump.

Schema reference

The unified BaselineEntry shape (introduced in v0.9.5; v0.5 string entries continue to parse as the back-compat case):

FieldTypeRequiredIntroducedDescription
idstringyes (when not the bare-string form)v0.5Advisory identifier — GHSA-…, CVE-…, MAL-…, or OSV-….
purlstringnov0.5Restrict the suppression to a specific component (otherwise wildcards across all components). May be versionless (pkg:npm/foo) or version-pinned (pkg:npm/foo@1.2.3).
expiresstring YYYY-MM-DDnov0.8Strict-format expiry date. After this date the entry surfaces a warning and stops suppressing. Malformed dates fail loudly — no silent never-expiring entries.
reasonstringnov0.8Free-form rationale; surfaces in the expiry warning and as the OpenVEX statement_text in --emit-vex output.
vex_statusstringnov0.9One of OpenVEX’s vocabulary: not_affected, affected, fixed, under_investigation. Drives --emit-vex output. Defaults to under_investigation so --emit-vex doesn’t fabricate not_affected claims.
vex_justificationstringnov0.9OpenVEX justification when vex_status = not_affected. E.g., vulnerable_code_not_in_execute_path, component_not_present.

Cross-link: vex_status and vex_justification are passthrough to the VEX emit format. The License policy chapter covers using baseline entries to suppress LicenseViolation findings (the same id / purl / reason schema applies; license violations key by a synthetic ID bomdrift.license-violation:<purl>).

Two valid shapes per entry

The suppressed_advisories array accepts either form per entry:

{
  "suppressed_advisories": [
    "GHSA-old-school",
    {
      "id": "GHSA-evil-1234",
      "purl": "pkg:npm/foo",
      "expires": "2026-12-31",
      "reason": "Awaiting upstream patch (issue #42)",
      "vex_status": "under_investigation"
    }
  ]
}

Bare strings remain in the file for v0.5 compatibility; bomdrift baseline add --reason … always emits the object form.

Time-boxed suppressions (expires + reason)

v0.8 adds two optional fields on each suppressed_advisories entry:

{
  "suppressed_advisories": [
    {
      "id": "GHSA-evil-1234",
      "purl": "pkg:npm/foo",
      "expires": "2026-12-31",
      "reason": "Awaiting upstream patch (issue #42)"
    },
    "GHSA-old-school"
  ]
}

Both fields are optional. String entries (the v0.5 form) keep working — the array is a union of both shapes.

Behavior

  • Active entry (expires is today or in the future, OR no expires): finding is suppressed as before.

  • Expired entry (expires is strictly before today): finding surfaces, and bomdrift prints one warning line per expired entry to stderr:

    warning: baseline entry GHSA-evil-1234 (pkg:npm/foo) expired 2026-04-29; finding will surface in this run — was: Awaiting upstream patch (issue #42)
    
  • Malformed expires (e.g. 2026/12/31): bomdrift refuses to load the baseline rather than silently treating it as never-expiring. Use strict YYYY-MM-DD zero-padded.

The “today” comparison honors SOURCE_DATE_EPOCH so reproducible-build contexts stay deterministic.

CLI

bomdrift baseline add GHSA-evil-1234 \
  --expires 2026-12-31 \
  --reason "Awaiting upstream patch (issue #42)"

The comment-suppress companion action also picks up an optional reason: <text> line in the triggering comment body:

/bomdrift suppress GHSA-evil-1234
reason: Awaiting upstream patch (issue #42)

Worked rotation example

Six months ago the team accepted GHSA-evil-1234 with a 6-month expiry. Today the warning fires:

warning: baseline entry GHSA-evil-1234 expired 2026-04-29 …

The reviewer either renews the suppression (new PR, new expiry + reason) or removes the entry and merges the upstream patch. Suppressions become reviewed work-items, not silent forever-state.

VEX (Vulnerability Exploitability eXchange)

bomdrift consumes and emits VEX statements so reviewers can record exploitability decisions next to their SBOMs and have those decisions suppress noise on subsequent diffs.

Two formats are supported on input (auto-detected per file):

  • OpenVEX 0.2.0 — see https://github.com/openvex/spec.
  • CycloneDX VEX 1.6analysis.state is mapped onto the OpenVEX vocabulary (not_affected / resolvednot_affected, exploitableaffected, in_triageunder_investigation).

OpenVEX is bomdrift’s preferred output format on emission (--emit-vex) because the standalone JSON-LD doc is the smallest interop surface.

Consuming VEX (--vex <path>)

The flag is repeatable. Each file is auto-detected by its top-level shape. Statements match findings by (vuln_id_or_alias, product_purl).

VEX statusEffect on the matching finding
not_affectedSuppresses (counted in “Suppressed by VEX”)
fixedSuppresses
under_investigationAnnotates with VEX:under_investigation
affectedAnnotates with VEX:affected

A VEX statement’s products[] may be either purl strings or {"@id": "pkg:..."} objects. A versionless statement (pkg:npm/foo) matches every versioned finding-product (pkg:npm/foo@1.2.3); a versioned statement only matches the exact purl.

Synthetic finding IDs

bomdrift emits non-CVE findings (typosquats, version-jumps, maintainer-age, license-violations). To author VEX statements that suppress them, use the synthetic ID convention:

Finding kindSynthetic ID format
Typosquatbomdrift.typosquat:<purl>:<closest>
Version-jumpbomdrift.version-jump:<purl>:<before_major>-><after_major>
Maintainer-agebomdrift.young-maintainer:<purl>:<top_contributor>
License-violationbomdrift.license-violation:<purl>:<license_string>

Example OpenVEX statement suppressing a typosquat finding:

{
  "vulnerability": { "name": "bomdrift.typosquat:pkg:npm/plain-crypto-js@4.2.1:crypto-js" },
  "products": [ { "@id": "pkg:npm/plain-crypto-js@4.2.1" } ],
  "status": "not_affected",
  "justification": "vulnerable_code_not_present",
  "status_notes": "verified the package is a re-export and not impersonating crypto-js"
}

Multiple files

--vex first.json --vex second.json is processed left-to-right. Statements with the same (vuln_id, product) are first-write-wins — later files do NOT override earlier ones. Layer policy-level VEX first and project-level VEX second so the project-level entries override the defaults. (Or pass them in the reverse order if you want the opposite precedence.)

Verifying with vexctl

If you have vexctl installed:

vexctl filter --vex bomdrift.openvex.json sbom.cdx.json

verifies the VEX doc is well-formed and that statements match a known purl in your SBOM.

Emitting VEX (--emit-vex <path>)

Writes a single OpenVEX 0.2.0 document covering every finding in the post-baseline diff.

  • Baseline-suppressed findings inherit their vex_status from the baseline entry, defaulting to under_investigation. Baseline ≠ “not affected” — baseline often means “accepted in PR review” or “temporarily ignored”, so emitting not_affected by default would publish a false claim. Opt in by adding vex_status: "not_affected" to the baseline entry:

    {
      "id": "GHSA-x-y-z",
      "purl": "pkg:npm/foo",
      "expires": "2026-12-31",
      "reason": "Awaiting upstream patch (issue #42)",
      "vex_status": "not_affected",
      "vex_justification": "vulnerable_code_not_present"
    }
    
  • Un-suppressed findings emit as affected with status_notes describing the bomdrift finding kind. The justification field falls back to the configured [diff] vex_default_justification (default vulnerable_code_not_in_execute_path).

The doc’s timestamp honors SOURCE_DATE_EPOCH, so --emit-vex output is byte-deterministic in CI when the env is set.

Configuration keys

[diff]
vex_author = "https://example.com/security"
vex_default_justification = "vulnerable_code_not_in_execute_path"

vex_author falls back to repo_url when unset; falls back to "bomdrift" when both are missing.

Justification vocabulary

bomdrift uses the OpenVEX 0.2.0 spec’s standard justification values verbatim: component_not_present, vulnerable_code_not_present, vulnerable_code_not_in_execute_path, vulnerable_code_cannot_be_controlled_by_adversary, inline_mitigations_already_exist, plus the under_investigation-related justifications the spec defines. Richer justification vocabularies (per-organization tags, custom-reason strings, tool-specific extensions) are out of scope — authoring against a single canonical enum keeps --emit-vex output interoperable with any OpenVEX consumer. If the OpenVEX spec evolves to add new justifications, bomdrift follows the spec; non-spec justifications won’t be invented here.

Worked rotation example

  1. Run a diff that surfaces GHSA-evil on pkg:npm/foo@1.0.0.

  2. Investigate, conclude the vulnerable function is not on your execute path.

  3. Add the entry to .bomdrift/baseline.json with VEX status:

    {
      "schema_version": 1,
      "suppressed_advisories": [
        {
          "id": "GHSA-evil",
          "purl": "pkg:npm/foo@1.0.0",
          "expires": "2027-01-01",
          "reason": "Function is unreachable per audit (PR #123)",
          "vex_status": "not_affected",
          "vex_justification": "vulnerable_code_not_in_execute_path"
        }
      ]
    }
    
  4. Re-run with --emit-vex bomdrift.openvex.json to produce a publishable exploitability statement that downstream consumers can ingest with their own --vex flag.

License policy

bomdrift can enforce a license allow/deny policy on every newly added or version-changed component. Distinct from the License changed finding (which detects same-version license drift), this is “the configured policy says this license isn’t allowed.”

Configuration

In .bomdrift.toml:

[license]
allow = ["MIT", "Apache-2.0", "BSD-3-Clause", "ISC"]
deny  = ["GPL-3.0-only", "AGPL-*"]
allow_ambiguous = false

Or via CLI flags (override the config block when set, matching the GitHub Dependency Review Action flag names exactly):

bomdrift diff before.json after.json \
    --allow-licenses MIT,Apache-2.0,BSD-3-Clause \
    --deny-licenses 'GPL-3.0-only,AGPL-*'

Both flags accept comma-separated values and may be repeated.

Matching rules (v0.8 — fail-closed)

InputWith allow_ambiguous=falseWith allow_ambiguous=true
Atomic license on allowpermitpermit
Atomic license on denydenydeny
Atomic license matching *-suffix glob in deny (AGPL-*AGPL-3.0-only)denydeny
Atomic license not on allow (when allow is non-empty)not-allowednot-allowed
Compound expression (MIT OR GPL-3.0)ambiguouspermit
NOASSERTION / OTHER / emptyambiguouspermit

Deny wins when a license matches both allow and deny.

Compound SPDX expression evaluation ((MIT OR Apache-2.0) against allow={Apache-2.0} resolves to permit) lands in v0.9 via the spdx crate. v0.8 fails closed on every compound expression unless allow_ambiguous=true is set explicitly.

Threshold gating

bomdrift diff before.json after.json --fail-on license-violation

Exits 2 when any violation is present. --fail-on any also includes license violations.

Output

  • Markdown: new “License violations” section before “License changed”, with ecosystem / name / version / license / matched-rule columns.
  • Terminal: [LIC] tag + matched rule per finding.
  • JSON: enrichment.license_violations top-level array.
  • SARIF: bomdrift.license-violation rule + per-finding result with stable partialFingerprints.primaryHash/v1. See SARIF + Code Scanning.

Suppression

License violations honor the standard --baseline machinery via the v0.5 suppressed_advisories field. Use a fully-qualified license identifier (or the SPDX expression as written by the SBOM) as the suppression key. The v0.8 expires + reason fields work the same way.

SPDX expression evaluation (v0.9+)

bomdrift evaluates each license string as a full SPDX expression via the spdx crate. Evaluation outcomes:

ExpressionAllowDenyOutcome
MIT[MIT]Permitted (allow exact match)
(MIT OR Apache-2.0)[MIT]Permitted (one branch allowed)
(MIT AND GPL-3.0-only)[MIT][GPL-3.0-only]Violation (deny wins)
(GPL-3.0-only OR MIT) AND BSD-3-Clause[MIT, BSD-3-Clause][GPL-3.0-only]Violation (denial path could resolve to GPL)
Apache-2.0 WITH LLVM-exception[Apache-2.0]Permitted (base license allowed; exception identity is currently informational only)
Custom (non-SPDX)[MIT]Falls back to atomic match → not in allow list
NOASSERTION / OTHER / empty[MIT]Ambiguous → violation (fail-closed)

Precedence

  1. Deny wins — any required atomic on the deny list (including any OR-branch) trips a violation, because the resolved license could be the denied alternative.
  2. Glob* suffix patterns work in both lists (e.g. AGPL-* matches every AGPL-*-only family member).
  3. Allow — when the allow list is non-empty, the SPDX expression must evaluate to true under a closure that returns true for allow-listed atomics.
  4. Non-SPDX strings — fall through to the v0.8 atomic-string matcher so vendor-specific license strings keep working.

Deprecated: allow_ambiguous

The v0.8 allow_ambiguous flag flipped fail-closed behavior on compound expressions. v0.9’s evaluator handles compounds correctly, so the flag is now a no-op when SPDX parsing succeeds. A one-time deprecation warning is printed to stderr per run when the flag is set. The flag still works on the fallback path (non-SPDX strings) for back-compat; it will be removed in v1.0.

WITH (exception) granularity

Per-exception allow/deny is configured with --allow-exception / --deny-exception (or [license] allow_exceptions / deny_exceptions in .bomdrift.toml). When either list is non-empty, the right-hand side of every WITH clause is evaluated against it: Apache-2.0 WITH LLVM-exception is permitted iff Apache-2.0 passes the base policy AND LLVM-exception is on the allow list (or absent from a non-empty deny list). Empty exception lists preserve v0.9 behavior — exceptions are informational only.

Compound-expression inheritance (v0.9.7)

v0.9.7 refines how exception decisions propagate through compound expressions. The rules:

  1. AND inherits: (X WITH ex) AND (Y) denies if either sub-clause would deny on its own. A denied exception in any conjunct denies the whole expression — every required atomic must be satisfiable, so a poisoned WITH clause poisons the conjunction.
  2. OR does not poison: (X WITH ex_a) OR (X WITH ex_b) is permitted when at least one branch is permitted. A denied exception on one branch doesn’t sink the expression as long as another branch resolves cleanly.
  3. Bare exception lookup: WITH <exception> without an allow/deny exception list configured falls through to v0.9 behavior (informational; the base license alone gates).
  4. Deny still wins atomically: a base license on the deny list denies regardless of the exception attached.
Worked examples

Assume [license] allow = ["Apache-2.0", "MIT"], allow_exceptions = ["LLVM-exception"], deny_exceptions = ["Classpath-exception-2.0"].

ExpressionResolutionWhy
Apache-2.0 WITH LLVM-exceptionpermitbase allowed, exception allowed
Apache-2.0 WITH Classpath-exception-2.0denyexception on deny list
Apache-2.0 WITH Some-other-exceptiondenybase allowed, but exception not on the non-empty allow list
(Apache-2.0 WITH LLVM-exception) AND BSD-3-ClausedenyAND inherits — BSD-3-Clause not on allow list, denies the conjunction even though the WITH half is fine
(Apache-2.0 WITH LLVM-exception) AND MITpermitboth conjuncts pass independently
(Apache-2.0 WITH Classpath-exception-2.0) AND MITdenydenied exception poisons the AND
(Apache-2.0 WITH Classpath-exception-2.0) OR (Apache-2.0 WITH LLVM-exception)permitOR doesn’t poison — the LLVM branch resolves cleanly
(Apache-2.0 WITH Classpath-exception-2.0) OR (GPL-3.0-only)denyboth branches denied (one by exception, one by missing-from-allow)

The runtime evaluator constructs a closure over the allow / deny exception sets and lets the spdx crate’s expression-evaluation walk the tree; the rules above describe the closure’s per-leaf decision.

OCI artifact attestation

bomdrift can verify that the SBOMs it diffs were signed by your build system before any drift signal is computed. This closes the “who produced this SBOM?” gap: you already trust the binary you shipped through SLSA-style signing — the SBOM that describes that binary’s supply chain deserves the same scrutiny.

Shipped in v0.9.6. The verification path is opt-in per flag; existing file-based diffs (bomdrift diff before.json after.json) are unaffected unless you explicitly pass attestation flags.

Overview

An OCI attestation is a signed in-toto envelope, stored next to a container image in an OCI registry, that asserts a claim about that image. bomdrift consumes attestations whose predicate type is cyclonedx: the predicate body is a CycloneDX SBOM, which bomdrift then diffs against another (also-attested) SBOM.

bomdrift does not ship a Sigstore client. It shells out to cosign, which handles:

  • in-toto envelope signature verification,
  • certificate-chain validation against Fulcio,
  • transparency-log inclusion proof (Rekor),
  • certificate-identity matching against your supplied regex/issuer.

bomdrift trusts cosign’s verdict. If cosign exits 0, bomdrift parses the verified predicate and feeds it to the diff core. If cosign exits non-zero, bomdrift surfaces the cosign stderr verbatim and exits 1.

Threat model gap NOT addressed

bomdrift does not implement Sigstore protocol verification itself. You are trusting cosign’s implementation, the cosign binary on PATH, and whichever Sigstore instance cosign is configured against (public-good by default; see Self-managed Sigstore).

Prerequisites

  1. Install cosign. Follow https://docs.sigstore.dev/system_config/installation/. v0.9.6 was developed and tested against cosign 2.x. Pin to a specific cosign version in your CI image so signature-verification semantics don’t drift across runs.
  2. Push your SBOMs as cyclonedx attestations on the same OCI reference as the binary they describe (see next section).

Generating attestations

The canonical guide is the sigstore docs; this section is a sketch.

# Produce the SBOM however you do today (Syft, etc.).
syft <oci-ref> -o cyclonedx-json > sbom.cdx.json

# Sign it as an attestation against the same digest.
cosign attest \
  --predicate sbom.cdx.json \
  --type     cyclonedx \
  ghcr.io/myorg/myapp@sha256:abc...

The --type cyclonedx flag is the predicate-type matcher bomdrift filters on. Other predicate types (SPDX, SLSA provenance, custom) are ignored — see What’s NOT in v0.9.6.

Verifying with bomdrift

Pass an OCI reference instead of a local file path via the attestation flags:

bomdrift diff \
  --before-attestation oci://ghcr.io/myorg/myapp@sha256:abc... \
  --after-attestation  oci://ghcr.io/myorg/myapp@sha256:def... \
  --cosign-identity '^https://github.com/myorg/.+@refs/tags/v.+$' \
  --cosign-issuer    https://token.actions.githubusercontent.com

--before-attestation <OCI-REF>

OCI reference (with oci:// scheme) of the “before” image whose attached cyclonedx attestation is the “before” SBOM. Mutually exclusive with the positional <BEFORE> argument; pass one or the other.

--after-attestation <OCI-REF>

Same as above, for the “after” SBOM.

--cosign-identity <REGEX>

Required when any --*-attestation flag is set. RE2-syntax regex that the certificate’s subject Subject Alternative Name must match. For GitHub Actions OIDC, this is typically the workflow URL plus a refs constraint, e.g. ^https://github.com/myorg/myapp/.github/workflows/release\.yml@refs/tags/v.+$.

bomdrift passes this to cosign as --certificate-identity-regexp.

--cosign-issuer <URL>

Required when any --*-attestation flag is set. The OIDC issuer that minted the signing certificate. For GitHub Actions, this is https://token.actions.githubusercontent.com.

bomdrift passes this to cosign as --certificate-oidc-issuer.

--require-attestation

Hard-mode flag. When set:

  • Both --before-attestation and --after-attestation must be provided.
  • Positional <BEFORE> and <AFTER> file arguments are rejected (clap conflict).
  • Any cosign verification failure exits 1; there is no fallback to unverified file inputs.

Use this on the production-CI gate that blocks releases. In dev loops where you sometimes diff a local file against a published attestation, leave --require-attestation off and let the operator mix file inputs with attestation inputs.

What bomdrift trusts

The trust boundaries, made explicit:

  • bomdrift trusts cosign to verify the in-toto envelope’s signature, certificate chain, and Rekor inclusion proof.
  • bomdrift trusts cosign to enforce the certificate identity regex and OIDC issuer match.
  • bomdrift does not independently re-verify the Sigstore transparency log. That is cosign verify-attestation’s job.
  • bomdrift assumes the predicate-type filter (--type=cyclonedx) is honored by cosign. It is, but the assumption is documented here so future cosign behavior changes are visible to auditors.
  • bomdrift parses the verified predicate as CycloneDX JSON. Anything cosign hands back that doesn’t parse as CycloneDX exits bomdrift with a parse error.

Self-managed Sigstore instances

If you run your own Sigstore stack (private Fulcio + Rekor), cosign honors the standard Sigstore env vars:

VariablePurpose
COSIGN_REKOR_URL / SIGSTORE_REKOR_URLOverride the public-good Rekor instance.
COSIGN_FULCIO_URL / SIGSTORE_FULCIO_URLOverride Fulcio.
COSIGN_OIDC_ISSUEROverride the default OIDC issuer probed during signing.
SIGSTORE_ROOT_FILEPin a custom Sigstore TUF root for verification.

bomdrift inherits the parent process environment when shelling out to cosign, so exporting these before invoking bomdrift diff is sufficient. No bomdrift-side flags are needed.

export SIGSTORE_REKOR_URL=https://rekor.internal.example.com
bomdrift diff --before-attestation ... --after-attestation ... ...

Air-gapped / self-hosted Sigstore

Regulated environments — finance, defense, healthcare on-prem, government cloud — frequently can’t reach the public-good Sigstore instance (rekor.sigstore.dev, fulcio.sigstore.dev, tuf-repo-cdn.sigstore.dev). The org runs its own Sigstore stack inside the trust boundary, with its own TUF root, Fulcio CA, and Rekor transparency log. bomdrift supports this without any bomdrift-side configuration: the attestation module shells out to cosign and does not scrub or modify the calling environment, so every Sigstore env var cosign respects flows through unchanged.

Environment variables

VariablePurpose
SIGSTORE_REKOR_URL / COSIGN_REKOR_URLTransparency-log endpoint (your private Rekor).
SIGSTORE_FULCIO_URL / COSIGN_FULCIO_URLShort-lived cert issuer (your private Fulcio).
SIGSTORE_OIDC_ISSUER / COSIGN_OIDC_ISSUEROIDC issuer used by the keyless flow. In a true air-gap you’ll likely use key-based attestations instead — see below.
SIGSTORE_ROOT_FILEPath to a custom Sigstore TUF root JSON (root.json).
TUF_ROOTDirectory containing TUF metadata (root + targets).
COSIGN_REPOSITORYAlternate cosign-data registry, when attestations are stored separately from the artifact’s registry.

bomdrift forwards the unchanged process environment to every cosign invocation, so exporting the variables on the workflow / shell that invokes bomdrift is enough — no bomdrift flag is needed.

Worked example: GitHub Actions against a private Sigstore

- uses: Metbcy/bomdrift@v1
  with:
    before-attestation: oci://registry.internal.example/myapp@sha256:abc...
    after-attestation:  oci://registry.internal.example/myapp@sha256:def...
    cosign-identity:    '^https://github.example.internal/.+$'
    cosign-issuer:      https://oidc.internal.example
    require-attestation: 'true'
  env:
    SIGSTORE_REKOR_URL:  https://internal-rekor.example
    COSIGN_FULCIO_URL:   https://internal-fulcio.example
    SIGSTORE_OIDC_ISSUER: https://oidc.internal.example
    TUF_ROOT:            ${{ github.workspace }}/.sigstore/tuf
    SIGSTORE_ROOT_FILE:  ${{ github.workspace }}/.sigstore/tuf/root.json

The action’s composite step inherits this env: block, propagates it to the bomdrift binary, and bomdrift propagates it again to cosign. No input on the action surface is needed for any of these — they are cosign’s own contract.

Key-based (non-keyless) attestations

In a true air-gap, the OIDC keyless flow may not be reachable: there’s no public-good Fulcio CA to mint short-lived certificates, and your internal OIDC issuer may not be wired up to your internal Fulcio yet. cosign’s fallback is key-based attestation:

cosign attest --key cosign.key --predicate sbom.cdx.json \
  --type cyclonedx registry.internal.example/myapp@sha256:abc...

For verification, cosign auto-detects a cosign.pub in the working directory or honors the COSIGN_PUBLIC_KEY env var. bomdrift’s current --cosign-identity / --cosign-issuer flags target the keyless flow; for the key-based flow, leave them empty (or pass identity values that match how cosign records key-based attestations) and rely on env-var passthrough:

export COSIGN_PUBLIC_KEY=$PWD/cosign.pub
bomdrift diff \
  --before-attestation oci://registry.internal.example/myapp@sha256:abc... \
  --after-attestation  oci://registry.internal.example/myapp@sha256:def...

cosign reads COSIGN_PUBLIC_KEY directly when no certificate-identity flags are present. bomdrift forwards the env unchanged, so no bomdrift-side configuration is required.

Troubleshooting checklist

When verification fails in an air-gapped setup, walk this list:

  1. Error: updating local metadata and targets — TUF can’t reach the configured TUF repo. Verify TUF_ROOT points at a directory pre-populated with your org’s TUF metadata, and that SIGSTORE_ROOT_FILE references a valid root.json.
  2. Error: getting Rekor public keys — Rekor URL is unreachable from the runner. curl -v "$SIGSTORE_REKOR_URL/api/v1/log/publicKey" from the same runner identity to confirm network reachability.
  3. x509: certificate signed by unknown authority — your private Fulcio’s intermediate CA isn’t in the system trust store. Either install it on the runner image, or set SSL_CERT_FILE to a bundle that includes it.
  4. Error: no matching signatures with key-based attestations — cosign found the attestation but the public key didn’t match. Confirm COSIGN_PUBLIC_KEY resolves to the same key that signed the attestation, and that no --cosign-identity / --cosign-issuer values are present (those force the keyless code path).
  5. Error: dial tcp: lookup rekor.sigstore.dev — cosign fell back to the public-good defaults because one of the SIGSTORE_* env vars wasn’t actually exported into bomdrift’s process. On GitHub Actions, double-check the env: block lives on the same step as the action (or a parent jobs.<id>.env: block), not on a different step.
  6. Verification works locally but not in CI — the runner image lacks cosign, or cosign was installed but PATH isn’t propagated to the composite-action subshell. The verify-signatures: true codepath already installs cosign for release signature verification; reuse that install or pin a known cosign version explicitly.

The air-gapped path uses cosign’s own contract, so any deeper diagnosis is a cosign problem, not a bomdrift problem. Reproduce with cosign verify-attestation --type cyclonedx ... directly, with the same env vars exported, before opening a bomdrift issue.

Troubleshooting

executable file not found in $PATH: cosign

bomdrift couldn’t find cosign on PATH. Install per Prerequisites, or set PATH so the cosign binary is reachable from the bomdrift process.

Error: no matching signatures

The cosign verification rejected every attached signature. Most common cause: --cosign-identity regex doesn’t match the actual certificate SAN. Debug with cosign directly first:

cosign verify-attestation \
  --type cyclonedx \
  --certificate-identity-regexp '<your-regex>' \
  --certificate-oidc-issuer    '<your-issuer>' \
  ghcr.io/myorg/myapp@sha256:abc...

If cosign’s own output is more revealing, you’ve isolated the problem outside bomdrift.

predicate type mismatch / no attestations of the requested type

The OCI reference has attestations, but none of type cyclonedx. bomdrift only consumes CycloneDX SBOM attestations in v0.9.6 — see the next section.

Error: parsing CycloneDX: ...

cosign verified the envelope but bomdrift couldn’t parse the predicate body as CycloneDX. Inspect the raw predicate by running the cosign command above with -o json and look at payload.predicate.

What’s NOT in v0.9.6

  • SPDX SBOM attestations. Only CycloneDX. SPDX-attestation support is a future ask; file an issue if you need it. The predicate parser is the only piece that needs to grow.
  • Direct Rekor verification. Deferred to cosign. bomdrift will not grow a Sigstore client implementation.
  • Air-gapped Sigstore. Documented as a first-class flow via cosign-respected env-var passthrough; see Air-gapped / self-hosted Sigstore.
  • In-process attestation (no shell-out). Pulling in a full-fat Sigstore Rust SDK contradicts the OSS-first / small-dep-tree design constraint. Revisit once a minimal, audited Rust Sigstore client exists.
  • Plugins — for verifying additional org-specific signals on attested SBOMs.
  • Output formats — verified diffs render identically to file-based diffs.
  • Roadmap — for the broader v0.9.6 dispositions.

Plugins

bomdrift’s enricher set is intentionally curated — typosquats, maintainer age, registry metadata, OSV/EPSS/KEV. Org-specific signals (banned packages, license-tier policies, internal package allowlists) don’t belong in the binary, but they need a first-class extension point. v0.9.6 ships that extension point as external-process plugins.

Overview

A plugin is an executable on the filesystem (any language, any shape of dependencies) that reads a JSON envelope from stdin and writes a JSON envelope to stdout. bomdrift invokes it once per matching component during a diff. Findings the plugin emits are merged into bomdrift’s output across every render path: terminal, markdown, JSON, SARIF.

Plugins are not a sandbox. They run as your CI user with the same filesystem and network access bomdrift itself has. Treat plugin source the same way you’d treat any external CI script.

Why external-process and not WASM

The original v0.4 sketch on the roadmap floated WASM. v0.9.6 deliberately picks shell-out instead:

  • Smaller dep tree. No wasmtime / wasmer pulled into the bomdrift binary. The dep-tree audit is a real OSS-first constraint.
  • Any language. Plugins write Bash, Python, Go, Rust, whatever. WASM would force a per-language toolchain.
  • Sandboxing is the user’s environment. CI runners already isolate per-job. Adding WASM-level sandboxing inside an already isolated container is duplicate effort for marginal value.
  • Failure isolation is cheap. A child-process crash can’t take bomdrift down; we already get that for free from the OS.

WASM may revisit in v1.0+ if a clear need materializes (in-browser diffing, multi-tenant CI without per-job isolation). For now, the shell-out model wins on simplicity and dep cost.

Manifest format

A plugin manifest is a TOML file pointed at by --plugin <path>. The flag is repeatable — bomdrift loads each manifest in declaration order and runs all matching plugins per component.

[plugin]
name        = "my-plugin"
description = "What this plugin checks for"
exec        = "./run.sh"
timeout_ms  = 5000
invoke_on   = ["added", "version-changed"]

Fields

FieldTypeRequiredDefaultNotes
namestringyesUnique within a single bomdrift run. Used in error messages and SARIF rule IDs.
descriptionstringnoFree-form. Surfaced when bomdrift logs plugin failures.
execstringyesPath to the executable, resolved relative to the manifest directory. Use ./ prefix to make this explicit. Absolute paths are accepted.
timeout_msintegerno5000Wall-clock timeout per invocation. After expiry the process is killed and the invocation’s findings are dropped.
invoke_onstring listyesSubset of ["added", "version-changed"]. Future versions may add removed, license-changed, maintainer-changed. Unknown values are rejected at load time.

exec must be marked executable on disk. bomdrift does not auto-chmod +x; this would mask permission bugs.

Protocol — stdin/stdout JSON shape

bomdrift writes one JSON object on the plugin’s stdin, closes stdin, and reads exactly one JSON object from stdout (parsing the last complete JSON object on stdout — earlier output is treated as plugin log noise and discarded silently, but plugins shouldn’t rely on this). The plugin should write its findings JSON and exit promptly.

Stdin

{
  "component": {
    "purl":     "pkg:npm/foo@1.2.3",
    "name":     "foo",
    "version":  "1.2.3",
    "licenses": ["MIT"]
  },
  "event":  "added",
  "before": null
}
  • component — the after component. Always present.
  • event"added" or "version-changed". Matches the manifest’s invoke_on filter.
  • beforenull for added, the before component (same shape as component) for version-changed.

Unknown fields may appear in future bomdrift versions. Plugins must ignore unknown fields on stdin and not assume the input shape is closed.

Stdout

Exactly one JSON object on a single line (newline-terminated is fine; multi-line pretty-printed JSON is also accepted as long as it’s a single value):

{
  "findings": [
    {
      "kind":     "your-finding-tag",
      "message":  "human-readable description",
      "severity": "info",
      "rule_id":  "stable.id.for.this.kind"
    }
  ]
}
FieldTypeRequiredNotes
kindstringyesFree-text tag. Surfaced in the markdown/terminal renderers as the finding category. Keep it short and stable.
messagestringyesOne-line human-readable description.
severitystringyesOne of "info", "warning", "error". Maps to SARIF level as note / warning / error.
rule_idstringyesStable identifier for this class of finding. Used in SARIF partialFingerprints; should be the same across runs for the same logical finding so dedup works.

An empty findings array is the no-match path:

{"findings": []}

SARIF mapping

All plugin findings render under a single SARIF rule: bomdrift.plugin. The plugin’s rule_id is threaded into the SARIF result’s partialFingerprints so that GitHub Code Scanning and similar consumers can dedup runs of the same finding.

Failure semantics

Plugins are best-effort. Their failures never fail the bomdrift diff:

Failure modebomdrift response
Plugin exits non-zeroDrop findings from this invocation. Log warning if BOMDRIFT_DEBUG=1.
Wall-clock timeout (timeout_ms)Kill the process. Drop findings. Log warning if BOMDRIFT_DEBUG=1.
Stdout is not parseable JSONDrop findings. Log warning if BOMDRIFT_DEBUG=1.
Stdout JSON is missing findingsDrop findings. Log warning if BOMDRIFT_DEBUG=1.
findings[i].severity is unknownDrop that finding. Other findings in the same invocation pass through.
Plugin exec is missing on diskManifest load fails fast (before any diff work). Exit 1.

The contract: the rest of the bomdrift report still renders. A bad plugin is a noisy plugin, not a broken pipeline. Run with BOMDRIFT_DEBUG=1 while authoring a plugin to see why findings are being dropped.

Windows note

On Windows, Command::kill() has known quirks where killed processes may leave orphan grandchildren. bomdrift kills the direct child cleanly; if your plugin spawns sub-processes, ensure it forwards the timeout signal itself. Plugin timeouts on Windows are best-effort in v0.9.6.

Worked example: banned-packages

The reference implementation lives in examples/plugins/banned-packages/:

examples/plugins/banned-packages/
├── README.md          # how to adapt for your org
├── plugin.toml        # the manifest below
├── check-banned.sh    # bash + jq implementation
└── banned.txt         # purl prefixes to flag

plugin.toml:

[plugin]
name        = "banned-packages"
description = "Flag dependencies on the org-maintained banned-packages list"
exec        = "./check-banned.sh"
timeout_ms  = 5000
invoke_on   = ["added", "version-changed"]

Invocation:

bomdrift diff before.cdx.json after.cdx.json \
  --plugin examples/plugins/banned-packages/plugin.toml

See the example’s README for adaptation guidance, performance characteristics, and security notes.

Performance

bomdrift invokes plugins sequentially, once per matching component. With N Added/VersionChanged components and P plugins, you’ll see N × P invocations. Implications:

  • Process-startup cost matters. A bash plugin that forks jq ten times costs ~30 ms of fork + interpreter warmup per call. At N = 200, P = 3 that’s ~18 s of pure startup overhead. Compile to a static Go/Rust binary if hot-path performance matters.
  • Tune timeout_ms. The default (5000) is generous for pure-CPU plugins; a plugin that hits a network endpoint per component might need 30000. A plugin that’s intermittently slow ruins your CI cycle time — consider sampling inside the plugin (return early for components that don’t match its scope).
  • No parallelism in v0.9.6. Concurrent plugin execution is on the table for v1.0 if a meaningful workload demands it. File an issue with timing data if you hit this.

Security

bomdrift does not sandbox plugins:

  • Plugins run as the bomdrift parent’s user.
  • Plugins inherit the parent’s environment (including secret-bearing env vars like GITHUB_TOKEN, NPM_TOKEN, etc.).
  • Plugins inherit the parent’s filesystem and network access.
  • Plugins can spawn arbitrary sub-processes.

Treat plugin source like any external CI script:

  • Vet what you ship. Read the plugin source, including any binary dependencies it pulls in.
  • Pin to a commit / tag. Don’t curl ... | bash an always-latest plugin executable.
  • Minimize the env. If a plugin doesn’t need a secret, don’t let it inherit one. env -i bomdrift diff ... strips the environment; manually re-export only what bomdrift itself needs.
  • Mirror internally. For high-trust pipelines, vendor the plugin into your own repo or internal artifact store rather than pulling from a public registry on every CI run.

Stability promise

The plugin protocol’s stdin/stdout JSON shape is best-effort stable in v0.9.6:

  • We may add fields to the stdin envelope in a future minor release. Plugins must ignore unknown fields.
  • We will not remove or rename documented stdin or stdout fields without a major version bump.
  • The stdout findings schema is the public contract; treat kind, message, severity, rule_id as semver-stable.
  • The TOML manifest schema may grow new optional fields; existing fields stay.

If the protocol needs a breaking change for v1.0, a deprecation window with a protocol_version envelope field will land at least one minor release before the break.

CI integration

A typical GitHub Actions job that wires in a plugin:

jobs:
  bomdrift:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Make sure jq is available if your plugin needs it.
      - run: sudo apt-get install -y jq

      - uses: Metbcy/bomdrift@v1
        with:
          before-sbom: before.cdx.json
          after-sbom:  after.cdx.json
          extra-args:  --plugin examples/plugins/banned-packages/plugin.toml

For multiple plugins, repeat --plugin in extra-args:

extra-args: >-
  --plugin .bomdrift/plugins/banned-packages/plugin.toml
  --plugin .bomdrift/plugins/license-tier/plugin.toml

Release signing

Every bomdrift release archive is signed with cosign keyless via Sigstore + GitHub OIDC. This means:

  • The signing key is not stored in the repo or in any GitHub Secret.
  • Each signature is bound to the GitHub Actions workflow run that produced it, with the OIDC issuer (token.actions.githubusercontent.com) acting as the identity provider.
  • The signing transparency log is the public Sigstore Rekor instance.

Verifying a release manually

VERSION=v0.9.6
TARGET=x86_64-unknown-linux-gnu
ARCHIVE=bomdrift-${VERSION}-${TARGET}.tar.gz

# Download the archive + signature + certificate
BASE="https://github.com/Metbcy/bomdrift/releases/download/${VERSION}"
curl -fsSL -O "${BASE}/${ARCHIVE}"
curl -fsSL -O "${BASE}/${ARCHIVE}.sig"
curl -fsSL -O "${BASE}/${ARCHIVE}.pem"

# Verify
cosign verify-blob \
  --certificate-identity "https://github.com/Metbcy/bomdrift/.github/workflows/release.yml@refs/tags/${VERSION}" \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  --certificate "${ARCHIVE}.pem" \
  --signature  "${ARCHIVE}.sig" \
  "${ARCHIVE}"

A successful verification prints Verified OK. Anything else means the archive has been tampered with (or the certificate’s identity doesn’t match the expected workflow run — same outcome, do not trust).

What the certificate identity proves

The --certificate-identity argument pins the verification to the exact workflow file that produced the signature, including the tag ref. As long as release.yml is the only workflow that ever signs bomdrift archives (it is, and the file is reviewed in PRs), an attacker who can’t push to the bomdrift repo can’t produce a verifiable signature.

The --certificate-oidc-issuer pins to GitHub’s OIDC issuer. Substituting a different IDP-backed signature wouldn’t pass.

Action-side verification (default)

The Metbcy/bomdrift action calls cosign verify-blob automatically on every download (when verify-signatures: true, the default). When verification fails, the action exits non-zero before running bomdrift, so a tampered binary never executes.

To skip verification (saves ~15s by also skipping the cosign-installer step), set:

- uses: Metbcy/bomdrift@v1
  with:
    before-sbom: before.json
    after-sbom:  after.json
    verify-signatures: false

This is appropriate when:

  • You’re running on self-hosted runners with a hardened image you control.
  • You’ve pre-pinned the bomdrift archive in your Nexus/Artifactory mirror and verified its signature once at mirror time.
  • You’re running in a network-restricted environment where the public Sigstore endpoints aren’t reachable.

When verify-signatures: true and cosign isn’t installed (or the .sig / .pem aren’t on the release), the action fails loudly rather than silently degrading — that’s the whole point of the explicit opt-out.

Why keyless?

The traditional alternative is a long-lived signing key stored as a GitHub Secret. That’s:

  • A single credential that, if leaked, lets an attacker sign forever.
  • A rotation problem — every key rotation breaks all consumers who pinned the verifying public key.
  • An audit gap — there’s no public record of who signed what when.

Keyless cosign moves the trust to the GitHub OIDC issuer + the Sigstore Rekor transparency log: every signature has a public, queryable record of the exact GitHub Actions workflow run that produced it, and the signing certificate is short-lived (10 minutes).

SHA-256 checksums

In addition to cosign, every archive ships with a .sha256 file for old-school checksum verification:

curl -fsSL -O "${BASE}/${ARCHIVE}.sha256"
sha256sum -c "${ARCHIVE}.sha256"   # GNU
shasum -a 256 -c "${ARCHIVE}.sha256"  # macOS

Checksums alone don’t authenticate the archive (an attacker who can modify the .tar.gz can also modify the .sha256); cosign is the authoritative verification path. The checksums exist for older toolchains and for quick local-rerun checks.

SLSA build provenance (v0.9.9+)

In addition to the cosign-keyless signature on each archive, the release pipeline produces a SLSA build provenance attestation covering both the per-target archives and the multi-arch ghcr.io image. The two are complementary, not redundant:

  • cosign proves “the bomdrift maintainer’s GitHub OIDC identity signed this artifact.” It binds the artifact to the human (or workflow run) holding the signing identity at sign time.
  • SLSA provenance proves “this artifact was produced by the public release.yml workflow on tag v0.9.9 in this repo, against this commit SHA.” It binds the artifact to the build itself — including the source ref, the workflow file, and the ephemeral runner identity.

Both verifications must pass for the release to be trustworthy. An attacker who compromised the maintainer’s signing identity (cosign verifies) but couldn’t push to Metbcy/bomdrift (SLSA fails) would trip SLSA. Conversely, an attacker who pushed a malicious workflow to a fork (SLSA verifies for the fork) wouldn’t have the maintainer’s OIDC identity (cosign fails).

The simplest path uses gh, which calls into the SLSA verifier with the right defaults for GitHub-hosted attestations:

VERSION=v0.9.9
TARGET=x86_64-unknown-linux-gnu
ARCHIVE=bomdrift-${VERSION}-${TARGET}.tar.gz
BASE="https://github.com/Metbcy/bomdrift/releases/download/${VERSION}"

curl -fsSL -O "${BASE}/${ARCHIVE}"
gh attestation verify --owner Metbcy "${ARCHIVE}"

A successful verification prints Loaded ... attestation(s) ... verified. Pin the source ref by adding --source-ref refs/tags/${VERSION} if you want to reject attestations from other tags.

Verifying SLSA provenance — slsa-verifier

For air-gapped or non-GitHub environments where gh isn’t available:

slsa-verifier verify-artifact \
  --provenance-path "${ARCHIVE}.intoto.jsonl" \
  --source-uri github.com/Metbcy/bomdrift \
  --source-tag ${VERSION} \
  "${ARCHIVE}"

The .intoto.jsonl file is downloaded automatically by gh attestation download, or you can fetch it directly from the release’s attestation manifest at https://github.com/Metbcy/bomdrift/attestations.

Verifying the ghcr.io image attestation

The multi-arch image carries an inline attestation (pushed by the build job’s push-to-registry: true):

gh attestation verify --owner Metbcy oci://ghcr.io/metbcy/bomdrift:${VERSION}

Architecture

bomdrift is a single-binary Rust CLI with three logical layers: parse, diff, enrich + render. Every layer is pure (no shared mutable state) so the same input produces byte-identical output every time — the upsert contract.

Module layout

src/
├── main.rs            — clap entry point; dispatches to lib::run
├── lib.rs             — top-level wiring: load_sbom -> diff -> enrich -> render
├── cli.rs             — clap derive types: DiffArgs, RefreshArgs, FailOn, etc.
├── config.rs          — `.bomdrift.toml` policy (de)serialization + merge
├── clock.rs           — single source of truth for "now" (honors SOURCE_DATE_EPOCH)
├── attestation.rs     — `cosign verify-attestation` shell-out (v0.9.6)
├── plugin.rs          — external-process plugin loader (v0.9.6)
├── vex.rs             — VEX consume (OpenVEX 0.2.0, CycloneDX VEX 1.6) + emit (OpenVEX)
├── baseline.rs        — `--baseline` snapshot suppression + `expires`/`reason`/`vex_status`
├── refresh.rs         — `bomdrift refresh-typosquat` subcommand
├── model/             — unified component / SBOM types
│   ├── component.rs   — Component, Ecosystem, Hash, Relationship
│   └── sbom.rs        — Sbom, SbomFormat
├── parse/             — format-specific parsers
│   ├── cyclonedx.rs   — CDX 1.5/1.6 JSON
│   ├── spdx.rs        — SPDX 2.3 JSON
│   └── syft.rs        — Syft JSON
├── diff/              — pair-by-version ChangeSet computation
│   ├── mod.rs         — diff(), ChangeSet
│   └── key.rs         — ComponentKey (purl-without-version | (eco, name))
├── enrich/            — risk-signal enrichers
│   ├── osv.rs         — OSV.dev /v1/querybatch + /v1/vulns/{id}
│   ├── epss.rs        — FIRST.org EPSS per-CVE scores (v0.8)
│   ├── kev.rs         — CISA KEV catalog (v0.8)
│   ├── registry.rs    — npm / PyPI / crates.io metadata (v0.9)
│   ├── license.rs     — SPDX expression evaluation + allow/deny + per-exception (v0.8 / v0.9 / v0.9.5)
│   ├── typosquat.rs   — Jaro-Winkler + suffix boost / Levenshtein / last-segment / package-portion
│   ├── version_jump.rs — major-delta >= 2 heuristic
│   ├── maintainer.rs  — GitHub REST contributor-age (the xz pattern)
│   ├── cache.rs       — single source of truth for CACHE_TTL_SECS (v0.9.6 unified)
│   └── mod.rs         — Enrichment graph aggregating findings
└── render/            — output formatters
    ├── markdown.rs    — GFM PR-comment body
    ├── term.rs        — TTY-aware ANSI
    ├── json.rs        — pretty-printed serde graph
    └── sarif.rs       — SARIF v2.1.0 with stable rule IDs + partialFingerprints

The pipeline

                          OSV.dev /querybatch + /vulns/{id}
                                      |
                                      v
SBOM file --[parse::*]--> Sbom --+   /Enrichment\
                                 |  | - vulns    | -- typosquat (pure)
SBOM file --[parse::*]--> Sbom --+--+ - typosq's | -- version_jump (pure)
                                 |  | - jumps    | -- maintainer (GitHub API)
                                 v  | - main_age |
                              ChangeSet  --------/
                                 |
                                 v
                            (--baseline applies here, suppresses findings)
                                 |
                                 v
                              render::*
                                 |
                                 v
                       markdown / term / json / sarif

parse layer

Each parser is hand-rolled (~150 LOC). We deliberately avoid the cyclonedx-bom and spdx-rs crates: their dep trees are heavy relative to the parsing surface we actually use, and the SBOM JSON shapes are stable enough that hand-rolling is low maintenance.

The unified model::Component carries:

  • name, version, ecosystem (parsed from purl when available, fallback to the source SBOM’s hint)
  • purl (Option<String>), bom_ref (Option<String>)
  • licenses: Vec<String> (canonicalized to SPDX expressions when possible)
  • hashes: Vec<Hash>, supplier: Option<String>, source_url: Option<String>, relationship

SbomFormat::auto_detect looks at top-level JSON fields to dispatch: bomFormat: "CycloneDX" → CDX, spdxVersion: "..." → SPDX, schema: {name: "Syft"} → Syft. --format <FORMAT> overrides detection.

diff layer

The diff core groups components by ComponentKey and computes per-key:

B = group_by_key(before.components)
A = group_by_key(after.components)

for K in keys(B) ∪ keys(A):
    versions in A[K] \ B[K] → ChangeSet::added
    versions in B[K] \ A[K] → ChangeSet::removed
    versions in A[K] ∩ B[K] with differing licenses → ChangeSet::license_changed
    legacy single-version case (|B[K]| = |A[K]| = 1, versions differ)
        → ChangeSet::version_changed (folds in license-changes-with-version-bumps)

ComponentKey is Purl(string-without-version) when the component has a parseable purl, else NameTuple(Ecosystem, name). This is what makes cross-format diffs work: a CDX SBOM diffed against an SPDX SBOM of the same project keys consistently across the two formats.

The BTreeMap-based grouping is what gives the diff its byte-deterministic ordering. No timestamps leak in, no insertion-order leakage. The is_deterministic integration test guards the contract.

enrich layer

Enrichers are independent. Each takes a &ChangeSet, returns its specific finding type (Vec<TyposquatFinding>, Vec<VersionJumpFinding>, etc.), and the lib’s run_diff aggregates them into a single Enrichment graph.

Best-effort contract:

  1. Per-request timeout (15s).
  2. Errors warn once, never block.
  3. Per-component caching within a single run.

The OSV enricher is the only one that touches a persistent on-disk cache (<XDG_CACHE_HOME>/bomdrift/osv/). All other enrichers are either pure-compute or only cache within a single process.

render layer

Renderers are pure functions: (ChangeSet, Enrichment) → String. The markdown renderer is the canonical “PR comment” path; terminal is the TTY default; JSON is the downstream-tooling shape; SARIF is for Code Scanning ingestion.

Determinism is the upsert contract:

  • Enrichment::vulns is a HashMap (the OSV enricher fills it via unordered batch responses). Renderers that emit it (markdown, JSON, SARIF) sort the keys before emission.
  • Enrichment::typosquats / version_jumps / maintainer_age are Vecs populated in cs.added / cs.version_changed iteration order — which is BTreeMap-derived, so stable.
  • ChangeSet::added / removed / version_changed / license_changed are Vecs populated in BTreeMap<ComponentKey, ...> iteration order.

Result: identical inputs render to byte-identical output every time, which is what peter-evans/create-or-update-comment relies on for the upsert behavior in the action.

Best-effort enricher contract

Every enricher — network (OSV / EPSS / KEV / GitHub / registries), shell-out (cosign attestation), or external process (plugins) — honors the same fail-soft contract:

  1. Per-request timeout so a misbehaving upstream can’t hang a CI job.
  2. Errors warn once to stderr (deduped by key) and the diff renders without that source’s findings.
  3. Per-component caching within a single run so monorepo subpackages sharing a parent project don’t multiply HTTP requests.
  4. Best-effort never blocks the diff render. Exit code stays 0 from the enricher itself; the only way an enricher influences exit code is indirectly via --fail-on thresholds tripping on findings it produced.

src/enrich/osv.rs is the canonical pattern; new enrichers MUST mirror its Result<Vec<Finding>>-where-Err-is-warned-not-propagated shape. The attestation.rs and plugin.rs modules apply the same contract to non-network shell-outs: a missing cosign binary, a plugin timeout, or a malformed plugin response all warn and continue.

Byte-determinism contract

Identical inputs MUST render to byte-identical outputs across every format. This is what peter-evans/create-or-update-comment relies on to upsert a PR comment in place rather than accumulating duplicates, and what makes SARIF / VEX / JSON safe to commit to git.

Concretely:

  • All HashMaps emitted into output are sorted by key first.
  • All Vecs populated from cs.added / version_changed iteration inherit the diff core’s BTreeMap-derived order.
  • Every “now” reference goes through clock::now(), which honors SOURCE_DATE_EPOCH for reproducible-build contexts and for tests.
  • VEX @id UUIDs and CycloneDX VEX bom-ref strings are deterministic hashes of the finding tuple, never random.

Tests that mutate SOURCE_DATE_EPOCH MUST acquire clock::test_env_lock() to serialize across the crate’s parallel test threads — a v0.9.5 discovery during the release/v0.9.5 cleanup. See Contributing for the recipe.

Why no async / tokio?

bomdrift is intentionally synchronous. The single-binary CLI runs to completion in seconds; concurrent network requests would shave maybe 1–2 seconds off the OSV enricher path on diffs with > 100 unique CVEs, at the cost of:

  • ~70 transitive crates (tokio, mio, futures, …).
  • A panic-on-blocking-call class of bug that’s a constant trap for contributors.
  • A bigger, slower-to-build, slower-to-link binary.

The OSV /v1/querybatch endpoint already batches (1000 queries per request), so the parallelism we’d want is mostly already there. The N+1 stage-2 /v1/vulns/{id} calls are gated by the on-disk severity cache, which makes reruns within the configured TTL essentially free.

Plugin processes (v0.9.6+) are also invoked synchronously: at most one external child at a time, with a per-component timeout. Parallel plugin execution would re-introduce the tokio dependency cost without solving a measured bottleneck.

Why no chrono / no semver / no octocrab?

Same reasoning. We need:

  • One ISO-8601 timestamp shape (the canonical YYYY-MM-DDTHH:MM:SSZ GitHub always emits). Hand-rolled parser is ~25 LOC, lives in clock.rs.
  • The major version of a SemVer string. Hand-rolled extractor is ~5 LOC in enrich/version_jump.rs.
  • GitHub REST: a small set of endpoints (contributors, commits) hand-rolled atop ureq. octocrab would pull in tokio.

All three pulls would add transitive weight for no functional gain. The constraint is documented at the top of each affected file so future contributors don’t reflexively reach for the popular crate.

Approved dependencies

As of v0.9.6:

CratePurposeNotes
clapCLI parsingderive feature only
serde, serde_json(de)serializationparse + render
anyhow, thiserrorerror types
ureqHTTPsync, rustls — no tokio
strsimtyposquat scoringJaro-Winkler + Levenshtein
owo-colors, supports-colorterminal renderer
directoriesXDG paths
toml.bomdrift.toml parsing
time = "0.3.47"timestamp formattingminimal feature set
sha2 = "0.10"partialFingerprint hashes (SARIF), VEX @id
spdx = "=0.10.9"exact-pinned SPDX expression evaluationLicense-policy semantics shift on minor list updates; pin exactly
base64 = "0.22"OCI attestation payload decoding (v0.9.6)
wait-timeout = "0.2"bounded plugin-process wait on Windows (v0.9.7)sidesteps Command::kill()’s Windows quirks; tiny dep, no transitive weight

Forbidden by policy: tokio, chrono, semver, octocrab, async-trait, anything pulling rustls + ring + tokio transitively beyond what ureq already brings.

Binary size budget

  • Target: ≤ 5 MB stripped + LTO on Linux x86_64.
  • Current (v0.9.6): ~3.4 MB.
  • Audit: cargo bloat --release --crates -n 20 periodically to confirm no unexpected dep-tree growth.

Contributing

Thanks for considering a contribution! bomdrift is intentionally small and the contribution loop is fast.

Looking for somewhere to start?

Issues labeled good first issue are scoped for first-time contributors:

  • Add a name to one of the typosquat top-N lists (data/<eco>-top*.txt — see the comment header in any of those files).
  • Fix a doc typo (mdBook in docs/, README, or any module-level //! comment).
  • Improve an error message (bomdrift’s anyhow chains can usually be more specific about what failed).
  • Refresh a curated typosquat list from its upstream source (snapshot date is in the file header).

For larger changes (a new enricher, a new ecosystem, an output-format addition), open a discussion or issue first so we can talk through the design before you sink time into a PR.

Development loop

git clone https://github.com/Metbcy/bomdrift
cd bomdrift

cargo check --all-targets       # fast feedback while editing
cargo test --release            # full test suite (~420 tests as of v0.9.6)
rustup run 1.88 cargo clippy --all-targets --all-features -- -D warnings
cargo fmt --all -- --check      # MUST pass; run `cargo fmt --all` to fix

Rust 1.88+ required (the project uses edition 2024; CI is pinned to 1.88 to keep clippy lints stable across releases — see Cargo.toml’s rust-version field).

Project conventions

Commits

Conventional Commits:

  • feat(scope): add X — new feature
  • fix(scope): Y — bug fix
  • docs(scope): Z — documentation only
  • chore: W — maintenance with no behavioral change

Commit bodies should explain why, not whatgit diff shows the what. Multi-line commit messages are fine; use the heredoc git commit -m "$(cat <<'EOF' ... EOF)" pattern for readability.

Commit signing on main

main enforces required_signatures via the repository ruleset. This does NOT mean PR contributors need GPG/SSH signing keys configured. Here’s how it actually shakes out:

You’re a…Do you need to sign?
Contributor opening a PR from a fork or feature branchNo. Push commits as-is. The maintainer chooses the merge method.
Maintainer merging via gh pr merge --mergeNo. GitHub’s web-UI key signs the merge commit; it counts as verified.
Maintainer merging via gh pr merge --squashNo. Same — GitHub signs the squash commit.
Maintainer merging via gh pr merge --rebaseYes. Rebase replays your PR commits verbatim onto main, so they must already be signed.
Anyone pushing directly to mainYes (and the ruleset blocks it via pull_request anyway, so this only matters for emergency bypass).

Practical rule of thumb for contributors: don’t worry about it. The maintainer will pick the right merge method.

If you’d like your commits to land verbatim on main for git-blame attribution (and want to use rebase-merge), set up local signing once:

# SSH-key signing (simplest, no GPG headache)
git config --global gpg.format ssh
git config --global user.signingkey ~/.ssh/id_ed25519.pub
git config --global commit.gpgsign true

Then add the same SSH public key to your GitHub account under SSH and GPG keys → Signing keys.

Branch model

Single-purpose feature branches off main, merged via merge-commits (git merge --no-ff) so the fan-out graph stays readable. Push the feature branch alongside the merge to preserve the history visually on the GitHub network graph.

No emojis in code or rendered output

Strictly bracketed-prefix everything ([ADD], [CVE], [SQT], etc.). This is for terminal accessibility, grepability of CI logs, and to keep the markdown PR comment readable in monospace fonts.

No Co-authored-by: <yourself> lines

The Co-authored-by trailer is reserved for collaborators who genuinely co-authored the commit. The project’s CI tooling adds its own trailer; don’t duplicate.

Where to put new code

If you’re adding…Put it in…
A new SBOM format parsersrc/parse/<format>.rs + parse::SbomFormat::auto_detect
A new enrichersrc/enrich/<name>.rs + add to Enrichment struct
A new output formatsrc/render/<format>.rs + OutputFormat clap enum
A new diff-core algorithmsrc/diff/ (rare; please open an issue first)
A new typosquat ecosystemdata/<eco>-topN.txt + SupportedEcosystem enum
A new CLI flagsrc/cli.rs + wire through lib.rs::run_diff
Documentationdocs/src/<chapter>.md + add to docs/src/SUMMARY.md

Tests

Three layers, all run by cargo test --release:

  • Unit tests (#[cfg(test)] mod tests inside each src/<module>.rs): test the smallest unit. Mock at the function-argument boundary (e.g. inject a fake fn fetcher(url) -> Result<Vec<u8>> for network enrichers).
  • CLI tests (tests/cli.rs): spawn the actual bomdrift binary via CARGO_BIN_EXE_bomdrift and assert on stdout/stderr/exit code. These are end-to-end and slower; reserve them for user-visible surface (flags, output shape).
  • Integration tests (tests/integration.rs): exercise the library API directly without spawning the binary. Faster than CLI tests but cheaper than spinning up the full process.

Network-touching enrichers should have a unit test for the network- failure path (fake fetcher returns Err) — the best-effort contract matters and silently breaking it would be an easy regression.

Coverage (v0.9.8+)

CI runs cargo llvm-cov on every PR and posts a sticky comment with the overall line coverage % (the full lcov report is uploaded as the coverage-lcov workflow artifact — the artifact name intentionally avoids the standard lcov-output filename, since email/feed renderers that strip Markdown backticks autolink anything ending in a TLD and that filename’s extension resolves to a real, unrelated parked domain). The job is informational for now — there is no --fail-under-lines threshold yet. The plan is to add a ratchet in v0.9.9 once 2–3 releases have made the baseline visible. Until then, the report is a nudge, not a gate; PRs that move coverage in the wrong direction without justification will get a review comment, not a red check.

Test conventions (v0.9.5+)

Tests that mutate SOURCE_DATE_EPOCH (directly or indirectly via bomdrift::clock::*) MUST acquire clock::test_env_lock() to serialize across the crate’s parallel test threads. Without the lock, two tests running in parallel can read each other’s mutated env var and intermittently fail in ways that look format-deterministic but aren’t.

#![allow(unused)]
fn main() {
#[test]
fn baseline_expiry_relative_to_source_date_epoch() {
    let _lock = bomdrift::clock::test_env_lock();
    // SAFETY: serialized by _lock above.
    unsafe { std::env::set_var("SOURCE_DATE_EPOCH", "1735689600") }; // 2025-01-01
    // ... test body ...
}
}

The lock is a std::sync::Mutex<()> — re-entrant calls within a single test thread are fine, but a panic without the guard will poison it. If you see “PoisonError” in CI but not locally, a previous test panicked without releasing — fix the panicking test, not the poison handling.

Adding a new enricher

The shortest viable PR shape, mirroring how enrich::epss was added in v0.8 and enrich::registry in v0.9:

  1. src/enrich/<name>.rs — pure enrich(cs: &ChangeSet, ...) -> Vec<<Name>Finding> with a fail-soft fetcher boundary. Mirror the shape of src/enrich/osv.rs.
  2. Wire into Enrichment — add a field to the bomdrift::enrich::Enrichment struct in src/enrich/mod.rs; have lib.rs::run_diff populate it.
  3. Add a --no-<name> flag to src/cli.rs::DiffArgs, plumb through the [diff] no_<name> config key.
  4. Renderers — add a section to render::markdown, render::term, render::json. For SARIF, add a stable rule ID (bomdrift.<name>), a partialFingerprints.primaryHash/v1 identity tuple, and a fingerprint-stability test.
  5. --debug-calibration row — emit one <kind>|<key>|<score>|<threshold> line per finding considered.
  6. Docs — add docs/src/enrichers/<name>.md and link it from docs/src/SUMMARY.md and docs/src/enrichers/overview.md.
  7. CHANGELOG## [Unreleased] entry under ### Added.

Adding a new finding kind

When a new finding kind is purely a rendering layer (e.g., a new synthetic ID for VEX export or a new SARIF rule for an existing enricher), the recipe is shorter:

  1. Synthetic-id grammar — extend bomdrift::vex::SyntheticFindingKind and the parse_synthetic_id parser. Round-trip must be exact.
  2. SARIF rule — add the rule descriptor to render::sarif::ALL_RULES so it appears in tool.driver.rules even with zero results, then a partialFingerprints identity tuple for the new rule.
  3. Markdown / terminal / JSON sections — mirror the existing per-finding sections.
  4. Determinism test — round-trip the rendered SARIF / VEX through the parser and assert byte-for-byte equality with the input.

Documentation

When you add a CLI flag / action input / enricher, update:

  1. The relevant chapter in docs/src/.
  2. The CHANGELOG entry under ## [Unreleased].
  3. The README’s Features list (only for user-visible surface).
  4. Module doc comment explaining why (//! ... at the top of the file).

mdBook builds with cd docs && mdbook build. The output renders to docs/book/; check that locally before pushing.

Reporting issues

For false positives / negatives in the heuristic enrichers (typosquat, version-jump, maintainer-age), the most useful issue includes:

  1. The component name + version that fired (or should have).
  2. The expected behavior + observed behavior.
  3. A minimal SBOM pair if possible (synthetic CDX 1.5 JSON works).

Open an issue at https://github.com/Metbcy/bomdrift/issues.

Security disclosures

For supply-chain bugs in bomdrift itself — particularly anything that could let bomdrift run untrusted input as code — please report privately via GitHub Security Advisories rather than a public issue.

Benchmarks

bomdrift uses criterion for benchmarking the four hot paths: parse, diff, typosquat, and render. Benchmarks are not run in CI (the variance on shared GitHub runners is ±20%, which buries any real signal); they’re a tool for validating perf-relevant changes on hardware you control.

Running

# Run all benchmarks
cargo bench

# Run just one harness
cargo bench --bench parse
cargo bench --bench diff
cargo bench --bench typosquat
cargo bench --bench render

# Filter inside a harness
cargo bench --bench typosquat -- npm_batch

Criterion writes an HTML report to target/criterion/report/index.html on each run with throughput plots, distribution histograms, and diff-against-previous-run charts.

What each harness measures

parse — SBOM parser layer

For each of the three fixture formats (CycloneDX, SPDX, Syft):

  • json_value: cost of serde_json::from_str to a Value only. Captures the JSON-deserialization floor independent of bomdrift’s parser.
  • full_pipeline: cost of from_str + parse_with_format to a normalized model::Sbom. The delta vs json_value is bomdrift’s parsing overhead.

A regression in the delta is the signal worth investigating — a regression in json_value is a serde_json change.

diff — diff core

  • axios_fixture_pair: realistic small-PR shape (~3 components per side). The lower bound for any diff invocation.
  • synth_monorepo_200: 200 components per side, half of them version-changed. The realistic monorepo upper bound for a single PR.
  • synth_self_diff_200: same input on both sides. Worst case for the BTreeMap-intersection path with no resulting work to do.

A regression on synth_monorepo_200 likely indicates a hot-loop change in diff_one_key; a regression on synth_self_diff_200 likely indicates a ComponentKey::Ord change.

typosquat — Jaro-Winkler scoring

  • one_npm_typosquat_axios: a single candidate (plain-crypto-js) scored against the embedded npm top-1k list. The typosquat enricher’s per-candidate cost.
  • npm_batch 10/50/100: a batch of N candidates, exercising the per-candidate cost amortized.
  • mixed_three_ecosystems: one candidate per ecosystem (npm + PyPI
    • Cargo), exercising the per-ecosystem dispatch and embedded-list load cost (after the OnceLock has been hit).

The first invocation of each ecosystem in a process pays the legit-list parsing + canonicalization cost (~1ms for the npm 1k list); subsequent invocations are hot. Criterion’s iter() measures the cached path.

render — output renderers

For each of markdown / JSON / SARIF / terminal, with a synthetic ChangeSet shaped like a moderate PR (50 added / 20 removed / 30 version-changed / 5 license-changed + 10 typosquats + 15 CVEs).

A regression on one renderer specifically is usually a string-formatting change in that file. A regression across all renderers is usually a ChangeSet shape change that propagated.

Suggested workflow for perf-relevant PRs

  1. On a clean main, run cargo bench and let criterion record the baseline.
  2. Switch to your branch, make the change, and run cargo bench again.
  3. Criterion’s HTML report shows a “Change vs previous” column with confidence intervals. ±5% is noise on most hardware; ±10%+ is worth looking at; statistical significance markers (criterion’s “Performance has improved” / “Performance has regressed” lines) are the first-class signal.
  4. If the change is intentional (e.g. a feature that adds a new pass), note the new baseline in the PR description so reviewers know to compare against the post-change number, not the pre-change one.

Why no CI integration?

  • Shared GitHub runners have ±20% variance run-over-run on these benchmarks. Real regressions are smaller than the noise floor.
  • Self-hosted runners with pinned hardware would solve that, but the project doesn’t have that infrastructure (and the operational cost isn’t worth it at the project’s scale).
  • For now, run benchmarks locally on a quiet machine; a future contributor can wire up a self-hosted bench runner if the project grows enough to justify it.

Property-based testing

bomdrift uses proptest for property-based tests of the parser, diff core, typosquat canonicalization, and version-jump extractor. Property tests run as part of cargo test alongside the unit tests — there’s no separate harness.

What’s tested

Parser layer

Hypothesis: feeding arbitrary bytes through serde_json::from_slice followed by parse_with_format must NEVER panic. Errors are fine; panics are bugs.

Tests in src/parse/mod.rs::tests:

  • parse_pipeline_does_not_panic_on_arbitrary_bytes — 1024 random byte sequences (0–2048 bytes each). Most error at JSON parse; the few valid-JSON-but-not-an-SBOM cases exercise the parser’s error paths.
  • parse_pipeline_does_not_panic_on_arbitrary_json — 1024 random serde_json::Value trees up to depth 3. Far more efficient at exploring the parser’s behavior on well-formed-JSON-but-not-an-SBOM than random bytes.
  • parse_pipeline_does_not_panic_with_format_hint — same as above but with each SbomFormat hint forced. Catches per-parser panics that auto-detect would have routed away from.
  • ecosystem_from_purl_does_not_panic — arbitrary unicode strings through the purl-type extractor.
  • hash_alg_does_not_panic — arbitrary algorithm strings through the hash-algorithm normalizer.

Typosquat canonicalization

Tests in src/enrich/typosquat.rs::tests:

  • pep503_normalize_does_not_panic — arbitrary unicode through the PyPI normalizer. Output invariants asserted: lowercase only, no leading/trailing dashes.
  • last_path_segment_returns_substring — arbitrary unicode through the Go/Composer match-form extractor. The result must be a substring of the input and must contain no /.
  • enrich_does_not_panic_on_arbitrary_components — random ChangeSets with up to 32 added components of varying ecosystems must go through the full enrich() path without panicking.

Diff core

Tests in src/diff/mod.rs::tests:

  • diff_self_is_empty — for any Sbom, diff(a, a) produces an empty ChangeSet. The strongest invariant; catches parser non- determinism that other tests miss.
  • diff_swap_roles_when_inputs_swappeddiff(a, b) and diff(b, a) swap added/removed cardinalities and preserve version_changed and license_changed cardinalities. Catches asymmetric bugs in the per-key dispatch.
  • diff_is_deterministic — two calls on the same input produce byte-equal ChangeSet structures. The upsert contract for the PR-comment renderer relies on this.

Version-jump extractor

Tests in src/enrich/version_jump.rs::tests:

  • extract_major_does_not_panic — arbitrary strings through extract_major().
  • extract_major_round_trips_well_formed_numerics — for any major version 1..10000, the function round-trips the bare form, the v-prefixed form, and the pre-release suffix form.
  • extract_major_handles_unicode_without_panic — arbitrary unicode prefix + a well-formed version number. The function should treat the prefix as garbage (return None) but never panic.

Why property-based, not cargo-fuzz?

  • Stable Rust. proptest works on the stable toolchain; cargo-fuzz requires nightly via the libFuzzer LLVM coverage instrumentation.
  • Runs as part of cargo test. No separate harness, no cross-build complexity, no CI configuration delta. Every PR runs the property tests automatically.
  • Counterexample shrinking. When a property fails, proptest shrinks the failing input toward a minimal reproduction. The resulting test failure is much easier to debug than a 2KB random byte sequence from libFuzzer’s corpus.

The trade-off is corpus persistence — proptest doesn’t accumulate a crash corpus the way libFuzzer does. For a tool of bomdrift’s size that’s a fair trade; if the project grows to need long-running fuzz campaigns, a future contributor can wire up cargo-fuzz alongside proptest.

Running

# Run all tests including property tests
cargo test --release

# Just one property test
cargo test --release diff_self_is_empty

# Increase case count for thorough exploration
PROPTEST_CASES=10000 cargo test --release diff_self_is_empty

The default case counts (512–2048 per property) are calibrated so the full test suite finishes in ~2 seconds. Bump PROPTEST_CASES for deeper exploration on a release machine.

When a property test fails

  1. proptest prints a minimized counterexample. Copy it verbatim into a new unit test in the same module.
  2. Add a #[test] that exercises the counterexample directly. This becomes a regression guard; the property test’s randomness alone isn’t sufficient long-term coverage for a known-bad input.
  3. Fix the bug.
  4. Both the property test and the new unit test should now pass.

Real-world SBOM regression corpus

In addition to property tests, bomdrift ships a corpus of real-world SBOMs in tests/fixtures/real-world/ (sourced from the official CycloneDX and SPDX example repos). The regression tests in tests/real_world.rs exercise:

  • Every fixture parses without error.
  • Every fixture has at least one component.
  • Components with known purl types (pkg:npm/, pkg:pypi/, etc.) resolve to the canonical Ecosystem variant — not to Ecosystem::Other(_).
  • Diffing two unrelated real-world SBOMs doesn’t panic.
  • Self-diffing a real-world SBOM produces an empty ChangeSet.
  • All four renderers produce non-empty output on a real diff.

The corpus is kept small (~2.7 MB total) so test runtime stays sub-second. Refresh it by re-fetching from upstream.

Roadmap

What’s planned, what’s deliberately out of scope, and what the acceptance criteria for new contributions look like.

Shipped (v0.9.9 — distribution)

The “distribution release.” No source-code feature work; every install path now works in one command.

  • cargo install bomdrift — published to crates.io. Cargo metadata + [package.metadata.docs.rs] + an exclude list trimming the published crate to 220 KiB compressed. New publish-dry-run PR-time CI guard.
  • docker run ghcr.io/metbcy/bomdrift:v0.9.9 — multi-arch (linux/amd64, linux/arm64) distroless image on every release. Tag matrix :vX.Y.Z, :vX.Y, :vX, :latest.
  • SLSA build provenance on every release archive AND the ghcr.io image, via actions/attest-build-provenance@v2. Verify with gh attestation verify or slsa-verifier. Complementary to the existing cosign keyless signatures — see Release signing.
  • Automated v1 major-tag retagrelease.yml force-pushes the major-version tag (v1 today; v${major} once v1.0.0 ships) to point at the latest release on every tag.
  • Manual recovery workflow — new rebuild-docker.yml lets a maintainer rebuild + push the docker image for any past tag without re-cutting the release. Reads Dockerfile from main so future fixes apply backwards.
  • README + Marketplace polish — crates.io / docs.rs / Marketplace badges; rewritten Marketplace listing description leading with the axios narrative.

Shipped (v0.9.8 — code-review-driven hardening)

  • Continuous parser fuzzing via cargo-fuzz against CycloneDX, SPDX, and Syft JSON parsers. PR-time short pass + weekly long scheduled run. See Continuous fuzzing.
  • CI coverage report via cargo-llvm-cov with a sticky PR comment. Informational; --fail-under-lines will be added once coverage is visible across 2–3 releases.
  • Production code audited for unwrap/expect/panic/todo/ unimplemented. Crate-root clippy::* warns enforce going forward. Zero production .unwrap() remain; remaining .expect() sites carry rationale comments.
  • All unsafe blocks documented with // SAFETY: comments, with clippy::undocumented_unsafe_blocks enforcing the contract.
  • src/lib.rs 47 KB → 31 linesrun_diff orchestration extracted to src/run.rs. Public API surface preserved byte-for-byte.

Shipped (v0.9.7 — milestone follow-ups)

  • SPDX WITH-chain exception inheritance(X WITH ex) AND (Y) / (X WITH ex_a) OR (X WITH ex_b) now evaluate per-leaf with proper AND/OR semantics. AND inherits a denied exception; OR doesn’t poison if another branch is permitted.
  • --multi-major-delta <N> — last hardcoded calibration threshold lifted. Default 2; tunable via flag or [diff] multi_major_delta config key.
  • Windows plugin timeout (first-class) — replaced manual Child::try_wait() polling with the wait-timeout crate. Behavior unchanged on Unix; first-class on Windows.
  • action.yml input parity — twenty-five new inputs map every v0.7-v0.9.7 CLI flag to an action input.
  • Air-gapped / self-hosted Sigstore docs — documents env-var passthrough (SIGSTORE_REKOR_URL, COSIGN_FULCIO_URL, etc.) and key-based attestation fallback.

Shipped (v0.9.6 — finish the roadmap)

  • OCI artifact attestation verification--before-attestation, --after-attestation, --cosign-identity, --cosign-issuer, and --require-attestation. bomdrift shells out to cosign verify-attestation --type=cyclonedx and consumes the verified CycloneDX SBOM payload. See Attestation.
  • Custom rules / plugin system — external-process plugins via repeatable --plugin <manifest.toml>. JSON over stdin/stdout, best-effort failures, new bomdrift.plugin SARIF rule. See Plugins.
  • Calibration knobs--typosquat-similarity-threshold, --young-maintainer-days, --cache-ttl-hours flags plus matching [diff] config keys. Every previously hardcoded threshold is now configurable.
  • Cache-TTL unification — internal refactor consolidating the four duplicated CACHE_TTL_SECS constants behind a single cache::ttl() helper. No user-visible change.

Shipped (v0.9.5 — polish + multi-SCM parity)

  • Per-exception SPDX allow/deny via [license] allow_exceptions / deny_exceptions and --allow-exception / --deny-exception CLI flags. Apache-2.0 WITH LLVM-exception etc. now evaluated at the exception level, not just the base license.
  • Bitbucket + Azure DevOps comment-driven suppression bridges — Cloudflare Worker references with the same five guards as the GitLab bridge. bomdrift now has comment-driven suppression parity across all four major SCMs.
  • bomdrift::vex::parse_synthetic_id public helper — round-trips bomdrift’s synthetic finding IDs back to a structured kind. Lets external VEX tooling identify which finding a statement targets.
  • spdx crate exact-pinned (=0.10.9) so license-list updates can’t silently change policy semantics.
  • BaselineEntry / ExpiredEntry unified internally; public alias preserved.
  • CI Rust toolchain pinned to MSRV 1.88; bumps are deliberate.
  • Single source of truth for the suppress-comment grammar (scripts/parse-suppress-comment.sh + CI sync guard).
  • GitLab note upsert + threading semantics documented.

Shipped (v0.9 — interoperability + breadth)

  • VEX consume--vex <path> accepts OpenVEX 0.2.0 + CycloneDX VEX 1.6 statements; not_affected / fixed suppress findings, under_investigation annotates.
  • VEX emit--emit-vex <path> emits an OpenVEX 0.2.0 document with explicit per-entry vex_status (default under_investigation, never auto-promoted).
  • Full SPDX expression evaluator via the spdx crate. Deprecates allow_ambiguous.
  • Bitbucket Pipelines + Azure DevOps Pipelines templates with auto-detection (BITBUCKET_BUILD_NUMBER, TF_BUILD) and per-platform footer shapes.
  • Registry-metadata enrichers — npm/PyPI/crates.io. New kinds: recently-published, deprecated, maintainer-set-changed (npm only).
  • GitLab comment-driven suppression via a security-reviewed Cloudflare Worker reference bridge (five guards).
  • Explicit non-goals + pair-with recommendations in README and STATUS.

Shipped (v0.8 — supply-chain hardening)

  • SARIF + GitHub Code Scanning with stable per-result fingerprints and one-line action opt-in (upload-to-code-scanning: true).
  • EPSS scoring on every CVE-aliased advisory; --fail-on-epss.
  • CISA KEV flagging of known-exploited advisories; --fail-on kev.
  • License allow/deny policy with *-suffix glob matching and fail-closed compound-expression handling. New bomdrift.license-violation SARIF rule.
  • Baseline expires + reason with stderr warnings on expiry.
  • time crate + clock module honoring SOURCE_DATE_EPOCH.
  • OSV CVE aliases threaded through VulnRef.
  • --debug-calibration-format jsonl and --output-file <PATH>.

Investigated and decided

  • GraphQL maintainer-age — investigated again for v0.9.6 and rejected. GitHub’s GraphQL history() connection doesn’t expose ascending-date ordering, so finding the oldest contributor commit still requires walking the cursor backward from the most recent commit. REST’s GET /repos/{o}/{r}/commits?author=X&per_page=1 plus Link-header parsing for the last page lets bomdrift fetch a single author’s oldest commit in two requests. Decided: REST stays. Closing this one off the roadmap permanently — re-open only if GitHub adds ASC ordering to the GraphQL history connection.

Calibration

All calibration thresholds are configurable via .bomdrift.toml and CLI flags. Tune [diff] typosquat_similarity_threshold, young_maintainer_days, recently_published_days, cache_ttl_hours. See CLI reference for flag forms.

Blocked on upstream

  • PyPI / crates.io maintainer-set-changed. The npm enricher (shipped v0.9) compares maintainer sets for VersionChanged components by reading registry.npmjs.org’s per-version maintainers[] array. PyPI’s https://pypi.org/pypi/<pkg>/json returns repository-level maintainers but no per-version history. Crates.io’s https://crates.io/api/v1/crates/<name> returns repository-level crate.owners but no per-version published_by history. If either ecosystem ships a per-version maintainer endpoint, bomdrift adds the enricher in a future minor release.

Future candidates (not committed)

Candidates that could land in a future release if maintainer time and adoption signal warrant:

  • Homebrew tap (Metbcy/homebrew-tap) — brew install Metbcy/tap/bomdrift. macOS adopters reach for brew install first; this closes the macOS install gap.
  • nix flake, AUR PKGBUILD, winget + Scoop manifests — the Linux power-user and Windows-package-manager install paths.
  • README diet — move the comparison table to a dedicated compare.md page, shorten the README to a one-screen pitch.
  • asciinema demo recorded against examples/axios-incident/, embedded in the README and the docs landing page.
  • Comparison docs (deep)compare/socket.md, compare/snyk.md, compare/trivy-grype-osv.md — neutral-tone pages that explain when to pick bomdrift vs. each competitor.

Non-goals

These are explicit non-goals. Don’t open a PR for them — it’ll be declined.

SBOM generation

bomdrift only consumes SBOMs. Use Syft to generate them — it’s already excellent and bomdrift’s contribution would be net-negative.

Replacing your SCA scanner

OSV-scanner, Grype, Trivy all have richer vulnerability databases and broader package metadata than bomdrift. bomdrift’s CVE enrichment is change-focused: only on what’s new in this diff. If you want “what’s in my SBOM right now?”, run an SCA scanner. If you want “what changed in this PR’s deps that I should worry about?”, that’s bomdrift’s question.

Reachability / call-graph analysis

Determining whether the vulnerable function in a flagged advisory is actually invoked from your application’s entry points is a fundamentally different analysis than diff-level supply-chain risk. It requires whole-program call-graph construction, language-specific runtime modeling (dynamic dispatch, reflection, eval), and an ever-growing per-CVE vulnerable-symbol database. The vendors who do this well — Endor Labs, Snyk Reachability — invest at a scale OSS bomdrift can’t match, and the per-CVE symbol curation is the moat, not the call-graph engine itself. Pair bomdrift with Endor or Snyk for reachability; bomdrift answers “what changed”, they answer “does the change reach prod code”.

Dependency-tree visualization

cargo tree, pnpm why, and ecosystem-specific equivalents handle this well. bomdrift’s diff core could in principle walk the dependencies / relationships arrays from the source SBOM, but it’s outside the “what’s risky” scope.

Per-language deep parsing

bomdrift treats SBOMs as the source of truth for what’s installed. Walking package-lock.json / Pipfile.lock / Cargo.lock directly would let us catch things SBOMs miss (lockfile drift), but doubles the parser surface for marginal signal — and the SBOM-generation ecosystem is converging fast enough that this won’t matter in 18 months.

Web UI / dashboard

bomdrift is intentionally a CI tool. Long-running stateful dashboards (org-wide vuln tracking, exception management UI) are better served by tools designed for that — Anchore Enterprise, Snyk, etc. The PR comment is the UX.

Contribution acceptance criteria

A new enricher / output format / parser PR should:

  1. Pass cargo clippy --all-targets --all-features -- -D warnings on its own. The codebase is clippy-clean and we keep it that way.
  2. Add unit tests in src/<your-module>/tests covering the happy path + at least one edge case. Best-effort enrichers should test the network-failure path (via fake fetcher injection).
  3. Add an end-to-end test in tests/cli.rs if it’s CLI-visible, or tests/integration.rs if it’s library-internal.
  4. Document its rationale in a module doc comment at the top of the file. The “why” is more interesting than the “what” — future contributors lift the rationale, not just the implementation.
  5. Stay best-effort. Network or filesystem failures must not block the diff from rendering. The contract is “render whatever we got”, not “all-or-nothing”.
  6. Not pull in tokio / chrono / semver / octocrab without strong justification. The dep-tree audit is real — see Architecture.

See Contributing for the development loop.