Automating Runtime Parity Checks Between Local and Staging
Automate runtime parity checks between local and staging. Diff OS, runtime versions, and lockfile hashes in a script and gate merges before drift hits production.
A change passes every local test, then fails the moment it reaches staging because the staging runtime is 18.20.1 and the laptop runs 18.19.0 — silent drift that no test asserted against. This walkthrough builds a scriptable parity check that diffs OS, runtime version, and lockfile hash between local and staging, then gates merges on the result. It is the production-ready companion to runtime parity frameworks under onboarding architecture and friction mapping.
Diagnostic
Diff the environment fingerprints across both contexts:
#!/usr/bin/env bash
set -euo pipefail
env | grep -E '^(NODE_VERSION|PATH|LD_LIBRARY_PATH|AWS_REGION)' | sort > local.env
ssh staging "env | grep -E '^(NODE_VERSION|PATH|LD_LIBRARY_PATH|AWS_REGION)' | sort" > staging.env
diff -u local.env staging.env
Expected BAD output — the runtime and library path diverge:
--- local.env
+++ staging.env
@@ -1,4 +1,4 @@
AWS_REGION=us-west-2
-LD_LIBRARY_PATH=/usr/local/lib/node
-NODE_VERSION=18.19.0
+LD_LIBRARY_PATH=/opt/staging/lib
+NODE_VERSION=18.20.1
PATH=/usr/local/bin:/usr/bin:/bin
Root Cause
Floating base-image tags (node:18, python:3.11) and dynamic environment resolution let local and staging diverge between rebuilds. When npm or pip falls back to the network registry instead of a cached, lockfile-pinned tarball, transitive dependencies resolve to different versions on each host. Nothing pins the contract, so nothing catches the divergence until runtime. The danger is that each axis of drift is individually plausible: a one-patch Node bump, a slightly different LD_LIBRARY_PATH, a transitive dependency that floated forward. None of them trips a test in isolation, but together they produce behavior that exists only on staging — the textbook "works on my machine" report, just inverted. A parity check is valuable precisely because it asserts on the axes tests ignore: the runtime version, the architecture, and the lockfile hash, all of which are normally invisible to a passing green build. This is the same symptom that debugging "works on my machine" runtime drift triages from the developer's side.
Resolution
- Pin immutable digests in every
Dockerfile:FROM node@sha256:0000000000000000000000000000000000000000000000000000000000000000 - Lock the host toolchain so package operations cannot trigger implicit upgrades:
# .tool-versions nodejs 18.19.0 - Write a strict parity script that compares runtime metadata and lockfile integrity:
#!/usr/bin/env bash set -euo pipefail LOCAL_RUNTIME=$(node -p 'JSON.stringify({v:process.version,arch:process.arch,platform:process.platform})') STAGING_RUNTIME=$(ssh staging "node -p 'JSON.stringify({v:process.version,arch:process.arch,platform:process.platform})'") LOCAL_HASH=$(sha256sum package-lock.json | awk '{print $1}') STAGING_HASH=$(ssh staging "sha256sum package-lock.json | awk '{print \$1}'") [ "$LOCAL_RUNTIME" = "$STAGING_RUNTIME" ] || { echo "RUNTIME_DRIFT: $LOCAL_RUNTIME != $STAGING_RUNTIME" >&2; exit 1; } [ "$LOCAL_HASH" = "$STAGING_HASH" ] || { echo "LOCKFILE_DRIFT" >&2; exit 1; } echo "PARITY_CHECK_PASSED" - Make it executable and run it before every push:
#!/usr/bin/env bash set -euo pipefail chmod +x scripts/parity-check.sh ./scripts/parity-check.sh
Expected Output
When local and staging share an identical contract, the script reports each axis green and exits 0:
[✓] Runtime: v18.19.0 (local) == v18.19.0 (staging)
[✓] Architecture: x86_64 matches
[✓] Lockfile integrity: SHA256 match confirmed
PARITY_CHECK_PASSED
Prevention
- Wire the script into a pre-push hook and a required CI step so divergence blocks the merge:
# .github/workflows/parity.yml on: [pull_request] jobs: parity: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: ./scripts/parity-check.sh --mode ci - Cache dependency trees keyed on the lockfile hash to keep the check fast across matrix jobs.
- Schedule a weekly drift audit (
cron: "0 3 * * 1") that opens a tracking issue when staging moves out from under the pinned baseline.
CI runners: GitHub Actions defaults to AMD64; add
docker/setup-qemu-actionto also validate ARM64 runtime behavior. WSL2: absolute Windows paths break in CI; convert withwslpath -uin local hooks before committing. Apple Silicon (ARM64): includeprocess.archin the fingerprint so an arm64 laptop is never compared as equal to an amd64 staging host.
Rollback
#!/usr/bin/env bash
set -euo pipefail
git revert --no-edit HEAD # undo the digest pin that broke the build
docker build -t app:rollback .