Reproduce a customer bug without production access

When a bug only reproduces against one customer's data, the shortest path to a fix is access to that data. Access to production isn't on the table. Shared staging is stale and probably doesn't contain that tenant anyway. The DBA export will arrive next week. Meanwhile the ticket sits.

This guide covers a specific alternative: a one-button CI job that reads from your DR warm standby, subsets to one tenant, masks PII, and hands back a connection string to a clean repro database inside the VPC. No production access. No manual export. No six-week-old data.

The schema you're actually debugging

A typical B2B billing repro touches organisations, users, subscriptions, invoices, line items, payment methods, and an audit_events table with JSONB payloads. Miss one hop in the FK graph and your repro lies to you.

CREATE TABLE organizations (
  id           bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  tenant_slug  text NOT NULL UNIQUE,
  legal_name   text NOT NULL,
  created_at   timestamptz NOT NULL DEFAULT now()
);

CREATE TABLE users (
  id              bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  organization_id bigint NOT NULL REFERENCES organizations(id),
  email           text NOT NULL,
  full_name       text NOT NULL,
  role            text NOT NULL DEFAULT 'member'
);

CREATE TABLE subscriptions (
  id              bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  organization_id bigint NOT NULL REFERENCES organizations(id),
  plan_code       text NOT NULL,
  status          text NOT NULL
);

CREATE TABLE invoices (
  id              bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  subscription_id bigint NOT NULL REFERENCES subscriptions(id),
  issued_at       timestamptz NOT NULL,
  total_cents     bigint NOT NULL,
  currency        char(3) NOT NULL DEFAULT 'USD'
);

CREATE TABLE invoice_line_items (
  id          bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  invoice_id  bigint NOT NULL REFERENCES invoices(id),
  sku         text NOT NULL,
  quantity    int NOT NULL,
  unit_cents  bigint NOT NULL
);

CREATE TABLE payment_methods (
  id              bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_id         bigint NOT NULL REFERENCES users(id),
  last_four       char(4) NOT NULL,
  provider_token  text NOT NULL
);

CREATE TABLE audit_events (
  id              bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  organization_id bigint NOT NULL REFERENCES organizations(id),
  occurred_at     timestamptz NOT NULL,
  payload         jsonb NOT NULL
);

The bug lives in the join between invoices and invoice_line_items, but the smoking gun is often in audit_events.payload — nested JSON with actor emails and support notes. Regex over payload::text corrupts the JSON and burns an afternoon. A proper JSONB masker handles this at the path level.

Why a manual pg_dump script breaks

The usual workaround: dump organizations WHERE tenant_slug = 'acme-corp', manually add users, forget invoice_line_items, run sed over emails, restore to staging. Monday morning the engineer reproduces something, but totals still don't match because half the child rows never made the trip — and the payment_methods for those users weren't masked consistently with their user records.

PrivaCI Commercial subsetting starts from a root predicate and walks foreign keys transitively: subscriptions because they reference the org, invoices because they reference subscriptions, line items because they reference invoices, and so on. The masking is deterministic from your salt — the same real value always produces the same fake within a given field, and across fields when you configure a shared seed_alias. Reruns with the same salt stay stable.

Read from the DR warm standby, not primary

SOURCE_DB_URL points at the warm standby in another AZ — the same instance you'd fail over to during an outage, streaming replication caught up to within seconds. You're not adding read load to primary, and you're not asking a DBA for a one-off export. The CI job runs inside the VPC with credentials that already exist for DR drills; no rows leave your network.

One practical note: if replication lag spikes during a bad deploy, treat it like any other DR concern and pause repro jobs until the standby catches up. You're debugging billing, not chasing ghosts from stale WAL.

Subsetting to organisation 451 (Acme)

Predicates are operator-authored SQL, not free-form input. Validate the org ID is numeric before it touches any config.

# commercial-extensions.yaml
version: "1.0"
subset:
  - table: public.organizations
    predicate: "id = 451"
  - table: public.audit_events
    predicate: "organization_id = 451 AND occurred_at >= now() - interval '90 days'"

The organizations root pulls subscriptions, invoices, line items, users, and payment methods via FK expansion automatically. audit_events has a FK to organizations too, so it would also be included — but without a root predicate of its own, the closure would pull Acme's entire event history. Declaring it as a second root lets you replace that with a bounded predicate. The organization_id = 451 condition isn't optional: a predicate with only the time bound would scope the closure to all tenants' events in that window, which is the opposite of what you want.

Masking scalars and JSONB paths

Column rules live in mask-rules.yaml; JSONB path rules live in commercial-extensions.yaml alongside subsetting.

# mask-rules.yaml (excerpt)
version: "1.0"
global_salt: env://ANONYMIZATION_SALT
tables:
  public.users:
    strategy: transform
    columns:
      email:     { action: fake, provider: email, seed_alias: user_email }
      full_name: { action: fake, provider: full_name }
  public.payment_methods:
    strategy: transform
    columns:
      provider_token: { action: hash }
      last_four:      { action: static, value: "9999" }

provider_token is hashed deterministically — the masked value is stable across reruns with the same salt, which matters if your repro spans multiple tables referencing the same token. last_four is a display-only char(4) field; there's no business logic that depends on it being a realistic digit string, so static is the right call over inventing a fake that might suggest false precision.

# commercial-extensions.yaml (json_mask section)
json_mask:
  - column: public.audit_events.payload
    paths:
      - path: $.actor.email
        action: fake
        provider: email
        seed_alias: user_email
      - path: $.actor.ip
        action: hash
      - path: $.internal.support_notes
        action: remove

Shared seed_alias: user_email on the scalar column and the JSON path keeps users.email and payload.actor.email in sync when the audit log copies the same normalized string. Alignment assumes the scalar rule is not applying uniqueness suffixes ( is_unique) — JSON paths have no equivalent — and that casing matches between column and payload ( Jane@Acme.com vs jane@acme.com diverge).

You only need that alignment when the debugging workflow compares email strings across tables. Most billing repros trace by numeric ID — invoice totals against line items via invoice_id, payment state via user_id, org scope via organization_id. If your payload stores actor.user_id, join audit events to users on that key and skip email correlation entirely. Use seed_alias when the payload carries a copied email (or username, token, etc.) with no stable FK key in the JSON.

support_notes is removed entirely: free-text fields in audit logs frequently contain customer-identifying context that a fake provider can't reliably neutralise.

Wiring it as a self-service job

Scaffold a starting workflow with privaci generate-ci, then add the dispatch input and input validation. The engineer on call fills in one field; the rest runs unattended.

# .github/workflows/support-repro-db.yml
name: support-repro-db
on:
  workflow_dispatch:
    inputs:
      organization_id:
        description: "Numeric organization id (no slug, no quotes)"
        required: true
        type: string

jobs:
  build-repro:
    runs-on: ubuntu-latest
    env:
      SOURCE_DB_URL:  ${{ secrets.DR_WARM_STANDBY_URL }}
      TARGET_DB_URL:  ${{ secrets.SUPPORT_REPRO_DB_URL }}
      ANONYMIZATION_SALT: ${{ secrets.ANONYMIZATION_SALT }}
      PRIVACI_MARKETPLACE_PRODUCT_CODE: ${{ secrets.PRIVACI_MARKETPLACE_PRODUCT_CODE }}

    steps:
      - uses: actions/checkout@v4

      - name: Validate org id
        run: |
          [[ "${{ inputs.organization_id }}" =~ ^[0-9]+$ ]] || \
            { echo "organization_id must be numeric"; exit 1; }

      - name: Render subset predicate
        run: |
          sed -i "s/id = 451/id = ${{ inputs.organization_id }}/g" \
            commercial-extensions.yaml
          sed -i "s/organization_id = 451/organization_id = ${{ inputs.organization_id }}/g" \
            commercial-extensions.yaml

      - name: Mask tenant slice
        run: |
          docker run --rm \
            -e SOURCE_DB_URL -e TARGET_DB_URL -e ANONYMIZATION_SALT \
            -e PRIVACI_MARKETPLACE_PRODUCT_CODE \
            -v "$PWD/commercial-extensions.yaml:/config/commercial-extensions.yaml:ro" \
            -v "$PWD/mask-rules.yaml:/config/mask-rules.yaml:ro" \
            ghcr.io/boundarylogic/privaci-commercial:1.0.1 \
            privaci run \
              --config /config/mask-rules.yaml \
              --commercial-extensions /config/commercial-extensions.yaml

      - name: Post connection string to Slack
        if: success()
        env:
          SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
        run: |
          curl -s -X POST https://slack.com/api/chat.postMessage \
            -H "Authorization: Bearer $SLACK_BOT_TOKEN" \
            -H "Content-Type: application/json" \
            -d '{
              "channel": "#support-repro",
              "text": "Repro DB ready for org ${{ inputs.organization_id }}. DSN in 1Password → support-repro. TTL: 48h."
            }'

A few things to note in this config: the image is the Commercial build (ghcr.io/boundarylogic/privaci-commercial:1.0.1 — pin the version tag from your Marketplace subscription in production). The sed substitution is blunt but effective for a trusted internal tool — the numeric validation above it is what makes it safe. The connection string itself goes to a private Slack channel with a 48-hour TTL reminder; the DSN lives in 1Password, not in the message. Before merging any change to mask-rules.yaml, run a policy-diff gate in PR CI with strict_autodetect: true so new columns need an explicit rule. Commercial v1.0.1+ writes the diff JSON before exiting non-zero — see the CI ephemeral guide for a full check-masking-policy job.

privaci preview --config mask-rules.yaml --policy-diff /tmp/policy-diff.json

What the engineer gets

The target database starts empty. After the job completes: something like 40,000 rows instead of 400 million. The engineer connects with whatever SQL client they use day to day. The invoice totals still disagree with the line items — good, that's the bug.

-- FK integrity on the target before handoff
SELECT COUNT(*) AS orphan_line_items
FROM   invoice_line_items li
LEFT   JOIN invoices i ON i.id = li.invoice_id
WHERE  i.id IS NULL;

-- The bug: invoice totals vs line items (trace by ID — no email needed)
SELECT i.id,
       i.total_cents,
       SUM(li.quantity * li.unit_cents) AS computed_cents
FROM   invoices i
JOIN   subscriptions s ON s.id = i.subscription_id
JOIN   invoice_line_items li ON li.invoice_id = i.id
WHERE  s.organization_id = $1
GROUP  BY i.id, i.total_cents
HAVING i.total_cents <> SUM(li.quantity * li.unit_cents);

-- Optional: email alignment when payload copies actor.email (needs seed_alias)
SELECT u.email,
       ae.payload #>> '{actor,email}' AS payload_actor_email
FROM   audit_events ae
JOIN   users u ON u.email = ae.payload #>> '{actor,email}'
WHERE  ae.organization_id = $1
LIMIT  10;

The first query should return zero — a quick FK orphan check on the target before you post "repro ready". It catches incomplete closure (a root predicate that didn't match, for example). privaci verify is built for full-database runs where source and target row counts should match; on a single-tenant slice from a multi-tenant primary, use this orphan query instead of verify as your pass/fail gate.

The second query is usually where the ticket lives: mismatched totals traced entirely through foreign keys. The third is optional — run it only when your workflow joins audit payloads to user rows by email string and you've configured matching seed_alias on both rules (see JSONB masking — seed_alias).

How long does it actually take?

For a mid-size tenant — a few thousand invoices, a year of audit events — expect 10–20 minutes end to end, most of which is the closure query walking the FK graph and the masked write. A whale tenant with dense JSONB audit history can run longer; the engine streams rather than buffers, so memory stays flat regardless. The job is slower than "query prod directly" and faster than waiting until Tuesday for a DBA export. That is the correct comparison.

What PrivaCI doesn't handle here

Provisioning the empty target. The engine expects an empty writable Postgres database and writes the masked graph into it. Spinning that up — an RDS snapshot restore, an ephemeral Cloud SQL instance, a Postgres pod in a preview namespace — is Terraform or your platform team's domain.

Teardown after TTL. Masked data accumulates if you don't enforce expiry. Auto-drop the repro database after 48 hours. This is policy, not product.

Composite FK edge cases. Multi-column foreign keys traverse normally in both directions. The one exception is pulling root-table rows referenced by a composite FK on a non-root table — that path is skipped and logged as a warning, so check job logs if a root table referenced through a composite key looks underpopulated. See the Commercial subsetting docs.

What you need

PrivaCI OSS engine — masking, streaming, verify, deterministic salt, audit log on the target database.
PrivaCI Commercial — FK-aware subsetting and JSONB path masking. Required for this workflow.
Network path — CI runner to DR standby (read) and to the empty repro target (write), both inside the VPC.
Secrets — ANONYMIZATION_SALT (32+ random bytes, treated as a production credential), DR read DSN, repro target DSN, Marketplace product code.

Commercial plans · Engine quickstart · Full-database staging mask · Ephemeral CI test database