Correctness sandbox

Before you trust us to run payroll, test whether a payroll answer can stand up.

The sandbox is not free payroll. It is a correctness-first workspace for cleanup, switch reconstruction, correction questions, notice evidence, advisor review, and agent-readable payroll packets.

Bring a payroll answer that needs to be reconstructed or defended: a wrong check, late fact, provider switch, tax notice, CPA question, or agent workflow.

Test a scenario See a sample packet

SANDBOX PACKET Correctness record

InputHours, wages, dates

ContextTaxes, deductions, YTD

ReviewMissing evidence

OutputExplain, correct, verify

Facts Rules Deltas Seal

What you can test.

Start with a small payroll scenario. The goal is to show what a defensible record would need, not to run live payroll or replace tax advice.

Input

Payroll facts

Employee wages, hours, pay period dates, gross-to-net details, taxes, benefits, deductions, and YTD context.

Input

Messy moments

Wrong checks, late raises, mid-year switches, local-tax questions, notice issues, and provider export gaps.

Output

Correctness record

What changed, what was calculated, what assumptions were used, and which rule/content version applied.

Output

Evidence checklist

What needs human review, whether the record is internally consistent, and what evidence would defend or correct it.

See the packet before sending a case.

Marlow Goods is a demo, test-backed sample showing a late rate change, derived correction, and notice reconciliation.

Marlow Goods proof packet

The original run stays filed. The later rate fact creates a $58.50 gross correction. The notice claim reconciles to $15.14, and the residual is $0.00.

Open sample packet

Why this helps quantify the market.

Government stats can size the payroll compliance surface. The sandbox measures the correctness and reconstruction pain inside that surface.

Start with public evidence.

Use IRS, BLS, Census/SUSB, SBA, state DOR, workforce-agency, and practitioner data to size employers, payroll frequency, filings, notices, penalties, and compliance volume.

Separate correctness pain.

Measure the narrower question: how often payroll records cannot answer what changed, which rule applied, what evidence exists, and what correction is required.

Track dollars and time.

For each case, capture hours already spent, dollars at stake, penalties or fees, advisor effort, missing evidence, and whether the packet is worth paying for.

The discipline

Public stats prove payroll compliance is large and consequential. They do not prove buyers will pay for Runbook. The sandbox has to show that our record model changes outcomes.

Agent-safe by design.

Agents should be able to inspect and prepare payroll records. They should not calculate payroll math, approve payroll, move money, or file taxes.

Validate packet

Check whether the supplied payroll facts are structurally complete enough to review.

Explain result

Trace a number back to source facts, rule content, assumptions, and calculation output.

Detect gaps

Identify missing evidence before a human sends a packet to a CPA, agency, worker, or provider.

Draft response

Prepare notice and correction drafts for human review without recording payroll facts.

Compare runs

Show what changed across payroll versions, corrections, or provider imports.

Verify artifact

Check a Quittance or sample proof packet without trusting a dashboard screenshot.

Test the record before trusting the provider.

Use early access to bring a payroll scenario that needs explanation, correction, or evidence.

Test a scenario