02Federal Civilian
Federal benefits systems that survive their first OIG review.
Federal civilian programs deliver hundreds of billions of dollars in benefits each year through decisioning systems that were not designed to be audited at the granularity OMB now requires. M-24-10, M-25-21, and the OIG community have shifted what counts as defensible. We help agencies build the engineering reality the new policy describes.
- Annual federal benefits
- $3.4T+
- M-24-10 minimum practices
- Multiple
- Improper-payment exposure
- $200B+ / yr
01What we keep seeing
Three failure modes that pre-date the AI conversation.
01
AI inventory is a spreadsheet, not a system
M-24-10 reads as a request for a data product. Most agency responses are documents. Compliance is technically met; capability is not.
02
Impact assessments are filed, not run
Pre-deployment evaluation is treated as a paperwork milestone. Without a runnable harness, the next model update can't be re-evaluated, and the agency has no real check on the vendor.
03
Decision reasoning is not retrievable
The system persists the decision. It does not persist the reasoning. When the FOIA request, the appeal, or the OIG audit arrives, the reasoning has to be reconstructed — which usually means it can't be.
02How we work the seam
Specific practices. Specific outcomes. No platitudes.
Practice 01
Outcome
Six weeks to a queryable, governed inventory
AI inventory as a data product
Replace the SharePoint spreadsheet with a versioned, workflow-backed inventory mapped to underlying systems. Becomes the control plane for everything downstream — impact assessments, minimum-practice runbooks, OMB reporting.
Practice 02
Outcome
Eight weeks to one passing harness; reusable across systems
Impact-assessment harness
Build a runnable evaluation suite for one priority rights-impacting use case. Demographic-group breakouts, distribution-shift tests, fairness criteria named explicitly. Becomes the template for every subsequent assessment.
Practice 03
Outcome
Each minimum practice has an on-call + SLA + remediation path
Minimum-practice runbook framework
For each minimum practice that applies to the system, define detector, on-call, remediation SLA, and notification path. Without these four, a minimum practice is a wish — not a practice.
Practice 04
Outcome
Per-decision reconstructibility for OIG defense
Decision-provenance retrofit
Add append-only event-log capture for every decision touchpoint with content-addressed pointers to features, models, prompts, and retrieved documents. The bar is: any decision is replayable with the same inputs, on demand.
03Market scale
The size, growth, and obtainable share of this market.
Every benefits-decisioning segment shares the same pattern: program dollars in the hundreds of billions, services spend in the single-digit billions, modernization and detection investments compound across the program spend.
TAM
$3.4T
Program dollars in scope
SAM
$22.0B
Annual services spend
5-yr CAGR
8.5%
Rolling SAM growth
3-yr SOM
$320M
Vardr-obtainable
Vardr captures a small but compounding share of the addressable services spend; the underlying program dollars dwarf the modernization budget.
OMB historical tables + GAO improper-payment series. SAM = federal civilian IT modernization + advisory (NAICS 541511/512/611/690). SOM = 3-yr Vardr-obtainable.
5-year SAM growth — Federal Civilian
$22.0B → $33.1B
04Programs we focus on
Federal civilian benefits and decisioning systems.
USDA Food and Nutrition Service / state pass-through
SNAP modernization
Eligibility, intake, recertification automation; benefits-issuance fraud signal at intake.
DOL / state UI agency
Unemployment Insurance integrity
Synthetic-identity defense, fictitious-employer detection, post-PUA program-integrity rebuilds.
Department of Veterans Affairs
VA claims processing
Adjudication copilots, evidence retrieval, appeal-overturn analytics.
Department of the Treasury / IRS
IRS refund-fraud defense
Refund-fraud rings, false-filing detection, identity-protection-PIN workflow integration.
Social Security Administration
SSA disability decisioning
DDS-level decisioning support, medical-evidence retrieval, citation-grounded adjudication.
05Questions worth asking
Open these with us — or with anyone else.
We bring these to every program-office conversation. Use them whether or not we end up working together.
- 01
Is your AI use-case inventory a data product or a document?
- 02
If a model update shipped tomorrow, could the agency re-run the impact assessment without the vendor?
- 03
For each minimum practice that applies to your system: who is paged when it's violated?
- 04
Can you reconstruct, for a specific decision from 18 months ago, the exact features, model artifact, and policy text that bound it?
06Related insights
Working in federal civilian and want our take?
45-minute principal-level briefing. Bring the program, the constraint, the deadline.