02Federal Civilian

Federal benefits systems that survive their first OIG review.

Federal civilian programs deliver hundreds of billions of dollars in benefits each year through decisioning systems that were not designed to be audited at the granularity OMB now requires. M-24-10, M-25-21, and the OIG community have shifted what counts as defensible. We help agencies build the engineering reality the new policy describes.

Annual federal benefits: $3.4T+
M-24-10 minimum practices: Multiple
Improper-payment exposure: $200B+ / yr

01What we keep seeing

Three failure modes that pre-date the AI conversation.

01
AI inventory is a spreadsheet, not a system
M-24-10 reads as a request for a data product. Most agency responses are documents. Compliance is technically met; capability is not.
02
Impact assessments are filed, not run
Pre-deployment evaluation is treated as a paperwork milestone. Without a runnable harness, the next model update can't be re-evaluated, and the agency has no real check on the vendor.
03
Decision reasoning is not retrievable
The system persists the decision. It does not persist the reasoning. When the FOIA request, the appeal, or the OIG audit arrives, the reasoning has to be reconstructed — which usually means it can't be.

02How we work the seam

Specific practices. Specific outcomes. No platitudes.

Practice 01
Outcome
Six weeks to a queryable, governed inventory
AI inventory as a data product
Replace the SharePoint spreadsheet with a versioned, workflow-backed inventory mapped to underlying systems. Becomes the control plane for everything downstream — impact assessments, minimum-practice runbooks, OMB reporting.
Practice 02
Outcome
Eight weeks to one passing harness; reusable across systems
Impact-assessment harness
Build a runnable evaluation suite for one priority rights-impacting use case. Demographic-group breakouts, distribution-shift tests, fairness criteria named explicitly. Becomes the template for every subsequent assessment.
Practice 03
Outcome
Each minimum practice has an on-call + SLA + remediation path
Minimum-practice runbook framework
For each minimum practice that applies to the system, define detector, on-call, remediation SLA, and notification path. Without these four, a minimum practice is a wish — not a practice.
Practice 04
Outcome
Per-decision reconstructibility for OIG defense
Decision-provenance retrofit
Add append-only event-log capture for every decision touchpoint with content-addressed pointers to features, models, prompts, and retrieved documents. The bar is: any decision is replayable with the same inputs, on demand.

03Market scale

The size, growth, and obtainable share of this market.

Every benefits-decisioning segment shares the same pattern: program dollars in the hundreds of billions, services spend in the single-digit billions, modernization and detection investments compound across the program spend.

TAM

$3.4T

Program dollars in scope

SAM

$22.0B

Annual services spend

5-yr CAGR

8.5%

Rolling SAM growth

3-yr SOM

$320M

Vardr-obtainable

Vardr captures a small but compounding share of the addressable services spend; the underlying program dollars dwarf the modernization budget.

TAM

$3.4T

SAM

$22.0B

3-yr SOM

$320M

OMB historical tables + GAO improper-payment series. SAM = federal civilian IT modernization + advisory (NAICS 541511/512/611/690). SOM = 3-yr Vardr-obtainable.

5-year SAM growth — Federal Civilian

$22.0B → $33.1B

04Programs we focus on

Federal civilian benefits and decisioning systems.

USDA Food and Nutrition Service / state pass-through
SNAP modernization
Eligibility, intake, recertification automation; benefits-issuance fraud signal at intake.
DOL / state UI agency
Unemployment Insurance integrity
Synthetic-identity defense, fictitious-employer detection, post-PUA program-integrity rebuilds.
Department of Veterans Affairs
VA claims processing
Adjudication copilots, evidence retrieval, appeal-overturn analytics.
Department of the Treasury / IRS
IRS refund-fraud defense
Refund-fraud rings, false-filing detection, identity-protection-PIN workflow integration.
Social Security Administration
SSA disability decisioning
DDS-level decisioning support, medical-evidence retrieval, citation-grounded adjudication.

05Questions worth asking

Open these with us — or with anyone else.

We bring these to every program-office conversation. Use them whether or not we end up working together.

01
Is your AI use-case inventory a data product or a document?
02
If a model update shipped tomorrow, could the agency re-run the impact assessment without the vendor?
03
For each minimum practice that applies to your system: who is paged when it's violated?
04
Can you reconstruct, for a specific decision from 18 months ago, the exact features, model artifact, and policy text that bound it?

06Related insights

07From the Open Workbench

Working in federal civilian and want our take?

45-minute principal-level briefing. Bring the program, the constraint, the deadline.

Schedule a principal-level briefing Run the readiness assessment

Federal benefits systems that survive their first OIG review.

Three failure modes that pre-date the AI conversation.

AI inventory is a spreadsheet, not a system

Impact assessments are filed, not run

Decision reasoning is not retrievable

Specific practices. Specific outcomes. No platitudes.

AI inventory as a data product

Impact-assessment harness

Minimum-practice runbook framework

Decision-provenance retrofit

The size, growth, and obtainable share of this market.

5-year SAM growth — Federal Civilian

Federal civilian benefits and decisioning systems.

SNAP modernization

Unemployment Insurance integrity

VA claims processing

IRS refund-fraud defense

SSA disability decisioning

Open these with us — or with anyone else.

Working in federal civilian and want our take?