Reading the OMB AI memos as an engineer: what M-24-10 and its successors actually require you to build

Most agency responses to the OMB AI memos are policy documents. The memos quietly demand specific engineering artifacts. This is what they are.

The OMB memos on AI use across federal agencies — beginning with M-24-10 — have generated a great deal of paperwork. Most of that paperwork is downstream of treating the memos as a policy compliance exercise. Read closely, the memos require something more interesting: specific engineering artifacts, with specific properties, that most agencies do not currently produce.

We read these documents as engineers. Below is what we believe each memo actually requires you to build, separately from what it requires you to write down.

The AI inventory is a data product, not a spreadsheet

Most agencies treat the AI use-case inventory as a list of items in a SharePoint spreadsheet. The memo's actual requirements — periodic update, versioning, public reporting, role-segregated visibility — describe a data product. A correctly built inventory is:

A schema with versioned entries.
A workflow for proposing, reviewing, and publishing entries.
An audit trail of who changed what and when.
A mapping from inventory entries to the underlying systems they describe, with that mapping itself versioned.

A spreadsheet cannot meet any of these requirements at scale. Agencies that build the inventory as a data product spend roughly the same effort as agencies that build it as a spreadsheet and don't return to it.

Impact assessments are tests, not documents

The required pre-deployment impact assessment is sometimes filed as a Word document. Read as engineering work, the assessment is a set of tests:

A baseline performance test against the population the system will operate on, broken out by the demographic groups the memo requires.
A robustness test against the kinds of distribution shift the system will see in production.
A fairness test with named criteria — equality of opportunity, calibration, or equalized odds, depending on the use case.
A test of the human-review pathway, run with realistic claim volumes, not toy data.

A document is a description of these tests. A passing assessment is the evidence of having run them. Agencies that conflate the two ship systems that fail their first real review.

Minimum practices are runbooks

The memos enumerate minimum practices for rights-impacting and safety-impacting AI. These are sometimes copied verbatim into policy. The version that survives contact with a deployed system is a runbook:

For each minimum practice, who is on-call when it is violated.
How the violation is detected.
The remediation SLA.
The notification path to affected individuals if the system has already produced a determination.

A minimum practice without an on-call, a detector, an SLA, and a notification path is not a practice. It is a wish.

What we recommend agencies build, in order

In the engagements where we help agencies move from policy compliance to engineering reality, the order is consistent:

The inventory as a data product. Six weeks. Establishes the universe of work.
The impact-assessment harness for one priority use case. Eight weeks. Reusable for every subsequent use case.
The minimum-practice runbook framework. Four weeks. Templatable across use cases.
The evaluation re-run cadence. Ongoing. The artifact that distinguishes an agency that wrote a policy from an agency that operates one.

We are not the right partner for an agency that needs help writing the documents. We are a good partner for an agency that has the documents and is ready to build the engineering reality the documents describe.