Strategy

Agent message provenance: a field guide in eleven signals

A client called the morning after the Pentagon AI-propaganda story broke. How does her vendor tell our agent's invoices from a fake? These are the eleven signals we attach.

Jacob Molkenboer· Founder · A Brand New Company· 6 Jun 2026· 9 min

Ivory paper blotter with brass wax seal, cream envelope, green wax mark, chartreuse ribbon, red stamp.

A client called us the morning after the Pentagon AI-propaganda story hit Hacker News. She runs accounts payable for a wholesaler in Rotterdam. Her question was concrete: how does the vendor receiving an email from our invoice-chasing agent verify its provenance, and tell it apart from a fabricated message claiming to come from her company?

She did not want a treatise on AI ethics. She wanted a checklist.

By the end of that week we had rewritten the message envelope every one of our fourteen production agents uses to send mail, post to Slack, or speak on the phone. Eleven provenance signals, attached to every outbound message, designed so a moderately technical recipient can confirm provenance in under thirty seconds, and a fully automated downstream system can verify it in milliseconds.

What follows is the field guide. We use it internally. You can copy it.

Why we rewrote the envelope in 48 hours

Three things changed in the last twelve months. Public agents now write on behalf of brands at meaningful volume. Voice cloning is good enough that a phone callback to the same number is no longer a verification step. And state actors are running production-scale generation pipelines, which means the cost of fabricating a convincing email from your agent has fallen to roughly zero.

Most of our clients already trust the agents we built for them. The new problem is the asymmetry. A recipient on the other end of a transactional message has no good way to establish provenance, and so no good way to distinguish a real agent's output from a fake one. The footer is the only surface available. So the footer has to do real work.

What an outbound message looks like now

Here is the actual footer block attached to messages from the invoice-agent we built for a Dutch wholesaler. Names changed.

--
Sent by:    invoice-agent v2.4.1 (deployed 2026-05-12)
Run ID:     run_01HVZQ2K3M4N5P6Q7R8S9T
Review:     auto-sent under whitelist policy AP-2026-03
Sources:    invoice/INV-2026-0411.pdf [sha256:a1b2c3...d4]
            contract/MSA-Acme-2025.pdf [sha256:e5f6a7...b8]
Confidence: high (template match + vendor on file since 2021)
Intent:     transactional, commitment-bearing (references payment)
Signature:  ed25519:Mn0pQrSt... (verify https://wholesaler.example/keys)
Reply:      reply here OR verify at https://wholesaler.example/agent/INV-0411
Scope:      to ap@vendor.example only, do not forward without re-verifying

Eleven fields. Each one earns its place. Below is the rationale.

The eleven provenance signals

1. Sender pedigree

Agent name, version, deploy date. Not the underlying model family. That changes too often and is rarely useful to a recipient. The pedigree tells you which of our agents wrote this, and lets a recipient match against a known list of agents we told them to expect.

2. Run ID

A globally unique trace identifier. Every tool call, every prompt input, every retrieved document is tagged with the same ID inside our logs. If something goes wrong, this is the one string a customer can quote to us at 2am to get the full chain back. Use a sortable format (ULID, KSUID) so support staff can also infer time and ordering at a glance.

3. Review state

One of three values. auto-sent means no human touched the message before it left. human-reviewed means a person approved the exact final text. human-drafted means the agent only routed it. We treat this as the single most important signal for recipients. A human-reviewed message carries different weight than an auto-sent one, and recipients deserve to know which they are reading.

4. Cited sources

For any retrieval-augmented response, the documents the agent pulled. Path plus SHA-256 hash. The hash matters because it lets the recipient (or their own systems) verify the document has not been swapped underneath. Two lines of metadata turn a black-box reply into something auditable.

5. Tool calls

Side effects the agent took or proposes to take, summarised in a line. Updated invoice status to SENT. Scheduled follow-up for 2026-06-13. A reader should not have to guess what the agent did in their account.

6. Confidence band

High, medium, low. With a short reason. We resisted this one for a while because confidence scores from language models are famously unreliable. The right framing is not the model's internal logit. It is whether our policy pipeline found any reason to flag this message. Template match plus known vendor is high. Novel claim with thin retrieval is low.

7. Cryptographic signature

An ed25519 signature over the canonicalised message body, with the public key reachable at a stable URL on the sender's domain. This is the field that lets a fully automated downstream system verify provenance without human help. It is also the one most likely to be ignored by humans, and that is fine. It exists for machines.

If you have done DKIM, this will feel familiar. The difference is that DKIM signs the SMTP envelope and the mail headers. This signs the semantic content the agent actually generated. See the DKIM specification (RFC 6376) for the email-header analogue, and apply the same canonicalisation discipline.

8. Content credentials on attachments

Any image, PDF, or audio file the agent attaches carries a C2PA content credential manifest. C2PA is the open standard developed by Adobe, Microsoft, BBC, and others for cryptographically signed provenance metadata on media files. For an invoice PDF, the manifest records that the file was generated by our invoice-agent on a given date from a given template. A spoofed invoice will not carry a valid manifest, and a downstream verifier can refuse to act on unsigned attachments.

9. Intent tag

One of: informational, transactional, commitment-bearing. The third is the dangerous one. Any message that creates an obligation (a payment, a contract, a deadline) gets flagged so both the recipient and their downstream automation know to treat it with care. Most phishing attempts dress themselves as commitment-bearing. Tagging your own legitimate ones makes the false ones easier to spot by contrast.

10. Reply-channel verification

A URL on a known domain where the recipient can re-verify the message. Not a click to confirm link. Those train people to click links in emails, which is the original sin. A link to a dashboard the recipient can navigate to independently, where the same message appears with the same Run ID. If the URL in the email and the URL the recipient already knows do not match, something is wrong.

11. Recipient scope

Who the message is intended for, and whether it is safe to forward. Most transactional agent messages are addressed to a specific role at a specific company. If someone forwards it, the new recipient should at minimum re-verify. This signal is informational for humans and enforceable by downstream automation that respects it.

Takeaway

The footer is not a disclaimer. It is a contract surface. Eleven fields, each one answering a specific question a recipient (or their downstream system) would ask if they suspected the message was fake.

What it costs

The honest answer: more than we expected, and less than you would think.

The cost in tokens is small. The footer adds about 180 tokens to every outbound message. At our current volume that is a rounding error.

The cost in deliverability is real. The first day we shipped the new footer for the Rotterdam wholesaler, inbox placement dropped, because spam filters do not love long structured footers from previously-quiet sender domains. We moved most of the metadata into a structured header (analogous to List-Unsubscribe) and kept the human-readable footer to two lines. Inbox placement recovered within 36 hours.

The cost in engineering is the keys. Rotating ed25519 keys, publishing them at a stable URL, building a verifier endpoint, writing the C2PA manifest pipeline for attachments. Two weeks of work for one of our engineers. Reusable across every subsequent agent we ship.

Warning

Do not publish your signing keys in a JSON file behind your marketing CDN. We almost did. Use a dedicated subdomain, set strict cache headers, and rotate quarterly. A leaked key is worse than no signature at all, because it creates the illusion of provenance verification where none exists.

Where the provenance model maps to existing standards

None of these signals are novel inventions. They are a reapplication of standards that already exist for adjacent problems:

The signature field generalises DKIM and ARC to message bodies.
The content-credentials field is C2PA, used unchanged.
The cited-sources field is a thinner version of the W3C Verifiable Credentials data model, scoped to retrieval evidence.
The intent tag borrows from email categorisation work done by mail providers since the 2010s.
The reply-channel verification is the same pattern as out-of-band MFA confirmations.

The work was not invention. The work was deciding which eleven of these standards to bring together, and committing to attach all of them on every message rather than picking and choosing per use case.

Where to start tomorrow morning

If your team runs even one outbound agent (an email responder, an invoice chaser, a Slack reply bot), the smallest useful thing you can do this week is add three of the eleven signals: sender pedigree, run ID, and review state. That is a two-line footer and a logging change. It will not stop a determined attacker, but it gives every recipient a way to ask a specific question when something looks off, and it gives you the trace ID to answer.

The rest of the eleven can come later, one per sprint, in the order your threat model demands.

When we built the invoice-agent for the Rotterdam wholesaler, the gotcha was that adding a signature footer doubled the spam-folder rate on day one. We solved it by moving most metadata into a structured header and keeping the visible footer to two lines. That kind of friction shows up everywhere in this work. If you want to talk through how to apply provenance signals to your own AI agents, the contact link in the footer of this page reaches us directly.

Key takeaway

The agent footer is a contract surface, not a disclaimer. Eleven fields, each answering one specific question a sceptical recipient would ask first.

FAQ

Do I need all eleven signals from day one?

No. Start with sender pedigree, run ID, and review state. Those three are a two-line footer and a logging change, and they cover the most common what-is-this-message question without engineering investment.

Will a long structured footer hurt email deliverability?

Yes, initially. Move most metadata into structured headers like List-Unsubscribe and keep the visible footer to two lines. Inbox placement typically recovers within 36 to 72 hours once sender reputation rebuilds.

Why sign the message body when DKIM already signs the headers?

DKIM signs the SMTP envelope and headers, which protects the routing layer. Signing the semantic body protects the content the agent actually generated against in-flight modification or quotation in a different envelope.

Is C2PA worth implementing just for invoice PDFs?

If your agent ever attaches a file that creates a payment obligation, yes. C2PA lets a downstream verifier refuse to act on unsigned attachments, which closes the easiest spoofing path against agent-issued documents.

ai agentsemail automationsecurityarchitecturestrategy

Building something?

Start a project