← Blog

Email automation

Email agent postmortem: 184 condolences, one stale CRM id

Monday 09:14. A Groningen funeral firm's email agent queued 184 condolence drafts, each carrying the wrong bereaved name. Here's what broke and how we caught it.

Jacob Molkenboer· Founder · A Brand New Company· 21 Jun 2026· 10 min
Cream envelope with green ribbon on dark leather blotter, index card, brass letter opener, dried rosemary on ivory desk.

09:14 on a Monday. The night queue at a 23-person Groningen uitvaartonderneming had built up 184 condolence drafts over the weekend. The agent we wrote handles intake from the funeral director's tablet, picks the right template (Catholic, Reformed, secular, two regional dialects), fills the merge fields, and lines mails up for the family liaison to release at 09:30.

The liaison opened the first draft. The deceased on the header was Mevrouw V., who had passed away on Saturday night. The body of the letter spoke about Meneer K., who had been buried in March. The condolences were addressed to V.'s children. The funeral arrangements, the church, the catering — all K.'s.

She opened the second draft. Same V. on the header. Same K. in the body. By the seventh draft she pulled the SMTP plug from her desk and ran to the IT closet. None of the 184 mails had left, because release was set for 09:30, not 09:00. We had sixteen minutes.

What actually shipped to the queue

The agent's job is small. It pulls a fresh case from Plotbox (a cemetery and crematorium management platform widely used across the Benelux), pairs it with the family contact sheet, picks one of fourteen templates (Catholic, Reformed, secular, with regional dialect overrides for Drents, Gronings and standard Nederlands), fills the merge fields, and stages a draft in the liaison's outbox. A human always releases. The "never auto-send a condolence" rule is non-negotiable in this domain — we built it that way on day one.

The failure was upstream of the agent. In Plotbox the primary key on a dossier is dossier_id, an integer auto-incremented per tenant. When this firm merged with a smaller one in late 2024, the migration script remapped the acquired firm's dossier ids into the surviving instance by offsetting them — typically +500000 — so that no collisions could occur on the main table. The migration was clean in the sense that no row was lost. It was unclean in the sense that the offset wasn't applied to every secondary reference table. A handful of audit-log rows still pointed at the un-offset id, and a back-office cleanup six months later silently "healed" those rows by writing the un-offset id back onto the main dossier table. So dossier_id = 41822 had existed on both sides, and after the cleanup, did again.

Over the weekend a back-office user re-opened the orphaned record to settle an unpaid 2024 invoice. Plotbox flipped the soft-delete back. Our agent's nightly sync, which pulls "all dossiers with activity since the last poll", happily picked up 41822. The activity was on the 2024 case. The new case file created on Saturday night also carried 41822. The merge field renderer asked "which dossier is 41822" and got the older one in 7 of 10 lookups — Plotbox returns records in update order and the older row had just been touched.

The template renderer didn't notice it was stitching together two people. {{ deceased.name }} resolved against the new case. {{ service.location }} resolved against the old. The agent saw a fully populated draft and queued it. Run that 184 times, once per bereaved contact in the active address book, and you have Monday morning.

Why the agent trusted a stale foreign key

Two assumptions failed at once. The first: that the CRM's primary key was unique over time. It is, on paper. It wasn't in practice, because the merger import privileged data preservation over key uniqueness, and a later cleanup undid the offset that was protecting it. The second: that our template renderer would fail loudly if a merge field came back ambiguous. It didn't, because the rendering library — Jinja2 fed by a thin SQLAlchemy adapter — treats "two rows for one id" as "use the first row", which is the SQL default that everyone forgets about until it bites. We had an eval suite that ran golden-template fixtures through the renderer nightly. None of the fixtures exercised the case where the same id resolved to two rows. Why would they? The schema said it couldn't.

Warning

If your agent reads from a CRM with a history of merger imports, the integer primary key is a foreign concept. Treat it as a hint and re-derive identity from the fields a human would use to recognise the person.

The per-deceased UUID namespace gate

The fix is small and almost embarrassing to write down. Before any condolence mail enters the SMTP tunnel, we compute a deterministic UUID v5 for the deceased using a namespace UUID we control and the fields a human uses to identify a person in this domain: full legal name, date of birth, date of death, and the canonical name of the funeral service location. We then check that the same UUID appears on (a) the dossier the agent pulled, (b) the contact sheet, (c) the template's merge context, and (d) the recipient row.

If any of those four UUIDs differ, the mail is held, the dossier is flagged, and a human is paged. No exceptions. No "soft" mode where we let it through with a warning.

The code looks like this:

import uuid
from dataclasses import dataclass

# Namespace UUID, generated once with `uuid.uuid4()` and committed to the repo.
# Rotating it is a breaking change. Treat it like a private key.
NS_DECEASED = uuid.UUID("8f1c5e2a-1d4a-4bd6-9c0b-6a8a3e9e7f10")

@dataclass(frozen=True)
class DeceasedIdentity:
    legal_name: str        # "Achternaam, Voornaam Tweede"
    birth_date: str        # ISO 8601, no time
    death_date: str        # ISO 8601, no time
    service_location: str  # canonical name from a controlled list

    def uuid(self) -> uuid.UUID:
        # lowercase + strip to avoid case/whitespace splits
        key = "|".join([
            self.legal_name.strip().lower(),
            self.birth_date.strip(),
            self.death_date.strip(),
            self.service_location.strip().lower(),
        ])
        return uuid.uuid5(NS_DECEASED, key)


class IdentityMismatch(Exception):
    pass


def gate(dossier, contact, render_ctx, recipient) -> uuid.UUID:
    ids = {
        "dossier":   dossier.identity().uuid(),
        "contact":   contact.identity().uuid(),
        "render":    render_ctx.identity().uuid(),
        "recipient": recipient.identity().uuid(),
    }
    if len(set(ids.values())) != 1:
        raise IdentityMismatch(ids)
    return next(iter(ids.values()))

The exception holds the mail and posts the four UUIDs, plus the source fields they were derived from, into the on-call channel. The liaison sees within ten seconds which two records are getting stitched and which two agree. She doesn't have to read 184 drafts to spot the one wrong line.

Why a name plus dates beats a CRM id

A skeptic reads the code above and asks: why not just trust Plotbox's GUID? Plotbox does in fact emit a UUID per record. We don't use it. The reason is that the GUID identifies the row, not the deceased. The merger import created new rows for cases that already existed, with new GUIDs, and pointed old foreign keys at them. The row identity moved. The person didn't.

By hashing the four fields that a funeral director would use to confirm "yes, this is the person we're burying", we get an identifier that survives any CRM-internal restructure. If the same person is represented in two systems with different row ids, the UUID is the same. If two different people end up in the same row by accident, the UUID is different and the gate fires. The trade-off: if a real-world field changes — a misspelled surname is corrected, a service location is moved between locaties — the UUID changes too. We accept that. A held draft on the day a typo is fixed is cheap; a misrouted condolence is not.

The choice of UUID v5, rather than v4 or a plain SHA-256, is deliberate. UUID v5 is specified in RFC 4122 §4.3, it's deterministic given a namespace and a name, and every database we touch — Postgres, the Plotbox export, the SMTP audit log — already has a native UUID column type. The gate composes with the existing schema without a single migration on the CRM side.

Holding the SMTP tunnel

The gate runs in the same process that signs and queues the outbound mail. The SMTP relay (postfix in front of a transactional provider) refuses to accept any envelope that doesn't carry a header named X-ABN-Deceased-Id. That header is the UUID the gate returned. There is no path from agent to wire that bypasses it.

The header is also written into the message body as an invisible HTML comment, so that if a draft is later forwarded, exported, or printed and re-scanned, the identity is preserved. We keep a 90-day rolling log of (header UUID, recipient hash, template version) for audit. We don't keep message bodies — there is no operational reason for ABN to hold them, and a funeral company's mail is among the most sensitive a Dutch business processes.

When the gate fires, the on-call channel gets a structured Slack message: the four UUIDs in a diff-friendly format, the source fields that differ, and a deep link to the dossier in Plotbox. The liaison's standard operating procedure on a mismatch is to call the family directly rather than try to apologise by mail — the apology vector that caused the incident is exactly the wrong tool to fix it. We page during business hours and queue overnight, never the other way around. After the gate shipped, we added a passive monitor that flags any dossier_id touched by two different deceased UUIDs within ninety days. It hasn't fired in fourteen months.

What this incident says about agentic systems

Most public writing about agent reliability focuses on retries, evals, and observability. Those matter. But the failure mode we just walked through has nothing to do with the model. The agent did its job correctly given the inputs. The inputs were two people stitched into one identity by a database mechanic from eighteen months earlier.

Agentic reliability, in our experience running fourteen agents in production, lives at the boundary between the agent and the systems it reads from — not inside the model. The questions worth asking before shipping a customer-facing agent are: what is the unit of identity in this domain, who controls the key, and what happens when the key changes without a corresponding change in the underlying entity. If you can't answer those in one sentence each, the agent will eventually do exactly what ours did on that Monday: produce a plausible, well-formatted, completely wrong artefact, at scale.

What you can do in the next hour

Three things, in order. First, list every CRM your agent reads from and ask the DBA when the last bulk import or merger happened and whether the primary key was reassigned during it. The answer is usually "yes, a couple of years ago" and nobody remembers the offset table. Second, write down on one piece of paper the four or five fields a human in your domain uses to recognise a customer, a case, a patient, or an order. Those are your namespace inputs. Don't ask the engineering team — ask the front-desk person who's been on the phone confirming identities for years. Third, gate the outbound side. The cheapest, most embarrassing place to catch an identity mismatch is in the mail server, ten milliseconds before the envelope hits the wire. Adding a header check at the relay is a one-afternoon job and it will save you the kind of Monday morning we had in Groningen.

When we built the email-agent for the Groningen firm, the thing we ran into was exactly the silent CRM key re-use described above; we solved it with a per-deceased UUID gate on the SMTP edge that re-derives identity from the four fields a funeral director would already write on a paper checklist. Most of what we ship as AI agent work looks like this: agent in the middle, deterministic gate on the boundary, human release.

Key takeaway

Treat your CRM's primary key as a hint, not as identity. Re-derive identity from the fields a human would use, and gate the outbound side on it.

FAQ

Why UUID v5 instead of UUID v4?

v5 is deterministic from a namespace and a name, so the same person yields the same UUID anywhere in the pipeline. v4 is random and gives you no way to compare an identity across systems.

Why not just trust the CRM's own GUID?

A GUID identifies a row. Mergers and re-imports create new rows for the same person, with new GUIDs. The person's identity has to survive the row's identity, so we derive it from human-readable fields.

What if the funeral service location isn't set yet?

The mail is held. Service location is a required namespace input; if it's empty, the gate refuses to release and pages a human. Better a held draft than a misrouted condolence.

Could a smarter model have caught this?

No. The model received a fully populated, internally consistent draft. The mismatch was between records, not inside one record. Deterministic gates beat prompts for identity checks.

Does this slow the outbound pipeline down?

The gate adds roughly a millisecond per mail. The hash is cheap and the four lookups are already in memory by the time the draft is staged. SMTP relay latency dominates by three orders of magnitude.

email automationai agentsautomationcase studyintegrationsoperations

Building something?

Start a project