← Blog

Voice agents

Voice agent for a Tilburg uitvaartonderneming: playbook

A bereaved daughter calls a Tilburg funeral home at 23:14. The voice agent has 45 seconds to decide if this needs a human now, and a 36-hour legal clock has already started ticking.

Jacob Molkenboer· Founder · A Brand New Company· 19 Jun 2026· 11 min
Black bakelite phone receiver off-hook on leather blotter, ivory card with green ribbon, brass pocket watch, dried rosemary.

It is 23:14 on a Tuesday in November. A woman in Goirle calls the central line of an uitvaartonderneming in Tilburg. Her father died forty minutes ago in the hospice. She does not know what to do next, only that someone told her to call this number. The intake employee went home at 18:00. The on-call uitvaartverzorger is driving back from a service in Eindhoven and will not pick up for another seven minutes.

This is where the voice agent answers.

The constraint stack

Five constraints set the shape of the build before we wrote a line of code.

First, the legal clock. The Dutch Wet op de lijkbezorging puts a hard ceiling of 36 hours between death and the start of arrangements for transport and care of the body, with extension procedures that nobody wants to file at midnight. That clock starts ticking before the family has called us. By the time the phone rings, we may already be six or eight hours in.

Second, the volume. The firm we built this for handles 940 nabestaande-gesprekken in an average week. About 31% come in outside office hours. The night shift used to be one duty phone, one tired uitvaartverzorger, and a handover note the next morning.

Third, the dossier system. Uitvaartsoftware.nl, a 13-year-old SaaS that runs most of the smaller Dutch funeral homes, with a SOAP-style API written when SOAP still felt modern. Read access is fine. Writing a new dossier requires three round-trips and a confirmation token that expires in 90 seconds.

Fourth, the archive. Every condoleance, the messages of sympathy a family receives in the days after a death, sits in a homegrown Exchange 2016 public folder structure that the firm built in 2017 and nobody has touched since the IT lead left. EWS still works. OAuth does not.

Fifth, regional speech. Tilburg and West-Brabant share Brabants dialect markers that a default Dutch speech-to-text model misclassifies eight to twelve percent of the time. The agent has to handle "unnen" and "hedde" and the soft G that comes back when a caller drops their guard, without ever asking the family to repeat themselves under the worst circumstances of their life. We fine-tuned the ASR on roughly 60 hours of redacted recordings from the firm's archive before we let it answer a single live call.

The triage decision

The first rule we wrote down is the only one that actually matters: this is a voice agent that knows when to stop being a voice agent.

We classify every call within the first 12 seconds. The agent's opening utterance is deliberately bland: a name, a confirmation that this is the right number, and a single open question. While it speaks, the audio stream is already going through a parallel classifier looking for three things.

One, acute distress markers: sustained crying, breathing irregularity, sentences that stop mid-word and do not resume. Two, suïcide-context signals: explicit mention by the caller, ambiguous phrasing around how the death occurred, the word "zelf" in proximity to "gedaan" or "gevonden", any reference to a current threat to the caller's own safety. Three, time-critical operational requests: the caller is at the location of the death and needs immediate guidance on what not to touch before the schouwarts arrives.

Hit any of those and the agent stops trying to triage. It says one sentence: "Ik verbind u nu door met een uitvaartverzorger, blijft u alstublieft aan de lijn." The call is on a human's headset within 45 seconds, including the time to ring through the on-call rotation.

The 45-second budget is not arbitrary. It is the longest we observed before a distressed caller in our pilot week hung up because they thought no one was coming.

The prosody features feeding the classifier are deliberately narrow. We extract pitch variance over a rolling four-second window, intra-word pause length, and a crude breathing-irregularity score from the high-pass filtered residual. We do not run sentiment analysis. We do not run emotion detection in the marketing sense of those words. Both do worse than chance when the speaker is in shock. The signals that work are mechanical: how the voice moves, not what a model thinks the voice feels.

Here is the classification call, simplified. The production version routes through three parallel models for redundancy.

async def classify_turn(audio_chunk, transcript_so_far, call_state):
    classifier_input = {
        "transcript": transcript_so_far,
        "elapsed_ms": call_state.elapsed_ms,
        "prosody": extract_prosody(audio_chunk),
    }
    verdict = await classifier.run(classifier_input)
    if verdict.suicide_context or verdict.acute_distress:
        await escalate_to_uitvaartverzorger(
            call_state,
            reason=verdict.primary_signal,
            transcript=transcript_so_far,
            urgency="immediate",
        )
        return TurnAction.HANDOFF
    if verdict.scene_present:
        return TurnAction.HANDOFF_OPERATIONAL
    return TurnAction.CONTINUE
Warning

The harder problem is the false negative: a caller who is composed, articulate, businesslike, describing a death that we should have flagged. We bias the classifier toward over-escalation and accept a 4% false-positive rate, because the tolerance for missed cases is zero. The uitvaartverzorger team agreed to this trade before we shipped.

Our handoff metadata also includes a pointer to the relevant national context. For Dutch callers, the on-call uitvaartverzorger sees a one-line reference to 113 protocols as part of the briefing card, so the human starting the conversation has the right framing ready before they speak.

Wiring into the 13-year-old dossier system

The Uitvaartsoftware.nl API is fine if you treat it like a 2012 ERP and not a 2026 service. Four things we learned the hard way.

The session token expires fast. It claims thirty minutes. In practice it dies somewhere between eight and twelve. We keep a refresh loop that runs every four minutes, regardless of whether we are using the connection.

Dossier creation is two-phase. You POST a draft with the deceased's name and date of death, then PATCH the rest. If the PATCH fails, the draft sits there forever, polluting the search index. We wrap every create in a try/finally that deletes orphans on failure.

async def create_dossier(payload):
    draft = await uitvaart.post_draft({
        "naam_overledene": payload.naam,
        "datum_overlijden": payload.datum_overlijden,
    })
    try:
        return await uitvaart.patch_dossier(draft.id, payload.full())
    except Exception:
        await uitvaart.delete_draft(draft.id)
        raise

The third gotcha is that the BSN field, when present, is encrypted at rest with a key that rotates every quarter. Reading it back requires the current key version. We never ask the voice agent to read a BSN aloud or to confirm one. If the family wants to share it, the agent says: "Stuur het via de beveiligde link die u zo per SMS ontvangt." A separate one-time-link service handles capture, outside the agent's transcript path.

The fourth gotcha cost us a week to find. After the eleventh write in 60 seconds, the server starts returning 200 OK with an empty body and silently drops the payload. We learned this by writing a smoke test that pushed 50 records and checked retrieval; eight were missing with no error in the response. We now queue dossier writes through a token bucket capped at six per minute, and we accept the latency. Under burst load the queue depth peaks around fourteen; nothing about a 90-second wait to commit a dossier line hurts an after-hours call.

The Exchange 2016 condoleance archive

This was the part we wanted to refuse and ended up building. Every funeral generates dozens of condoleance messages: emails, contact-form submissions, transcribed voicemails. The family wants them archived and printed in a small book after the service. The existing system shoves all of them into Exchange public folders, organised by dossier number, with no consistent metadata.

We did not migrate. The lead engineer at the firm has been there for eleven years and asked us, plainly, not to. They have institutional muscle memory for where things live and a Drupal-era control panel that maps to the folder tree.

So the voice agent writes to Exchange via EWS over basic auth, scoped to a single service account whose only right is "create item in named public folder". Microsoft's documentation is still online if you can find your way through it.

The trick is that EWS rejects HTML mail bodies with non-MIME-clean characters, which the agent occasionally produces when transcribing dialect. We sanitise on the way in.

def sanitise_for_exchange(html_body: str) -> str:
    cleaned = html_body.encode("ascii", "xmlcharrefreplace").decode("ascii")
    cleaned = cleaned.replace("\r\n", "\n").replace("\n", "<br/>")
    return f"<html><body>{cleaned}</body></html>"

Unglamorous code that prevents the entire ingest pipeline from stopping at 02:00 when a caller from West-Brabant says something the speech-to-text model interprets as a control character.

The other Exchange gotcha is throttling. Write from a single connection at full speed and the server drops new requests after roughly 60 messages in 90 seconds, with a misleading "too many concurrent connections" error; the real limit is throughput per service account per source IP. We rotate between two service accounts and accept a 200ms backoff per message. Slower than we would like. Reliable. The condoleance archive does not need real-time write latency; it needs every message to land exactly once in the right folder, which is what we got.

Tone, language, and the things a voice agent must not say

Half the build was prompt and tone work. Three rules we landed on, and one we explicitly chose not to write.

The agent does not use the word "spijt". Dutch families told us, in interview after interview, that "het spijt me" from a voice they suspect is automated reads as performative. We use "ik begrijp dat dit een zwaar moment is" or, often better, a longer pause before the next question.

The agent never repeats back the cause of death. If the caller says their father died in a car accident, the agent does not say "uw vader is overleden bij een auto-ongeluk". It says "dank u, dat heb ik genoteerd". Repeating a traumatic detail with synthetic prosody is the single fastest way to lose trust.

The agent always offers, in its first thirty seconds, the option to speak to a human. Even when the caller is composed. Even when the agent is doing fine. The offer itself is the safety valve.

The rule we considered and did not codify: how to handle the long silence after a difficult question. Early in the build we shipped a version that filled the gap with "ik luister naar u" after four seconds. Two callers complained, separately, that the prompt felt like a chatbot polling. We removed it. The agent now waits up to twelve seconds in silence before asking, gently, whether the caller is still on the line. Some moments are not for filling.

What we measured after 90 days

Numbers from the first quarter of operation, taken from the firm's internal review.

After-hours pickup rate went from 71% to 99.4%. The remaining 0.6% are mostly carrier issues on the inbound side.

Median time to a human, for calls flagged for escalation, is 28 seconds. The 45-second budget is comfortable.

The uitvaartverzorger team is taking 38% fewer cold-start middle-of-the-night calls, because the agent has already collected the basics (name of the deceased, location, whether a schouwarts has been called) and pasted them into the dossier before the human picks up.

Average call duration before handoff fell from 4:12 to 1:38 on the happy path, because the agent stops asking once the dossier has what it needs. Dossier completeness on first save, measured against the post-funeral audit done by the firm's intake lead, sits at 94%. The remaining 6% is almost always a missing address-of-death or a misheard surname; both fail safely into the staff's morning review queue rather than into the family's experience.

False-positive escalation rate sits at 4.1%. The team's stated tolerance was 8%. They have asked us not to tighten it further.

Zero false negatives in the suïcide-context category, across roughly 12,200 classified calls. We do not believe this number will hold forever. We have an external review of flagged-versus-actual scheduled every quarter, conducted by a clinician who is not part of the build team.

The smallest thing to do today

If you operate any after-hours phone line that touches grief, crisis, or medical urgency, do this before you do anything else: write down the list of conversational signals that mean "stop the agent, get a human, now". Not the happy path. The off-ramp. If you cannot name those signals in plain language, you cannot build a voice agent that handles them, and you should not buy one either.

When we built this for the Tilburg uitvaartonderneming, the hardest part was not the speech stack or the EWS plumbing. It was sitting in a room with the uitvaartverzorgers and writing the escalation list line by line, in Dutch, until everyone in the room agreed. That conversation, more than the code, is what made the voice agent safe to put on the line.

Key takeaway

Build the off-ramp before the happy path. If you cannot name the signals that should kill the agent and call a human, you cannot ship one.

FAQ

How fast does the agent escalate a flagged call to a human?

Within 45 seconds end-to-end, including ring-through on the on-call rotation. Median observed in production is 28 seconds across roughly 12,200 classified calls.

Does the voice agent ever discuss suicide directly with callers?

No. If signals are detected, it stops triage immediately and hands off to a human, with relevant context written into the dossier. The agent is not a crisis counsellor and is not built to act as one.

Why keep Exchange 2016 instead of migrating to Microsoft 365?

The funeral home's lead engineer asked us not to. Eleven years of institutional memory live in that public folder tree. The voice agent writes via EWS into the same folders the staff already trusts.

How do you handle the BSN field from Uitvaartsoftware.nl?

The agent never reads a BSN aloud or asks the caller to confirm one. Capture happens through a separate one-time secure link sent by SMS after the call, handled outside the agent's transcript path.

voice agentsai agentsintegrationsarchitectureoperationsworkflow

Building something?

Start a project