Voice agents
Voice agent on .NET 4.5: a Rotterdam zorgbroker playbook
How we shipped a Dutch and Papiamentu voice agent against a 14-year-old .NET 4.5 polisadministratie, with code-rood escalations under 60 seconds.

09:14 on a Tuesday. Maria belt vanuit Charlois. Her dochter just turned 18, the aanvullende tandartsdekking needs to move to her own polis, and there are fourteen minutes left before her tram pulls into Rotterdam Centraal. The medewerker on the line was hired three weeks ago and has not yet been taught how to find a polis when the BSN is given in a Papiamentu accent.
This is what the broker had every Tuesday morning. Not difficult work. Repetitive work. The kind that lives in the gap between a polishouder who needs forty seconds and a CRM that needs four minutes.
The brief from the directeur was short: build a voice agent that takes the easy calls off the wachtrij, handles Dutch and Papiamentu, and never gets a code-rood wrong.
The intake we walked into
Thirty-three people, including the directeur. Two phone lines, one general and one zakelijk, carrying 2,180 inbound calls a week. About 64% of those calls touched a single polis. Around 18% were aanvullende-pakket questions that nobody on the eerste lijn was authorized to commit. A small slice, which we measured at 1.1%, were overlijdensmeldingen that demanded a senior medewerker within minutes, not hours.
The policy data lived in a custom .NET 4.5 polisadministratie commissioned in 2012 by a developer who has since emigrated to Australia. SQL Server 2012 on a Hyper-V host in a Rotterdam datacenter. WCF endpoints over basicHttpBinding, the kind of WCF where every method returns a 200 OK and the actual error code is buried inside the SOAP envelope.
.NET 4.5 reached end of support in 2016, which means there is nobody to call when the WCF host has an existential crisis. The broker had a half-Romanian, half-Dutch sysadmin who knew where the bodies were. We did not have permission to refactor. We had permission to integrate.
The constraints that shaped the design
Three constraints did most of the work.
One: the polisadministratie could not be modified. Any database write went through the WCF facade or through the medewerker UI. No new endpoints, no "let us just add one stored procedure." The acceptant team had spent four years cleaning data inside that schema and they were not going to let a voice agent dirty it up.
Two: the call had to feel like a Rotterdam call. Not a Brussels call, not a Hilversum call. The dialect matters. Most polishouders in the broker's book live in Rotterdam-Zuid. A non-trivial slice speaks Papiamentu at home and prefers it when the topic is geld or gezondheid. ASR that mishears their BSN is worse than no ASR at all.
Three: AVG. The compliance officer wanted audio retention down to seven days and full erasure on request, with structured PII redaction in the transcripts. The pipeline had to assume every call contained a BSN, an IBAN, and a medische verklaring.
The voice stack
We ended up with five moving parts.
A SIP trunk from the broker's existing provider, terminated into LiveKit Agents running on a Hetzner box in Falkenstein. LiveKit handled the realtime audio plumbing and the half-duplex turn-taking. We did not want to pay per-minute media costs to a US vendor for what is essentially a Rotterdam-to-Rotterdam call.
For ASR, Deepgram Nova-2 multilingual on the Dutch path, and a fallback Whisper-large-v3 deployment on a single A100 for Papiamentu. Deepgram does not officially list Papiamentu; in practice it transcribed around 71% of Papiamentu utterances as garbled Dutch, which is worse than useless. Whisper handles Papiamentu well enough for intent classification but not well enough for BSN capture, which we will come back to.
For the brain, an LLM agent with a tight tool surface: six tools, all of them wrappers around the WCF facade or the queue system. For TTS, ElevenLabs with a Dutch female voice cloned from a 12-minute sample of one of the broker's existing senior medewerkers. The polishouder needs to recognize the voice. That recognition does more for trust than any onboarding script we could write.
For the queue, a small Node service on the same Hetzner box that holds three RabbitMQ queues: triage, acceptant, and code-rood.
// tools/lookupPolis.ts
// Thin wrapper around the WCF facade. Returns a normalised polis object.
export const lookupPolis = {
name: "lookup_polis",
description: "Find a polis by BSN or polisnummer. Read-only.",
input_schema: {
type: "object",
properties: {
identifier: { type: "string", description: "BSN (9 digits) or polisnummer" },
kind: { type: "string", enum: ["bsn", "polisnummer"] }
},
required: ["identifier", "kind"]
},
handler: async ({ identifier, kind }) => {
const soap = buildSoapEnvelope("GetPolisByIdentifier", { identifier, kind });
const res = await fetch(WCF_URL, {
method: "POST",
headers: { "Content-Type": "text/xml", SOAPAction: "GetPolisByIdentifier" },
body: soap
});
const body = await res.text();
if (body.includes("<faultcode>")) {
// WCF returns 200 OK even on faults. Parse the envelope, do not trust the status.
return { ok: false, reason: parseFault(body) };
}
return normalise(parseEnvelope(body));
}
};
WCF over basicHttpBinding returns HTTP 200 OK on most server-side faults. Always parse the SOAP envelope before deciding the call succeeded. Letting the agent treat a 200 as success will cheerfully commit garbage into a 14-year-old schema.
Dutch and Papiamentu without a Babel fish
Language detection on the first 1.2 seconds of audio is unreliable, especially with code-switching. Polishouders move between Dutch and Papiamentu inside a single sentence, and the model that handles that switch cleanly does not yet exist for our budget.
We did three things.
We let the polishouder press 1 voor Nederlands, 2 pa Papiamentu before the agent picks up. About 38% press 2. Of those, around 19% switch back to Dutch within the first thirty seconds. That is fine. The agent runs a second-pass language detector every 800ms and switches the ASR pipeline when confidence stays above 0.84 for three consecutive turns.
We isolated BSN capture. The agent never tries to capture a BSN from free speech. It asks the polishouder to enter the BSN on the keypad, reads it back digit by digit in their chosen language, and waits for ja or sí before continuing. DTMF beats ASR on a nine-digit string, every time, in any language.
We logged every Papiamentu turn for human review during week one. The acceptant team flagged 41 mistranscriptions in the first 600 Papiamentu calls. We used those 41 as a finetune set for a small intent classifier that sits between Whisper and the LLM. Intent accuracy on Papiamentu went from 84% to 96% over two weeks.
The routing rules
The agent does not make policy decisions. It moves calls into the right queue, and it does the read-only work that does not need a human.
The triage queue handles polis-wijzigingen the agent can complete end to end: address changes, IBAN changes, adding a known polishouder to an existing aanvullend pakket within the standard window. The agent commits these through the WCF facade and reads back the confirmation. About 47% of inbound calls land here.
The acceptant queue handles every aanvullende-pakket-vraag that needs underwriting. The agent collects the polishouder's intent, the gewenste pakket, and the gezondheidsverklaring answers when the polishouder is willing to give them by phone, then drops a card into a kanban board the acceptanten work from. Average time from call-end to acceptant first touch dropped from 19 hours to 41 minutes.
The code-rood queue handles overlijdensmeldingen. The classification is deliberately generous. Any mention of overleden, gestorven, weggegaan, or the Papiamentu morto pa muri triggers it, plus any sentence where the caller identifies themselves as nabestaande or echtgenoot of an existing polishouder. False positives go to a senior medewerker who closes them in under a minute. False negatives would be catastrophic.
Sixty seconds for code-rood
The SLA the directeur asked for was simple. From the moment the agent classifies a call as code-rood, a senior medewerker is on the line within 60 seconds.
We did not hit that with a webhook. Webhooks fan out across a CRM and a notification service and a router, and 60 seconds is not enough budget for a chain of HTTP calls when any one of them can hang.
What works: a dedicated WebSocket channel that the seniors keep open on their workstation, with a small Mac menu bar app that drops everything to full-screen on a code-rood. The agent stays on the line, plays a short pre-recorded message in the chosen language ("een collega komt zo voor u op de lijn"), and warm-transfers when the senior picks up. Median handoff time across the first 90 days: 23 seconds. Worst case in 90 days: 51 seconds.
# escalation.py
# When the classifier returns code_rood, we do not POST. We push.
async def on_intent(intent: Intent, call: Call):
if intent.label == "code_rood":
await audio.play(call, prompts.warm_hold[call.lang])
seniors = ws_registry.online("senior")
if not seniors:
# Fallback: kantoor general line, never a voicemail.
return await call.transfer(KANTOOR_LINE)
winner = await race_first_accept(seniors, timeout_s=20)
if not winner:
return await call.transfer(KANTOOR_LINE)
await call.warm_transfer(winner.sip_uri)
What we logged and what we did not
AVG forced a few decisions we would have made anyway. Full audio: 7 days, encrypted at rest with a key the broker rotates. Transcripts with structured PII redaction: 90 days. BSN, IBAN, and any string matching a medical condition vocabulary get replaced with a token at write time. The acceptant team works from the redacted transcripts and can request a one-time unmask through a four-eyes approval flow.
We do not log the polishouder's tone, sentiment, or emotional state. The model can route on it inside the call. It does not get persisted. The compliance officer cared about this; the directeur cared about this; we cared about it for reasons that are partly principled and partly that nobody wants to be the studio whose voice agent told an AVG auditor that Maria sounded anxious last Tuesday.
What surprised us
The 14-year-old .NET 4.5 backend was not the problem we expected. The WCF facade was stable, idempotent, and well-named, because the original developer cared. The medewerker UI was the problem. It was the only path for certain writes, and the seniors had keyboard shortcuts the agent could not replicate without scraping the DOM of a thick client. We left those flows alone and routed them to humans.
The other surprise: the polishouders preferred the agent for the boring calls. We expected resistance and got NPS gains. The hypothesis the directeur offered: people do not want to feel they are wasting a human's time on a five-minute change.
The work is in the seams
When we built this voice agent for the Rotterdam zorgbroker, the hard part was not the AI. It was the WCF facade returning 200 OK on every fault, the Papiamentu BSN capture, and the 60-second SLA on code-rood. We solved those by parsing every SOAP envelope before trusting it, falling back to DTMF for any nine-digit string, and replacing the webhook chain with a single persistent socket.
The smallest thing you can do today: open one week of your call recordings, set a 90-minute timer, and tag each call with the smallest piece of information the agent would need to complete it. If 40% of your calls need fewer than four data points, you have a voice agent project. If they need twelve, you have a CRM project first.
Key takeaway
The work in a voice agent is rarely the language model. It is the integration seams, the routing rules, and the escalation path the SLA actually lives on.
FAQ
Why not replace the .NET 4.5 polisadministratie first?
Because a working voice agent against a stable WCF facade ships in weeks. A platform migration ships in quarters and carries data-integrity risk the broker did not want to absorb at the same time as a new channel.
Does Whisper actually handle Papiamentu?
Well enough for intent classification and free-form conversation, not well enough for nine-digit strings like a BSN. We use DTMF keypad entry for any number capture, in both Dutch and Papiamentu calls.
How do you measure the 60-second code-rood SLA?
From the timestamp of the intent classification event in the agent log to the timestamp of the senior medewerker pressing accept on the WebSocket prompt. Both are stamped server-side on the same Hetzner box.
What happens if no senior is online when a code-rood comes in?
The agent transfers to the kantoor general line, never to voicemail. The fallback path is exercised about twice a month, which is more than zero but not enough to redesign around.
How much human review did week one need?
Two acceptanten spent about 90 minutes a day reviewing flagged transcripts during week one. By week three that dropped to roughly 20 minutes a day, mostly on Papiamentu edge cases.