Voice agents
Voice agent case study: dental recalls in Dutch and Arabic
A Den Bosch dental group needed to recall 1,820 patients a week in Dutch and Arabic without exposing a single BSN to any model. Here is the architecture we shipped.

The recall list nobody wanted
It was a Friday in February. The practice manager at a four-clinic dental group in Den Bosch printed her weekly recall list at 16:00, walked it to the front desk, and watched the receptionists' faces. The list was 1,800 patients long. Cleaning recalls, six-month checkups, orthodontic follow-ups, denture fittings. Each one a phone call. Each call thirty seconds of conversation and two minutes of typing it into Exquise. She had budget for a voice agent. She did not yet have a way to point one at her patient database without risking a BSN leak.
She had already done the math. At two and a half minutes per call, the list ate forty hours a week between the four front desks. Half the calls went to voicemail. A quarter of the patients had moved, switched insurance, or did not speak conversational Dutch at home. The receptionists hated it. The dentists complained that the chair sat empty for ninety minutes on a Tuesday because nobody had got through to confirm.
That is the project we picked up.
Exquise, ODBC, and a 16-year-old schema
The first call with the IT lead was twenty minutes of context. The patient-record system is Exquise, a Dutch dental practice management product that has been around since the late nineties. Their installation went live in 2009 on a Windows Server with a SQL Server backend. Sixteen years of patient history, appointment notes, treatment plans, insurance claims. The maintenance contract with the vendor is active, but the API surface is limited and the schema is, frankly, organic.
We did the only sensible thing. We did not ask for a new API. We asked for a read-only ODBC connection.
ODBC is older than most of the people on this project. It is also boring, well-understood, and supported by every database that matters. With a read-only DSN pointed at the Exquise SQL Server, we could query for the recall list, the patient's preferred language, their phone number, their appointment history, and their consent flags. We could not modify a single row. The DBA could revoke the account in one statement.
Read-only was non-negotiable for two reasons. The vendor's support contract voids if a third party writes to the database. And the practice manager wanted a kill switch that did not depend on us being awake.
Here is what the connection string ended up looking like, with the credentials stripped.
[exquise_recall_ro]
Driver = ODBC Driver 18 for SQL Server
Server = tcp:exq-sql.internal,1433
Database = ExquiseProd
UID = svc_recall_ro
Encrypt = yes
TrustServerCertificate = no
ApplicationIntent = ReadOnlyThe ApplicationIntent=ReadOnly flag on a SQL Server AlwaysOn replica was the bit that made the DBA relax. We were not even on the primary node.
The redaction layer that sees the BSN so the model never does
This is the part that took the longest in design and the shortest in code.
In the Netherlands, every patient record carries a BSN, the Burgerservicenummer. It is the single identifier that ties a person to their health insurance, their tax records, and their municipal registration. Sharing it casually is not just unprofessional. The Autoriteit Persoonsgegevens treats avoidable BSN exposure as a reportable data breach under the AVG.
The brief was simple. The voice agent must never see a BSN, never log one, never embed one in a prompt, never ship one to any model provider. Not for performance reasons, not for legal coverage. For the patient.
The architecture we landed on has three layers.
- A query worker on the dental group's own infrastructure that runs the read-only ODBC query and produces a JSON record per patient.
- A redaction proxy that strips the BSN out of the record before it crosses the building's firewall, replacing it with a short opaque token we generate per call.
- The voice agent, hosted on our side, which only ever sees the opaque token, the patient's first name, the appointment slot on offer, and a language hint.
When the patient confirms a slot, the agent sends back a record keyed by the opaque token. The redaction proxy resolves the token back to the BSN locally and writes the confirmation into the booking system through a separate, narrowly scoped write path (more on that below). The model provider never sees the resolved value.
# redactor, runs on-prem
def redact(patient_record):
token = secrets.token_urlsafe(12)
BSN_VAULT[token] = patient_record["bsn"] # stays on this box
return {
"patient_token": token,
"first_name": patient_record["voornaam"],
"language_hint": patient_record["taal"] or "nl",
"last_visit": patient_record["laatste_bezoek"],
"recall_reason": patient_record["recall_type"],
"offered_slots": next_open_slots(patient_record["voorkeur_arts"]),
}That BSN_VAULT is a Redis instance on the on-prem box with a 24-hour TTL on every token. After 24 hours the patient is recalled fresh or not at all. The vault has no outbound network route.
If a model never sees an identifier, no provider policy change can ever expose it. Redaction at the boundary beats trust at the contract layer.
We did not invent this approach. It is roughly the same shape as a payments tokenization vault. What is new is the urgency. Model providers change retention policies on their own timetable, not yours. A practice manager who reads a Tuesday-morning news story about a thirty-day log retention window does not want to spend Wednesday explaining it to her DPO. Even if you trust the provider, you do not get to make that decision unilaterally for 14,000 patients. Better to make sure the question never comes up.
Dutch first, Arabic second, code-switching in between
About 18% of the patient base in this practice speaks Arabic at home, mostly Levantine and Moroccan dialects. The receptionists had long since given up on a clean script. They would open in Dutch, get a hesitant response, and switch. Sometimes mid-sentence. Sometimes the patient's adult child would take the phone.
We could not pretend that was a clean two-language problem.
What we shipped is a voice agent that opens in the patient's recorded language preference, listens to the first three seconds of response, and re-routes to the other language stack if the spoken language does not match. The detection is intentionally cheap and intentionally biased toward Dutch. False positives toward Arabic were worse for trust than false positives toward Dutch.
The mechanism is a small voice-activity gate followed by a single short clip to a multilingual classifier. The classifier returns a Dutch-or-Arabic label and a confidence score. Below 0.7 confidence the agent stays on Dutch and waits for a second turn. The first version of this layer had four hand-rolled heuristics chained behind the classifier and was less accurate than the classifier alone. We deleted them.
For Arabic, the voice model handles Modern Standard Arabic gracefully and stumbles politely on dialect. We did not try to fix that. The fallback is honest. The agent says, in Dutch and then in Arabic, that it is an automated assistant and offers to schedule a callback from a human receptionist if the patient prefers. Roughly 4% of Arabic-language calls take that exit and we consider that a feature.
A quiet design choice. The agent never uses the patient's last name. Pronunciation is hard, mispronunciation is rude, and the first name plus a confirmation question is enough to verify identity together with the date of birth that the patient supplies.
Writing confirmations back without write access
The vendor contract forbids third-party writes. The practice still needs confirmed appointments to land in Exquise without a human retyping anything. These two facts had to coexist.
The answer was a small on-prem service we wrote, owned by the dental group's IT lead, which exposes a single HTTP endpoint, POST /confirm. It accepts a payload signed by the redaction proxy. It validates the signature, resolves the token back to a BSN locally, and calls the official Exquise booking module the same way the receptionists' desktop client does, through the vendor's documented integration surface. We are not writing to the database. We are driving the supported booking flow with the patient's resolved record.
If the booking call fails, the service writes a row to a local SQLite queue and rings a webhook into the practice manager's Teams channel. Failures are rare but loud.
POST /confirm HTTP/1.1
Content-Type: application/json
X-Signature: ed25519:...
{
"patient_token": "Hk2v7Tg5pQ8x",
"slot_id": "2026-06-18T10:20+02:00:arts_3",
"language_used": "ar",
"agent_run_id": "vra_01HZJ7Q5",
"confirmed_at": "2026-06-12T11:42:18+02:00"
}The agent_run_id is the only piece of voice-agent state we retain on our side, and it is keyed only by the opaque token. If a patient complaint arrives a week later, the practice manager hands us the run id and we can pull the call audio (stored on our side, encrypted, 30-day retention) without ever having to map it back to a BSN ourselves. The audit chain on the practice side is the inverse: their booking log holds the BSN and the run id, our log holds the run id and the audio, and the two only ever meet inside the redaction proxy under signed lookup.
What 1,820 weekly callbacks look like in practice
The agent has been live since late January. The cadence settled around 1,820 outbound calls per week across four clinics. About 71% connect on the first attempt. Of those, 58% confirm a slot in the same call. The rest either reschedule, request a callback from a human, or politely cancel the recall. The agent gives up after three attempts across two days and writes the patient back to the human queue. Nobody gets called eleven times.
The front desks now spend their recall hours on the exception queue. The exception queue is the only place a human voice is still needed: complaints, complex rescheduling, patients who specifically asked for a person, and the 4% Arabic-language fallback. The practice manager told us the desks feel quieter even though the throughput is up.
The numbers we watch weekly:
- Connect rate, currently 71%, was 64% before we tuned the dial window to skip 12:00 to 13:30.
- First-call confirmation rate, 58%.
- Slots filled per chair per week, up 14% versus the same month last year, though some of that is seasonal.
- Complaints per thousand calls, three, all human-resolvable, none about identity exposure.
We do not publish the cost per call externally. Internally it is roughly an order of magnitude under the loaded cost of a receptionist minute, which is what we needed for the business case.
What we would build differently next time
Two things.
The first is the language-detection stack. We over-engineered it in the first sprint and had to rip half of it out. A cheap voice activity detector plus a short clip to a multilingual classifier was more accurate than the chained heuristics we tried to build. Next time we start cheap, measure for a week, and only add layers when the confusion matrix demands it.
The second is the consent surface. We added an opt-out by SMS late in the project because two patients asked for it. It should have been in the first sprint. Voice agents at this scale need a one-line escape hatch from minute one, not after the first complaint. The SMS line now also doubles as a language-preference correction channel, which we did not anticipate and which removed about a third of the language-mismatch incidents in week two.
If your voice agent has no SMS or web opt-out from the first call, you will rebuild that surface under pressure later. Build it on day one.
The boring boundary is the interesting part
The interesting work in voice agents this year is not the voice model. It is the boring boundary between the model and the systems that already run the business. Read-only ODBC, an opaque token vault, a signed write endpoint into a documented integration surface. None of it is novel. All of it is what makes a 16-year-old patient-record system safe to point a voice agent at.
The pattern travels. Replace Exquise with a sixteen-year-old logistics planner, a tile-trade ERP, or a Drupal 7 membership database, and the same three pieces still hold. A read-only door into the system of record. A boundary that strips the identifier the model has no business seeing. A signed, narrow write path back through whatever interface the vendor will still support in writing. The voice model in the middle becomes the easy part.
When we built this voice agent for the Den Bosch group, the thing we ran into was that read-only contract clause with the vendor. We solved it by driving the vendor's own booking flow from an on-prem service the practice owns, so the database integrity guarantees stay with the people who signed for them.
If you have a recall list, a legacy system, and a BSN-shaped problem, the smallest thing you can do today is open your database manual and check whether your DBA can hand you a read-only ODBC user. Everything else is downstream of that.
Key takeaway
If the model never sees the identifier, no provider policy change can ever expose it. Redact at the boundary, do not trust at the contract.
FAQ
Why ODBC instead of a modern REST API?
The vendor's API surface for this 2009 install is limited, and a read-only ODBC DSN was the fastest path to safe, revocable access without voiding the support contract.
How do you keep the BSN away from the model provider?
An on-prem redaction proxy strips the BSN before any record crosses the firewall and swaps it for a short opaque token. The token resolves back to the BSN locally, never in the model.
What happens if a patient prefers a human?
The agent offers a human-callback path in Dutch and Arabic. About 4% of Arabic-language calls take that exit, and they land in the same exception queue as complaints and complex reschedules.
Can this work for non-dental practices?
Yes. The shape (read-only ODBC, on-prem token vault, signed write endpoint into a documented booking flow) transfers to any legacy system with patient or customer identifiers that should never leave the building.