← Blog

Voice agents

Voice agents for dental clinics: an Eindhoven case study

A 22-chair dental group in Eindhoven replaced its Monday no-show backlog with a Dutch voice agent that books, reschedules, and hands off in 90 seconds.

Jacob Molkenboer· Founder · A Brand New Company· 6 Jun 2026· 8 min
Vintage black bakelite phone receiver off-hook on cream leather blotter, green ribbon, ivory card with red wax seal.

The first call of the week at the dental group we will call Tandheelkundig Centrum Strijp (the practice name is a pseudonym; the numbers are not) lands at 07:51. By 08:14, three of the five reception staff are mid-conversation, two lines are on hold, and a fourth call has rolled to voicemail. The practice opens at 08:00. A patient is already at the desk. The hygienist on chair four is buzzing reception to ask whether her 08:30 confirmed last night or not.

This is not a story about understaffing. The reception rota is fully booked, on time, and competent. It is a story about what reception staff are forced to do at the same time, every Monday, while a queue of patients watches them do it.

The math of a five-person rota

The group runs twelve chairs in Eindhoven and ten in Veldhoven, with one shared agenda across both buildings. Reception covers both on a single phone system. Before we got involved, the practice was logging roughly 1,100 inbound calls in an average week. About 60% of those were short and procedural: confirming an appointment, moving an appointment by a day, asking what time a child's check-up was, asking whether the practice accepted a specific insurer. The remaining 40% needed a person on the other end: triage questions, price questions about implants, complaints, billing.

The Monday peak was brutal. Patients who had thought about their teeth over the weekend all called between 08:00 and 10:00. Reception spent the first two hours of the week stacking the agenda, then spent the next two hours unstacking it because the dentists' actual capacity that week did not match what reception had promised.

The no-show rate sat between 7% and 9% depending on the month. That is not unusual for Dutch dental practices. The KNMT has been publishing guidance on no-show fees and reminders for years, and most practices have adopted SMS reminders. SMS reminders help. They do not solve the problem, because a patient who gets a reminder at 18:00 the day before still has to call the practice in the morning to reschedule, and by then reception is full.

What we actually built

We deployed a Dutch-language voice agent that answers every inbound call on the practice's main number. It greets the caller, asks why they are calling in plain Dutch, and routes from there. For the four highest-volume intents (confirm, reschedule, cancel, ask about opening hours) it handles the call end-to-end and writes back to the practice management system. For everything else, it gathers the caller's name, the patient's date of birth, a one-sentence summary of the reason for calling, and hands off to a human within 90 seconds.

The telephony layer is a SIP trunk pointed at a Twilio Programmable Voice endpoint. We use Twilio's Media Streams to pipe raw audio to a server we run in eu-central-1. The speech-to-text is a Dutch-tuned Whisper variant running on a GPU we keep warm. The intent layer and dialogue manager are a small custom state machine, not an LLM in the call path. We learned the hard way that you do not want a 1.2-second model round-trip sitting between ik wil een afspraak verzetten and naar welke dag?.

The text-to-speech is a Dutch voice we cloned with the practice manager's permission, from her own voice, with a written consent form filed under our AVG record. The voice sounds like the practice. That matters more than the underlying model: callers who hear a generic Dutch TTS voice hang up at roughly twice the rate of callers who hear a familiar one.

The write-back is into Exquise, the practice management system the group runs. There is no public API. We wrote a thin adapter that uses the same database connection Exquise uses, in a read-mostly, write-narrow pattern, with a feature flag that lets reception turn the agent's write access off in one click if anything goes wrong. The first week, that flag got flipped twice. After the third week, it stopped getting flipped.

The 90-second handoff

The single most important number in this project was 90 seconds. That is the maximum time the voice agent is allowed to spend with a caller before it must either solve the call or hand it off to a human with enough context that the human does not have to start over.

The handoff is not "sorry, one moment, I'll transfer you." A handoff like that costs the caller an extra 45 seconds of repeating themselves, and costs reception a cold start. Our handoff is a one-line summary, in Dutch, written to a sticky note on the receptionist's screen before the call rings through. The payload looks like this:

{
  "patient": "Anneke de Wit",
  "geboortedatum": "1978-03-14",
  "intent": "klacht_over_kroon",
  "summary": "Kroon links onder, geplaatst 12 mei, schuurt langs tong sinds gisteren.",
  "agent_attempted": ["bevestigen", "verzetten"],
  "attempted_outcome": "niet_van_toepassing",
  "ringing_to": "reception_eindhoven",
  "elapsed_seconds": 47
}

The receptionist picks up already knowing who is on the line, what they are calling about, and what the agent has already tried. The hello becomes "Hallo Anneke, vervelend dat je kroon schuurt. Ik zie dat hij op 12 mei geplaatst is. Klopt dat?" instead of "Tandheelkundig Centrum Strijp, met wie spreek ik?"

Takeaway

A voice agent earns its keep on the handoff, not the deflection. The number of calls it solves end-to-end matters less than how warm it leaves the call when it cannot.

Dutch is not English with extra letters

The hardest part of the build was not the agent design. It was Dutch.

Dutch speech-to-text accuracy in the wild, on a noisy phone line, with a 67-year-old caller from Brabant, is not what the public Whisper benchmarks suggest. The closer you get to a regional accent, the more the model drifts. Ik wou even bellen gets transcribed as ik kou even bellen or ik vouw even bellen. None of that matters for prose-style transcription. All of it matters when the next step is a state machine that expects bellen to be a verb.

We solved this in three places. First, we tightened the language model with a vocabulary list of every dentist's name in the practice, every common Dutch dental term (kroon, vulling, brug, wortelkanaal, gebitsreiniging, mondhygiëniste), and the names of every street within five kilometres of the two buildings. Second, we biased the intent classifier toward the four high-volume intents and made it ask "Bedoel je X of Y?" any time confidence dropped below 0.7. Third, we made the handoff threshold aggressive. If the agent does not understand a caller twice in a row, it hands off. No third try.

The aggression on handoff is the thing that took the longest to get right. The first version of the agent was too patient. It would ask a confused caller to repeat themselves three or four times. Callers hated this more than any other failure mode. The second version was too quick to hand off, and reception complained that the agent was forwarding calls it should have solved. The third version, the one in production, hands off after two failed turns. We watch the handoff rate per intent every week and tune.

AVG, recordings, and the dentist's chair

Dutch dental practices fall under both the AVG (the Dutch implementation of GDPR) and the medical-data rules layered on top. Call recordings count as medical data the moment a patient describes a symptom. We do not record calls by default. We store the transcript, hashed against patient ID, for fourteen days. We store the structured outcome (booked, rescheduled, handed off, with intent label) indefinitely, because once the patient ID is stripped it is no longer personal data.

The Autoriteit Persoonsgegevens has been explicit about consent for voice cloning and AI processing in healthcare contexts. We treat that as a hard constraint. The cloned voice has written consent. The agent identifies itself as an AI on every call, in the first sentence, in Dutch: "Hallo, je spreekt met de digitale assistent van de praktijk." Callers who would rather speak to a human can say mens or medewerker, or press 0, and the agent transfers immediately without trying to solve the call.

About 4% of callers do this. We watch that number. If it climbs, something is wrong with the agent that the dashboard is not telling us.

Warning

If you build a voice agent for a healthcare practice in the Netherlands, do not skip the consent script and do not skip the opt-out word. It is not optional, and it is not just polite. The AP has fined practices for less.

What changed after eight weeks

We ran the agent in shadow mode for the first two weeks. It answered, transcribed, and proposed actions, but a human pressed the button to commit anything into the agenda. Then we let it write to the agenda for the four high-volume intents only. Then we widened to seven intents.

After eight weeks in full production, the reception team's inbound call volume dropped from roughly 1,100 a week to about 430 a week. The other 670 are handled end-to-end by the agent or are resolved in a self-service flow ("onze openingstijden zijn 08:00 tot 17:30, maandag tot en met vrijdag, wil je nu een afspraak plannen?"). Reception is no longer overloaded between 08:00 and 10:00 on Mondays. The first cup of coffee gets drunk warm.

No-shows fell from the 7-9% band to between 4% and 5%. Most of that gain is not the voice agent's headline trick. It is from reschedules that happen on the evening before the appointment, by patients who would previously have called the next morning, found the line busy, and not bothered. The agent picks up at 21:34. It books the replacement slot. The original slot is back in the agenda before reception logs in.

The 90-second handoff target is met on about 94% of handed-off calls. The 6% that overshoot are mostly calls where the patient took a long time to find their date of birth, which we now ask for later in the flow rather than earlier.

The handoff taxonomy we now reuse

Every voice agent we have built since this project uses the same five-bucket handoff classifier. There are five legitimate reasons for a voice agent to hand off: the caller asked for a human; the caller's intent does not match a solvable intent; the intent matches but a precondition failed (the agenda is full, the patient is not in the system, the slot is reserved for a dentist who is on holiday); the agent did not understand the caller twice in a row; or the call has crossed the 90-second limit.

Each reason produces a different sticky note. "Asked for a human" gets a polite, low-context note. "Intent does not match" gets a longer summary because the receptionist will have to dig. "Precondition failed" tells the receptionist exactly which precondition, so the human knows where the workaround lives. That taxonomy turned out to be more valuable than any single piece of model tuning.

When we built the voice agent for the Eindhoven group, the thing that took longest was not the speech recognition or the dialogue state machine. It was getting the handoff right, in Dutch, for a building full of receptionists who had every reason to be sceptical. The fastest way we have found to do that on later projects is to ship the handoff before you ship the deflection.

If you run a multi-location practice and want to see whether your own reception load looks like this one's, pull last Monday's call log between 08:00 and 10:00, count the calls that asked one of your four highest-volume questions, and divide by the total. That number is the ceiling on what a voice agent could lift off your reception team's desk this week.

Key takeaway

A voice agent earns its keep on the handoff, not the deflection. Ship the warm transfer before you ship the first deflected intent.

FAQ

Why not just use SMS reminders?

SMS reminders cut some no-shows but cannot accept the reschedule. The patient still has to call the practice the next morning, when reception is busiest. A voice agent picks up the reschedule the same evening.

Does the voice agent handle Dutch dialects?

With tuning, yes. We add a vocabulary list of dental terms, local street names, and dentist names, and we hand off after two failed turns rather than asking a confused caller to repeat themselves a third time.

Is it compliant with AVG and medical-data rules?

Yes, if you do it on purpose. We do not record calls by default, store transcripts for fourteen days hashed against patient ID, and the agent identifies itself as AI in the first sentence of every call.

How long does a build like this take?

From signed proposal to shadow-mode-in-production was about six weeks. Full production with write access to the agenda took ten. The slow part is the handoff design, not the AI.

voice agentsai agentsautomationcase studyintegrationsoperations

Building something?

Start a project