Voice agents
Dental voice agent: holding the line through summer leave
Three receptionists on summer leave, a phone line that should have collapsed, and a voice agent that booked 547 dental appointments without human help.

The phone line, mid-July
It is a Tuesday morning in July at a dental group with four branches across the Randstad. Two receptionists are on holiday in Greece. One has a sick child. The fourth is the manager, and she is on the phone with a lab in Eindhoven about a crown that did not arrive. The voicemail prompt fires for the fifth time in twenty minutes.
Before this summer, that meant lost appointments, frustrated patients, and a backlog of voicemails someone would chew through on Thursday afternoon, often with a half-eaten broodje at the keyboard. This year, the line did not go dead. It did not even slow down. A voice agent picked up at the first ring, in Dutch, and wrote appointments straight into the practice management system at each branch.
What dental phone work actually looks like
Before we wrote a line of code, we sat next to receptionists for two full days at the busiest branch. The phone is not a single workflow. It is five workflows braided together.
- New patient registrations, with a check on whether the branch is taking new patients at all that month.
- Rescheduling, which is roughly forty percent of inbound traffic.
- Cancellations, which need to free a slot fast so the next caller can take it.
- Pain calls, which need triage and a same-day or same-week slot.
- Admin questions, including insurance, invoices, and "did the dentist receive my x-ray".
Anyone who has built a chat agent knows the trap. If you write a single prompt that tries to handle all five, the agent does none of them well. The receptionists themselves run a state machine in their heads. Our agent had to as well, but ours had to declare its state in code.
The constraints we wrote down first
Dutch dental practice management software is its own world. The group runs Simplex at three branches and a custom layer over Exquise at the fourth. Neither vendor publishes a real public API. Both expose enough through a desktop client and an internal HTTP endpoint that we could write an adapter without breaking any terms.
Three constraints shaped the rest of the build.
- Sub-second latency. Anything over 1.5 seconds of silence after the caller stops talking, and they repeat themselves or hang up. We measured this on real recordings before we promised anything.
- Dutch with Flemish and German edges. The patient mix at this group includes Dutch native speakers, German cross-border patients from the Achterhoek, and a Flemish minority. The agent had to handle all three without forcing a language switch.
- No fabrication, ever. A wrong appointment is worse than a missed call. The agent must read availability from the actual calendar, never guess, and confirm every booking out loud before writing it.
Architecture
We did not invent anything novel. We composed pieces that have been production-ready for the last twelve months and put a strict harness around them.
The call enters through the practice's existing SIP trunk. We did not replace the phone provider, which would have been a political fight that delays every healthcare project by a quarter. We added a SIP fork to a Twilio Programmable Voice endpoint that streams audio bidirectionally to a small Node service running on a Hetzner box in Falkenstein. That service holds conversation state per call.
The pipeline per turn:
- Streaming speech-to-text via Deepgram Nova-3 in Dutch, with the German acoustic model warm in parallel for cross-border callers.
- Voice activity detection tuned to a 220 millisecond end-of-turn threshold, which is aggressive but pays off in perceived snappiness.
- A tool-calling LLM that holds the conversation. It cannot speak its own answers about availability. It can only call the calendar tool, the patient lookup tool, and the booking tool.
- Streaming text-to-speech in a Dutch voice we licensed for exclusive use across the four branches, so callers across locations recognise the same speaker.
The calendar tool talks to a small Python adapter that wraps both the Simplex and Exquise endpoints. Each branch has its own adapter instance with its own credentials and its own rate limit, because Simplex throttles aggressively and we did not want one busy branch starving another.
The voice agent never says a time aloud that did not come from a tool call in the last 800 milliseconds. That single rule killed almost every category of failure we feared.
The interruption problem
Half of voice-agent demos break the moment a caller talks over the bot. Real callers always do. They cough. They say "ja, ja" while you are still talking. They ask a follow-up before you have finished the sentence. The polite ones say "sorry, even iets" and then ask anyway.
We handle interruption with two layers. The first is voice activity on the inbound stream that cancels the outbound TTS immediately, mid-syllable if needed. The second is a cheap classifier on the partial transcript that decides whether the interruption is a backchannel (a "hm", a "ja") or a real new turn. Backchannels are ignored. Real turns reset the state and the agent answers the new question instead of finishing its old sentence.
This is the difference between an agent that feels human and one that feels like a phone tree with extra steps.
The calendar is not a calendar
Dental scheduling has rules that look obvious only when you watch the receptionists work. We wrote down 37 of them in two days.
- A check-up is twenty minutes, but only with the patient's own dentist, and only on days that dentist works that branch.
- A filling needs forty minutes and a treatment room with the right equipment for the relevant quadrant.
- A pain call can break the twenty-minute rule. It goes into an emergency slot that the schedule pretends does not exist but every receptionist knows about.
- Children under twelve are not booked after 15:30 unless a parent insists, in which case the receptionist notes it.
- If a hygienist slot is empty within four weeks of a check-up, offer it. The patients almost always say yes.
None of this lives in the calendar software in any structured form. It lives in the receptionists' heads. We turned each rule into a small function that runs after the calendar lookup and before the agent is allowed to read a slot back to the caller. The agent does not know the rules. The adapter does.
Confirmation, out loud
The agent confirms three things before any booking is written.
- The patient's full name and date of birth, read back from the lookup.
- The treatment type in plain Dutch, not jargon. "Controle" not "periodieke mondonderzoek".
- The day, date, time, and which dentist at which branch.
Only after a clear "ja" or "klopt" does the booking tool fire. If the caller corrects anything, the agent re-reads the corrected slot. This adds about eight seconds per booking. Patients do not mind. The error rate dropped to a level the manager described, dryly, as "lower than what we did by hand".
Handoff, designed in
Every voice agent eventually meets a call it should not finish. Ours flags any of the following and transfers to the on-duty receptionist with a one-sentence written summary on her screen.
- Three or more failed name lookups in a row. Usually a hearing-impaired caller or a noisy background.
- Any mention of bleeding, swelling that affects breathing, or trauma. The agent does not triage these.
- A complaint, defined by sentiment plus a short keyword list. Complaints go to humans within the same call.
- Anything about invoices over 250 euros or insurance disputes.
The handoff is warm. The receptionist sees who is on the line, what they came for, and what the agent already confirmed, before she even says "goedemorgen". The whole transition takes under three seconds.
The summer test, in numbers
The agent went live in late May 2026. The first real test was the second week of July, when three of the four receptionists were off at the same time.
Across that week, the agent handled 1,184 inbound calls. Of those, 612 were booking or rescheduling intents. The agent completed 547 of them without human help. 41 were handed off to the on-duty receptionist for reasons the agent flagged. 24 were dropped by the caller before a booking completed, which is roughly the same drop rate we measured on staffed weeks. Zero bookings were written that the patient later disputed.
Average handle time for a booking was 1 minute 47 seconds, against a baseline of 2 minutes 30 seconds in a March week with full staff. The agent is faster because it never puts a caller on hold to check the calendar in a second window. It has the calendar already.
Cost, and why it is not the headline
A voice agent has a per-minute cost. STT, LLM, and TTS together come to about 0.11 euros per minute of conversation at this group's volume. The question we put to the manager was not whether the agent paid for itself in raw call minutes. It was what an hour of a trained receptionist's time is worth on the busiest morning of the year, and how many of those hours the line would have lost without it.
Her answer was direct. The agent costs less per month than the temp she used to hire for the August week. The savings were not the reason she signed off, though. The reason was that the line stops ringing into voicemail. Patients who reach a human, even a synthetic one, do not call the competing practice two streets over.
What we would do differently
Two things, if we were starting over.
First, we underbuilt the German support in v1. The cross-border patients are a small share of volume but a vocal one, and we should have had German-acoustic STT live on day one instead of week six.
Second, we should have shipped a daily summary email to the practice manager from day one. We added it in week three after she asked. It lists every call the agent took, every handoff, every flagged transcript, with a link to the audio. It is the single feature that built trust faster than anything else, including the booking numbers.
Do not ship a voice agent to a healthcare client without a per-call transcript log the practice owner can read. Trust collapses the first time a patient calls back and the receptionist cannot say what was promised on the previous call.
What you can do today
If you run a practice, an agency, or any operation with a phone line, the smallest useful thing is a five-minute audit. Pull last month's call log. Count voicemails. Count missed calls during lunch. Count calls that came in during the school holiday week. If the number is higher than you thought, the case for a voice agent on the front line is already written. You do not need to replace your receptionists. You need to stop losing the calls they cannot get to.
When we built this for the dental group, the gnarly part was not the speech model. It was the calendar adapter that had to obey rules nobody had ever written down. That is the pattern in almost every project we run: the interesting work is the institutional knowledge, not the AI. We do this kind of build for clients across the Netherlands and Thailand under the heading of AI agents, and it always starts with two days at a real desk, watching real work.
Key takeaway
A voice agent that never speaks a time it did not just read from the calendar will outperform a phone tree and survive the holiday week your receptionists cannot.
FAQ
How does the voice agent handle a caller who talks over it?
Inbound voice activity cancels the outbound speech mid-syllable. A small classifier then decides whether the interruption was a backchannel like 'ja' or a real new turn, and only real turns reset the conversation state.
What happens if the agent cannot understand the caller?
After three failed name lookups, or any mention of bleeding, breathing trouble or trauma, the call is warm-transferred to the on-duty receptionist with a one-sentence written summary already on her screen.
How accurate is Dutch speech recognition for medical terms?
Streaming Deepgram Nova-3 in Dutch handled the everyday booking vocabulary well. We keep a small custom term list for treatment names and dentist surnames, and we never read back jargon. Confirmation is always in plain Dutch.
How long did the build take from kickoff to go-live?
Roughly nine weeks. Two for discovery at the busiest branch, four for the conversational core and calendar adapters, and three for hardening, German support, the daily summary email and supervised pilot calls.
Does the agent ever invent an appointment time?
No. The agent is not allowed to speak any time aloud unless that time came from a tool call to the live calendar in the last 800 milliseconds. That rule is enforced in the harness, not just the prompt.