Voice agents
Dutch voice agent: 1,640 weekly calls on Navision 2009
It is Tuesday at 06:14. A Polish driver in Tilburg wants to know which pallet of paksoi sits where in his route window. The phone agent answers in 35 seconds.

The 06:14 phone problem
It is Tuesday morning, 06:14, on a wet day in May. A Polish driver in a refrigerated truck is parked outside a Carrefour distribution centre near Tilburg. He needs to know whether the two pallets of paksoi listed under reference 41229 are on his run today or the next one, because his route planner has already moved him forward by forty minutes. He calls the wholesaler in Breda. The line rings exactly once.
What used to happen at that moment: nobody picked up. The night shift had handed off at 06:00, the day team was on the warehouse floor, and the call rolled to voicemail. The driver left a message in Polish. Forty minutes later someone called him back, in Dutch, after asking around the office for help. By then he was halfway to Eindhoven.
What happens now: the line is picked up by a voice agent, in Polish, on the first ring. The agent reads back the order reference, confirms the two pallets, names the loading dock, and tells him the route slot is 07:15 to 07:45. Total call length: 41 seconds.
That is one call. There are about 1,640 like it every week.
The wholesaler and the constraint
The company is a 26-person fresh-produce wholesaler in Breda. They move leafy greens, brassicas, herbs, and a stable rotation of out-of-season fruit between Dutch growers, Polish growers, and around 90 mid-tier buyers across the Benelux. A serious business with old infrastructure.
Their ledger lives in Microsoft Dynamics NAV 2009. NAV 2009 went out of mainstream support in 2015 and out of extended support in 2019; Microsoft's own lifecycle page is still up if you want to read the dates yourself.
That sounds like a story about replacing the ERP. It is not. The wholesaler has tried twice over the past eight years to migrate to Business Central. Both times they pulled the plug at week eleven because the route planner, the scale-house integration, and three custom margin reports were never going to come along cleanly. The CFO is honest about it. He calls NAV 2009 "the heart attack we never had". The number to call is still 076.
What they could not do, though, was keep handling order-status calls the way they were. They were losing roughly four hours of warehouse-floor time every day to ringing handsets.
The shape of the build
We built a voice agent on top of NAV without touching NAV. That is the entire architecture in one sentence. Everything else is detail.
The agent answers every inbound call to one specific number, identifies the caller against the customer record (CLI plus a fallback verification by company name and postcode), and handles three intents:
- Order status, the boring case, "where is my paksoi"
- Route slot confirmation, "is my pallet on the 07:15 run"
- Quality complaint, the kwaliteitsklacht case, which gets routed to a human within 30 seconds
Everything else, including new orders, escalates to a planner. We were deliberate about not adding more intents in the first build. A focused agent that knows two things cold is far more useful than a wide one that hedges on six.
The agent talks to NAV through a read-only SQL view we built on the side, plus a small queue table for writes. We do not touch the production NAV instance. The view is refreshed every 60 seconds from a mirrored read replica, which sounds slow until you remember that pallet ETAs change in 15-minute increments, not 15-second ones.
The route-planner integration is older than the ERP. It is a third-party Windows service from 2011 that the company will not part with, because it understands one specific quirk about how their drivers handle Belgian rural drops. We talked to it over a polling file drop, the same way the route planner has always talked to NAV. The agent reads the same file format. Nothing about the planner had to change.
That is the philosophy. The new system fits the contours of the old one.
Dutch and Polish in one voice stack
The two-language requirement was the one design choice that everyone underestimated, including us.
About 38% of the wholesaler's drivers are Polish-speaking. A smaller share of their growers are. The buyers are almost entirely Dutch-speaking. Code-switching mid-call is normal. A planner might say "tak, oczywiście, wij sturen 'm om kwart over zeven". You cannot build for that with a single-language ASR model.
We run two recognisers in parallel for the first 2.5 seconds of every call, score them by confidence on the leading utterance, and lock the call to whichever wins. From there on the agent stays in that language unless the caller explicitly switches. There is no "press 1 for Dutch" menu. The agent simply answers in the language the caller used.
Polish was harder. Dutch ASR is in good shape; the public corpora are deep, including projects like Mozilla's Common Voice, which carries enough validated Dutch audio to cover most regional accents. Polish ASR is fine on a clean signal, but Polish ASR for a driver calling from a moving truck cab with diesel idle and a Bluetooth headset is a different problem. We dropped the model's word-confidence threshold for Polish to 0.62 and added a deterministic fallback: if the agent cannot get a reference number out of three attempts, it offers to text the caller a short link they can tap to confirm. About 4% of Polish calls fall through to that path. About 0.7% of Dutch calls do.
We also pre-loaded the recogniser with the wholesaler's product list, in both languages, as a custom vocabulary. The single biggest accuracy lift on day one came from teaching the model that "paksoi" is a word, that "pak choi" is the same word, and that "Chinese kool" might be the same word in context. None of those are exotic. All of them broke the stock vocabulary.
The 35-second budget
The thing the operations director cared about most was the wall-clock. He had been measuring his old human-answered calls, and the median was 2 minutes 11 seconds, because most of that time was the planner walking to the second screen to pull up the route file. He wanted the agent under 40 seconds end to end.
We came in at 35 seconds for the median order-status call, including the greeting, the verification, the lookup, the spoken answer, and the close. That number matters because it is the difference between the driver staying on the line and the driver hanging up to call dispatch directly.
Where the 35 seconds goes:
greeting + language lock 2.4s
CLI match + verify (if needed) 6.8s
NAV view lookup (cached path) 0.9s
route-planner file read 1.6s
TTS answer (with pallet + ETA) 11.0s
caller confirmation + close 12.3s
The two long lines on that list are the human ones, not the machine ones. The lookup and the file read together are under 3 seconds. Most of the budget goes to people speaking and listening, which is fine, because that is the part the agent cannot rush.
The constraint that mattered was wall-clock, not accuracy. A 96% accurate agent that answers in 35 seconds beat a 99% accurate one that answered in 90.
Routing the kwaliteitsklacht
The one intent we routed to humans, always, was quality complaints.
In Dutch fresh-produce, a "kwaliteitsklacht" is the call where the buyer says the lettuce arrived warm, or the strawberries arrived bruised, or the herbs are showing condensation in the punnet. These calls can become credit notes, return shipments, or damaged grower relationships, and they are not the kind of conversation you put through a voice agent in 2026.
The agent listens for a small set of triggers: words like klacht, kapot, warm, bruin, schimmel, plus their Polish equivalents (skarga, zepsute, ciepło, brązowy, pleśń), plus an intent classifier that catches the more polite phrasings. On a trigger, the call routes to the logistics planner currently on duty, with a short structured handoff: caller name, customer code, order reference, the agent's best guess at the complaint type, and a clean recording of the last 20 seconds.
The planner picks up in around 8 seconds on average. The caller never has to repeat themselves. The recording plays in the planner's headset, not on speaker, so they hear it while they are still in the conversation. If you build a voice agent for any business with real liability exposure, the clever move is not making the agent handle complaints. The clever move is making the handoff to a human invisible.
What 90 days of production looked like
Numbers from the first 90 days, all measured against the prior 90 days as baseline:
- 1,640 calls per week answered by the agent, against 1,420 per week previously answered by humans (the agent picks up calls that used to roll to voicemail)
- Median answer time: 35 seconds (was 131 seconds)
- Human-routed share: 11% (target was under 15%)
- Kwaliteitsklacht route to a planner under 30 seconds: 94% of cases
- Polish-language calls handled end-to-end: 27% of total (was 0%, because nobody on the day team spoke fluent Polish)
- Warehouse-floor time recovered: roughly 17 hours per week across the team
The number that surprised the CFO was none of these. It was that the night shift, which previously handled around 80 calls between 22:00 and 06:00, stopped handling any. The agent took them. The night shift now uses that time to pre-stage the morning's outbound pallets, which moves the 06:30 loading start to 06:18. Twelve minutes earlier is twelve minutes more sleep for every driver in the route.
What this is not
A few things this build did not do, and did not try to do.
It did not replace NAV. It will not. The wholesaler will run NAV 2009 until the day they switch to whatever comes next, which they have not chosen yet. There is a separate piece of work on the table for that, and it is not this one.
It did not modernise their operations. It answered the phone. That is one workflow. The reason it works is that the workflow is narrow and the integration is honest.
It did not save anyone's job. The two part-time call handlers were already retiring. The agent let the company not backfill those roles, and let the rest of the team stop being interrupted every six minutes during the morning push.
The smallest thing you can do today
Look at the inbound call log for one of your numbers, for one week. Count how many of the calls are one of three things: a status question, a slot confirmation, or a complaint. If that share is over half, you have the same shape of problem this wholesaler had.
When we built the voice agent for this Breda wholesaler, the part that surprised us was not the speech model or the language switching, it was the route-planner file drop from 2011 and how cleanly it held up once a new caller was on the other end. If you want to see how we approach that kind of work, the AI agents page is the closest one to this story.
Key takeaway
A voice agent that answers an old phone line cleanly beats a new ERP that rewrites the whole stack. Fit the contour of the legacy system; do not fight it.
FAQ
How do you connect a modern voice agent to an old ERP like Navision 2009?
Through a read-only SQL view mirrored from a read replica, plus a small write queue. You never touch the production NAV instance. The agent reads the view; planners write back through the queue.
Why two ASR models instead of language detection?
Dutch-Polish code-switching is common in fresh-produce calls. Running two recognisers for the first 2.5 seconds and locking to the winner is more reliable than asking a single model to detect language mid-utterance.
What stays human in this build?
Quality complaints (kwaliteitsklachten), all new orders, and anything where the agent's confidence dips. The handoff carries the customer code, order reference, a complaint guess, and the last 20 seconds of audio.
How long did this take to ship?
About nine weeks from kickoff to first production call, then four weeks of tuning. Most of the work was the NAV side and the route-planner file format, not the voice stack itself.