Voice agents
Voice agents for logistics: Transwide and Oracle 11g playbook
The planner team in Antwerp was drowning in 2,480 weekly 'where's my truck' calls. The TMS was 14 years old. Here is how the voice agent shipped anyway.

Tuesday, 06:42, the planner's desk in Antwerp. Three landlines blinking. One is a Slovak chauffeur outside the terminal at Liefkenshoek wanting to know which gate. One is a French shipper asking why a 13-tonne pallet sat in Zeebrugge for two days. The third is the customer's customer, who already called yesterday, also at 06:42.
Our client is a 31-person logistiek-dienstverlener in the port of Antwerp. They run roughly 110 trucks a day across the Benelux, Germany, and northern France, with a steady douane corridor to and from the UK after Brexit. They are not a big name. They are good at their job. They do not have a software team.
Their dispatch handled 2,480 inbound calls a week. Our four-week sample showed 71 percent were status calls: where is my truck, what is my ETA, has the douane stop cleared yet. The remaining 29 percent were the calls a planner actually needs to take: new orders, exceptions, claims, a chauffeur in real trouble.
Average handle time was 2 minutes 40 seconds. Multiply that out and the planner team was losing roughly 78 hours a week to information lookups it had already done. Two full-time planners' worth of attention, spent answering questions whose answers were already in two different systems.
That was the brief from the operations lead: build a voice agent that owned the 71 percent and handed the other 29 percent back to people, fast, without breaking a TMS that no developer alive wanted to touch.
The shape of the stack we inherited
The TMS was Transwide. Not last year's Transwide. The 2012 build, never upgraded, with a custom EDI layer one of the founders wrote himself the year Belgian truckers blocked the E40.
Underneath, an Oracle 11g database with the actual rittenplanning. A homegrown PL/SQL job ran every twenty minutes to push ride state into Transwide. Driver scans came in by SMS via a Belgian carrier nobody had a contract printout for.
Oracle 11g went out of Premier Support in 2015 and Extended Support in 2020. The instance was still running. Nobody had touched its init.ora since 2017. The DBA had retired.
If you are reading this and recognising your own stack, you are not unusual. The question is whether the voice agent on top has to inherit the legacy fragility or can route around it. Ours routed around.
Why a voice agent, not a chat widget
The obvious answer is that drivers and shippers were calling, not typing. The real answer is that 38 percent of the inbound calls came from chauffeurs at the wheel, often with hands on the wheel and a German road in front of them. A chat widget loses to a phone every time on the cab side. On the shipper side, status-call habits are decades old. People who book transport pick up the phone. We could replace the channel, but not the muscle memory.
We picked a voice stack we could reason about: ElevenLabs for synthesis (Dutch, French, German voices), Whisper-large-v3 for transcription tuned on a six-hour internal corpus of clipped chauffeur Dutch, and our own dialog state machine. We did not use a single end-to-end voice model. The cost on this volume was untenable in mid-2025, and the latency on Benelux carrier routes was wrong.
The data plumbing nobody wanted to touch
You cannot answer a status question if you cannot see the status. So before we wrote a line of dialog, we built two read paths.
The first was a 60-second materialised view over Oracle 11g, exposed as a tiny REST service running on a separate VM. The view joined RIT, STOP, T1_DOC, and CHAUFFEUR_SCAN into a flat row keyed by the public order reference.
CREATE MATERIALIZED VIEW mv_rit_status
REFRESH FAST ON COMMIT
AS
SELECT r.ref_extern AS order_ref,
r.status AS rit_status,
s.eta_klant AS eta_planned,
s.werkelijk_aankomst AS eta_actual,
t.t1_nummer AS t1_doc,
t.t1_aangemaakt AS t1_created_at,
c.laatste_scan AS last_scan_at,
c.laatste_locatie AS last_loc
FROM rit r
JOIN stop s ON s.rit_id = r.id AND s.volgorde = 1
LEFT JOIN t1_doc t ON t.rit_id = r.id
LEFT JOIN chauffeur_scan c
ON c.chauffeur_id = r.chauffeur_id
AND c.actief = 'J';The second was a polite, rate-limited pull against the Transwide HTTP API for everything Oracle did not own (shipper EDI events, planned customs windows, transhipment confirmations). Transwide's API is fine if you respect it. We used a 2-call-per-second cap per integration token and a 24-hour SQLite cache for slow-changing fields.
The voice agent never talks to Oracle directly. It calls our REST layer, which has SLAs, logging, and circuit-breakers the 2012 stack does not. If Oracle is having a moment, and Oracle 11g has moments, the agent degrades to 'I'm checking with planning, can I call you back in twelve minutes' instead of dropping the call.
One non-obvious rule: do not let your voice agent issue arbitrary SQL against the production database, however good the LLM-to-SQL demos look. One bad join in a 14-year-old schema will lock a table the planning UI depends on, and the planner you've just inconvenienced is the one who has to vouch for your project at the next management meeting. The agent reads materialised views. Always.
The T1 escalation rule
Of the 2,480 weekly calls, the most expensive class, measured in money lost rather than minutes spent, was customs trouble. A T1 transit document is the EU paper that lets goods move under customs control without paying duty at every border. If a truck arrives at the Vrasene customs office (the standard release point on the way to and from the UK on the E34) with a T1 that has been open for more than 72 hours, the chance of an additional inspection rises sharply. An inspection costs the client two to four hours of chauffeur time and a queue slot in the planner's day.
The rule we implemented, in one paragraph: any caller asking about an order that has an open T1 document older than 72 hours, where the planned route crosses Vrasene within the next 90 minutes, is taken off the voice agent immediately and dropped into a hot planner queue. Not voicemail. A live transfer with the full context preloaded into the planner's screen.
def should_escalate(call_context, order):
if not order.t1_doc:
return False
if (now() - order.t1_doc.created_at) < timedelta(hours=72):
return False
if not crosses_vrasene_within(order, minutes=90):
return False
return True
# in the dialog state machine
if should_escalate(ctx, order):
transfer_to(queue="planning_hot", context=ctx)
returnThe 90-minute window is not arbitrary. It is the time it takes a chauffeur from the Liefkenshoektunnel exit to the Vrasene release point at average port-area traffic. If the planner gets the call before that window closes, they can rebook the customs slot, push a corrected T1 to the douane portal, or, in two cases out of five, flag the truck to a different release point entirely. After Vrasene, the options collapse.
The point of a voice agent in logistics is not to answer 100 percent of calls. It is to be ruthless about which 8 percent it must not answer, and to hand those off with the full context before the truck reaches the point of no return.
What the planner team's Monday looked like after four weeks
Eight weeks after go-live, the four-week rolling average looked like this:
- 71 percent of status calls fully resolved by the agent, no human transfer.
- 9 percent transferred at the caller's request ('I want to speak to someone').
- 8 percent escalated by rule. T1 + Vrasene proximity was the largest class, joined by smaller tripwires we added for German Maut violations and a French SIVEP fish-import window.
- 12 percent the agent could not handle: new orders, claims, complaints, calls in Polish (we shipped Dutch, French, German, and English).
The planner team's call volume dropped from 2,480 to 740. Handle time on the calls they did take went up, to 4 minutes 10 seconds, because the calls left for them were the calls worth taking. Two planners moved from the phone to the customs and exceptions desk full-time. Nobody was fired. The chauffeurs noticed first; one of them sent the office a box of Liège waffles.
The carrier was the hard part
The model was not the bottleneck. Whisper-large-v3 transcribed Dutch chauffeur dialect well enough on a desk handset in a quiet office. The bottleneck was the audio that actually reached the model. A driver on speakerphone in a moving Volvo at 130 km/h on the A2, roaming on a German MVNO, sounds nothing like the clean phone audio in any public benchmark.
The first week of real traffic, the agent misheard order references in roughly half the calls. We almost rebuilt the transcription layer. Instead, we added a single confirmation turn after every order reference: 'I have order 4-4-1-7-2, is that correct?'. Misreads dropped from 51 percent to under 3 percent. The cost was one extra second per call and another small dent in the 'natural' feel we had already given up on.
The second carrier problem was uglier. Two days after go-live, the agent started dropping calls from a single SIM range at the 17-second mark. The TTS provider had renegotiated to a codec the carrier silently downgraded, and after seventeen seconds the audio frame buffer collapsed. We pinned the codec on our side and added a heartbeat test against four Belgian carriers that runs every fifteen minutes. The dashboards for that test hang on the planner room wall now, because the planners care more about codecs than we do.
What we would not do again
Three things, in order of regret.
First, we spent two weeks tuning prompts to make the agent sound 'natural'. The drivers did not want natural. They wanted predictable. They wanted the agent to say the order number back, say the ETA, say the next checkpoint, and let them go. We deleted half the conversational scaffolding and the satisfaction score went up.
Second, we built our own escalation queue UI before checking what the planners already used. They lived in the Transwide planning board. We rebuilt our queue as a row inside that board, colour-coded by escalation rule. Adoption went from 'forced' to 'they ask for new escalation classes now'.
Third, we underspent on observability for the first month. Voice agents fail in ways chat agents do not: a dialog state lock, a TTS provider hiccup, a carrier renegotiating codecs mid-call. We now log every turn with audio, transcript, dialog state, and downstream API timings. A failed call is a 90-second clip a planner can listen to, not a row in a log table.
The smallest thing you can do this week
Sit with your dispatch team for two hours on a normal Tuesday. Tally the inbound calls into three buckets: status, exception, new business. If the status bucket is over half, you have a voice-agent problem, not a staffing problem. The next question, which 8 percent must never go to the agent, is the one that decides whether your project ships or sits in a slide deck.
When we built the voice agent for the Antwerp client, the thing we kept running into was that the 14-year-old Transwide schema and the Oracle 11g instance underneath it had each been patched by different people across a decade, and nothing was documented. We ended up writing a one-page glossary of fields by name and meaning before any dialog work began, and that glossary became the contract between the voice agent and the planner team. If you are looking at similar work on a legacy stack, our notes on shipping AI agents against systems older than the team running them are written from this kind of week.
Key takeaway
A voice agent in logistics isn't about answering every call. It's about being ruthless on which calls must escalate before the next border.
FAQ
How do you keep a voice agent from hallucinating an ETA?
Anchor every ETA in a named database field with a timestamp. If the field is missing or stale beyond a threshold, the agent transfers to planning instead of inventing one. Never let the model generate a number.
Can a voice agent talk to a 14-year-old TMS?
Yes, but not directly. Build a small read-layer over a materialised view or a cache, and let the agent talk to that. Never let an LLM-driven agent issue ad-hoc queries against a legacy production database.
What's the ROI math on a logistics voice agent?
Take weekly inbound call count, multiply by average handle time, multiply by dispatcher loaded cost. If 60 to 80 percent of calls are status lookups, that is the number you reclaim, minus the agent's run cost.
Which languages should a Benelux voice agent ship with?
Dutch, French, German, and English cover the chauffeur and shipper population for most Belgian operators. Polish is the next one. Many drivers in northwestern Europe speak it as a first language.
What's the single biggest gotcha?
Not the LLM. The carrier. Audio quality and codec negotiation on Benelux mobile networks vary by handset and by 4G/5G fallback. Test on real driver phones, not on a desk handset.