Operations

Storings-intake routing: scoring voice agent, desk, hybrid

A CV-ketel down in Voorburg at 06:47 on a Sunday. Three storings-intake models on the table for a sub-€8M installatiebedrijf — here is how we score them.

Jacob Molkenboer· Founder · A Brand New Company· 21 Jun 2026· 7 min

Brass bell, open ledger, black bakelite phone receiver and green-sealed triage tag on ivory desk by a window.

It is 06:47 on a Sunday in November. A CV-ketel in Voorburg won't ignite. The owner has a six-year-old who has been crying since 06:20. The storings-intake rotation at her installatiebedrijf does not start until Monday 07:30. The voicemail loops her to an app. The app loops her to a form. By 09:00 you get the call from her partner, who sits on the bestuur of the VvE.

You are the directeur. Every quarter the operations lead asks you a different version of the same question: do we put Claude on a Plantronics headset and let it take the call, do we keep the three-person rotation we have always had, or do we run hybrid — agent picks up, escalates to a werkvoorbereider on the calls that warrant it.

Each option has a defender in the room. Each defender has a number that flatters their case. We use a four-line scoring method for this decision. It does not pick the winner for you. It makes the answer defensible — to the directie, to the verzekeraar, and to the Inspectie SZW if someone ever asks why a spoed sat in a queue.

The three configurations on the table

A — Claude voice agent on a Plantronics headset

A 24/7 SIP line into a Twilio or Voys trunk, transcribed live, the agent triages against the customer's installation history, classifies on NEN 1010 fault categories, identifies risico-categorie (gas, elektra, water), and either books a maintenance slot or pages the spoed-monteur on roulatie. The Plantronics piece is not a metaphor. The agent runs on the same headset the werkvoorbereider would have worn, so when the agent does hand off, the audio context is already in the room. No model can hear what a werkvoorbereider hears on the third sentence of a panic call. It can transcribe everything, every time.

B — Three-person service desk rotation

Two werkvoorbereiders and one junior planner, rotating ATV-diensten across the week, 07:00–19:00 weekdays, 08:00–14:00 Saturday, voicemail-to-telefoonaanname on Sunday. This is the model your concurrenten in Naaldwijk and Bodegraven still run, and the model most installatiebedrijven in this revenue band quietly fall back to when the agent conversation gets too political. It works. It is also the model that produced your Voorburg incident above.

C — Hybrid agent-first with werkvoorbereider hand-off

Agent answers first. It collects the eight fields the werkvoorbereider would have asked for — adres, ketel-merk, type-melding, foutcode op het display, tijdsduur, water uit ketel ja/nee, gasgeur ja/nee, contactvoorkeur — classifies risico, and only routes to a human when the call meets one of seven escalation rules. The werkvoorbereider sees the transcript before the call hits her headset. She does not start cold.

Per-melding cost at 2,640 monthly storingen

2,640 storingen per maand is 88 per dag, 3.7 per uur averaged. The curve is brutal. Monday 07:00–10:00 carries roughly 22% of the weekly volume. Sunday afternoon carries about 1.8%. That asymmetry kills the pure-rotation model on cost and makes the agent-only model look cheaper than it is on the spreadsheet.

The cost line we put on the one-pager for the directie:

                       A: agent      B: rotation   C: hybrid
Per-melding cost       ~€0.94        ~€4.10        ~€1.62
Wall-clock to slot     11 min        14 min        9 min
Spoed escalation       agent⇒pager   human direct  agent⇒wvb
Sunday 06:00–10:00     covered       voicemail     covered
Monday 07:00–10:00     queue ok      overflow      queue ok
Gas-cat audit trail    transcript    handwritten   transcript + wvb sign-off

The numbers are what a Dutch installatiebedrijf at this revenue band tends to land at after about six months: telephony (Voys of Twilio), model inference (Sonnet on the caller-facing flow, Haiku on transcription cleanup and classification), plus roughly 0.3 FTE werkvoorbereider for the hand-off side of the hybrid model. Your numbers will differ by ±30%. The method does not.

A — voice agent only — looks like the obvious win on per-melding. It is, on per-melding. It isn't, on total cost of being wrong. The next section is why.

NEN 1010 defensibility under the Arbo-RI&E

This is the part the cost spreadsheet does not say out loud. A storings-intake on a residential installation falls under the customer's verantwoordelijkheid, but the moment your monteur arrives, the Arbeidsomstandighedenwet expects your RI&E to anticipate the risk he or she will face. NEN 1010 governs the installation. Your RI&E governs how your people approach it. The intake is the bridge — the place where you decide what your monteur should expect when the van door slides shut.

When the intake compromises the risk-classification — agent decides it is a Cat. II melding, monteur arrives on a Sunday afternoon expecting a routine reset, walks into a smouldering hoofdverdeler — defensibility comes down to three questions:

Was the intake auditable end-to-end? Did you keep the transcript and the classification path the system took?
Was the classification rule itself documented and versioned at the moment the call came in? Or did the rule change last Tuesday and nobody wrote it down?
Was a human in the loop at the decision point that mattered, with enough context to disagree with the model?

Model A fails (3) on the cases that hurt most. Model B passes (3) on every call but quietly fails (1) and (2) — rotation-staff classify in their heads, the rule lives in a Slack thread from 2024, nothing is logged in a way a verzekeraar would accept. Model C passes all three if you wire it right.

Hybrid is the only configuration where you can hand the Inspectie SZW a versioned prompt, a full transcript, and the werkvoorbereider's note — all within about thirty seconds of the request landing in your inbox.

Warning

Pure voice-agent intake on gas-related storingen is defensible only if the agent escalates 100% of gas-cat calls to a human before dispatch. We have watched one Dutch installatiebedrijf set this rule preemptively, three set it after a near-miss, and zero set it before there was an agent in the loop at all.

Sunday morning ownership when the agent compromises

Ownership of the prompt repo is a named role or it is no role at all. Anthropic's own writeup on building effective agents keeps landing on the same point: the humans around an agent have to own its decisions, not just its outputs. Whoever owns the prompt is also the person who owns the rollback when the agent compromises on a Sunday morning. If you cannot name that person on a Sunday at 07:00, you do not have an agent in production. You have an unowned voice in a queue.

The compromise we care about is not the model hallucinating a part number. That is annoying and easy to catch. The compromise we care about is the model picking the wrong urgency tier. Cat. III treated as Cat. II is a customer complaint. Cat. II treated as Cat. III is a coroner. The whole point of the scoring method is to keep your bedrijf out of the second sentence.

Ownership has to be named before you go live. Two roles, on a single page:

Model A: the directeur owns the prompt and the dispatch, because there is nobody else. This does not survive one near-miss.
Model B: the rotating planner owns the call and the dispatch. Defensible. Slow on volume spikes. Falls over on Sundays.
Model C: the werkvoorbereider on roulatie owns the dispatch. The technisch directeur owns the prompt. Two named humans, two distinct responsibilities, both written into the kwaliteitshandboek before the first call lands.

Agents are reliable to the extent that the humans around them have a fast, clear, documented escalation path. Tooling does not save you from an unowned spoed. There is no temporary Cloudflare account for an installatiebedrijf that forgot to name who picks up.

The scoring sheet we hand the directie

This is the one page that goes to the directievergadering. Weights are ours and we revisit them every twelve months. Yours should reflect your spoed-mix and your verzekering.

Dimension                          Weight   A     B     C
Per-melding cost                   25%      9     4     8
Wall-clock to slot                 15%      7     5     9
Sunday/holiday coverage            15%      9     2     9
NEN 1010 audit trail               20%      4     5     9
Arbo-RI&E defensibility            15%      3     7     9
Ownership clarity on spoed         10%      4     7     8
                                            ---   ---   ---
Weighted total                              6.4   4.7   8.6

Column A wins on cost and coverage and loses everything that ends up in front of a rechter. Column B wins on simplicity and loses on volume and weekend. Column C is the answer for almost every installatiebedrijf in this revenue band — not because it is shiny, but because it is the only configuration that scores above 7 on the two rows that survive a near-miss.

If your scoring sheet gives you a different answer — say B wins on a 2,000-melding floor because your spoed-mix is 60% planned-maintenance follow-up rather than echte storingen — that is also fine. The method is the method. The point is that the directie can see why.

When we built the storings-intake agent for a klimaat-installatiebedrijf near Zoetermeer, the thing we kept running into was the gas-cat escalation rule. The agent was good at calming the caller down — too good. The werkvoorbereider was getting fewer Sunday escalations than she should have. We solved it by writing the gas-cat escalation as a hard, non-negotiable routing rule in the prompt, not as a model judgement. If you are sketching the same decision for your own bedrijf, our voice agents page has the architecture sketch and the seven escalation rules we now run by default.

Open a spreadsheet. List your last 60 storingen. For each one, mark whether a werkvoorbereider would have classified it differently than the intake did. The ratio you get is your weight for line 4 of the table above. That is the smallest defensible thing you can do this week, and it is enough to start the conversation in the next directievergadering.

Key takeaway

Hybrid agent-first intake wins because it is the only configuration that gives you a transcript, a versioned rule, and a named human on the calls that end up in court.

FAQ

What does a voice agent for storings-intake actually cost at 2,640 meldingen a month?

Per-melding tends to land near €0.94 for agent-only and €1.62 for hybrid at that volume after about six months. Trunk choice and model mix shift these by roughly 30%.

How does NEN 1010 apply to an AI voice agent?

NEN 1010 governs the installation, not the intake. The intake falls under your RI&E. Keep transcripts, version your classification rules, and route 100% of gas-cat calls to a human before dispatch.

Who should own the spoed-monteur dispatch if the agent compromises?

Name two people in the kwaliteitshandboek before going live: the werkvoorbereider on roulatie owns the dispatch, and the technisch directeur owns the prompt. Two signatures, one page.

Why hybrid rather than pure agent or pure rotation?

Pure agent is cheapest per melding and indefensible on gas-cat. Pure rotation falls over on weekends and volume spikes. Hybrid keeps about 70% of the cost win and puts a human on the calls that actually matter.

voice agentsai agentsprocess automationoperationsautomationbusiness

Building something?

Start a project