Operations

Customer-support stack: a 4,200-ticket decision method

It's 23 November 2026. You run a €4M Dutch e-com brand, you handled 4,200 support tickets last month, and the board wants a number for next year.

Jacob Molkenboer· Founder · A Brand New Company· 15 Apr 2026· 8 min

Brass ticket spike with fanned cream tickets, one tied with green ribbon, beside a linen ledger, pencil, and bell.

It's 23 November 2026. You run a €4M Dutch e-commerce brand. Your support inbox took 4,200 tickets last month and your two existing seats are quitting in March. The board wants a number for next year's support stack on Monday, and an explanation that survives the audit conversation in April.

There are three real choices in front of you, and the founders who pick badly tend to pick on vibes. Here is the method we walk our clients through, with the actual numbers.

The three options on the table

Pretend we already know your domain. Sub-€6M Dutch e-commerce, mid-quality SKU mix, ~12% return rate, peak load in the eight weeks before Sinterklaas and Kerst. The realistic shortlist:

A. Pure agent. A Claude-driven chat agent on your site and in your inbox, RAG-backed against your help centre and order DB, escalates only by email tag.
B. Manila BPO, two-person rotation. Two trained agents covering English and basic Dutch on a follow-the-sun shift, working off your Zendesk macros.
C. Hybrid. Agent-first triage, a Dutch tier-2 hand-off for anything the agent flags or that lands on a refund or complaint pattern.

You will hear vendors claim there is a fourth or fifth option. There is not, at €500k–€6M revenue. You do not have the spend for a 15-seat NL contact centre, and you should not run an agent without a human escape hatch.

Per-ticket cost at 4,200 monthly contacts

The way to get this number wrong is to compare API spend to BPO salaries. The way to get it right is to amortise the build cost, count the supervision hours, and divide by the ticket volume you actually see — not the volume you wish you saw.

Working numbers from six Dutch e-com builds in the last fourteen months, fully loaded in euros per month:

// 4,200 tickets/month, EUR, fully loaded
const month = {
  A_pure_agent: {
    api_spend:       420,   // Claude Sonnet, ~€0.10/ticket avg incl. retries
    monitoring:      350,   // PostHog + Langfuse + on-call rotation
    supervision:     600,   // 8h/week founder or ops lead review
    amortised_build: 2_080, // €25k build over 12 months
    total: 3_450,
  },
  B_manila_bpo_2p: {
    seats:    4_400,        // 2 FTE @ €2,200 loaded incl. BPO margin
    nl_qa:      400,        // one NL-speaker QA review pass weekly
    tooling:    250,        // Zendesk seats + translation tooling
    total: 5_050,
  },
  C_hybrid: {
    api_spend:      300,    // 70% deflection → ~2,940 tickets
    nl_tier_2:    3_200,    // 1 part-time NL agent, ~24h/week
    monitoring:     350,
    amortised_build: 1_500, // €18k build over 12 months
    total: 5_350,
  },
}

// per ticket
// A: 3450 / 4200 = €0.82
// B: 5050 / 4200 = €1.20
// C: 5350 / 4200 = €1.27

Pure agent wins the spreadsheet. It is not close. But the spreadsheet is one of three columns on the board, not the only one.

Defensibility under the Wet oneerlijke handelspraktijken

The Dutch Wet oneerlijke handelspraktijken — the local implementation of EU Directive 2005/29/EC — sets the rules for misleading or aggressive practice toward consumers. The ACM enforces it. The thing it cares about, in plain language: did you tell the customer something that materially changed their decision, and was it true.

An agent that hallucinates a 30-day return window when your policy is 14 days is the textbook violation. The defence is not "the model said it" — it is the audit trail you can put in front of an ACM caseworker.

What that trail has to contain, for each ticket:

The exact prompt and context the agent received, versioned.
The retrieved knowledge-base chunks it grounded on, with hashes.
The full conversation, including any tool calls the agent made (refund issued, label generated, voucher created).
The reviewer signature if a human touched it.

Option A can produce this trail if you build it that way from day one. We have done it. The cost is in the architecture, not in compute.

Option B produces a different shape of trail — Zendesk macros, agent IDs, QA scorecards — that the ACM is already comfortable reading. There is twenty years of precedent for "human agent made a mistake."

Option C gives you the strongest position: the agent has a tightly scoped lane, and anything that smells like a complaint, a refund above €X, a delivery dispute, or a vulnerable-consumer flag (Dutch keywords: "klacht", "ACM", "advocaat", "ziek", "ouder", "rouw") is routed to a named NL human who signs off.

Warning

European regulators are not waiting. The EU AI Act is already in force on a staged timeline, and the ACM has been publishing dark-pattern enforcement work since 2022. Do not assume "no one is watching." Build the audit trail before you ship, not after the first complaint.

Refund-budget ownership on a Christmas weekend

The scenario that decides this for most founders is the one nobody puts in the deck. It is 21 December, your warehouse is closed, the agent is live, and the model is in a generous mood. A customer pushes hard, the agent issues a €180 store credit it should not have issued, and by Sunday evening 47 customers have done the same trick. You wake up Monday to an €8,460 hole.

The question is not "could this happen." It is "whose name is on the invoice."

Pure agent: yours. There is no second party to blame. Mitigation has to live inside the agent — hard caps per ticket (€25 store credit max without escalation), daily compromise budget (€200/day before the agent loses refund authority), and a kill-switch the on-call ops lead can hit from their phone.
Manila BPO: mixed. Standard BPO contracts cap individual agent authority and put consequential overspend back on the BPO if it breaches policy. Read the SLA. Most are weaker than founders think.
Hybrid: cleanest. The agent has a €25 lane and a €200/day pool. Anything above goes to the NL tier-2 who has a personal cap of €75 and escalates to the founder above that. You can write this on a single page and an ACM caseworker can read it in two minutes.

The hybrid number on the spreadsheet (€1.27/ticket) is €0.45 more than pure agent. That delta buys you a refund-budget structure that does not depend on a frontier model staying in a good mood on Sinterklaas weekend.

The scoring sheet

Score each option 1–5 on the three axes, weighted to the founder you actually are.

// Weights — move these for your risk appetite
const w = { cost: 0.35, defensibility: 0.30, refund_ownership: 0.35 }

const score = {
  A_pure_agent: { cost: 5, defensibility: 3, refund_ownership: 2 },
  B_manila_bpo: { cost: 3, defensibility: 4, refund_ownership: 4 },
  C_hybrid:     { cost: 3, defensibility: 5, refund_ownership: 5 },
}

const weighted = (s) =>
  s.cost * w.cost +
  s.defensibility * w.defensibility +
  s.refund_ownership * w.refund_ownership

// A: 3.40   B: 3.65   C: 4.30

Move the weights. If you are a 28-year-old founder with a high risk appetite and a healthy current account, option A might come out on top. If you are a CFO on a board with two non-execs from a Dutch bank, you will not get C past the audit committee even if it wins on paper.

Takeaway

Our default recommendation for sub-€6M Dutch e-com at 3,000–6,000 monthly contacts is hybrid, with a 90-day window to harden the agent and then re-score.

What this looks like in production

When we built the hybrid stack for a Dutch home-goods brand at ~5,000 tickets/month last spring, the deflection rate sat at 71% after the first month and 78% after we rewrote the refund-policy chunks in the knowledge base. Per-ticket cost landed at €1.18 fully loaded. The thing we ran into was that the NL tier-2 person quietly stopped using the escalation form after week three, which broke the audit trail. We fixed it by making the form the only way to issue a refund above €40 — the corresponding Zendesk macro was disabled. The way we structure these AI agents for European e-com follows the same recipe.

Open a spreadsheet. Put your last 30 days of ticket volume, three real BPO quotes, and the weights above into three columns. The answer falls out in twenty minutes.

Key takeaway

At sub-€6M Dutch e-com, hybrid usually wins — not on per-ticket cost, but on ACM defensibility and on whose name is on the refund invoice.

FAQ

Why not run the Claude agent without a human escape hatch?

Because the first time the model issues an unauthorised refund on a peak weekend, your founder phone is the one that rings. Cap the authority, route the escalations, and write the policy on a single page.

What deflection rate is realistic for a Dutch e-com chat agent?

60–80% after a month of tuning, depending on SKU complexity, return-policy clarity, and how clean your help-centre content is. Anything claimed above 85% on month one is either selective sampling or a different ticket mix.

Will the ACM accept agent logs as an audit trail?

They will accept a complete, hashed, versioned record of prompt, retrieved context, conversation, and tool calls. They will not accept 'we trust the model.' Architect the trail before you ship the agent.

Is a Manila BPO realistic for Dutch-language support?

Yes, with a thicker QA layer and thinner Dutch nuance. Most sub-€6M founders end up with English-first BPO and a small NL pool for complaints, which is a hybrid in everything but name.

ai agentschat agentsoperationse-commercestrategyautomation

Building something?

Start a project