Chat agents
Chat agent triage: 1,420 quotes a week at a Breda landscaper
It's 06:48 on a Tuesday in March. The seizoens-planning at the Etten-Leur loods locks at 07:30, and 87 quote requests are still untriaged in the inbox.

It's 06:48 on a Tuesday in March. The seizoens-planning at the Etten-Leur loods locks at 07:30, and 87 quote requests are still sitting untriaged in the company inbox. The meewerkend-voorman — a 51-year-old who has been laying gardens since 1994 — is on his third coffee, scrolling through emails on a Samsung tablet flecked with dried potting compost. Each request needs a yes-or-no on whether his crew can quote it, and how much of an existing beplantingsplan it can reuse.
That was the situation at the hoveniersbedrijf in Breda before we shipped their chat agent in late 2024. They had grown from 8 staff to 22 in six years, mostly on word-of-mouth in the Brabant region, and the inbox had grown faster than the people who could read it.
The volume problem nobody mentioned in the discovery call
The brief we got was "we need a chatbot on the website." What the data actually showed, once we mirrored their inbox and the WhatsApp Business number for a week, was 1,420 quote requests in seven days during the March-to-May peak. Roughly 60% came through the website form, 25% through WhatsApp, and the rest through email forwarded from various intake addresses (info@, offerte@, planning@, plus three personal addresses of staff who had been the contact point in 2017 and never got removed from the website).
Of those 1,420, about 78% were sub-100m² — terrace replants, hedge trims, a row of buxus replacements after the rupsplaag in the neighborhood. Those follow a well-trodden path: the catalogus in Exact Online has them priced as standard packages, the foreman doesn't need to look at them, and the apprentice can quote them within an hour of receiving the request.
The 22% that needs human judgment is the entire problem. Anything over 400m² touches drainage, soil composition, sometimes vergunningen, and almost always reuses or extends an existing beplantingsplan from the PostgreSQL archive — twelve years of designs drawn in a homegrown tool the previous IT volunteer built in 2013 before he moved to Eindhoven.
The two systems the agent had to talk to
Two data sources, neither of which had been designed for a chat agent to query.
The first was their Exact Online tenant, where the tuinplan-catalogus lived as Items with a tangle of custom division fields: surface-area band, soil-type band, sun band, average install hours, and a free-text "notitie voor planner." The fields had been added one at a time between 2012 and 2024 by three different bookkeepers, so naming is inconsistent — m2_band on some records, OppervlakteBand on others, OPP on a handful from 2015.
The second was the beplantingsplan archive: a PostgreSQL 14 database on a Hetzner box, with about 4,200 plans, each with a polygon (PostGIS), a plant list, and a scan of the original hand-drawn sketch. The plant list uses Latin names roughly half the time and Dutch the other half — Buxus sempervirens in one row, "lavendel" in the next.
If your knowledge base has been touched by three people over twelve years, the agent will mirror their disagreements back at the customer unless you normalise upstream. We learned this in week one, when an early test reply quoted both a Latin and a Dutch name for the same shrub in adjacent sentences.
Triage in 90 seconds, not a quote in real time
The agent does not try to quote anything. It triages. Every incoming aanvraag is classified into one of four buckets within 90 seconds of arrival:
- Standaard onder 100m² — auto-drafted quote from the Exact catalogus, sent to the apprentice's queue.
- Standaard 100–400m² — same, but flagged for a sanity check before sending.
- Boven 400m² — parked in the meewerkend-voorman queue with an attached "closest match" from the beplantingsplan archive.
- Onduidelijk — agent replies asking for one specific clarification (almost always: surface area or postcode), then re-runs.
The 90-second target is not a marketing number. The seizoens-planning at the loods locks at 07:30. Anything that lands in the foreman's queue after 07:30 slips a day. During peak he needs to clear the overnight queue between 06:30 and 07:30. He has roughly 60 minutes for what used to be 90 minutes of reading. The agent buys him that half-hour by pre-sorting and pre-matching.
How the 400m² rule actually gets enforced
Surface area is the single most important field, and the customer almost never gives it cleanly. They say "ongeveer een halve voetbalveld" or "achtertuin van een rijtjeshuis in Princenhage." The agent does three things in sequence:
- Asks once for a number, accepting m², a length × width, or a postcode + huisnummer.
- If a postcode + huisnummer arrives, queries the BAG for the parcel and computes a rough perceel-oppervlakte minus the building footprint.
- If the result is within 50m² of the 400m² threshold, escalates to the foreman regardless — we'd rather waste two minutes of his time than miss a job that needed his eye.
The "ask once" rule matters. The agent never asks the customer the same question twice. If the second turn still doesn't have a number, it asks for the postcode instead. Three turns and no number, it parks the conversation in the onduidelijk bucket and writes a one-line note for the human. Customers will abandon a chat that feels like a form.
def classify_size(payload: dict) -> SizeBucket:
m2 = payload.get("m2_explicit")
if m2 is None and (lw := payload.get("length_x_width")):
m2 = lw[0] * lw[1]
if m2 is None and (pc := payload.get("postcode_huisnummer")):
m2 = bag_perceel_m2(pc) - bag_pand_m2(pc)
if m2 is None:
return SizeBucket.UNKNOWN
if 350 <= m2 <= 450:
return SizeBucket.FOREMAN_BORDERLINE
if m2 > 400:
return SizeBucket.FOREMAN
if m2 >= 100:
return SizeBucket.APPRENTICE_CHECK
return SizeBucket.APPRENTICE_AUTO
Matching against twelve years of beplantingsplannen
When a job lands in the foreman's queue, the agent attaches the three closest matches from the PostgreSQL archive. "Closest" is a weighted score across surface area, postcode distance (driving time matters more than crow-flies for a hoveniersbedrijf), soil band, and plant-style tags. We backfilled the soil band from a public soil-type map and pre-computed the centroid of each archived polygon so the score runs in under 200ms.
SELECT
plan_id,
ST_Distance(centroid, ST_MakePoint($1, $2)::geography) AS meters,
abs(area_m2 - $3) AS m2_delta,
soil_band,
similarity(style_tags, $4) AS style_sim
FROM beplantingsplannen
WHERE area_m2 BETWEEN $3 * 0.5 AND $3 * 2.0
ORDER BY
(0.4 * (m2_delta / 1000.0))
+ (0.3 * (meters / 20000.0))
+ (0.3 * (1 - style_sim))
LIMIT 3;
The foreman opens the queue, sees the aanvraag, and immediately has three prior plans to look at. In roughly 70% of cases he picks one of those three as the basis for the new quote. In the other 30% he either draws fresh or asks the agent for a wider search — both supported.
What broke in the first three weeks
Two things failed that we didn't see coming.
First, the Exact Online API rate limit. The tenant had a daily ceiling and we were burning through it by mid-afternoon during peak. The fix was a Redis-backed read-through cache for the catalogus, refreshed nightly at 03:00. Items don't change often enough to need real-time freshness — the bookkeeper edits prices once a quarter. Exact's rate-limit documentation is worth reading before you go to production against their API; the limits are per-endpoint, not flat per-tenant, which is the kind of detail you discover the hard way.
Second, the foreman didn't trust the agent's "closest match" suggestions until we showed him why. We added a one-line explanation under each suggestion: "same soil band, 1.2km from the job, comparable surface area." After that, his pick-rate from the suggestions roughly doubled in a week. The explanation is generated from the same fields the score uses, so it's never wrong — it just narrates the score.
An agent doesn't earn trust by being right. It earns trust by showing its work in a sentence the operator can verify in two seconds.
The numbers six months in
By April 2026 the agent was handling all 1,420 weekly aanvragen during peak. The foreman's morning queue was clearing by 07:18 on average, twelve minutes before the lock. The apprentice was sending around 740 standard quotes a week without the foreman ever seeing them — work that used to land on his desk and sit there until Thursday. The onduidelijk bucket runs at roughly 4% of volume, which matches the rate of genuinely odd requests (existing klant asking for an unrelated klus, someone testing the bot, the occasional sales pitch from a leverancier).
Conversion on the auto-drafted quotes is within two percentage points of the human-drafted ones from the same catalogus the year before. The foreman's queue has a higher conversion than it did pre-agent, which we attribute to faster response time — sub-400m² aanvragen now get a price within an hour during business hours, where previously the lead time was two to three days and the customer had already called three other hoveniers.
What this didn't fix
The agent didn't fix the catalogus. It still has m2_band, OppervlakteBand, and OPP fields competing for the same data. We wrote a normalisation layer that reads all three and writes back to a fourth field the agent owns, but the underlying mess remains. The bookkeeper will clean it up "when there's time," which there isn't. This is the honest version of every AI-on-top-of-legacy story: the agent papers over the mess; it does not remove it.
The PostgreSQL archive also still mixes Dutch and Latin plant names. We built a synonym table with about 180 entries, which covers the long tail well enough, but every six months someone adds a new vaste plant in only one of the two languages and it takes a week to notice.
The smallest audit you can run today
Open your inbox folder for the last seven days. Count the requests. Bucket them by surface area or job size — or, if you're not a hoveniersbedrijf, by whatever proxy distinguishes the standard work from the bespoke. If 70% or more are mechanically similar, an agent will pay for itself before the next seizoen. If they're all bespoke, you don't have an agent problem — you have a pricing problem, and a bot won't help.
When we built this chat agent for the Breda hoveniersbedrijf, the part that took longest wasn't the language model. It was the BAG lookup, the Exact cache, and convincing twelve years of inconsistent custom fields to behave like one schema. That is the work in most AI agent projects: not the model, the plumbing around it.
Key takeaway
An agent earns trust by narrating its score in a sentence the operator can verify in two seconds — that single line roughly doubled the foreman's pick-rate.
FAQ
Why a 400m² threshold and not square-meter pricing across the board?
Because below 400m² the work is mostly catalogus-driven and the apprentice can quote it. Above 400m², drainage, soil and beplantingsplan reuse matter, and only the foreman has the judgement to weigh them.
What happens if the customer never gives a surface area?
The agent asks once for a number, once for a postcode + huisnummer, then parks the chat in the onduidelijk bucket with a note. It will not pester. Three turns is the cap.
Could you skip Exact Online and store the catalogus directly?
You could, but the bookkeeper lives in Exact. Two sources of truth for prices is how you end up quoting yesterday's price. The Redis cache gives you speed without forking the data.
How long did the project take end-to-end?
About ten weeks from kickoff to the agent handling 100% of peak volume. Two weeks of inbox mirroring, four weeks of build, four weeks of supervised rollout where every agent reply was double-checked.
Does the agent ever quote on its own, without a human?
Yes, for the sub-100m² bucket where the catalogus has a standard package. About 740 of the 1,420 weekly aanvragen now go out auto-drafted. Everything above 100m² still gets a human glance before sending.