Email automation
Credit-control email agent: €212k recovered in one quarter
A 29-person B2B distributor in Breda swapped a four-person credit-control desk for an email agent. It chases 1,800 invoices a month and recovered €212,000 of aged AR in Q1.

The credit-control lead opens her spreadsheet on a Monday at 08:30. 1,840 open invoices. 312 of them older than 60 days. 47 sitting north of 120. She has four colleagues and a coffee that has already gone cold. By Friday the five of them will have made roughly 320 chase calls and sent maybe 600 emails. The DSO needle will move down by two days, maybe none.
This is the situation a 29-person B2B distributor in Breda walked into our office with last September. They sell industrial fasteners and abrasives to about 4,000 active business customers across the Benelux. Annual revenue around €38M. DSO sitting at 68 days. The credit-control desk cost roughly €240,000 a year in salary and overhead. Three of the four people on it were actively looking for other jobs. The work, they said, was the worst part of their week.
Nine months later, the desk has one person on it. She is the team lead, she earns more than she did before, and she spends most of her day handling disputes and the occasional escalation. Everything else, all 1,800-odd monthly chases, runs through an email agent we built around their Exact Online ledger. In Q1 of this year, the agent helped recover €212,000 of aged accounts receivable. DSO came down to 49 days. We want to tell you how it actually works, because the part that surprised us was not the technology.
The shape of the work
Before we wrote a line of code, we sat with the credit-control desk for four days. Two days of shadowing, two days of timing. The interesting finding was that almost none of the work was thinking work. Of the eight hours each clerk spent per day, around 70 minutes was judgment (negotiating a payment plan, deciding when to halt deliveries, deciding when to send to collections). The other six-plus hours was retrieval, formatting, and waiting.
Retrieval looked like: open Exact, find the customer card, find the invoice, copy the number, copy the amount, copy the due date. Formatting looked like: paste it into an Outlook template, change the salutation, attach the PDF copy of the invoice. Waiting looked like: send the email, wait two to four days, check whether the customer replied, check whether they paid, decide whether to call. And then the spreadsheet, always, the spreadsheet, with its 1,800 rows and its colour coding and its lost comments.
The thing we tell clients in this position: if your team's work is mostly retrieval and formatting, you do not have a credit-control problem. You have a context-assembly problem. The chase is the last 5%.
The bottleneck in credit control is rarely the chase itself. It is the time spent assembling the context needed to write a sensible chase email.
What we replaced and what we kept
We did not try to automate the judgment work. We automated everything around it. The agent owns four jobs: pulling the open-invoice list from Exact Online every morning, classifying each invoice into one of six chase states, composing and sending the right email in the right tone in the customer's language, and reading replies to decide whether the matter needs a human or can be answered automatically.
The team lead owns the rest: disputes, payment plans, the call to halt shipments, the call to write off, the call to file with collections. She gets a daily digest at 09:00 with the ten items that need her judgment. The other 1,790 are already moving.
How the agent reads the ledger
Exact Online has a clean REST API. The agent pulls the open-invoice list once an hour, plus the matched-payments feed. For each invoice it computes age in days, age bucket, customer history (average days late over the last 12 months), customer relationship state (active orders, recent disputes), and the most recent contact, on any channel, with that customer.
That last field matters. The agent will not send a chase email if a human at the company has emailed or called the customer within the last 48 hours. We learned this the hard way in week two of pilot, when the sales rep for a large account had spent a friendly call on Friday and the agent followed up with a stiff Dutch reminder on Monday morning. The customer was not amused. Now the agent reads the shared Microsoft 365 mailbox and the Aircall CDR before it composes anything.
The classification is rule-based, not LLM-driven. Six buckets:
- Pre-due: due in 0 to 3 days. Soft nudge, friendly tone, no demand.
- Just past due: 1 to 14 days late. First reminder, neutral.
- Aged: 15 to 45 days late. Firmer, restates terms, attaches statement.
- Stuck: 46 to 90 days late. Formal demand, references statutory interest under the EU Late Payment Directive 2011/7/EU.
- Pre-collections: 91 to 120 days late. Final notice, payment plan offered, halt of deliveries flagged.
- Escalation: over 120 days. Agent does not send. Lands in the human queue.
We kept the classification rule-based on purpose. An LLM is good at writing the email. It is unnecessary and unsafe for deciding when to threaten an account. That decision needs to be auditable and predictable. A junior auditor or the customer's own bookkeeper can read the rules and tell you what tier they are in.
How the agent writes the email
This is the part we expected to be hard. It was not. The agent has six templates, one per tier, with placeholders for customer name, invoice numbers, amounts, due dates, and the payment link. The LLM only touches the email in three places:
- Tone adjustment based on customer history. A 12-year customer who pays on day 38 every month gets warmer language than a new customer with two missed payments in a row.
- Language. Dutch, French, German, English, based on the customer's master record. If the master record is wrong, the reply triages to the team lead.
- Reference handling. If the customer wrote "please use our PO 4477123 on the next invoice" three months ago, that PO appears in the chase.
The agent sends from a real mailbox (creditcontrol@), with a real reply-to. We never spoof a human's address. The signature names the team lead, with a line that says "this reminder was prepared by our automated credit-control assistant." That sentence was the team lead's idea. She wanted customers to know they could call her if the email felt wrong. In nine months, two have called. Both calls turned into payment plans.
The reply loop
About 38% of chases get a reply. The agent reads every reply and classifies it into one of five intents: paid (will reconcile), promise-to-pay (with a date), dispute, copy-request, other.
Paid replies cross-check against the Exact matched-payments feed. If the payment shows up, the thread closes. If it does not show up within seven days of the promised date, the agent re-opens the chase one tier higher. Promise-to-pay sets a reminder. If the promised date passes without payment, the agent escalates one tier with a polite "we agreed on X, we have not received payment, can we agree on Y."
Copy-request triggers automatic resend of the invoice PDF and statement. Around 11% of replies are copy-requests. Before the agent, this category alone consumed about 90 minutes a day of clerk time.
Disputes and "other" route to the team lead with a short summary of what the customer wrote and the agent's recommendation. She accepts the recommendation about 80% of the time and writes the reply herself for the other 20%.
If you let an agent reply to disputes automatically, you will eventually agree to a credit you never meant to grant. Replies to disputes must route to a human. There is no good shortcut here.
The Q1 numbers, with the messy parts
We do not love the marketing version of these numbers, so here is the honest version.
Recovered aged AR: €212,000. This is the sum of payments received in Q1 against invoices that were over 60 days old when the quarter started. It is not pure agent attribution; some of those would have been collected anyway. The Q1 2025 figure (same period, before the agent) was €74,000. The €138,000 delta is the conservative way to read the impact.
DSO: 68 to 49 days. Steady decline across the quarter, with a flatter curve toward the end as the easy-to-collect aged tail ran out.
Headcount: 4 to 1. Three colleagues moved to other roles inside the company (warehouse coordination, customer service, internal sales). No layoffs, which was the founder's hard requirement at the start.
Annualised savings: around €185,000. Salary and overhead saved, minus the agent's running cost (about €1,400 a month, mostly LLM API spend and our retainer).
Customer satisfaction: not worse, possibly better. The client runs a quarterly NPS. It moved from 31 to 38. We do not credit the agent for that. We suspect it is the fact that copy-requests now come back in 90 seconds instead of three days.
For the macro context, the latest Atradius Payment Practices Barometer puts the average B2B DSO in Western Europe in the high 50s. Our client moved from above average to comfortably below it. They are not unique. The work is unglamorous and reproducible.
The things we did not automate
This list is shorter than the automated one, but it is the whole game.
We do not automate the decision to halt deliveries. That is a commercial call with consequences for the sales relationship. The agent flags candidates; the team lead and the sales rep decide together.
We do not automate write-offs. Writing off is a CFO-level call.
We do not automate calls. We tested a voice agent for follow-up calls in March. It worked technically. The team lead asked us to retire it after two weeks. Her reasoning: "if a customer is 90 days late and we are calling them, the relationship needs a human voice, even if the message is firm." We agreed. Some moments are not for agents.
We do not automate replies to disputes. See the warning above.
What we would do differently
Two things.
First, we would build the human-contact-window check before sending a single test email. We rebuilt it in week two and it cost goodwill we did not need to spend.
Second, we would integrate the payment-link earlier. The agent now sends a customer-specific payment link in every chase email, generated against their open invoices. About 14% of paid replies now come via that link instead of bank transfer. Customers who use the link pay, on average, two days faster than the ones who do not. We did not have this in pilot. We should have.
The smallest version you could build this week
You do not need an LLM to start. You need three things: a clean read of your open-invoice list, a way to check whether a human at your company has touched that customer in the last 48 hours, and a single, well-written template per chase tier.
If you can do those three things, you can run a credit-control assistant from a script and a cron job. The LLM is what makes it good. The plumbing is what makes it safe.
When we built the email-agent for the Breda distributor, the hard part was not the chase emails. It was reading the shared mailbox, the call log, and the order book before deciding to send anything. We solved it with a context-assembly layer that any of our other AI agent builds now uses by default. If your credit-control desk is drowning, the first move is not to write better chase emails. It is to make sure the agent knows what your humans have already done.
Tomorrow morning, before you do anything else: open your AR aging report, sort by age, and count how many of the rows over 60 days have a copy-request as their last interaction. If that number is over 10%, you have already paid for the agent.
Key takeaway
The bottleneck in credit control is rarely the chase itself. It is the context-assembly work that has to happen before a sensible chase can be written.
FAQ
Can a four-person credit-control desk really be reduced to one?
Yes, when the work is mostly context assembly and templated chases. Keep one human for disputes, payment plans, halt-of-delivery calls, and write-offs. Everything else automates cleanly.
Does the agent send firmer emails based on customer history?
Yes. Six rule-based tiers set the structure; the LLM adjusts tone. A long-term customer who pays a few days late every month gets warmer language than a new customer with two recent misses.
What does the agent never do?
It never halts deliveries, writes off invoices, places phone calls, or replies to disputes. Those decisions stay with a human. The agent prepares context and recommends; humans commit.
Which ERP does this work with?
We built it against Exact Online's REST API. The same pattern works for AFAS, Twinfield, Sage, NetSuite, or any ERP with an open-invoice export and a matched-payments feed.
How long does a build like this take?
Roughly six weeks from kickoff to pilot, plus four weeks of supervised running before the human desk shrinks. The four shadowing days at the start are not optional.