Email automation

Email automation: 2,180 briefs a week, zero conflicts

How a 29-person Amsterdam reclamebureau routes 2,180 weekly briefs through an 11-year-old Teamleader CRM and a Notion-archief, with every DPG conflict caught in four minutes.

Jacob Molkenboer· Founder · A Brand New Company· 13 Apr 2026· 9 min

Brass mail-sorting rack with cream envelopes and linen dividers on an ivory desk, one tied with a green ribbon.

On a Tuesday morning in March, the operations lead at a 29-person Amsterdam reclamebureau opened her laptop to 412 unread emails in the shared briefing inbox. By 5pm there would be roughly 540. Across a normal week the inbox swallows 2,180 briefing mails, and until last summer every one of them was triaged by hand.

The agency works for the kind of clients you have heard of. That is the problem, not the boast. Several of those clients compete with brands owned by DPG Media, and the agency's contract with one of its largest accounts says, in plain Dutch, that no briefing mentioning a DPG-owned title may be opened, archived, or replied to without a documented conflict check first. Miss that step once and you lose the account.

This is the story of the agent we built to handle that inbox.

The stack we walked into

Two systems, each with eleven years of muscle memory baked in.

The first is Teamleader Focus, a Belgian CRM the agency has used since 2014. Custom fields have accreted across three account managers and two operations leads. The "client" entity has 41 fields. Roughly seven of them are filled in consistently. The rest are folklore.

The second is a Notion-archief the team built themselves in 2019 when they realised the CRM could not hold the texture of a creative brief. Every brief that ever shipped, every pitch, every kill-fee, every "we passed on this for capacity reasons" note, lives in Notion, organised by client, then by year, then by brand. About 7,400 pages, growing by maybe 30 a week.

Any agent that wanted to help with the inbox had to read both systems and respect both worldviews. Teamleader holds the contractual truth. Notion holds the institutional memory. Neither one knows what the other knows.

The four-minute clock

We set one hard SLA before writing any code: from the moment a briefing email lands, the agent has four minutes to decide whether it can touch it at all. That number is not arbitrary. The agency's account managers check the inbox roughly every fifteen minutes during the working day. If the agent can sort, classify, and either park or draft within four minutes, an account manager will only ever see emails that are already in the correct lane.

The pipeline looks like this:

inbox webhook
  → classifier (is this a briefing? a quote? a newsletter?)
  → entity extraction (sender domain, brand mentions, deadlines)
  → CRM match (Teamleader client_id, or null)
  → archive search (last 18 months of Notion briefs for this brand)
  → conflict check (DPG brand graph + contract exclusions)
  → route: queue | draft | escalate

The classifier is a small fine-tuned model running on Dutch subject lines and the first 300 characters of the body. It is right about 96% of the time, which sounds fine until you remember that 4% of 2,180 is 87 wrongly classified mails a week. So nothing the classifier touches goes anywhere autonomously without a second check.

The conflict-of-interest queue

This is the part that earns its keep.

DPG Media owns somewhere north of forty Dutch and Belgian titles. De Volkskrant. Het Parool. AD. Trouw. Tweakers. Independer. A lot of brands a marketing person would not, at a glance, connect to the same parent. The agency's exclusion clause does not say "no DPG". It says "no titles in the DPG portfolio at the time of contract signing, plus any subsequently acquired titles within 90 days of public disclosure of the acquisition." That sentence has a date in it. It needs a brand graph that knows when each acquisition closed.

We maintain that graph as a small versioned JSON file in the repo, updated quarterly by a paralegal at the agency. The agent does a deterministic lookup against that file before any LLM call touches the brief. If the sender domain, any URL in the body, or any of the brand names extracted from the brief matches a DPG-portfolio entry, the email goes straight into a conflict-of-interest queue and the account manager gets a Slack ping.

Median time from email arrival to queue placement, measured over the last 90 days: three minutes and forty-two seconds. Within the four-minute SLA, every time so far.

Warning

If your conflict check depends on an LLM "knowing" which brands belong to which parent, you will eventually ship a wrong answer. Brand portfolios change. Models are stale. Conflict logic belongs in code you can grep, not in a prompt you can vibe-check.

Drafting the capaciteits-bevestiging

For briefings that clear the conflict check and match a known client, the agent drafts a capaciteits-bevestiging: a short Dutch reply that confirms receipt, names the senior creative who will lead the project, and gives a realistic first-draft date based on current studio load.

The draft must comply with the DDMA-gedragscode, which sets out, among other things, how Dutch marketing communications must handle personal data, opt-outs, and the identity of the sender. For our purposes the relevant clauses are about transparency: the reply has to make clear that a human is responsible for the project even when an agent wrote the first version of the email. Every draft ends with the named account manager's full name and direct phone number. Nothing is sent autonomously.

The draft sits in the account manager's outbox as a pre-filled reply. She reads it, edits if she wants, hits send. Over the first eight weeks in production, roughly 81% of drafts were sent with no edits at all, another 14% with cosmetic edits, and the remaining 5% rewritten from scratch. The rewrites cluster on briefs from new clients whose tone the agent has not learned yet.

The first version we threw out

The first version of the agent did everything in one big LLM call. Read the email, look up the client, check the brand graph, decide the route, draft the reply. It worked in the demo. It failed in production within a week.

The failures were the kind you would expect. The model would confidently route a brief from a small fashion label into the conflict queue because one paragraph mentioned "Tweakers" as a target audience. It would draft a capaciteits-bevestiging using a project lead who had left the agency three months ago, because that lead's name appeared in the most recent Notion page for that client. Plausible answers, wrong answers.

The shape we ended up with separates the deterministic and the generative cleanly. Conflict checks, CRM lookups, capacity calculations: code. Classification of ambiguous subject lines, generation of the draft reply, summarising the brief for the account manager: model. Each step writes to a structured log so we can replay any decision the agent made.

# pseudocode of the routing step
def route(email):
    parsed = extract_entities(email)        # deterministic
    client = teamleader.match(parsed)       # deterministic
    if conflict_check(parsed, client):      # deterministic, against the graph
        return queue("conflict-of-interest", reason=conflict_check.reason)
    if client is None:
        return queue("unknown-sender", draft=summary_for_human(email))
    return queue("ready-to-confirm", draft=draft_confirmation(email, client))

Scope before capability

The conflict-of-interest queue exists because there is a class of email this agent must not touch. Not because the model could not handle it. Because the cost of being wrong, even once, is a lost account and a regulator-shaped problem. The discipline is not "what can the agent do." It is "what must the agent never do, and how do we make that constraint visible in the code."

Every agent we have shipped has had a scope question that mattered more than any capability question. Answering the scope question first is what keeps the capability question honest. In an inbox where one wrong open costs you a client, scope is not a footnote on the architecture diagram. It is the architecture diagram.

Takeaway

Conflict logic in code, draft logic in the model, and a four-minute SLA the inbox owner can feel. Nothing the agent decides should be invisible to the human who owns the consequence.

What changed for the team

Before: the operations lead spent the first 90 minutes of every day triaging the inbox. Account managers got briefs in batches, twice a day, often after the prospective client had already gone home. Conflicts were caught most of the time. "Most of the time" was the source of two late-night conversations a year that nobody wanted to have.

After: the inbox is essentially empty by 9am because everything has already been sorted. Account managers see briefs as they land, with a draft reply attached. The operations lead spends those 90 minutes on actual operations. Over the first sixteen weeks in production the agent flagged 23 conflicts the team would otherwise have had to catch by hand. Three of those were ones a human almost certainly would have missed.

The team did not get smaller. Nobody was hired or fired because of this. The work shifted from "open every email" to "decide on every email," which is what the senior people were there to do in the first place.

The smallest version worth shipping

You do not need to build the whole thing to learn whether it would work for you. Pick the one decision in your inbox that costs you the most when it goes wrong, conflict checks, supplier-vs-client disambiguation, urgent-vs-routine, and write it as a rule, not a prompt. Run it against last month's archive. Count the disagreements between the rule and what your team actually did. That gap is the brief for the agent.

When we built the email-agent for this reclamebureau, the thing we ran into was that the conflict-of-interest logic could not live in the same place as the draft-generation logic without one corrupting the other. We ended up solving it by splitting the agent into a deterministic router and a generative drafter, and writing the brand graph into a file the agency's own paralegal could edit. If you are looking at a similar inbox, that is the shape we now reach for first across our AI agents work.

Key takeaway

Conflict logic in code you can grep, not a prompt you can vibe-check. Irreversible decisions belong where the human who owns the consequence can see them.

FAQ

Why put the conflict check in code instead of letting the model handle it?

Brand portfolios change and model knowledge is stale. A versioned JSON file the agency's paralegal updates is auditable, greppable, and wrong only when a human gets it wrong.

How does the agent handle a brief from a brand it has never seen before?

It still runs the conflict check against the DPG brand graph, then routes the email to an unknown-sender queue with a one-paragraph summary for the account manager rather than drafting a reply.

Is auto-drafting a Dutch reply compliant with the DDMA-gedragscode?

Drafting is fine. Sending without a named human is not. Every draft sits in the account manager's outbox with her name and phone number on it, and she hits send.

What happens when Teamleader Focus and Notion disagree about who the client is?

Teamleader wins for contractual fields like client_id and exclusion clauses. Notion wins for tone, project history, and which creative led the last brief. The agent never overwrites either system.

email automationai agentscase studyintegrationsoperationsworkflow

Building something?

Start a project