← Blog

Email automation

Email automation for a law firm: the Kleos and NOvA playbook

A Haarlem advocatenkantoor gets 1,820 client emails a week. Triage, conflict-check, draft, archive — without one reply that could breach NOvA rule 5.

Jacob Molkenboer· Founder · A Brand New Company· 31 Dec 2025· 10 min
Cream wax-sealed envelope, brass letter-opener, forest pocket-square, ivory card with green ribbon, brass bell on ivory desk.

Monday, 08:14, Haarlem

The managing partner opens Outlook. 1,820 unread. That is what comes in over the weekend at a 28-person advocatenkantoor that does corporate, employment, family, and a fair amount of pro deo. Of those 1,820, roughly 60 are urgent. Around 30 turn out to be the same client mailing two associates and a partner on the same matter. Six are new prospects. Twelve are from opposing counsel. One — and this is the one — is a request from a party that is already represented elsewhere in the firm against the sender's company. That is a NOvA rule 5 problem before any associate has had her first coffee.

This is the playbook for shipping an email-automation agent into exactly that situation. We will not tell you what to buy. We will tell you the order to build it in, the gates the agent must never cross, and the three things that went wrong before it worked.

The intake problem, in numbers

Before any model touched any inbox, we spent two weeks reading mail. Not the bodies — that would breach the geheimhoudingsplicht the firm had already booked itself into under NOvA gedragsregel 3. We pulled headers, addresses, sizes, and subject patterns through a script the partners reviewed and signed off on.

Of the 1,820 weekly mails:

  • 1,140 were already in a known dossier — in-Kleos contact, dossier-number in the subject or body.
  • 410 were known clients but the dossier was ambiguous: one client, four matters.
  • 180 were new — prospect intake, opposing counsel, court, or supplier.
  • 90 were noise: newsletters, calendar spam, the inevitable LinkedIn replies.

That breakdown decides the architecture. Everything that follows is a consequence of those four numbers.

Why we did not start with the agent

The instinct, when a partner asks for “AI in the inbox”, is to plug a model into Microsoft Graph, point it at the shared mailbox, and let it draft. We did not. The first thing we built was the conflict-of-interest gate, and it ran for six weeks without drafting a single reply.

The reason is regulatory and the reason is also boring. Under NOvA gedragsregel 5, the advocate must not act for two parties whose interests conflict, or against a former client in a related matter. A draft reply — even an ontvangstbevestiging — that goes out to the wrong party is, in the eyes of the deken, the firm acting. We needed to know we could trust the conflict check before the agent was allowed to speak.

Warning

An automated acknowledgment to a new prospect is not a neutral act. If that prospect is already opposing party in a dossier two floors up, you have just opened a complaint file with the local deken. Build the conflict gate before the drafter.

The Kleos bridge that did not exist

Kleos is the Wolters Kluwer dossiermanagementsysteem that runs a sizeable share of mid-sized Dutch firms. This one had been on it for fourteen years. That is a feature and a problem. The feature is fourteen years of party data, dossier history, and known conflicts. The problem is that the firm was still on a legacy on-premise version with no public API surface beyond an aging SOAP endpoint and a SQL replica the IT manager had configured for the accountant.

We did not push to upgrade. The firm had bigger fish. Instead we built a read-only sync from the SQL replica into a Postgres mirror — parties, dossiers, status, opposing parties — refreshed every fifteen minutes. The agent never touched Kleos directly. It queried the mirror.

-- The query at the heart of the conflict gate.
-- Given a sender + extracted entities, are any of them
-- on the opposing side of any open dossier?

SELECT d.dossier_nr, d.matter_type, p.role, p.display_name
FROM   dossier_parties p
JOIN   dossiers d ON d.id = p.dossier_id
WHERE  d.status IN ('open', 'in_behandeling')
AND    (
         p.email           = ANY($1::text[])
      OR p.kvk_number      = ANY($2::text[])
      OR p.normalised_name = ANY($3::text[])
       )
AND    p.role IN ('wederpartij', 'tegenpartij_advocaat');

Three keys: email, KvK number, normalised name. The third earned its keep. Dutch corporate parties are routinely spelled four ways across a single dossier. We normalised on lowercase, stripped legal-entity suffixes, removed diacritics, and ran a tri-gram match. False positives are fine here. False negatives end careers.

The conflict gate, end to end

Every inbound mail runs through the same six steps before any draft is composed:

  1. Strip and parse headers: from, to, cc, references, in-reply-to.
  2. Extract entities from the body — names, company names, KvK numbers, BSN patterns (and immediately redact those).
  3. Look up the sender against the Kleos mirror.
  4. Run the conflict query above against every extracted entity, including historical parties pulled from the archive index.
  5. If any hit: park the mail in the partner queue, no draft generated, no auto-reply, no read receipt.
  6. If clean: pass to the drafter with the matched dossier context.

The partner queue is not a folder. It is a small web view with three buttons — clear, conflict confirmed, needs deken consult — and an audit log the firm can hand to a tuchtrechter on request. The audit log is the deliverable. The dashboard is the by-product.

Drafting the ontvangstbevestiging

Only after six weeks of conflict-gate-only operation did we let the agent draft. Even then, the first draft was a reception confirmation. Nothing substantive. Nothing that could be read as legal advice. The template:

Geachte {aanhef} {achternaam},

Wij hebben uw bericht in goede orde ontvangen en hebben
dit toegevoegd aan dossier {dossier_nr}. {behandelend_advocaat}
neemt binnen {sla_uren} werkuren contact met u op.

Mocht uw bericht spoedeisend zijn, dan kunt u bellen op
{telefoon_dossier}.

Met vriendelijke groet,
namens {behandelend_advocaat}
{firm_name}

Five variables, one tone register, no opinion. We logged every send. The model only filled slots; it did not write prose. That choice is deliberate. Drafting free prose for a tuchtrechtelijk gereguleerd beroep is a problem you do not need on week one. Maybe ever. The general principle is small and durable: the smallest reply the agent can send is the safest one. Slot-filling beats free-text generation for any regulated profession, especially in the first ninety days.

The Outlook 2016 archive problem

The firm had a homegrown archive: every sent and received mail since 2016 in a network share of PST files, one per associate per year, indexed by a now-retired IT lead's Python script. The archive mattered because conflict checks needed historical context. Was this sender ever a party in a closed dossier?

We did not migrate the PSTs to a modern store. The firm did not have the appetite for an Exchange Online migration in the same quarter. We did this instead:

  1. One-shot extraction. We mounted the PSTs read-only, used libpff to pull headers, party lists, and message-IDs into Postgres. Bodies stayed where they were.
  2. A nightly delta. New mail since the last extraction was pulled at 03:00 via the same script. The IT manager owns the cron.
  3. Hash-only lookup for the agent. The conflict gate queries header fields and party hashes — never message body — against the archive index.

That decision saved roughly four months of project time and one career-shortening migration. We are aware that mounting PSTs is a long-known fragile operation; Microsoft has documented the file-size and corruption issues on .pst stores for years. We ran the extraction on a single VM, never in parallel, and copied the files before reading them.

Tuning the false-positive budget

By week four the gate was parking around 61 mails a week. The secretariaat — two people — cleared that queue inside an hour every morning. That hour was the cost of the system. We did not try to lower it. We did try to make every park explicable: each entry carried the three keys that triggered it, the dossier numbers involved, and the tri-gram score that pushed the match over the line. False positives are fine if they take five seconds to dismiss. They are corrosive if they take ten minutes to reason about.

We tuned the match threshold exactly once, in week four, from 0.78 down to 0.72. That move shifted about eleven mails a week from the auto-clear lane into the partner queue. The secretariaat was annoyed with us for three working days. Then the gate caught a former-client conflict in week eight that would have walked through the older threshold. Nobody raised the workload again. The third thing we got wrong was assuming we could pick the threshold from a sample; you cannot, the only honest way is to run loose for two weeks and tighten on the misses.

Twelve weeks in: what we measured

The conflict gate went live in week one. Drafting went live in week seven. At week twenty we sat down with the managing partner and looked at the numbers.

  • 1,820 weekly mails resolved to 1,790 unique threads after the deduper ate the same-thread cc traffic.
  • 61 mails per week parked in the partner queue. Of those, 4 were confirmed conflicts that would otherwise have received an auto-reply. The other 57 were precaution.
  • Reception confirmations sent within 6 minutes median, down from 4.5 hours.
  • Associate hours recovered: the firm logged 11 hours per week, mostly from secretariaat triage and from associates not having to acknowledge mail before lunch.
  • Zero tuchtklachten related to the system. The audit log was checked once by the firm's own compliance officer; that was the only review.

Four conflicts in twelve weeks is not a marketing number. It is the number that justified the project to the partners. One of those four was a former client of a partner who had retired in 2019; the only place that party still existed was the PST archive. Without the archive index, the agent would have answered.

What you can do this afternoon

If you are running an inbox at a regulated firm and you are reading this thinking maybe: pull two weeks of mail headers, count the four categories above, and answer one question — how would you know an auto-reply went to the wrong party? If you cannot answer that in one sentence, you are not ready to draft. You are ready to build the conflict gate.

When we built this for the Haarlem firm, the part we underestimated was the Kleos party-name normalisation: Dutch legal entities are spelled four ways across a single dossier, and the agent fails closed on every spelling it has not seen. We solved it with the tri-gram match and a weekly review of near-misses by the secretariaat. If your team is sitting on the same problem, that is the kind of work we do in our AI agents practice.

Key takeaway

Build the conflict-of-interest gate first and run it alone for six weeks before the agent drafts a single reply — the audit log is the real deliverable.

FAQ

Why not let the agent draft replies from day one?

NOvA rule 5 makes a wrong-party draft a regulatory event. Running the conflict gate for six weeks first meant the firm trusted the audit log before any reply went out.

How do you connect to a legacy on-premise Kleos installation?

We did not connect to Kleos directly. We synced read-only from its SQL replica into a Postgres mirror every fifteen minutes, and the agent queries the mirror.

What did you do with the Outlook 2016 PST archive?

Headers and party lists were extracted with libpff into a Postgres index. Bodies stayed in place. The conflict gate queries hashes, never message bodies.

What was the smallest first draft the agent was allowed to send?

A slot-filled ontvangstbevestiging: salutation, name, dossier number, handling lawyer, SLA hours, urgent phone line. Five variables, no opinion, no prose generation.

How many real conflicts did the gate catch?

Four in twelve weeks, out of 61 weekly partner-queue parks. One came only from the PST archive index — a former client of a partner who retired in 2019.

email automationai agentsautomationcase studyintegrationsworkflow

Building something?

Start a project