Email automation

Email agent for a Leiden publisher: ISBN-conflict triage

A 21-person Leiden publisher gets 1,180 author emails a week. Half land in Klopotek, half in an Exchange archive nobody trusts. Here is what we built.

Jacob Molkenboer· Founder · A Brand New Company· 22 Jun 2026· 8 min

Cream envelope tied with green ribbon on dark leather blotter, clipped letters and brass clip on ivory paper.

Fifteen-fifty on a Tuesday in Leiden

It is 15:50 on a Tuesday, and the hoofdredacteur of a 21-person Leiden uitgeverij has nine minutes before the CB Logistics aanlever-window closes at 16:00. Her inbox shows 47 unread author messages since lunch. Two of them are about ISBNs that conflict with assignments already in Klopotek, the title-management system the publisher has run since 2014. She does not know which two.

If she misses the window, three titles slip a week in the Dutch trade supply chain. Three launch dates move. A campaign that is already running in the wild stops being true.

Before we built the email-agent, this was a normal Tuesday.

The shape of 1,180 weekly emails

The publisher's redactie processes about 1,180 author emails per week. Roughly four hundred are metadata: ISBN questions, BISAC and Thema-code requests, blurb edits, contributor changes. Three hundred are manuscript drafts and revisions, attached as Word documents and InDesign IDML packages. Two hundred are contract and royalty queries that have to be routed to a separate team. The rest is everything else: holiday auto-replies, congratulations on the new house, the chapter the author already sent yesterday and would like to resend "to be sure".

Of those 1,180, between twelve and twenty-five contain an ISBN-toekenningsconflict. An author has been told a number by an external party (a co-publisher, a distributor, a previous editor) that does not match what is in Klopotek. Every such conflict, if missed before 16:00, has a chance of becoming a logistics issue at CB later that week.

Two systems that don't talk

Klopotek runs on-prem on a Windows Server cluster that has not been meaningfully touched since 2014. It exposes a SOAP API that was state-of-the-art in 2009 and a more recent REST gateway that covers roughly 60% of the SOAP surface. The redactie-archief, thirteen years of author correspondence going back to 2013, lives on Exchange 2016, on-prem, behind a hardened reverse proxy.

Neither system is going anywhere. The publisher's CFO has priced a Klopotek migration twice in five years, and both times the spreadsheet did not survive contact with reality. So whatever we built had to read from both, write to neither (at first), and never, ever block the Outlook clients on the desks.

We did not propose to replace anything. We proposed a sidecar.

The sidecar

The agent pulls mail from a dedicated EWS impersonation account that has read access to the four shared redactie mailboxes. Every 25 seconds it asks Exchange for messages received since the last cursor, using the standard FindItems EWS call against the inbox folder. We chose EWS over Microsoft Graph for a boring reason: Exchange 2016 on-prem does not speak Graph, and the publisher's network team had spent four months hardening the EWS endpoint. We were not going to ask them to undo that work.

Each new message goes through a three-step pipeline: extract, classify, route. Extraction is deterministic: parse headers, decode the body, strip the signature using a regex library trained on the publisher's own footer patterns. Classification is the agent itself, a small model with tool access to Klopotek (read-only) and to a vector index of the last 18 months of correspondence. Routing is deterministic again, because routing is where you do not want a model to be creative.

The whole pipeline runs in under forty seconds from EWS arrival to queue placement. Most of those seconds are EWS latency.

The ISBN-conflict path

The interesting path is the ISBN one. When the classifier flags a probable assignment conflict, three things happen in parallel.

First, the agent extracts every ISBN-13 candidate from the message body and attachments. ISBNs are easy to find and easy to miss: a thirteen-digit string that starts with 978 or 979, ending in a Mod-10 checksum. We wrote that checksum verification ourselves rather than ask the model to do it, because the model will happily hallucinate a valid checksum on a string that does not appear in the email at all.

Second, it queries the Klopotek REST gateway for each extracted ISBN and records the current assignment status: assigned, reserved, free, or stuck — a Klopotek state we have learned to treat as a fourth category because it tends to mean someone's manual workflow is half-finished.

Third, it does a semantic match against the redactie-archief vector index to see whether the same conflict has already been discussed in the last 90 days. If it has, the agent attaches the prior thread to the queue entry, so the editor does not start from zero.

If any of the extracted ISBNs is in a state inconsistent with what the email asserts, the message lands in the hoofdredacteur queue within forty seconds of arriving in EWS. Everything else routes to the normal redactie inboxes with a suggested label.

GET /api/titles?isbn=9789403621847
Authorization: Bearer …

200 OK
{
  "isbn": "9789403621847",
  "title": "De stille kant van de stad",
  "status": "assigned",
  "assigned_to_project": "P-2025-0411",
  "last_modified": "2026-06-19T11:22:14+02:00"
}

Why forty seconds, and what it costs

Forty seconds is not magic. It is a budget. EWS push notifications could give us sub-second latency, but only on a network we do not fully control, and they break in ways that take days to debug. Polling every 25 seconds plus a 15-second processing budget gave us a worst-case of forty seconds, a best-case of ten, and a failure mode (a missed poll) that is loud and obvious instead of silent.

The cost of forty seconds is that the queue can occasionally bunch up around the 16:00 window. We mitigated by adding a second poller that runs at 5-second cadence between 15:30 and 16:05, throttled back to 25 seconds the rest of the day. That is not elegant. It works.

Takeaway

A forty-second SLA is a budget, not a limit. Pick the slowest acceptable number, then design backwards from it so your failure modes get loud instead of silent.

The Klopotek cache

The hardest part to build was not the classifier. It was the cache layer between Klopotek's REST gateway and the agent. The gateway covers roughly 60% of Klopotek's data model and rate-limits anything heavier than a hundred reads per minute per client. At 1,180 emails a week, with two to four lookups per email, we would have starved the gateway by Tuesday lunch.

We built a thin Postgres cache that mirrors title, ISBN, contributor, and assignment data, refreshed in 90-second windows via the REST endpoints we trust, and via direct read-only SOAP calls for the 40% the REST gateway does not cover. The cache holds about 240,000 rows and stays under 600MB on disk. The agent reads from it. The editors' Klopotek-search bookmarklet also reads from it, which they did not ask for but quietly stopped removing.

We ran gateway-direct and cache-backed reads in parallel for two weeks before flipping the agent to the cache. The gateway is now the writer; the cache is the reader. The lookup time on a title dropped from "a few seconds, sometimes longer" to under 200ms, and the editors noticed that change first. The classifier doing the right thing in the background was the less visible win.

What broke first

Three things broke in the first six weeks, and they are the three things that always break.

The first was authentication. The EWS impersonation account had a 90-day password rotation policy; nobody had told us. On day 91 the agent stopped reading mail, the redactie did not notice for two hours because Outlook still worked, and we got the call at 14:30 on a Friday. We moved to a service-account exemption with a documented quarterly review and a calendar reminder owned by two people.

The second was attachments. Authors send 80MB InDesign files. The classifier did not need them, but our extractor pulled them down anyway, twice, because EWS's streaming API has a re-fetch pattern that we had copied from a 2017 sample. We started rejecting anything over 25MB at the extraction stage and re-issuing a metadata-only read.

The third was the model being too confident. In one early case, a co-publisher had written "ISBN 978-94-036-2184-7" in a thread, but the actual conflict was about a different ISBN further down the message, in a quoted reply. The classifier flagged the wrong one, the hoofdredacteur trusted the flag, and a title shipped with the wrong cover for two weeks before anyone noticed.

We changed two things. The agent now attaches every ISBN it finds, in order, with its position in the thread. And it refuses to commit a confidence judgement when there is more than one candidate — the queue entry simply says "two ISBNs, please review", and the editor decides. The same principle is argued more generally in Anthropic's Building Effective Agents: prefer simple, composable patterns over end-to-end autonomy. We agree, expensively.

Where humans still own the call

The agent does not write to Klopotek. It does not reply to authors. It does not move messages between folders. Everything it produces is a queue entry, with a link back to the original message in Outlook and a structured summary the editor can paste into Klopotek if she chooses.

This is partly a trust decision and partly a regulatory one. Author correspondence at a Dutch publisher includes contract negotiation, which means GDPR Article 6 lawful-basis arguments we did not want to have. By keeping every outbound action human-initiated, we kept the agent in a category the legal team understood: a search tool that pre-sorts.

It also turned out to be the right product decision. The editors trusted the queue once it stopped trying to be clever. They never trusted auto-replies and were never going to.

What it changed

Six months in, the publisher reports two numbers that matter. The hoofdredacteur queue averages 18 items per day, and the median time-to-decision on those items is six minutes. The number of titles that missed the 16:00 CB window because of a metadata issue went from "roughly one a fortnight" to "two in six months, both of them ours". Author email is still 1,180 messages a week. It just stopped being the bottleneck.

The publisher has not asked us to add write paths. We have not pushed them to either. The queue is enough, and "enough" is a perfectly acceptable destination.

When we built the email-agent for this Leiden publisher, the thing we kept running into was the gap between Klopotek's REST gateway and what the redactie actually needed to know. We solved it with the thin caching layer above, which both the agent and the human editors query — the kind of AI agent work that sits between an old system of record and the people who live with it, which is most of what we ship in 2026.

If you want the smallest version of this you could try tomorrow: open your CRM or title system, count the inbound emails per week that require touching it, and time how long the average lookup takes. If that product is bigger than your team's tolerance for it, you have the same problem this publisher had.

Key takeaway

A forty-second SLA is a budget, not a limit. Pick the slowest acceptable number, then design backwards from it so your failures get loud, not silent.

FAQ

How does the agent read from Exchange 2016 on-prem?

Through a dedicated EWS impersonation account with read access to the four shared redactie mailboxes. It polls every 25 seconds, and every 5 seconds in the 30 minutes before the CB Logistics window closes.

Why a 40-second SLA instead of real-time?

EWS push notifications break in ways that take days to debug on locked-down networks. Polling at 25 seconds plus a 15-second processing budget gives a loud failure mode and a predictable load on Exchange.

Does the agent write back into Klopotek?

No. Every action it produces is a queue entry with a structured summary an editor can paste into Klopotek. We kept writes human-initiated for GDPR reasons and because the redactie trusted the system more once it stopped trying to be clever.

What was the hardest part to build?

Not the classifier. The Postgres cache between Klopotek's REST gateway and the agent. Klopotek's API covers about 60% of its data model and rate-limits anything heavier, so the cache became the de facto search interface for the catalogue.

ai agentsemail automationautomationcase studyintegrationslegacy sites

Building something?

Start a project