← Blog

Email automation

Email automation against legacy SAP EDI: a four-eyes playbook

At 23:00 in Eindhoven, 740 supplier confirmations sit unread in a purchase desk inbox. The 14-year-old SAP behind it cannot read them. Here is the playbook we used to ship the agent that can.

Jacob Molkenboer· Founder · A Brand New Company· 18 Jun 2026· 10 min
Cream envelope with green ribbon, leather logbook, iron shipping tag on ivory desk in side light.

It is 23:00 in Eindhoven and the purchase desk inbox has 740 unread items. Every one of them is a supplier confirming an inkooporder: yes we will ship, here is our date, here is our price. Some are PDFs. Some are Word attachments with the customer reference buried in the footer. A few are plain text in German. Behind that inbox sits a 14-year-old SAP ECC 6.0, a custom EDI broker bolted on in 2014, and an AS2 gateway that has not been touched since the engineer who built it left in 2019. By Friday this volume will be 3,640 confirmations. By Monday morning the operations lead wants to know which ones slipped.

This is the playbook we used to ship an email agent against exactly that stack, for a 32-person semicon-toeleverancier near the High Tech Campus. No rip-and-replace, no SAP upgrade, no AS2 surgery. Six weeks from kickoff to first production run.

The constraint that defines the project

You cannot send the EDI-997 functional acknowledgement back to the customer before a human has approved any delivery-date slip larger than five working days. That is the rule. Everything else (the parser, the SAP write, the queue UX) is downstream of that single constraint.

The reason is contractual. Once the 997 leaves the AS2 gateway, the supplier has formally accepted the order at the confirmed date. If that date later slips and nobody flagged it, the buyer can charge back for line-down hours. In semicon, line-down is five-figures-per-hour money. So the inkoop-team had been doing this check by hand: open the email, find the date, compare it to the line in SAP, decide if the slip is acceptable, click send.

Doing that 3,640 times a week is what we were asked to replace. Not the deciding. The opening, finding, comparing, and the eighty percent of confirmations where the answer is obvious.

What we built, in one sentence

An email agent that reads supplier confirmations into a structured record, compares against the open purchase order in SAP via the existing EDI broker, writes back the confirmed date through an IDoc the broker already knew how to consume, and holds the 997 acknowledgement in a queue when (and only when) the slip exceeds five working days.

Flow diagram: supplier email arrives, parser extracts fields, agent diffs against open PO in SAP via EDI broker, writes IDoc, decides whether to release the 997 or park in the four-eyes queue.
The agent sits between the inbox and the existing AS2 gateway. SAP does not see anything new. It sees IDocs.

Parsing the confirmation

Two-thirds of the inbound traffic was PDF, one quarter was emailed XLS, the rest was free-form text. We did not build one parser. We built a router and three specialists:

  • A layout-anchored PDF extractor for the eight suppliers that account for 71% of weekly volume. Each has a stable template and we lock onto known coordinates first, fall back to an LLM only on mismatch.
  • A spreadsheet reader for two suppliers that send Excel attachments. Schema-on-read against a YAML hint file per supplier.
  • A general-purpose LLM extractor for the long tail, with a strict JSON schema and a refusal path when confidence drops below a threshold.

The schema is small: customer_po_number, line_item, confirmed_quantity, confirmed_delivery_date, supplier_reference, currency, unit_price. Anything else gets attached as raw context for the four-eyes reviewer but does not influence the routing decision. If the parser cannot fill those seven fields with confidence above a threshold we calibrated against three weeks of human-labelled mail, the confirmation goes to the queue with the raw email next to the partial extraction.

If your parser writes anything to SAP based on an LLM extraction without a deterministic check against the open PO, you will eventually book a price change against the wrong line. We learned this on a Saturday in week two when an LLM picked up the supplier's internal reference number instead of our PO number and matched it against the only open line that happened to share two digits. The diff caught it because the unit price did not match within tolerance. We added the tolerance check before we wrote any production IDoc.

Talking to a 14-year-old SAP

SAP ECC 6.0 went into extended mainstream maintenance through the end of 2027, which means the customer was not going to fund an S/4HANA migration as part of this project. Good. We did not need one.

The EDI broker already accepts ORDERS, ORDRSP and DELFOR IDocs. The agent does not log into SAP at all. It builds an ORDRSP.ORDERS05 IDoc, signs it for the broker, and drops it into the same inbound directory the broker has been watching since 2014. The broker, in turn, calls the same BAPI it always called. SAP cannot tell the difference between our IDoc and one written by a human via ME22N.

This is the single most important architectural choice in the playbook: treat the legacy integration surface as the API. Do not invent a new one. Do not ask for a new RFC user. Do not ask the SAP team for anything except read access to a single MARA/EKKO/EKPO view, which they granted in 40 minutes because we asked for nothing else.

def build_ordrsp_idoc(po: PurchaseOrder, conf: SupplierConfirmation) -> bytes:
    seg = IDocBuilder("ORDRSP", basictype="ORDERS05")
    seg.control(sender=AGENT_LS, receiver=SAP_LS, mestyp="ORDRSP")
    seg.e1edk01(belnr=po.number, currency=conf.currency)
    for line in conf.lines:
        seg.e1edp01(
            posex=line.po_line,
            menge=line.confirmed_qty,
            preis=line.unit_price,
        )
        seg.e1edp20(edatu=line.confirmed_date.strftime("%Y%m%d"))
    return seg.serialize()

The four-eyes queue

Now the constraint. For every confirmed line, we compute slip_days = working_days_between(po.requested_date, conf.confirmed_date). Working days, not calendar days, against a Dutch holiday calendar plus a supplier-country calendar for the top eight. If slip_days > 5, the line is parked. The 997 is not sent.

Parked confirmations land in a queue that two people can see: the inkoop lead and one of three supply-chain planners. Both must click approve or reject within their respective roles. The UI is one screen. It shows the original email, the parser's extracted record, the open PO, the slip in working days, and a Slack-style comment thread.

What we explicitly did not do: route the four-eyes queue through email. We had a long debate about it. Email-driven approval would have been faster to build but it loses the audit trail and it allows reply-all chaos. We picked a small Next.js app on the same domain as the agent, with magic-link auth tied to the customer's Entra ID. Six weeks to ship includes that app.

Timing the EDI-997

The functional acknowledgement is the trickiest piece. AS2 partners typically expect the 997 within a defined SLA. For most of this supplier's customers, that is two hours. Holding it for a four-eyes review can push past that.

We split the 997 into two timed events. The technical 997 (we received and parsed the message) is sent immediately and unconditionally. The business 997 (we accept the confirmed dates) is sent only after the four-eyes queue clears. Most customer EDI implementations accept this split because it matches the X12 spec; a handful did not, and for those we shortened the parking window to four hours and escalated harder when a planner missed it.

Takeaway

If your agent has to decide whether to send an irreversible message, split the message into a technical receipt and a business commitment. Send the receipt; queue the commitment.

Idempotency, replay, and the boring middle layer

Supplier inboxes are messy. The same confirmation arrives twice, sometimes from two different addresses at the same supplier. We key every inbound on a tuple of (supplier_id, customer_po, line, confirmed_date_hash). A second arrival with the same tuple is logged and dropped. A second arrival with a different date is treated as a revision and re-runs the slip check.

The replay log is in Postgres, not in the agent's runtime memory. If the agent dies, the next instance picks up at the last unacknowledged email. If a planner approves a confirmation and we then discover the parser misread the date, we can roll forward by writing a corrective ORDRSP. Never roll back, because SAP has no concept of "undo my last IDoc".

Shadow mode before cutover

We did not switch on writes to the broker directory on day one. For the first two weeks the agent ran in shadow mode: every inbound email went through the full pipeline but the generated IDoc landed in a staging directory the broker did not watch. Each morning at 08:30 the inkoop lead opened a one-screen diff: every confirmation the agent had parsed in the last 24 hours, the IDoc it would have written, and the IDoc a human actually wrote. Agreement counted; disagreement got a one-click reason code.

The shadow run surfaced seventeen edge cases we had not seen in the design conversations. A supplier in Singapore who confirms in local time but stamps the PDF in UTC, off by seven hours across a midnight cutoff. Two consecutive PO numbers that differ only by a zero-width space slipped in by one supplier's PDF generator. A planner who routinely accepts a one-day slip on lithography spares without recording it, which would have flooded our queue with confirmations the team in practice does not care about. Each one became a parser rule, a normaliser, or a per-supplier override before we let the agent write a single live IDoc.

The bits we deliberately did not build

We did not build a supplier portal. The team wanted one. The instinct is reasonable: if suppliers fill in a structured form, the parser disappears. But the suppliers who matter most are tier-one fabs and EMS shops that already have their own EDI stacks and zero interest in logging into a customer portal to retype data they already sent. Building the portal would have moved the integration problem onto the wrong side of the relationship.

We also did not auto-escalate to the customer when a supplier missed an SLA. The team wanted that too. We pushed back because the agent does not know which customer relationships can absorb a polite nudge and which cannot. That stays with the planners, who get a daily digest of upcoming slips.

What it looks like in production

At the time of writing the agent has been live for eleven weeks. Weekly traffic landed at 3,640 confirmations on average with peaks above 4,200 during quarter-end. 81% of confirmations clear the agent without ever touching the four-eyes queue. The remaining 19% (slips greater than five working days, parser low-confidence, or new suppliers) average 11 minutes from arrival to planner decision, down from a previous median of "next morning". The inkoop-team went from three FTE on inbox triage to one FTE on queue review and exception handling.

The first interesting failure mode, which we did not anticipate, was suppliers who reply to the original PO email with a one-line confirmation in the body. The router missed those for the first week because we were keyed on attachment presence. We added a body-parser fallback and the recovery rate on the long tail jumped from 78% to 96%.

The second was a supplier whose mail server rewrote PDF attachments through a third-party scan-and-rebuild proxy. The rebuilt PDFs broke our layout-anchored extractor because every coordinate had shifted by two points. We added a per-supplier sentinel that compares a known field's bounding box against the template; if it drifts more than half a millimetre we fall back to the LLM extractor and log it for the next template refresh. That one took a day to diagnose and an hour to fix, because shadow mode had already conditioned us to trust the diff more than the parser.

The smallest thing you can do today

If you run an EDI flow against a legacy ERP, open the inbound directory your broker watches. Look at the last week of IDocs. Count how many were written by a human transcribing an email. That number is your business case.

When we built the email agent for this Eindhoven supplier, the hardest part was not the LLM or the SAP integration. It was earning the right to hold an EDI-997 for a human. We earned it by splitting the acknowledgement into a technical receipt and a business commitment, and by giving the four-eyes reviewers a screen that respected their five seconds of attention.

Key takeaway

Split the EDI acknowledgement into a technical receipt and a business commitment, then queue the commitment until a human approves the slip.

FAQ

Why not upgrade to S/4HANA as part of the project?

The customer was not going to fund an ERP upgrade alongside an automation project, and they did not need to. SAP ECC 6.0 mainstream maintenance runs through the end of 2027, which gave us a multi-year runway on the existing stack.

What stops the LLM from writing the wrong date to SAP?

A deterministic diff against the open PO. The agent checks PO number, line, quantity and unit price within tolerance before any IDoc is written. Anything that fails the diff goes to the four-eyes queue with the raw email attached.

Why hold the EDI-997 instead of sending an exception message later?

Most AS2 partners treat the 997 as commitment. Splitting it into a technical 997 (sent immediately) and a business 997 (sent after queue clears) preserves the SLA without overpromising on dates the supplier has not approved.

Does the same pattern work for EDIFACT instead of X12?

Yes. The EDIFACT equivalent of the 997 is CONTRL. The same split applies: send the technical CONTRL on receipt, hold the business ORDRSP until the four-eyes queue clears.

How long did the build take with a 32-person customer?

Six weeks from kickoff to first production run, including the Next.js queue UI, the IDoc builder, the parser router, and the AS2 timing split. One ABN engineer plus part-time access to the customer's SAP and EDI leads.

email automationai agentsprocess automationintegrationscase studyarchitecture

Building something?

Start a project