Integrations

Mollie, Buckaroo, Adyen webhook quirks: SEPA storno traps

A Zaandam subscription-box retailer reconciled €4,200 of SEPA payments that never settled. The webhooks all returned 200 OK. The storno never landed.

Jacob Molkenboer· Founder · A Brand New Company· 12 Aug 2025· 8 min

Ivory paper surface with stamped paying-in slip, ribboned envelope, brass weight, iron tag, cracked red wax.

On a Tuesday in May, the finance lead at a 32-person subscription-box retailer in Zaandam opened her reconciliation tab and saw €4,247 of successful SEPA payments that the bank had never settled. The PSP webhook logs were clean. Every endpoint returned a 200 OK. The storno had landed somewhere else.

She did not know it yet, but the agent we had shipped two weeks earlier was processing notifications from three providers (Mollie, Buckaroo and Adyen) across two merchant identifiers, and the same code path that confirmed a payment was eating the reversal. The webhook contract was satisfied. The accounting was wrong.

What follows is the cheatsheet we built after that morning. Seventeen quirks across three Dutch-relevant PSPs, ranked by how quietly they drop a SEPA storno on a multi-merchant tenant. If you operate a reconciliation pipeline that touches any combination of these providers, you have probably hit some of these. The ones you have not hit yet are the dangerous ones.

Why SEPA storno is the silent failure mode

A SEPA direct debit can be reversed up to eight weeks after collection without any reason given, and up to thirteen months if the mandate itself is contested. By the time the storno arrives, your warehouse has shipped, your books have closed, and the customer has eaten the box of organic muesli.

Card refunds tend to be loud. They come back through the same channel that authorised the payment, with the same merchant identifier, on the same business day. SEPA reversals are quiet. They arrive late, they arrive on a different event type than the one that confirmed the payment, and on a multi-merchant tenant they often arrive on a webhook payload that does not name the merchant clearly.

The pattern is consistent: a webhook fires, your endpoint returns 200, and the storno is logged against either the wrong tenant or no tenant at all. The PSP considers the message delivered. There is no retry.

Mollie quirks that return 200 and forget you

Mollie's webhook contract is famously minimal. The POST body is a single field, id, and you are expected to call back to GET /v2/payments/{id} to learn what happened. That terseness is fine for a single-profile shop and dangerous for a multi-profile tenant.

The webhook never tells you what the event is. A chargeback notification posts the same shape as a successful payment. If your handler treats "I have seen this id before" as idempotent and short-circuits, you will drop the chargeback. Always re-fetch and compare the status field.
The chargeback is its own resource. The payment status stays paid even after a SEPA reversal. The reversal lives under /payments/{id}/chargebacks. If you only read the payment, you will never see it.
Test and live share the URL. The only discriminator is the mode field on the payment object, not on the webhook payload. We saw a tenant route a live chargeback into the test ledger because the agent assumed the URL implied the mode.
Profile binding is on the payment, not the webhook. If you have multiple Mollie profiles under one organisation account, the webhook payload itself is profile-agnostic. You must fetch the payment, read profileId, then route from there.
A 200 with no side effect is a permanent loss. Mollie retries on non-200 for up to 24 hours. A 200 returned before your write commits is treated as a successful delivery. There is no second chance.
Refund and chargeback look similar but settle differently. A customer-initiated SEPA reversal arrives as a chargeback (cost: roughly €12.50), not a refund. If your agent classifies by amount sign instead of event type, it will not record the fee.

Buckaroo quirks where the signature lies

Buckaroo's push notifications come in two flavours that coexist on the same merchant account: the legacy form-encoded "Push" and the JSON-encoded "Push v2". They sign differently, they nest differently, and the same SEPA reversal can fire under either format depending on which integration created the original transaction.

Form-encoded pushes sign the URL but not the query string. If you forward webhooks through a router that appends a tenant hint as a query parameter, the HMAC validates and the payload still looks right. The tenant hint is being silently ignored.
SEPA reversal arrives as StatusCode 890. That is "cancelled by user", not the failure code 491 you might wire your retry logic around. A handler that branches on StatusCode === 491 will treat the storno as a no-op.
Additional service payloads nest under different keys per version. A SEPA chargeback under v1 lives at AdditionalServices.Service[0].ResponseParameter. Under v2 it is AdditionalServices[0].Parameters. Same event, different shape.
WebsiteKey, not transaction key, is your tenant boundary. Multi-website Buckaroo setups share a single endpoint. The WebsiteKey header tells you which storefront the push belongs to. The transaction key is unique but tells you nothing about ownership.
Retries on the JSON push are limited. A Buckaroo v2 push is fired with a much shorter retry window than v1. If your endpoint returns 502 because a downstream queue was full at 04:00, that storno is often gone by the time the queue drains. The dashboard will mark it delivered.

Adyen quirks on multi-merchant tenants

Adyen is the strictest of the three and also the easiest to misconfigure. Its notifications are batched, signed per merchant account, and routed through a single endpoint that has to fan out to many tenants.

NOTIFICATION_OF_CHARGEBACK is not the chargeback. It is the heads-up. The actual CHARGEBACK event comes later, sometimes days later, with a different PSP reference. Both must be processed, and the timing window for defending the dispute starts on the first event.
SEPA SDD reversals arrive as REFUND_WITH_DATA with success: false. The 200 is still required. If you return non-200 because success is false, Adyen will redeliver the same notification on a 5, 30, 60 minute and then exponential backoff schedule. You will get the same storno four times. If your idempotency key is based on event id only, you will write it four times.
HMAC signing keys are per merchant account. A single webhook URL serving five merchant accounts needs five keys. If you validate against the first key that matches, the other four signatures will silently fail-open or fail-closed depending on your library.
notificationItems is a batch. Up to roughly twenty events arrive in one POST. There is no partial-success contract. If item 7 of 20 throws, you either accept the whole batch or none of it, and Adyen redelivers the whole batch.
The live flag lives at the root, not per item. A test transaction and a live transaction cannot share a batch, but the discriminator is one field that is easy to miss when you log only the per-item payload.
merchantAccountCode is the only tenant boundary. The PSP reference is unique, but it does not tell you who owns it. Multi-merchant tenants must route on merchantAccountCode first, then by PSP reference.

Warning

A 200 OK is a contract. It tells the PSP "I have durably accepted this notification, you can move on." If your handler returns 200 before the write commits and the write then fails, you have created a silent data-loss event that no dashboard will ever show you. Acknowledge after commit, not before.

The reconciliation invariant that catches all seventeen

After we hit the first six of these in production, we stopped patching individual quirks and rewrote the agent around one invariant: every webhook is a hypothesis, and the source of truth is the daily settlement file.

The settlement file (Mollie's settlements.csv, Buckaroo's transaction export, Adyen's SettlementDetailsReport) is what your bank actually sees. The webhook is a real-time hint that lets you ship sooner. If the hint and the file disagree, the file wins.

In practice that means three things:

-- daily diff, one row per discrepancy
select
  l.psp,
  l.merchant_code,
  l.psp_reference,
  l.event_type        as webhook_said,
  s.settlement_status as bank_said,
  l.amount_cents,
  l.received_at
from webhook_ledger l
left join settlement_lines s
  on  s.psp           = l.psp
  and s.psp_reference = l.psp_reference
where l.received_at >= now() - interval '8 weeks'
  and (s.settlement_status is null
       or s.settlement_status <> l.event_type)
order by l.received_at desc;

That query is the spine of the agent. Every webhook write goes into webhook_ledger with the raw payload, the resolved PSP, the resolved tenant, and the timestamp. Every settlement file import populates settlement_lines. Discrepancies older than 24 hours go to a human, not back to the agent.

Most of the seventeen quirks above resolve into the same one-line discrepancy in the daily diff: webhook said paid, settlement says reversed. Once that signal is visible, the actual fix is usually a five-line classifier change. The hard part was making the signal visible.

The reason we like this shape is that it inverts the usual "wire up the webhook, hope it works" model. The webhook is now allowed to be wrong. We build the agent to expect it to be wrong, and we close the loop on the file the bank produces.

Where this ran in production

When we built the payments-reconciliation AI agent for the Zaandam subscription-box retailer, the part that surprised us was not the webhook contracts themselves. It was how easy it is to write a handler that satisfies all three vendor docs and still loses €4,000 of SEPA reversals across a multi-merchant tenant. We solved it by treating webhooks as untrusted hints and reconciling against the daily settlement file, which is now the default shape we ship for anyone running more than one PSP.

Pull last week's settlement file from your PSP. Diff it against your own paid-orders table. Count the rows where the file says reversed and your table says paid. If that number is greater than zero, you have a webhook quirk in production. Find which provider, which event type, and which tenant. Fix that one. Ship the diff job before you fix the next one.

Key takeaway

Every webhook is a hypothesis. The settlement file is the truth. Build your reconciliation agent to expect the webhook to be wrong.

FAQ

Why does my webhook handler return 200 but still lose SEPA reversals?

Because the 200 commits delivery. If your write fails after the 200, the PSP never retries. Acknowledge after the database commit, not before, and classify by event type rather than amount sign.

Can one webhook endpoint serve multiple Adyen merchant accounts safely?

Yes, but each merchant account needs its own HMAC key validated against the matching notificationItems entry. Route on merchantAccountCode first, then by PSP reference, then verify the signature.

Is the Mollie payment status enough to detect a SEPA storno?

No. The payment status stays paid after a SEPA chargeback. The reversal lives under /payments/{id}/chargebacks and must be fetched separately whenever the webhook fires for that payment id.

What is the smallest reconciliation check I can run today?

Pull last week's PSP settlement file, left-join it against your paid-orders table on the PSP reference, and count rows where the file shows reversed and your table shows paid. Non-zero means you have a webhook quirk live.

integrationsautomationai agentsarchitectureoperationscase study

Building something?

Start a project