Process automation

Customs agent incident: 940 declarations, wrong EORI

At 09:14 on a Thursday in June, an Almere customs broker's automation agent filed 940 entry declarations under a sister tenant's EORI. Here is the post-mortem.

Jacob Molkenboer· Founder · A Brand New Company· 4 Feb 2026· 8 min

Ivory desk with customs slips, chartreuse paper tab, brass shipping tag on linen twine, dark green leather blotter.

At 09:14 on a Thursday in June, the operations lead at a 31-person logistics broker in Almere watched 940 entry declarations turn red in her dashboard. Every one of them was rejected by DMS Inbreng for the same reason: the EORI on the declaration did not match the importer named in the cargo manifest. The broker had been filing entries through our automation agent for fourteen months. In ten minutes the queue had moved through more wrong declarations than a human team could file in a week.

She called us at 09:23. By 09:51 the agent was paused, the Douane account manager was on the phone, and we were reading the API logs from the previous fifteen minutes. This is what we found, what we sent to Customs, and the gate we now run on every customs integration we ship.

What hit the queue between 09:02 and 09:14

The agent ingests cargo manifests from the broker's TMS, normalises them, and files the entry declaration through DMS Inbreng. Each declaration carries an EORI, the European customs identifier that says which company is importing the goods. EORIs are not a routing detail. They are the legal identity on the declaration. Filing under the wrong EORI is the customs equivalent of signing someone else's name on a tax return.

Between 09:02 and 09:14 the agent filed 940 declarations. 916 of them carried the EORI of a different broker on the same tenant cluster, a sister tenant we onboarded six weeks earlier. Twenty-four were filed correctly, and those are the ones that told us when the bug started. The last correct declaration had a request timestamp of 09:02:11. After that, the agent silently switched identities.

The cached jurisdiction token

The agent calls a thin internal wrapper around the AGS-style submission endpoint. The wrapper carries two tokens: a customer auth token that proves the broker is who they say they are, and a jurisdiction token that authorises submissions for a specific EORI under a specific declarant. The jurisdiction token is short-lived, fifteen minutes, and the wrapper caches it in Redis so it does not re-issue on every call.

The cache key was jurisdiction:{declarant_id}. It should have been jurisdiction:{tenant_id}:{declarant_id}.

The sister tenant had a different declarant_id but the same wrapper version. When their token was issued first that morning, the cache happened to be cold. At 09:02:17 our broker's agent asked for a jurisdiction token, the wrapper checked Redis under jurisdiction:declarant-NL-0042, found a fresh entry, and returned it. The token was valid. It was just valid for the wrong company.

Warning

If a cache key for an auth artefact does not contain the tenant id, you do not have a cache. You have a same-key collision waiting for the second tenant.

Why the retry loop did not catch it

The Claude tool-use loop made things worse. The agent's submit tool returns structured errors. If the customs endpoint rejects with an authentication or jurisdiction error, the loop is supposed to surface a hard stop. But at 09:02:11 a routine AGS call failed with a transient network error. The wrapper retried automatically. The retry path did not re-read the token from the auth service. It pulled from the Redis cache, which is exactly what caches are for.

The retry succeeded. The model saw a 200, marked the tool call resolved, and moved on. So did the next 939 calls.

The retry was not the root cause. The cache key was. But the retry loop is what turned a slow failure into a fast one. Anthropic's tool-use guidance is explicit that the model trusts what the tool returns. If your tool says ok, the agent will not second-guess you.

What we sent to Douane

By 10:30 the broker's customs account manager had a list of all 916 affected MRNs and the corrected EORI mapping. The standard route for a declaration error in DMS is a regularisation request to the customs office that accepted it. Dutch customs accepts these in writing with a corrected declaration appended. The broker filed regularisations the same afternoon. Goods that had not yet been released were held under bond and refiled under the correct EORI. Goods that had been released triggered a post-clearance correction.

None of the cargo was seized. Two import-VAT lines moved from one company's books to the other's, which both finance teams felt about as you would expect. The Almere customs office had seen something like this before, never at this scale, and was matter-of-fact about it.

What we did not get away with was the audit trail. Every wrong declaration carried a valid signature from the sister tenant's jurisdiction token. The sister tenant had to confirm in writing that they had not authorised any of those 916 entries. That paperwork took three weeks.

The per-tenant token binding gate

Two things went wrong. The cache key missed a dimension. The retry loop trusted the wrapper. The fix has to address both.

First, the cache key is tenant-scoped, and the wrapper refuses to issue a request unless the token's bound tenant matches the call's tenant. The check runs on every call, not just on cache miss:

// runs on every customs-API call, not only on token issue
function bindTokenToTenant(token: JurisdictionToken, ctx: CallContext): void {
  if (token.boundTenantId !== ctx.tenantId) {
    throw new TokenTenantMismatch({
      tokenTenant: token.boundTenantId,
      callTenant: ctx.tenantId,
      declarantId: ctx.declarantId,
    });
  }
  if (token.boundEori !== ctx.eori) {
    throw new TokenEoriMismatch({
      tokenEori: token.boundEori,
      callEori: ctx.eori,
    });
  }
}

async function submitDeclaration(ctx: CallContext, payload: DmsPayload) {
  const token = await jurisdictionTokens.get(
    // key now contains both tenant and declarant
    `jurisdiction:${ctx.tenantId}:${ctx.declarantId}`
  );
  bindTokenToTenant(token, ctx); // hard fail before the wire call
  return dmsClient.submit(token, payload);
}

Second, the retry path no longer pulls from cache. A retry forces a fresh token. The reasoning: if the previous call failed, we treat the cached token as suspect until proven otherwise. The cost is one extra round-trip to the auth service on transient errors. The benefit is that a stale or cross-bound token cannot survive a retry.

Third, the agent's tool now returns a typed error when a tenant mismatch is detected, and the tool-use loop is configured to halt the entire batch on that error class. This is the part that needed work in the prompt and in the tool schema. The agent was previously told to retry auth-shaped errors with backoff. The new instruction is explicit: a TokenTenantMismatch is never retryable, never logged-and-continued, and never summarised in a post-batch report. It pages a human.

// tool result schema — the model can no longer interpret this
type SubmitResult =
  | { ok: true; mrn: string }
  | { ok: false; retryable: true;  category: "transient_network" | "rate_limited" }
  | { ok: false; retryable: false; category:
      | "tenant_mismatch"
      | "eori_mismatch"
      | "schema"
      | "rejected" };

The retryable flag is set by our wrapper, not inferred by the model. We learned the hard way that giving a model latitude to decide what counts as the same problem, try again is fine for an idempotent GET and dangerous for anything that creates a legally binding record.

The checklist we run on every customs integration now

Every customs integration we ship now passes through the same five gates before it goes live:

Cache keys for auth artefacts include the tenant id. Always. We grep for this in CI.
The auth artefact carries its bound tenant and bound EORI in the payload, not just in the lookup key. The wrapper checks both on every call.
Retry paths re-fetch auth artefacts. Caches are for the happy path.
The agent's tool schema distinguishes retryable from non-retryable errors. The model is not allowed to reclassify.
Tenant-mismatch errors page on first occurrence. They do not roll up into a 5%-failure-rate dashboard.

The fifth point is the one we underrated. The agent's batch dashboard showed 100% success between 09:02 and 09:14 because, from the wrapper's perspective, every call succeeded. The customs system rejected the declarations downstream, but those rejections came in over the inbox channel, not the API channel. The dashboard the on-call engineer was watching was the wrong dashboard.

What this incident was not

This was not a Claude bug. The model did what it was told to do with the tools it was given. The cache key was wrong, the retry path was wrong, and the tool schema was too permissive. Each of those is a thing a careful engineer can write, ship, and not notice for fourteen months as long as only one tenant uses the system.

It was also not a multi-tenant SaaS problem in the way that phrase usually means. The cross-tenant leak did not come from a missing WHERE clause in a database query. It came from a cache that pre-dated multi-tenancy and was never re-keyed when the second tenant arrived. That is the more common shape of this class of bug in our experience: the database is fine, the application code is fine, and the infrastructure picks up an old assumption that nobody re-checked.

The smallest thing you can do today

When we rebuilt the broker's customs AI agent wrapper, the thing we ran into was that tenant-scoping the cache key was a one-line change and tenant-binding the token payload was a two-week change involving three vendors. The cache key is the cheap fix. The payload binding is the one that saves you the next time someone forgets the cache key. When we ship customs agents now, we ship both.

Open a terminal. Grep your auth-related caches for the substring tenant. If a key that touches an auth or jurisdiction artefact does not contain a tenant identifier, you have today's problem. Fix that one first.

Key takeaway

If a cache key for an auth token does not include the tenant id, you do not have a cache. You have a same-key collision waiting for the second tenant.

FAQ

What is DMS Inbreng?

DMS Inbreng is the Dutch customs entry-declaration channel under the Declaration Management System. It is the system through which import entries are filed and accepted by Douane.

How did a token from another tenant end up on these declarations?

The Redis cache key for the jurisdiction token was scoped to declarant id only, not tenant id. A sister tenant's token was already cached under the shared key when our broker's agent asked for one.

Why did the Claude tool-use loop not catch the wrong EORI?

The wrapper returned a 200 success after the silent retry. The model trusts the tool's response. Without a structured tenant-mismatch error, there was nothing for the loop to halt on.

What is a per-tenant token-binding gate?

A check that runs before every customs-API call and verifies the auth token's bound tenant and bound EORI match the call context. It fails hard if either is off, instead of trusting the cache.

How did the broker correct the 916 wrong declarations?

Through regularisation requests to the accepting customs office, with corrected declarations appended. Goods not yet released were refiled under bond. Released goods triggered post-clearance corrections.

process automationai agentsintegrationsoperationscase studyworkflow

Building something?

Start a project