Process automation

Process automation incident: 940 aangiften, one stale TARIC

The aangifte queue was green. Customs called at 11:47 anyway. Forty-three of the morning's declarations had cleared with a tariff code that stopped existing the day before.

Jacob Molkenboer· Founder · A Brand New Company· 19 Apr 2026· 11 min

Open customs ledger, brass stamp with red ink pad, carbon forms under paperweight, chartreuse sticky note, stopped clock.

What broke

11:47 on Thursday 2 April. The operations manager at a 28-person haven-expediteur in the Waalhaven gets a call from Douane. Forty-three of the morning's declarations have a TARIC code that stopped existing the day before. By the time we get the message and start looking, the agent has shipped 940 aangiften since midnight. Every single one of them carries the same stale code on at least one line.

The agent did not crash. The Douane-tunnel did not reject. The control queue was green. The customer's invoicing was running off the same code, so internal reconciliation also looked fine. The whole point of an autonomous process automation is that the system tells you when something is off, and the system told no one. What we were dealing with was a silent-correct pipeline that had gone silent-wrong overnight.

Between 11:47 and 12:30 we and the customer broker pulled the morning's file list, sampled twelve declarations against the current TARIC, and confirmed the same versie-skew on every affected line. By 13:00 we had a count: 940 aangiften shipped, 412 of them with at least one line on a TARIC code that had been retired at midnight. None of the 412 had triggered an exception. The agent's structural validator had passed each one, the Descartes lookup had returned what it considered a valid measure, and the Douane-tunnel had handed back the green acknowledgement. Three different systems had agreed the morning was normal.

The setup

The agent does what most haven-expediteur back-offices do, just without the people. A booking comes in over EDI or email. The agent extracts the goederencode, weight, country of origin, partij-waarde and procedural details, validates them against the up-to-date TARIC measure for that classification, fills the AGS-aangifte, and pushes it through the Douane-tunnel via the customer's Descartes integration. The broker reviews exceptions; everything else flows.

That validation step is what is supposed to catch a code typo, a transposed digit, a classification that no longer fits the commodity. It is not designed to catch the case where the code lookup itself is reading from a version of the world that ended yesterday. The agent has no way to tell that the measure it received is the correct shape for the wrong moment in time. From inside the request loop, a stale cache and a current cache look identical.

The TARIC lookups go through Descartes. Descartes maintains a local mirror of the EU TARIC database that is kept current against the Commission's daily update. That mirror is one of the reasons you pay for Descartes. The whole point is that you trust the cache.

That trust is the bug.

The 1 April publicatievenster

The TARIC is the EU's integrated tariff. The Commission publishes daily delta files and a master version that turns over at the start of each quarter. 1 April is one of the four hard cutovers in the year. Codes get retired, new codes get introduced, duty rates change, and a non-trivial number of measures see their validity window end at 23:59:59 on 31 March and a successor measure start at 00:00:00 on 1 April.

The deltas land on the Commission's distribution server in a window the vendors call the publicatievenster. In practice that window is loose. Some quarters it lands in the small hours of the cutover day; some quarters it does not land until late morning. Vendors with their own mirror pull on a schedule and stamp their cache with the new versie when they have it.

What Descartes did on the morning of 1 April was hold its cached Q1-2026 versie through the publicatievenster, serve lookups against it, and only flip to Q2-2026 in the afternoon. There is a hint of this in the Descartes admin console, a small versie tag at the top of the TARIC pane, but our agent did not read that tag. It read the measure, got a valid measure back, and proceeded.

How it went unnoticed for a working day

For the first hour or so of 1 April this looked indistinguishable from a normal morning. Most of the codes the customer files against were unaffected by the Q2 update. The aangiften with affected codes still validated structurally: the cache returned a measure, the measure was internally consistent, the system was happy. The Douane backend accepted the declaration on a TARIC code that did not, as of the entry datum, legally exist for that classification.

The reason the Douane backend accepted it is the same reason this kind of bug survives anywhere: both sides of the transaction trusted the same upstream. The Douane validation rules for a given measure follow the same TARIC publication cycle. If the master at Douane's end happens to lag a few hours in the same direction as your vendor's mirror, the wrong code looks correct on both sides. By the time the lag closes, you have already filed.

The fault re-surfaces when a downstream system, controlling officer, audit batch, post-clearance check, reads the same aangifte against the now-current TARIC. That is the call we got at 11:47. If your automation reads a tariff or regulatory cache by code and never by versie, the cache silently flipping back to yesterday's reality is invisible until a human downstream notices. The error is not in the code path that failed. The error is in the absence of a code path that compares.

The dual-version diff-gate

We were not going to remove the Descartes cache. The cache exists for a good reason, and the alternative, hitting the Commission's TARIC consultation per lookup, has its own failure modes and rate limits. What we needed was a way for the agent to refuse to file an aangifte when the cache and the world disagreed on what the code meant today.

The fix is unglamorous. Before any aangifte leaves the tunnel, the agent fetches the same measure from two places: the Descartes cache as before, and a second, independently kept TARIC mirror with its own publication-window discipline. We pull a minimal set of fields, measure ID, duty expression, additional codes, validity start and end, and we diff them.

Picking the right fields took longer than the code. The TARIC publishes more than 200 attributes per measure, most of which never change between quarters and most of which an aangifte does not depend on. We narrowed it to what the Douane backend would actually score the declaration against: measure ID, duty expression in a canonical string form, additional codes, the validity start and end, and the footnote codes that affect calculation. Anything else we let drift. The narrower the diff, the lower the false-positive rate, and on this kind of system a false positive is a parked aangifte and a broker phone call we cannot afford to make on noise. The afternoon we spent picking fields was the afternoon that decided whether the gate would survive contact with production.

If the two sources agree, the aangifte goes out. If they disagree, the agent does three things: it parks the aangifte in a hold-queue, it pings the customer broker with the diff inline, and it stamps the incident with the two source versies so the operator can see which side is lagging. The hold-queue has a 20-minute SLA. In normal operation it is empty.

// libs/customs/diff-gate.ts
type TaricSnapshot = {
  source: 'descartes' | 'mirror-b';
  versie: string;           // e.g. '2026Q2-04-01'
  measureId: string;
  dutyExpr: string;         // canonicalised
  additionalCodes: string[];
  validFrom: string;        // ISO
  validTo: string | null;
};

export async function gate(code: string, entryDatum: string) {
  const [a, b] = await Promise.all([
    descartes.lookup(code, entryDatum),
    mirrorB.lookup(code, entryDatum),
  ]);

  const drift = diff(a, b);
  if (drift.fields.length === 0) return { ok: true, snapshot: a };

  return {
    ok: false,
    reason: 'taric-drift',
    drift,
    sources: { a: a.versie, b: b.versie },
  };
}

The diff function is deliberately strict. We do not try to decide which side is right. We refuse to file. A broker decides.

What we did not do

We did not add a TTL to the Descartes cache. The cache has a TTL. The TTL was respected on 1 April. The cache had simply not received the new versie yet. Cache invalidation is not the bug here.

We did not write a trust score for tariff sources. We considered it for an afternoon. The shape of the system rewards a hard veto: any diff, any size, blocks the aangifte. A trust-weighted vote sounds intelligent and would, on the day this incident happened, have voted to file the wrong code.

We did not bolt on an ML anomaly detector. A model that flags an unusual duty rate would not have caught this. The duty rate was not unusual; it was simply assigned to a code that no longer existed. The signal we needed was not in the data the agent had. It had to be fetched, and it had to be fetched from somewhere the cache could not reach. The class of failure here is not statistical, it is bibliographic, and you do not detect a citation error by clustering the citations.

We did not switch vendors. Descartes was not negligent in any way we can prove, and the second mirror has a future morning where it will be the lagging one. The point of the gate is that both can be wrong; only matching answers go out.

What the 940 cost

Every affected aangifte had to be corrected with a verbeterverzoek through the Douane portal. The customer broker, three controlling officers, and two of our engineers spent the following Friday and Monday on it. No goods were physically held. Two consignments had to be re-coded for duty, and the duty delta was small in both directions. The customer did not lose a shipment. They lost a weekend, the trust of a controlling officer for about three weeks, and the price of a small kitchen renovation in engineering time on both sides.

The real cost is the one we keep in the post-mortem doc: for a working day the system stamped legally meaningful documents in the name of a regulated entity with a code that did not exist. The duty was almost right. The audit trail was completely wrong.

What we tell other clients now

Every regulatory-data input to an automation agent, TARIC, REACH, EORI registry, sanctions list, VAT-number validation, anything published by a public authority on a schedule, gets a versie field surfaced into the agent's working memory and a second independent source. If you cannot find a second independent source for a given list, that is itself a finding: it tells you the agent should not be filing autonomously against that input.

The agent does not need to understand the regulation. The agent needs to refuse confidently when two upstreams disagree about what the regulation says today.

Takeaway

Stale regulatory caches do not fail loudly. The cache will return a perfectly shaped wrong answer all day. Diff against a second source; refuse to act on a mismatch.

A note on the AI in school debate

The week of this incident, the Norwegian government's near-ban on AI in elementary school was on every front page. The framing in most of the coverage was about classrooms. The substance is older than that: do not let an opaque automation do legally binding work without a second pair of eyes that can disagree with it. Schools are one instance. A haven-expediteur's customs queue is another. The pattern is the same. Build the disagreement in.

What you can do this afternoon

Pick one regulatory-data feed your automation reads. List the fields it consumes. Find one other place, a vendor mirror, a Commission portal, an open-data dump, where you can fetch the same fields. Wire a five-minute job that pulls both, diffs the fields you care about, and emails you when they disagree. You do not have to ship a gate today. You only need to know whether your feed has, in the last 90 days, been silently wrong.

When we built the customs-automation agent for the Rotterdam haven-expediteur, the gap we ran into was exactly this: a cache the vendor and the Douane backend both believed was up to date. We ended up solving it by refusing to file on any unresolved diff, and we leave that gate on by default for every process automation we ship now.

Key takeaway

If a regulatory cache returns a perfectly shaped wrong answer, diffing against a second source is the only signal you have that it lied.

FAQ

Why didn't the Douane backend reject the aangifte?

TARIC publication propagates on the same schedule on both sides. For a few hours around a quarterly cutover, your vendor's lag and the Douane mirror's lag can point in the same direction. The wrong code looks valid on both ends.

Was Descartes at fault?

No. Descartes ran the cache within its documented update window. The bug is that an autonomous agent filed legally meaningful documents against a single source without surfacing the source's versie. That is an integration design choice, not a vendor failure.

Why not hit the EU TARIC consultation directly?

Rate limits, latency, and intermittent availability. The public consultation is designed for human lookups, not transaction-rate automation. Use it as a diff source against your vendor cache; do not make it the only source.

How big a code change was the diff-gate?

About 200 lines plus a second source adapter and a hold-queue UI for the broker. The work is operational, not algorithmic. Most of the time went into agreeing on what 'same measure' actually means across two mirrors.

process automationai agentscase studyintegrationsoperationsworkflow

Building something?

Start a project