Integrations

Bol.com to NetSuite agent: 11 hours of €0.01 invoices

At 06:14 Amsterdam time, an order-import agent had been posting €0.01 line items into NetSuite for eleven hours. Here is what changed, what we missed, and the gate we should have had.

Jacob Molkenboer· Founder · A Brand New Company· 30 Aug 2024· 9 min

Brass postal weight on a cream invoice stamped red, green ribbon, tarnished bell and receipt tape on ivory blotter.

At 06:14 Amsterdam time on a Wednesday in May, the AR controller at a mid-sized Dutch consumer-brand client opened NetSuite to clear the previous night's Bol.com reconciliation. The order-import agent had been running since 19:30 the evening before. She saw 1,847 sales invoices. She saw a subtotal of €37.42. She closed the tab, opened it again, and called us at 06:18.

For the next four minutes we shared screens and watched the same number tick up. The agent had spent eleven hours posting €0.01 line items into NetSuite. Real customer orders, real SKUs, real shipping addresses, one-cent prices. The kill-switch we had inherited from the previous vendor was sitting at zero failures.

This is what changed upstream, what the mapper did with it, and what the alarm should have caught in ninety seconds.

The schema change Bol.com shipped

Bol.com publishes a Retailer API used by sellers to pull orders, ship items, and write back tracking. Version 10 has been stable for over a year. On the Tuesday afternoon before the incident they pushed an additive change to the orders endpoint: the per-item price moved from a flat decimal into a structured pricing object.

Before:

{
  "orderItem": "BOL-1234567890-001",
  "ean": "8710398502537",
  "unitPrice": 24.95,
  "quantity": 2
}

After, on the same endpoint, no version bump:

{
  "orderItem": "BOL-1234567890-001",
  "ean": "8710398502537",
  "pricing": {
    "unitPrice": { "amount": 24.95, "currency": "EUR" },
    "vatRate": 21.0
  },
  "quantity": 2
}

Bol's release note called this "richer pricing metadata for tax handling". The old unitPrice field stayed in the response for a 30-day deprecation window. Bol's Open Retailer API documentation covered both shapes during the transition. Our agent was reading the old field, and the old field was still there.

The catch: the old field was not always there. For any order that carried a promotional discount or a marketplace correction, Bol populated only the new pricing.unitPrice.amount. The flat unitPrice came back as null. Our mapper got null. Our mapper had a fallback. The fallback was the bug.

Bol communicates schema changes through a public changelog and a weekly newsletter for connected sellers. Both had described this one as additive and low risk. That description was correct from the wire perspective and wrong from the integration perspective. Additive means new keys appear. It does not mean old keys keep their old values. The implicit promise of stability lives one level deeper than the schema diff.

How the mapper fell to one cent

Three years ago, when this client first connected to Bol.com, NetSuite was rejecting any sales-order line with a unit price of zero. The accounting team did not want freebies showing as invoices. Reasonable. But the agent occasionally received legitimate giveaway orders, the import would halt, and the original integration engineer (not us) added what looked like a harmless guard:

function mapLinePrice(item) {
  const raw = item.unitPrice;
  // NetSuite rejects zero-value sales lines; use a cent
  // so the line posts and finance can correct it manually.
  return (raw === null || raw === undefined || raw === 0) ? 0.01 : raw;
}

For three years that fallback fired roughly twice a month on genuine promo orders. Finance flagged the cent lines in a Monday review, corrected them by hand, and the system stayed quiet.

When Bol's schema shifted, every order with a promotional discount started returning null for the flat unitPrice. The mapper saw null and returned 0.01. The agent posted a valid sales invoice with a one-cent line. NetSuite accepted it. The agent logged 200 OK. Eleven hours of orders flowed through the same pipe.

By the time we got the call, 1,847 line items had landed at €0.01 against an average of €23.40. The exposure was roughly €43,400 of mispriced invoices waiting to leave NetSuite in the next morning's Mollie payment batch.

This pattern, defaulting a missing input to a sentinel so the downstream write does not block, sits in a lot of legacy integrations. It survives because it works under one specific assumption: that the input is missing rarely and conspicuously. The day either of those stops being true, the sentinel becomes the most common value in the column.

Warning

A sentinel default that lets the row post so a human can fix it is a tripwire. The day the rare branch becomes the common one, the human never gets the signal.

The kill-switch that slept eleven hours

Every agent we ship has a circuit breaker. The pattern is standard: if N failures inside a rolling window, halt the run, alert Slack and on-call email, require a manual acknowledgement to resume. Ninety seconds from breach to alert is the design target.

The agent we inherited had one too. It watched three things: HTTP non-2xx responses from Bol.com, HTTP non-2xx responses from NetSuite, and uncaught exceptions inside the mapper. All three counters stayed at zero through the incident. Bol returned 200. NetSuite returned 200. The mapper never threw. It returned 0.01, cleanly, eight times a minute.

The kill-switch was guarding the wrong layer. It was a transport alarm in a business incident. The financial signal, the cliff from €23 to €0.01 in average line value, was visible in NetSuite the whole time and visible to no monitor.

This is the failure mode worth naming. AI agents and dumb integration scripts share it. They are judged at the protocol boundary, and the business meaning of the payload is treated as somebody else's problem. The recent Hacker News thread asking whether an AI assistant increased bugs in rsync arrives at the same place from another angle. If you cannot tell a correct-looking change from a wrong one without running it against reality, your gate is not strict enough.

From discovery to halt in 44 minutes

06:18: client calls. We open the agent's last 200 log lines and the NetSuite saved search she had on screen.

06:23: we confirm the cent pattern is not isolated. Every invoice posted after 19:31 the previous evening carries at least one €0.01 line.

06:31: we pause the agent's cron, revoke the integration role's NetSuite write permission, and freeze the overnight Mollie batch by extending the hold window on the payment-export job.

07:02: agent halted, NetSuite locked, no customer notification gone out. The damage is contained to 1,847 internal records.

The thing we got right in those 44 minutes was the freeze on the Mollie batch. NetSuite holds invoices for an export window. If the call had landed at 08:30 instead of 06:18, several thousand one-cent payment requests would have hit Mollie and bounced off customers' inboxes as confused order confirmations. Reversing invoices internally is a database job. Apologising to 1,800 customers is a brand job.

What we shipped before lunch

Three things had to land before the next Bol poll at 08:00.

First, the mapper. We replaced the sentinel with an explicit read from the new field, falling back to the legacy field, falling back to a hard refusal:

function mapLinePrice(item) {
  const newPrice = item.pricing?.unitPrice?.amount;
  const legacyPrice = item.unitPrice;
  const price = newPrice ?? legacyPrice;

  if (price === null || price === undefined) {
    throw new MappingError('NO_UNIT_PRICE', {
      orderItem: item.orderItem,
      ean: item.ean,
    });
  }
  if (price < 0.10) {
    throw new MappingError('SUSPICIOUS_UNIT_PRICE', {
      orderItem: item.orderItem,
      price,
    });
  }
  return price;
}

A thrown MappingError now routes the order into a needs_review queue instead of NetSuite. Two genuine giveaway orders surfaced inside the first day. Both correctly flagged. Accounting prefers the queue to the silent cent.

Second, the reversal. We pulled the 1,847 affected invoices from NetSuite, marked the corresponding Bol order_id values as un-imported in our state table, and re-ran the import against the new mapper. Pricing came back clean from pricing.unitPrice.amount. The Mollie batch released on time the next morning. No customer ever saw a one-cent invoice.

Third, the audit trail. NetSuite gives you internal IDs but not always the upstream order context. We added a custom field custbody_bol_payload_hash (SHA-256 of the raw Bol response) on every invoice the agent writes. If we ever need to argue with a customer about what Bol sent versus what NetSuite posted, the link is one query away.

The new gate, in business units

The transport kill-switch stayed. We added two more, both running inside the same process as the agent, both wired to the same 90-second alert path.

A rolling-window value floor:

-- runs every 60s against the agent's own write log
WITH recent AS (
  SELECT line_amount_eur
  FROM agent_writes
  WHERE source = 'bol_orders'
    AND written_at > now() - interval '15 minutes'
)
SELECT
  COUNT(*)               AS n,
  AVG(line_amount_eur)   AS avg_eur,
  MIN(line_amount_eur)   AS min_eur
FROM recent
HAVING COUNT(*) >= 50
   AND AVG(line_amount_eur) < 5.00;

If the average line value across the last fifteen minutes drops below €5 over fifty or more lines, the agent halts and pages on-call. €5 sits well below the smallest legitimate SKU this client sells and well above any sentinel. The window is short enough to catch a fresh incident before a quarter of a working day burns.

A field-presence assertion against the source schema, running once per poll, before any orders are mapped:

const sample = orders.slice(0, 10);
const newFieldPresent = sample.filter(
  o => o.orderItems?.[0]?.pricing?.unitPrice?.amount !== undefined
).length;

if (newFieldPresent < sample.length * 0.5) {
  await alert('bol.com schema may have shifted', {
    sample: sample.slice(0, 2),
  });
}

This one watches Bol from our side. If the shape of their payload changes again, we hear about it from our own monitor, not from accounting at 06:14. Google's SRE book chapter on monitoring makes the same argument from a different vocabulary: the alarms that matter measure the user-visible thing, not the layer underneath.

The kill-switch was not broken. It was watching the wrong thing. Every integration agent should carry at least one alarm that reads in the units of the business: euros per minute, refunds per hour, contracts per day. Protocol-level alarms tell you the pipe is connected. Business-level alarms tell you the right thing is flowing through.

This is true of any agent, not only the integration kind. A chat agent that answers in two seconds with the wrong policy passes every transport check. An invoice-chase agent that emails polite reminders to the wrong customers passes every transport check. The alarm has to read the thing the business actually cares about.

When we built this Bol.com to NetSuite integration for the client, the gap we ran into was exactly that: the difference between healthy traffic and correct traffic. We closed it by adding a second class of monitor that knows what an order is worth, not just whether it posted. That kind of AI agents work tends to look uneventful when it is working, which is the whole point.

The smallest thing you can do today: open whichever integration agent you trust the most, find the alerting block, and ask whether it would fire if every line item it wrote suddenly cost a cent.

Key takeaway

A protocol-layer kill-switch tells you the pipe is connected. A business-layer kill-switch tells you the right thing is flowing through. Every integration agent needs both.

FAQ

Why did the agent return 200 OK on every failure?

Because nothing failed at the protocol level. Bol returned valid JSON, the mapper produced a valid number (0.01), and NetSuite accepted the invoice. Every transport checkpoint was healthy. Only the business value was wrong.

How did you decide on €5 as the value-floor threshold?

Below the cheapest SKU this client sells (€7.95) and well above any legacy sentinel. Pick a floor that lives between your real minimum and your synthetic defaults. Recalibrate quarterly if the catalogue changes.

Should we trust deprecation windows from upstream APIs?

Treat them as warnings, not guarantees. A field can stay in the response and still go null for a subset of records. Read the field, but also assert it is populated against a sample before you map a full batch.

Would a human code review have caught the sentinel?

Maybe at the time it was written, but it had been quiet for three years. Reviews catch new code, not dormant assumptions. Periodic chaos-style audits against integration mappers find these better than line-by-line reading.

integrationsai agentsautomationarchitectureoperationscase study

Building something?

Start a project