PHP
PHP 5.6 ERP migration: hex weights, a silent 32k cap
A custom PHP 5.6 ERP migration to Directus stalled fourteen days when 1998-era hex weights silently truncated at 32.767 kg — and broke 4,200 cheese batches.

On day fifteen of a fourteen-day migration window, our project lead Marije was at a warehouse desk in Gouda watching a touchscreen scale read 47.2 kg. The new ERP — freshly cut over from a custom PHP 5.6 + MySQL 5.1 portal to a Directus + Nuxt stack — recorded the same pallet write-off as 32.767 kg. Same number, every time, no matter what they put on the scale above 32 kilos.
The client is a 29-person kaasgroothandel shipping Gouda, Edam, and Boerenkaas to retailers across the Benelux. Their old portal had run since 1998. The migration was supposed to take two weeks. We were now two weeks and one day in, and 4,200 partij-traceabiliteit records were silently wrong.
The 1998 inventory ledger
The portal's voorraad_correctie table — stock corrections, the ledger that records every gram added or removed from a batch between intake and shipment — had been designed by someone who knew exactly what they were doing in 1998 and exactly nothing about what 2026 would need.
The schema, untouched since the first commit:
CREATE TABLE voorraad_correctie (
id INT(11) NOT NULL AUTO_INCREMENT,
partij_id INT(11) NOT NULL,
delta_g CHAR(8) NOT NULL, -- gram delta as signed hex
reden VARCHAR(40),
user_id INT(11),
ts DATETIME,
PRIMARY KEY (id)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
That CHAR(8) holding a signed hex value is the kind of decision you make when disk is expensive, when your PHP team is sharing a 4 MB shared host, and when you've decided — for reasons that made sense at the time — to encode the sign bit by hand. A typical row looked like:
delta_g = '0000FFEC' -- -20 grams (signed 16-bit two's-complement, sign-extended)
delta_g = '00007FFF' -- 32,767 grams
delta_g = 'FFFF8001' -- -32,767 grams
Where 32,767 quietly comes from
The original PHP wrote those values with a one-liner that's lived in the codebase since the Wim Kok cabinet:
// 1998-era encoder, still in use
function pack_delta(int $grams): string {
// pack as signed short, hex-encode, left-pad to 8 chars
return str_pad(bin2hex(pack('s', $grams)), 8, 'F', STR_PAD_LEFT);
}
The format flag 's' on pack() is a signed short — a 16-bit integer. Range: −32,768 to +32,767. The CHAR(8) column hides this completely: from the outside it looks like eight hex digits with 32 bits of headroom, so you'd assume corrections up to ±2 billion grams were possible. They never were. The upper four hex digits were always either 0000 (positive) or FFFF (negative). The actual data ceiling was 32.767 kg.
For a cheese wholesaler this almost always works. A wheel of Gouda is 4 to 12 kg. A snij-verlies is 30 to 200 grams. The hot path of the business — wheel-level corrections — fits inside the 16-bit window with room to spare.
The cold path doesn't. Pallet-level write-offs (a forklift drops a stack, a customer returns a 48 kg pallet that went off, the cold storage thermostat trips and they cull 80 kg of Boerenkaas) blow straight past the ceiling.
If you can't see your type system, you can't reason about its ceilings. Hex-in-a-CHAR is type erasure with extra steps, and the next person reading the schema will trust it for far more than it can hold.
Why operations never complained
This is the part of the war story that took us longest to understand. The bug had been live for 28 years. Nobody had ever filed a ticket. How?
Three reasons, in descending order of how cheerful they are:
- The original PHP did
pack('s', $grams)with no overflow check. PHP, before strict types, would happily wrap a 50,000-gram correction to−15,536grams. The system accepted it. - The application layer above it ran a sanity rule: any correction whose sign didn't match the operator's reason code (negative delta on a
BIJBOEKENstock-add, for instance) was kicked back to a paper spreadsheet maintained by the warehouse foreman. He kept those numbers in his head and reconciled them at month-end against the dairy supplier's CMR notes. - The foreman retired in March 2024. Nobody told us.
So when our migration script — written against the schema, not against thirty years of institutional workaround — ran hexdec() on the CHAR(8) values and stored them as SMALLINT in Directus (because that's what 32,767 fits in), it faithfully reproduced the silent truncation. Day fifteen, on the warehouse floor, was the first time anyone had ever held the new ERP up against a real scale at pallet scale.
The traceability hit
Of 41,000 corrections in the live table, 4,200 had been silently clamped. Roughly 10%. Most were minor. Forty-six were material — pallet-scale write-offs of single-origin Boerenkaas where the recorded delta differed from the true delta by 15 to 60 kg.
Under Regulation (EC) No 178/2002 — the EU's General Food Law — Article 18 obliges every food business operator to be able to trace any batch one step forward and one step back. For a wholesaler that means: given a partij number, you must be able to reconstruct exactly which wheels left the building, in what condition, to whom, at what weight. With 4,200 records where the recorded stock-correction was wrong, the audit trail didn't reconcile against the outgoing CMR weights. The NVWA — the Dutch food safety authority — was scheduled for a routine inspection in eight weeks.
Food business operators shall be able to identify any person from whom they have been supplied with a food, a feed, a food-producing animal, or any substance intended to be, or expected to be, incorporated into a food or feed.
Regulation (EC) No 178/2002, Article 18
Reconstructing the deltas without rewriting history
The fix was not "edit the bad rows". Audit trails get worse, not better, when you silently rewrite them. The fix was:
- Re-parse every CHAR(8) value as a signed 32-bit integer, not a signed 16-bit one. (For rows written before the bug existed, the upper 16 bits had been correctly sign-extended; the encoder was right, the application's mental model of it was wrong.)
- For each row, compare the parsed delta against the closing stock total recorded in the sibling
partij_voorraad_dagtable for that day. If they agreed, the row was fine. - If they disagreed, the closing stock total — written by a separate code path that read the scale directly — was the ground truth. Compute the corrected delta as
stock_after − stock_before. - Write the correction as a new row in
voorraad_correctiewith reason codeMIGRATIE_HERSTEL_2026, referencing the original row'sid. The original row stays. History stays.
The reconstruction query, after we'd convinced ourselves the schema told only part of the truth:
INSERT INTO voorraad_correctie_v2
(partij_id, delta_g, reden, original_id, ts)
SELECT
v.partij_id,
(d.stock_after_g - d.stock_before_g) - parsed.delta_int AS delta_g,
'MIGRATIE_HERSTEL_2026',
v.id,
NOW()
FROM voorraad_correctie v
JOIN partij_voorraad_dag d
ON d.partij_id = v.partij_id
AND d.dag = DATE(v.ts)
JOIN LATERAL (
SELECT CAST(CONV(v.delta_g, 16, 10) AS SIGNED) AS delta_int
) parsed ON TRUE
WHERE ABS(parsed.delta_int) BETWEEN 32700 AND 32767;
That BETWEEN 32700 AND 32767 clause is the smell. Real-world weight data does not cluster against a numeric boundary unless the boundary is doing something to it. When we ran a histogram of ABS(parsed.delta_int) across the full table, the distribution was log-normal up to about 30,000 grams and then had a vertical spike at exactly 32,767. That spike was the bug, visible in two minutes of SELECT and a bar chart. We should have plotted it on day one.
The audit we should have run on day zero
A schema is a claim. A migration script that trusts the claim and doesn't profile the data is a script that will silently propagate every lie the schema has ever told.
The pre-migration audit that would have caught this in ten minutes:
-- For every numeric or numeric-encoded column,
-- bucket the values by power of two and look for a cliff.
SELECT
FLOOR(LOG2(ABS(CAST(CONV(delta_g, 16, 10) AS SIGNED)) + 1)) AS bit_bucket,
COUNT(*) AS n
FROM voorraad_correctie
WHERE delta_g <> '00000000'
GROUP BY bit_bucket
ORDER BY bit_bucket;
If the histogram has a cliff at bucket 15 (215 = 32,768), you have a 16-bit-coded column. A cliff at bucket 31, a 32-bit-coded column. Either way: that cliff tells you what the data thinks it is, regardless of what the column type says it is.
Before you migrate a legacy database, plot a histogram of every numeric or numeric-encoded column. A cliff at a power-of-two boundary names the bug for you, no source code required.
What we'd do differently on day zero
Three changes, in the order we now do them on every migration:
- Talk to the foreman before the schema. The retired warehouse foreman would have told us about the parallel spreadsheet in twenty minutes. Institutional workarounds are the negative space of every legacy bug.
- Profile every column that looks like a number, even if it's a string. CHAR, VARCHAR, even ENUMs sometimes hide integers. Histogram them. Look for cliffs.
- Write the rollback path before the forward path. Audit-trail repairs must be additive — new rows, not rewrites — and the rollback has to preserve the corrected history.
When we built the Directus + Nuxt portal for this Gouda kaasgroothandel, the thing we ran into was a 1998 hex encoding the schema couldn't see. We solved it by treating the old database as evidence rather than as source-of-truth — a legacy migration approach where we profile first, migrate second, and never edit the audit trail in place.
If you have a PHP 5.x or 7.x portal you're about to move, the smallest thing you can do today is one query per numeric-ish column: a histogram bucketed by power of two. Ten minutes. It will tell you which columns are lying.
Key takeaway
Before you migrate a legacy database, histogram every numeric column by power of two. A cliff at bucket 15 or 31 names the type-width bug for you.
FAQ
Why did the bug go undetected for 28 years?
The original PHP wrapped over-range values silently, and the warehouse foreman maintained a paper spreadsheet for pallet-scale corrections. He retired in March 2024 and the workaround left with him.
Why not just edit the wrong rows in place?
Food traceability law treats the original record as evidence. We added correction rows referencing the originals so the audit trail stays intact under EU 178/2002 Article 18.
How do you find type-width bugs before a migration?
Histogram every numeric or numeric-encoded column bucketed by power of two. A vertical spike at bucket 15 or 31 means a 16- or 32-bit ceiling is silently clamping the data.
Does Directus handle large integer columns by default?
Directus respects whatever type you declare in the underlying database. For weight deltas above a kilogram, use INT or BIGINT — never SMALLINT, and never a string-encoded integer.