← Blog

WordPress

WordPress to Hydrogen migration: the ACF repeater trap

A Venlo tyre wholesaler's dealer portal was supposed to go live on day six. We shipped on day eighteen. Here is the ACF Repeater field that ate the gap.

Jacob Molkenboer· Founder · A Brand New Company· 28 May 2026· 11 min
Open leather logbook with handwritten pages, brass key, index card with green ribbon, red wax seal on ivory linen.

Day twelve. Venlo, 23:40. The operations manager of a twenty-five-person truck-tyre wholesaler is on a video call, sharing his screen, refreshing the dealer portal we were supposed to ship last Friday. He types in 295/80 R22.5, a Goodyear Marathon LHT, and the dealer-price tier should show €387. It shows €0,00. He tries another size. €0,00. He tries the price for a forty-tyre bulk order on Michelin X Multi D. €0,00. Three out of every four tier prices in the new system are wrong, the system goes live to 340 dealers in seventy-two hours, and we are twelve days into a WordPress-to-Hydrogen migration we estimated at six.

This is the story of an ACF Repeater field set up in 2017 that ate twelve days of our migration sprint, and the audit step we have been running on every legacy migration since.

The stack we walked into

The client ran their dealer portal on WordPress 4.9, WooCommerce 3.5 and PHP 7.0. Hosted at a Dutch VPS shop, kept alive by a freelancer who had moved to Berlin in 2019 and stopped answering email in 2021. Around 1,200 truck-tyre SKUs. Each SKU carried a Staffel, a price tier block, that held, per maat, the DOT code, the EU tyre label data (rolling resistance, wet grip, noise) and three or four price brackets depending on order volume. Total: roughly 14,800 individual price tiers.

The price tiers were the product. Without them the portal is a brochure.

The new target stack: Shopify Hydrogen, a Remix-based storefront framework, with the catalogue and price tiers held as Shopify Metaobjects. Hydrogen for the dealer-facing UI, Shopify Admin for the wholesaler's team, Metaobjects for everything WooCommerce had been bending its variants to model.

The first thousand products imported fine

We built the importer the obvious way. Walk every WooCommerce post, pull every meta row out of wp_postmeta, map it through a normaliser, push it to the Shopify Admin GraphQL API as Metaobject entries linked to product variants. We tested on a sample of fifty SKUs. Green. We tested on two hundred. Green. We ran the full 1,200 and watched the logs scroll. No errors. Shopify accepted everything. Validation passed.

The next morning, our front-end engineer wired up the Hydrogen price formatter, pulled up a product page, and saw €0,00 where a price tier should be. We assumed a formatter bug. The formatter was fine.

What ACF Repeater actually stores

If you have never touched ACF Pro's Repeater field directly in the database, this is the part worth knowing. ACF does not give you one tidy row per repeater entry. It gives you:

  • One row holding the repeater count (e.g. staffels set to 6)
  • One row per sub-field per index, with keys like staffels_0_dot_code, staffels_0_eu_label_noise, staffels_1_dot_code, all the way up to staffels_5_…

This is fine. It is how the plugin has worked since launch. Our importer handled it correctly. We unpicked the indexed keys, grouped by repeater, and emitted clean Metaobject entries.

What our importer did not expect was the second set of meta rows.

The bomb the freelancer planted in 2017

Buried inside the theme's functions.php was a helper the original freelancer had written so the in-house team could export a flat snapshot of every Staffel for their accountant's Excel sheet. The helper called get_field('staffels', $post_id), took the resolved nested array, and stored it back into wp_postmeta as a single key, _staffels_snapshot. It ran on every product save, for nine years. In simplified form:

add_action('save_post_product', function ($post_id) {
    $staffels = get_field('staffels', $post_id);
    if (!empty($staffels)) {
        update_post_meta($post_id, '_staffels_snapshot', $staffels);
    }
});

WordPress's update_post_meta() silently runs any non-scalar value through PHP's serialize() before insertion. A developer reading that line in 2017 sees "store the array". The database sees a string starting with a:6:{ that no SQL tool will parse, no other plugin will respect, and no LIKE 'a:%' query will be written to find for nine more years.

So every product carried two representations of the same data: the indexed ACF rows, and a serialized PHP-array snapshot. The snapshot was never read at runtime by the live site. It was a dormant write-only column. The accountant had stopped using the export in 2019.

Our importer read every meta key on every product. It did not know the snapshot key was a backup. As far as the importer was concerned, _staffels_snapshot was just another field that needed a home in Shopify.

The normaliser swallowed it as a number

PHP's serialize() emits output that looks like this:

a:6:{i:0;a:4:{s:8:"dot_code";s:4:"2419";s:5:"price";d:387.5;…

That leading a:6: means "array, six elements". Our normaliser had a guard that asked, in effect, "if this looks numeric, treat it as a price." The check was parseFloat(value) followed by an isNaN test. parseFloat("a:6:{…}") returns NaN, which would have been fine. But the data had been piped through a cleanup step that stripped a small set of non-ASCII and low-ASCII prefixes left over from a 2018 CSV round-trip, and the leading a: had been chewed off, leaving 6:{i:0;a:4:{…. parseFloat("6:{i:0;a:4:{…") returns 6.

The normaliser handed 6 to the Metaobject importer as the variant price tier value. Shopify accepted it. The actual price tier rows, written by the same importer one millisecond earlier, were overwritten by the snapshot key, which arrived alphabetically later. 14,800 tiers became, on average, single-digit numbers between zero and twenty.

Hydrogen's price formatter then rounded most of them to €0,00 because the truncated snapshot keys started with 0: or unserialised to an empty leading element. A handful of products showed €6,00 or €4,00, which is how we noticed the pattern at all.

Warning

Any migration off a legacy WordPress install with ACF Pro should grep wp_postmeta for serialized payloads before the importer is written. The query is SELECT meta_key, COUNT(*) FROM wp_postmeta WHERE meta_value LIKE 'a:%' GROUP BY meta_key;. If it returns rows, you have hidden serialized data and your importer needs an explicit unserialize stage.

How we found it

Twelve days in, after rewriting the Hydrogen price component twice, swapping currency configurations, adding debug logs to the Metaobject importer, and rolling the whole import back and re-running it on a Saturday, our backend engineer ran one query against the source WordPress database:

SELECT meta_key, LEFT(meta_value, 40) AS preview
FROM wp_postmeta
WHERE meta_value LIKE 'a:%'
LIMIT 20;

The first row came back: _staffels_snapshot | a:6:{i:0;a:4:{s:8:"dot_code"…. He pinged the team channel with a single word: oh.

The fix

Three changes shipped on day fourteen.

First, we added an explicit allowlist to the importer. Every meta key being read had to be on a documented list, sourced from the client's product team, not from SELECT DISTINCT meta_key. The snapshot key was not on the list and was skipped.

Second, we added an unserialize stage for any meta value matching /^a:\d+:\{/, routed to a separate "ACF snapshot" handler that compared the snapshot to the structured ACF rows for that product and logged a mismatch if they disagreed. They never did, but we wanted the audit trail.

Third, the normaliser stopped coercing values to numbers without an explicit field-type contract. A meta value is a price only if the field definition says it is a price. parseFloat is no longer trusted to detect a price. In code, the new normaliser looks roughly like this:

type FieldContract =
  | { kind: 'price'; currency: 'EUR' }
  | { kind: 'string'; maxLength: number }
  | { kind: 'enum'; values: readonly string[] }
  | { kind: 'snapshot'; ignore: true };

function decodeMeta(key: string, raw: string, contract: FieldContract) {
  if (contract.kind === 'snapshot') return null;
  if (/^a:\d+:\{/.test(raw)) {
    throw new Error(`unexpected PHP serialized value at ${key}`);
  }
  if (contract.kind === 'price') {
    const n = Number(raw.replace(',', '.'));
    if (!Number.isFinite(n)) {
      throw new Error(`expected price at ${key}, got ${raw.slice(0, 40)}`);
    }
    return n;
  }
  // string and enum paths follow the same pattern: contract dictates parser
  return raw;
}

Every key in the importer manifest now has a kind. New keys cannot enter the pipeline without one. The throw on unexpected serialized values means that if a future plugin reintroduces the snapshot pattern, the next import fails loud instead of silently overwriting prices.

We re-ran the import. The 14,800 tiers landed correctly. Day eighteen, the portal went live. The operations manager texted us a screenshot of his dealer pricing page at 06:12 with the message: klopt nu.

We left both systems running side by side for the next two weeks. Every overnight job dumped a diff between the WooCommerce database and the Shopify catalogue. The first three nights surfaced eleven small discrepancies, mostly trailing whitespace inside DOT codes, that the original migration had carried through faithfully. None affected pricing. By night ten the diff was empty and the in-house team turned the old VPS off.

What this taught us about legacy WordPress migrations

The mistake was not the freelancer's helper. The helper was reasonable code, in context, in 2017. The mistake was reading every meta key on every product without asking what each key was for. WordPress's wp_postmeta is an open dumping ground. Nine years of plugins, themes and one-off helpers write to it. By 2026 most production WordPress databases we audit carry a meaningful chunk of meta rows that nothing on the live site ever reads. We have seen sites where it is the majority.

The fastest audit step we know is the serialized-payload grep above. The second-fastest is asking the in-house team for a list of meta keys they actually care about and treating everything else as suspect.

An inventory query that takes ninety seconds

Before any import code is written, we now run one query against the source database and read the result top to bottom:

SELECT
  meta_key,
  COUNT(*)                              AS rows,
  ROUND(AVG(LENGTH(meta_value)))        AS avg_bytes,
  MAX(LENGTH(meta_value))               AS max_bytes,
  SUM(meta_value LIKE 'a:%')            AS serialized_rows
FROM wp_postmeta
GROUP BY meta_key
ORDER BY rows DESC
LIMIT 50;

On the Venlo install this returned 312 distinct keys. The top twenty accounted for 88% of the rows. Three of those top twenty, including _staffels_snapshot, were write-only: nothing in the active theme or any installed plugin ever read them. We confirmed by grepping the codebase for each key. The other suspects turned out to be a legacy Yoast SEO cache from a plugin removed in 2020 and a half-finished CSV import staging field from 2019. Both could be dropped without affecting the live site.

The recent Hacker News discussion on building reliable agentic systems landed on a similar principle for AI pipelines: do not trust the input shape, constrain it. A migration importer is an agent of a different kind, but the rule transfers. Validate at the boundary. Default to refuse. Make the safe path explicit.

The smoke test that caught the next one

We folded a step into our standard playbook after this project. After every import, the migration runner pulls fifty random SKUs end to end. Shopify Admin, Hydrogen storefront, dealer-portal render. It asserts that the price tier displayed in the browser equals the price tier in the source WordPress database. Not the source CSV. Not the source GraphQL response. The browser, against the original DB.

Two weeks after the Venlo portal went live, that test caught a second ACF Repeater on a different field (banden_certificeringen) where the same freelancer had written the same snapshot pattern. The fix took an hour because we already had the unserialize handler.

When we rebuilt that dealer portal for the bandengroothandel, the real win was not Hydrogen or Remix. It was the audit step. We have added the serialized-payload grep and the end-to-end smoke test to every legacy migration we quote. If your stack has a wp_postmeta table older than three years, run the grep tonight before you do anything else.

Key takeaway

Before writing a single line of WordPress importer code, grep wp_postmeta for serialized arrays. The bomb in your migration is almost always a write-only key no one remembers.

FAQ

Why does ACF Pro store Repeater data as separate meta rows?

ACF flattens the repeater on save: one row holds the row count, and each sub-field at each index gets its own row in wp_postmeta. It has worked this way since launch, which is why migration importers need to group keys by index.

How do I detect serialized PHP arrays in a WordPress database?

Run SELECT meta_key, COUNT(*) FROM wp_postmeta WHERE meta_value LIKE 'a:%' GROUP BY meta_key against the source DB. Any rows returned are serialized payloads your importer must handle explicitly.

Is this issue specific to Shopify Hydrogen?

No. The same misread happens with any importer that treats wp_postmeta as a flat key-value store. Hydrogen only made it visible because the price formatter rounded the corrupted values to €0,00 on screen.

How long does a typical WooCommerce-to-Shopify Hydrogen migration take?

For a 1,000 to 2,000 SKU catalogue with custom fields, plan three to six weeks of engineering before launch, plus a two-week parallel-running period for the in-house team to verify pricing and reporting.

Should we keep ACF Pro alive on the new stack or rebuild the field model?

Rebuild it. The Repeater shape works because WordPress is permissive; in a typed Metaobject world you want explicit field definitions, validation at the boundary, and no dormant write-only columns from old export helpers.

wordpressmigrationlegacy sitesphpcase studye-commerce

Building something?

Start a project