Migration

PHP 5.4 to Payload CMS: a seven-week shadow cutover

A 16-year-old PHP 5.4 portal, 14,200 archive pages, a Mollie iDEAL mandate handshake, and a hard deadline. Here is the seven-week shadow cutover that landed it.

Jacob Molkenboer· Founder · A Brand New Company· 18 Jun 2026· 10 min

Open leather logbook with brass key on cream card, green tab, linen-tied iron tag on ivory paper, north window light.

The mysqldump came in at 4.1 GB. Eighty-seven tables, three of them spelled with a typo that someone in 2011 had decided not to fix because "queries already reference it." Passwords stored in unsalted SHA-1. A Mollie integration built against the v1 API, which Mollie themselves had told everyone to stop using somewhere around 2018. The host had given the client a hard cutoff: the FreeBSD 9 jail running PHP 5.4 would be powered off the last Friday of Q2.

The client is a 25-person scheepvaart-uitgever in Groningen, three centuries of harbour news, fourteen thousand archive pages, and roughly eight thousand paying subscribers whose monthly direct debit pays the mortgage on the office above the Noorderhaven. The portal had been running on the same custom PHP stack since 2010. Nobody had touched the auth layer in nine years. Two of the three original developers were dead, and the third had moved to Norway and politely declined to talk about it.

Seven weeks. New stack: Payload CMS on Postgres, Next.js 15 in front, Mollie's modern recurring-mandaat flow at the till. No re-authorization for the existing subscribers. No URL changes. No archive losses. Here is what the seven weeks actually looked like.

Week 0: read before you write

The hardest week of any old-PHP migration is the one you do before Week 1, when you read the codebase and the database and resist the urge to start writing the new one. We did this for six working days. By the end we had a printed dependency graph, a list of every cron, and a one-page document we called the surface: the contract the legacy system actually exposed to the world.

URL surface: 14,200 archive URLs of the form /zeebrieven/{jaar}/{nummer}/{slug}.html, all indexed by Google and linked from a dozen maritime forums.
Auth surface: a session cookie called SHPSESS, mapped against a sessies table that had not been TRUNCATEd since 2017.
Payments surface: a single Mollie mandate per subscriber, stored as mandaat_id on the abonnees table, plus a nightly cron that reconciled the previous day's SEPA file.
Reading-history surface: an INSERT on every page view into paginabezoek. 91 million rows. No primary key worth the name. We will come back to this.

We took the surface document to the client and asked one question: "Which of these are we allowed to change, and which not?" The answer matters more than any architecture diagram. We were allowed to change SHPSESS (nobody depended on it). We were not allowed to change the archive URLs (forty percent of organic traffic). We were definitely not allowed to ask anyone to re-do iDEAL.

Week 1: schema mapping to Payload collections

Payload's data model is collections of typed documents, not normalised joins. Mapping 87 MySQL tables down to a workable set of collections is the step where most teams overthink. We did it in one whiteboard session.

The rules we followed:

Anything the editorial team touches becomes a collection: zeebrieven, auteurs, rubrieken, uitgaven.
Anything billing-related becomes its own collection with a strict access policy: abonnees, mandaten, incasso-runs.
Anything append-only and high-volume, i.e. the 91-million-row reading history, does not live in Payload. It goes to a separate Postgres schema with proper indexes, queried via a custom Payload endpoint.

Three collections. One side table. One service. We wrote the Payload config and seeded with a thousand rows of representative data on the second day, then ran the editorial team through a click-test by Friday. The senior editor pointed at exactly one missing field (a vlagstaat enum for ship registry) and we added it in ten minutes. That meeting paid for itself ten times over in week six.

Week 2: the Mollie mandaat handshake

This is the part that scares everyone, and rightly so. The portal had 7,914 active recurring subscriptions paid through Mollie's iDEAL-into-SEPA-mandate flow. If we asked any of them to re-authorize, we would lose between five and fifteen percent of the file overnight. The CFO had been very clear about this.

The good news, which is not advertised loudly enough: a Mollie mandate is portable across your own API integrations. The mandate_id is stable. The bank-side mandate (the SEPA-mandaat reference you actually present to the customer's bank) lives at Mollie, not in your code. You can rebuild the entire integration on top of the same set of mandates as long as your customer IDs still match.

So we did exactly that. We read every mandaat_id out of the old database, matched it against Mollie's v2 customer record (the IDs were unchanged; the v1 deprecation never deleted them), and stored both on the new abonnees collection.

// payload/collections/Abonnees.ts
export const Abonnees: CollectionConfig = {
  slug: 'abonnees',
  access: { read: isOwnerOrEditor, update: isOwnerOrEditor },
  fields: [
    { name: 'email', type: 'email', required: true, unique: true },
    { name: 'mollieCustomerId', type: 'text', required: true, index: true },
    { name: 'mollieMandateId',  type: 'text', required: true, index: true },
    { name: 'mandaatStatus', type: 'select', options: ['valid', 'pending', 'invalid'] },
    { name: 'legacyAbonneeNr', type: 'number', index: true }, // for the 301s
    { name: 'incassoIntervalMaanden', type: 'number', defaultValue: 1 },
  ],
}

Before we wrote a single new charge, we ran a one-off script that called GET /v2/customers/:id/mandates for all 7,914 customers and verified every mandate came back with status: "valid". Forty-one came back invalid: cancelled bank accounts, deceased subscribers, one mandate that had never been signed properly in 2014. We flagged those forty-one for the customer-service team to handle by hand. Everyone else continued paying without ever knowing the back end had changed.

Warning

Do not call Mollie's POST /v2/payments with sequenceType: recurring until you have read the mandate back and confirmed status: valid. A failed first recurring charge costs you €0,30, a customer-service email, and a confidence hit you do not need in week two.

Week 3: the archive import and URL preservation

14,200 archive pages, all flat HTML on disk, all crawlable, all linked from elsewhere. We had three jobs: extract the content cleanly, store it in Payload, and serve the same URLs from Next.js with the same response codes.

Extraction was a Node script using node-html-parser against the flat-file directory. We mapped each file to one zeebrieven document, captured { jaar, nummer, slug, titel, publicatiedatum, lichaam, auteur }, and uploaded inline images to the Payload media collection. The body was stored as Lexical JSON, not HTML, because we wanted the editors to be able to edit historical pieces and we did not want two formats in the same field.

The URL rule was a single route in the Next.js app router:

// app/zeebrieven/[jaar]/[nummer]/[slug]/page.tsx
export default async function ArchiefPagina({ params }: { params: Params }) {
  const res = await payload.find({
    collection: 'zeebrieven',
    where: {
      and: [
        { jaar:   { equals: Number(params.jaar) } },
        { nummer: { equals: Number(params.nummer) } },
        { slug:   { equals: params.slug.replace(/\.html$/, '') } },
      ],
    },
    limit: 1,
  })
  if (!res.docs[0]) notFound()
  return <Archiefartikel doc={res.docs[0]} />
}

The trailing .html was important. Google had the .html indexed. We kept it. The Next.js route accepted the trailing extension and stripped it before the database lookup. No 301 chain, no rewrite rules at the edge: one route, one query, identical URLs.

Week 4: reading history without the 91-million-row table

The legacy paginabezoek table was unindexed, ungovernable, and growing at roughly four million rows a month. The product team used it for exactly one thing: "show me what I read last."

We did not migrate it. We started a new table in a separate Postgres schema, partitioned by month, with a composite index on (abonnee_id, gelezen_op DESC). We wrote a Payload endpoint at /api/leesgeschiedenis that returned the most recent 50 reads. The old data, we archived to S3 as monthly Parquet files, and pointed the editorial team to a Metabase dashboard if they ever asked.

No one has asked.

Week 5: shadow traffic on a single edge

Shadow traffic is the technique where you send a fraction of real production traffic to the new system, compare its responses against the old, and write the difference to a log. You do not show the new response to the user. You are only testing the new system against reality.

We did it at the CDN edge with a small worker:

// edge worker (Cloudflare-style)
export default {
  async fetch(req, env, ctx) {
    const url = new URL(req.url)
    const oldResp = await fetch('https://legacy.scheepspublisher.nl' + url.pathname, req)
    if (Math.random() < 0.10) {
      ctx.waitUntil((async () => {
        const newResp = await fetch('https://new.scheepspublisher.nl' + url.pathname, req)
        await env.SHADOW_LOG.put(crypto.randomUUID(), JSON.stringify({
          path: url.pathname,
          oldStatus: oldResp.status,
          newStatus: newResp.status,
          oldHash: await sha1(await oldResp.clone().text()),
          newHash: await sha1(await newResp.clone().text()),
        }))
      })())
    }
    return oldResp
  },
}

By the end of week 5 we had logged 1.4 million shadow requests. 99.6% matched on status code. The 0.4% that did not split into three groups: malformed legacy URLs that the old PHP routed through a 200-and-error page (we returned a clean 404), Googlebot probing for /wp-login.php (we returned 404, the legacy returned a custom error template at 200), and a handful of paginated archive listings where the legacy had off-by-one pagination. The third group was the only one that mattered, and we fixed it in an afternoon.

Week 6: canary, on real users this time

With the shadow logs clean, we flipped the same edge worker to canary mode: 5% of real users actually got the new response, the rest stayed on legacy. We watched Mollie webhook events, login success rate, and time-to-first-byte. After 72 hours at 5% with no anomalies, we went to 25%. After another 48 hours, 50%.

The only real-world surprise was at 25%: a single Sectigo TLS intermediate had been pinned in an old Android app belonging to the editorial team. Nobody had told us about the app. We rolled the canary back to 0 for that user agent, set up a stub on a separate hostname for the app to keep working, and pushed the canary back up. Two hours of degraded service for four people, none of them subscribers.

Week 7: cutover, and what you do at 23:00

At 23:00 on the last Tuesday we set the edge worker to 100% new. By 23:05 we had confirmed login, paid-content access, mandate lookup, and archive routing were all clean against synthetic monitors. The legacy box stayed online, in read-only mode, behind a private hostname, for another seven days. Then the host pulled the plug.

The morning after, the editor-in-chief sent one Slack message: "De zoekfunctie is sneller. Niemand heeft gebeld." Nobody had called. That, in publishing, is the only success metric that matters.

Takeaway

The Mollie mandate is portable. The URL surface is sacred. The reading history can be quietly archived. Everything else is just a Payload collection and a Next.js route.

Three things we would do differently

We spent too long on the editorial UI in week 1. Payload's defaults are good enough that the editors were happy on day one. We polished fields nobody used until week three. Next time we will ship the unfiltered Payload admin first and only customise based on real complaints.

We should have written the SEPA reconciliation tests before the Mollie webhook handler. We wrote them after, when a single batch return file with an unusual R-transaction code caught us out. Mollie's recurring payments documentation has the full list. Read it before you write a line of webhook code.

We should have brought the customer-service lead into Week 0. She knew the forty-one bad mandates already. She just had not been asked.

The five-minute audit you can run today

If you are sitting on your own old-PHP-and-MySQL portal and the host has not yet handed you a deadline, the first thing to do is not to choose a stack. It is to write your own surface document. Open one page. List your URL surface, your auth surface, your payments surface, and your highest-volume data table. For each, write a single sentence: "Are we allowed to change it?" That document is what tells you how much work the migration is. Everything before that is speculation.

When we did this legacy migration for the Groningen team, the thing we ran into was the Mollie v1-to-v2 mandate carry-over. We solved it by reading every mandate back from Mollie before flipping a single charge, and treating the forty-one invalid ones as a customer-service task, not a code task.

Key takeaway

The Mollie mandate ID is portable across your own integrations: read it back, verify it, and rebuild around it without forcing a single re-authorization.

FAQ

Why Payload CMS rather than WordPress or Strapi for a subscription portal?

Payload's typed collections and code-first access rules make it straightforward to lock down paid content. WordPress's plugin ecosystem becomes your security surface; we wanted to own that ourselves.

Can you really preserve Mollie recurring mandates across an integration rebuild?

Yes. The mandate ID is stable on Mollie's side. Read it back via the v2 customers API, confirm status is valid, and existing subscribers never see a re-authorization prompt.

What if legacy URLs include a trailing .html or odd casing?

Keep them. Match the exact form in your Next.js route and strip the extension in code. A 301 chain costs crawl budget; a literal route costs one line.

Is shadow traffic worth the extra two weeks compared to a hard switch?

Yes if you are disciplined about it. The shadow phase finds the routes where the legacy returned 200 for nonsense. Skip it and those become your first real-user incident.

migrationlegacy sitesphpmysqlarchitecturecase study

Building something?

Start a project