PHP

PHP 5.6 to Strapi + Astro: a seven-week parallel cutover

The redactie-CMS was 16 years old, PHP 5.6 had been dead for years, and the NDP-feed to 22 dagbladen could not stop for a weekend. Here is the seven-week cutover.

Jacob Molkenboer· Founder · A Brand New Company· 22 Jun 2026· 10 min

Half-open leather logbook with green ribbon, brass tag, index card, rubber date stamp, red ink pad on ivory paper.

The hoofdredacteur sent us the screenshot at 23:14 on a Tuesday: the redactie-CMS had hung again on a single publish action. The piece was a 600-word column for the print edition that closed Thursday morning. The MySQL slow log showed a sixteen-second JOIN across artikelen, auteurs and embargo_geschiedenis. The CMS was sixteen years old. PHP 5.6 has been end-of-life since December 2018. Two of the 22 dagbladen on the NDP-feed had complained that morning that a story had skipped a beat. The publisher could not stop printing for a single day, and the editors could not learn a new tool in a single sprint.

This is the playbook we used to move them off it over seven weeks, with not one missed print edition and not one broken syndication push.

The constraint set

A 31-person vakbladuitgever in Hilversum. Four titles, weekly print, daily web, a paywall the in-house dev team had hand-rolled in 2014. The CMS held 142,000 artikelen going back to 2010, plus the per-redacteur embargo-geschiedenis the publisher needed under the Auteurswet: who signed off on publication, on whose authority, against which embargo time. Lose that log and you lose your defence in a copyright dispute.

On top of that, an XML-feed pushed to 22 dagbladen via the NDP, on a schedule the receiving systems had been parsing for nine years. The schema was not documented anywhere. Break a field name, break a national newspaper's front page on a Sunday.

The legacy stack:

PHP 5.6.40 on Debian 8, on a single VM at a Dutch hoster.
MySQL 5.6 with three custom my.cnf tweaks no one remembered the reason for.
jQuery 1.7 in the editor. TinyMCE 3. SCP-deploy from a developer laptop.
No staging. No tests. The previous lead developer had left in 2019.

The publisher wanted a modern stack but, more than that, wanted to not panic. The brief was: keep publishing, keep the NDP feed alive, keep the embargo log intact and admissible.

Why Strapi and Astro, not WordPress

We auditioned three replacements: WordPress with Advanced Custom Fields, a Sanity + Next.js combination, and Strapi + Astro. WordPress lost on the embargo trail — the audit hooks are doable but the legal team wanted the log table sitting next to the article, owned by the publisher, queryable in plain SQL. Sanity lost on data sovereignty: the publisher's lawyers wanted the database in the EU on hardware they could point at.

Strapi gave us a typed content model, a real Postgres database we could run in Amsterdam, lifecycle hooks for the embargo log, and an editor UI close enough to the old one that the redacteurs would not riot on day one. Astro gave us a reader site that builds to static HTML by default, with islands where we needed dynamic behaviour (paywall, comments). The marketing team kept their existing CDN.

Week 1: schema archaeology

Two developers, full week, no code written. We mapped every column in every table to a target field in Strapi or to a tombstone. The old artikelen table had 71 columns. Twelve were actually used. Three were duplicated under different names. One — publish_state_v2 — held 47 distinct values across sixteen years, including typos and free-text. We grouped them into nine real states (draft, in_review, embargoed, scheduled, published, retracted, archived, spiked, syndicated_only) and built a CSV the legacy team signed off on, line by line.

The deliverable that mattered most was not a database diagram. It was a written list of the twelve business rules the embargo trail had to preserve. Things like "if an article is retracted, the original publish event must remain in the log, never deleted." We made the publisher's legal counsel sign the document. That signature became the spec.

Week 2: Strapi modeling and the embargo log

We modelled three content types: Article, Author, EmbargoLog. The first two are obvious. The third is the spine of the migration. Every state change on an article writes an immutable row.

// src/api/article/content-types/article/lifecycles.js
module.exports = {
  async beforeUpdate(event) {
    const { data, where } = event.params;
    const before = await strapi.entityService.findOne(
      'api::article.article',
      where.id,
      { fields: ['status', 'embargo_at'] }
    );
    const changed =
      before.status !== data.status ||
      String(before.embargo_at) !== String(data.embargo_at);
    if (!changed) return;
    await strapi.entityService.create('api::embargo-log.embargo-log', {
      data: {
        article: where.id,
        actor: event.state.user?.id ?? null,
        from_status: before.status,
        to_status: data.status,
        embargo_at: data.embargo_at,
        recorded_at: new Date().toISOString(),
      },
    });
  },
};

The embargo_log table is append-only at the database level. We revoked UPDATE and DELETE on it for the Strapi role, and gave the redacteur role no direct access at all. The only writer is the lifecycle hook. The only reader through the UI is a custom Strapi plugin that renders, per article, the full chain from draft to whatever state it is in today.

Warning

If you migrate an embargo or audit log into a new system, never let the new ORM be the thing that decides whether a row is mutable. Take the privilege away at the database role level. ORMs change. Auditors do not.

Week 3: ETL dry run on a copy

We took a Friday-evening dump of the production MySQL, restored it on a dev box, and ran the full ETL against it. The script is a single Node process that streams articles in chunks of 500, maps fields, and writes through the Strapi REST API with the lifecycle hooks disabled. We do not want 142,000 rows in the embargo log for the import — those events did not happen on the new platform.

Instead, we materialise the historical embargo trail directly: we read embargo_geschiedenis, normalise sixteen years of timestamps into UTC, attach each row to its target article id, and bulk-insert into the embargo log with a flag imported_from_legacy: true. We can prove, in court if it ever came to that, which rows were re-created from a 2010 record and which were captured live by the new system.

First dry run took 11 hours. By the third pass it was 73 minutes. The difference: bulk inserts instead of per-row HTTP, and dropping the search indexes during import and rebuilding them after.

Week 4: parallel publish

This is the part you cannot rush. The editors kept publishing in the legacy CMS. Every write went through a thin PHP shim we added to the legacy publish_artikel() function:

<?php
function publish_artikel(array $row): void {
    legacy_insert($row);
    if (getenv('STRAPI_DUAL_WRITE') === '1') {
        try {
            strapi_post('/api/articles', map_legacy_to_strapi($row));
        } catch (Throwable $e) {
            error_log("strapi mirror failed for {$row['id']}: " . $e->getMessage());
            // never block the legacy write
        }
    }
}

A nightly diff job pulled both sides and reported any drift to a Slack channel the migration lead actually read. In the first two days we found 38 mismatches, all of them in the date handling around midnight (CET vs. UTC). By day five the diff was clean. The editors did not know the dual-write existed. They published as normal. We watched.

Week 5: NDP syndication cutover

The NDP-koppeling was the scariest part of the project, because the consumers are not us. 22 dagbladen, each with their own ingestion pipeline, expecting the exact same XML schema the publisher had been producing since 2017. One missing attribute, one renamed tag, and a real newspaper has a hole on the front page on a Sunday morning.

We did not refactor the feed. We re-implemented it byte-for-byte. The new Astro site serves a route that produces the legacy XML, character-for-character compatible, validated against a sample of 200 archived feeds. We ran the new feed in parallel for a week on a staging URL the NDP receiver pointed at one of the 22 papers as a canary. When that paper printed three clean editions on the new feed, we cut the other 21 over on a Sunday afternoon.

One thing we learned the hard way: a few of the receiving systems were sensitive to the order of XML attributes. The XML spec says attribute order is not significant. Real-world parsers disagree. We froze attribute order to match the legacy output and added a CI test that re-checks it on every deploy. When you replace a feed other systems consume, treat the wire format as the contract: reproduce the bytes, not the intent.

Week 6: the reader site on Astro

This was the easy week. Astro builds the public site from the Strapi content API, mostly static, with islands for the paywall check and the comments. The previous reader site averaged 2.1 seconds to first contentful paint on a 4G connection. The new one averages 0.4. We did not write a paragraph about performance in the launch announcement — the readers noticed, the redacteurs noticed, and that was enough.

The marketing team migrated their tracking and SEO redirects in two days. We kept every legacy URL alive with a 301 from Astro's redirect map. /artikel/12345-titel-slug still resolves, six years of inbound links still work.

Week 7: sunset

On the last Sunday we froze the legacy CMS to read-only. The redacteurs had been publishing exclusively in Strapi for nine days already; they did not notice the freeze. We left the legacy MySQL running on a private VM as a read-only archive for 90 days, then snapshotted it to cold storage. The cold storage costs the publisher €4 per month. The signed dossier from the legal counsel lives next to the snapshot. If a journalist or a lawyer ever needs to confirm what the 2014 embargo state of an article was, we can hand it to them in under an hour.

What we would do differently

Two things. First, we underestimated the timezone work. Sixteen years of DATETIME columns with no timezone column means inferring CET vs. CEST per row. We wrote a deterministic resolver in week three; we should have written it in week one and tested it against a known sample of 100 articles before touching anything else.

Second, we should have built the byte-for-byte XML feed before the content model. The XML schema turned out to be the hardest external contract. Once we knew exactly what it required, the Strapi content model fell out of it. We did it the other way around and lost two days renaming fields.

When we built the parallel-publish shim for this publisher, the lever that saved us most time was the small, dull nightly diff job between old and new — boring, but it caught every regression before an editor did. That is the kind of unglamorous engineering that makes a legacy migration arrive on schedule instead of on fire.

If you have a CMS still running on a PHP version that has been dead for years, do not start with the new stack. Start with one query: count the rows in the table that holds your legal audit trail, and read the first ten and the last ten by hand. If you do not have that table at all, that is the first sprint, not the migration.

Key takeaway

Migrate the legal audit trail first, dual-write for a week, and rebuild the syndication feed byte-for-byte. Everything else is just CMS work.

FAQ

How long does a full PHP 5.6 to Strapi migration take?

For a publisher with around 142,000 articles and an active syndication feed, expect six to eight weeks of calendar time with two senior engineers. Smaller corpora and no syndication: three to four weeks.

Can we run the new and old CMS in parallel without confusing editors?

Yes. Editors keep publishing in the legacy tool while a write-through shim mirrors every save into the new system. They only switch UIs after a nightly diff job runs clean for a full week.

How do you migrate an audit log without losing legal admissibility?

Tag every historical row as imported_from_legacy, revoke UPDATE and DELETE at the database role level so the ORM cannot mutate it, and keep a signed field-mapping document with your legal counsel.

Why not WordPress for a publisher with strict audit requirements?

WordPress works for many publishers. For an Auteurswet embargo log, the custom plugin path is fragile across upgrades. A typed content model in Strapi or similar survives plugin churn better.

phpmysqllegacy sitesmigrationarchitecturecase study

Building something?

Start a project