← Blog

Joomla

Joomla 3.10 to Payload CMS: the 7-week shadow cutover

A Leiden academic publisher, 18,600 DOIs, fifteen years of peer-review history, and a Joomla 3.10 stack on borrowed time. Here is how the seven-week cutover actually ran.

Jacob Molkenboer· Founder · A Brand New Company· 19 Jun 2026· 9 min
Leather logbook with blue ribbon, brass key on index card, wax stamp, green tab, red wax seal on ivory paper.

It was a Tuesday in March when the editor-in-chief sent us a screenshot. The Joomla 3.10 admin showed a yellow banner she had been ignoring since 2023, plus a fresh email from the hosting provider: PHP 7.1 would stop shipping security backports at the end of August. The portal ran 18,600 active DOI resolvers, a per-author peer-review archive going back to 2011, and a nightly job that pushed metadata to Crossref and DataCite. None of that could miss a beat.

The publisher is a 22-person operation in Leiden. Two journals, four book series, an open-access mandate that auditors check, and an editorial board that includes people you do not want to email twice in a week. They had been quoted a six-figure rebuild by a vendor that wanted to start by deleting the DOI table and "redesigning the URL structure." We disagreed.

This is the actual seven-week playbook we ran. Names changed, numbers real.

The constraints we wrote on the wall

Before any code, we taped a sheet of A3 to the office wall with four lines on it:

  • Every DOI that resolves today must resolve to the same article tomorrow, with the same canonical URL.
  • No peer-review record is lost; the chain of reviewer to manuscript to decision is preserved with original timestamps.
  • The Crossref and DataCite nightly deposit jobs keep running. No missed nights.
  • The editorial team works in the live system every single day of the migration.

The fourth constraint is the one that kills most big-bang replatforms. If you ask a small editorial team to freeze the system for two weekends, you will get a strike. We had to do this with shadow traffic.

Why Payload CMS, and not another headless

The portal is content-heavy but not content-simple. Articles have nested metadata (corresponding author, ORCID, funder, license, DOI prefix, supplementary material) and the editorial workflow needs custom collections that look nothing like a blog post. We needed a CMS where the admin UI is something you can shape per-collection without buying a SaaS seat per editor.

Payload runs as a Node service in front of PostgreSQL or MongoDB. The admin panel is React, the data layer is your own, and access-control rules are TypeScript that maps cleanly to the publisher's role matrix (managing editor, section editor, copy editor, reviewer, author). We picked Postgres for the journal data and kept the file blobs on S3-compatible storage in Frankfurt for GDPR comfort. Next.js handled the public reader-facing site.

We also considered Strapi, Directus, and a Sanity-plus-Next combination. The deciding factor was that Payload lets us colocate the admin UI, the deposit jobs, and the DOI resolver in one repo with shared types. For a 22-person publisher, one repo means one on-call rotation and one deploy.

Week 1: the parallel read replica

We stood up the Joomla database as a read-only replica on the new host and started writing import scripts against it, not against production. The Joomla schema is what fifteen years of plugin authors leave behind: half of the article body in com_content, custom DOI metadata in three different K2 extension tables, peer-review history in a bespoke jos_reviews table the previous developer designed in 2014.

We did not try to clean it on the way in. The first ingest job was deliberately stupid: pull every row, dump it into a legacy_* schema on the new Postgres, never delete. Clean-up is a second pass, not a first one. If a transform breaks, you re-run it against the legacy schema in seconds, not against the live Joomla site over a thin VPN.

-- Postgres side
CREATE SCHEMA legacy;

-- Import script (Node, mysql2 -> pg)
INSERT INTO legacy.articles (id, raw, imported_at)
SELECT id, row_to_json(j.*), now()
FROM joomla_replica.jos_content j
ON CONFLICT (id) DO UPDATE
  SET raw = EXCLUDED.raw, imported_at = now();

By Friday of week one, the entire Joomla content tree was queryable as JSONB on the new host. The editorial team did not notice we existed.

Week 2: the DOI resolver moves first

The DOI resolver is the single most important URL on the site. Crossref points at https://publisher.example/doi/10.xxxxx/journal.2018.0142, and that URL has to land on the article page forever. There are 18,600 of them, indexed by Google Scholar, cited in PDFs, embedded in other journals' reference lists.

We built the new resolver as a Next.js route handler before we built the article page. Why backwards? Because the resolver is the contract. The article page can change layout; the resolver cannot change behaviour.

// app/doi/[...slug]/route.ts
export async function GET(req: Request, { params }) {
  const doi = params.slug.join('/')
  const target = await resolveDoi(doi) // hits Postgres, never Joomla

  if (!target) return new Response('Not found', { status: 404 })

  return Response.redirect(target.canonicalUrl, 301)
}

We ran the new resolver against the legacy schema for a week, with the old Joomla site still authoritative. The shadow check: for every inbound DOI request, hit the new resolver in parallel and log the diff. By Friday of week two we had three diffs, all caused by trailing-slash inconsistencies in the old Joomla rewrite rules. We codified the old behaviour. We did not "fix" it.

Warning

If your old URLs have quirks (trailing slashes, mixed case, double-encoded characters) preserve the quirks. SEO and inbound citations do not care that they are ugly. They care that they resolve.

Week 3: peer-review history, with timestamps intact

The reviews table was the part nobody else had quoted on. Fifteen years of reviewer_id, manuscript_id, decision, date, plus free-text reviewer comments in a Dutch-English mix, plus a separate jos_reviews_files table pointing at PDF attachments on the old filesystem.

Two principles we held to:

  • Original timestamps win. Never now() on import. If a review was submitted at 2014-09-11 22:47 CET, that is its created_at in the new database.
  • Reviewer identity is sacred. The publisher's COPE compliance depends on being able to prove who reviewed what, when. We mapped reviewer accounts one-to-one, with a fallback imported_reviewer_legacy_id field for accounts that had no email on record.

Payload's collections made the schema readable in a way the old PHP never was. The Reviews collection has explicit relationships to Manuscripts and Users, access controls that let only the managing editor see reviewer identities for double-blind submissions, and a hook that prevents anyone (including us) from mutating a historical decision row.

// collections/Reviews.ts
export const Reviews: CollectionConfig = {
  slug: 'reviews',
  access: {
    read: ({ req }) => req.user?.role === 'managingEditor'
      ? true
      : { reviewer: { equals: req.user?.id } },
    update: ({ req, id }) => !isHistorical(id, req),
  },
  fields: [
    { name: 'manuscript', type: 'relationship', relationTo: 'manuscripts', required: true },
    { name: 'reviewer', type: 'relationship', relationTo: 'users', required: true },
    { name: 'decision', type: 'select', options: ['accept','minor','major','reject'] },
    { name: 'submittedAt', type: 'date', required: true },
    { name: 'legacyId', type: 'number', admin: { readOnly: true } },
  ],
}

Week 4: the Crossref and DataCite deposit jobs

The publisher had been depositing metadata to Crossref via direct XML deposit for over a decade. The DataCite jobs were younger, added in 2019 for a dataset series that needed DataCite's REST API. Both ran nightly at 02:30 from a cron on the old VPS. Both worked. Neither was documented.

We did three things, in this order:

  1. Read the actual XML the old system was producing. Saved a week of guessing.
  2. Re-implemented the deposit as a Payload scheduled job that hits the same Crossref deposit endpoint, with identical credentials, against the new Postgres data.
  3. Ran both jobs in parallel for ten nights, diffed the output XML, fixed the four fields where the old system was silently truncating author affiliations.

The new job runs as a single Node script in the same repo as the CMS. No separate cron VPS, no shell scripts on a server nobody can SSH into anymore. pnpm tsx scripts/deposit-nightly.ts on a scheduled task in the hosting provider's UI.

Week 5: shadow traffic on the reader-facing site

By week five, the new Next.js front-end could render every article. The editorial team was still working in Joomla. We pointed a fraction of inbound traffic, 5% via a Cloudflare worker rule, at the new site, with a header that suppressed indexing and a banner that read "Preview of the new portal. The classic site is at the original URL."

What we watched:

  • Server-Timing headers from the new site versus the old. The new median dropped from 1.4s to 280ms, mostly because Joomla was rendering a logged-out menu tree on every request.
  • The DOI resolver hit rate. We expected ~3% of inbound to be DOI traffic. It was 11%, mostly from Google Scholar.
  • Editor feedback. Three of the four section editors logged into the new admin to try it. Two found a bug in the manuscript-assignment form. We fixed it before week six.

Week 6: editorial team moves over

This is the week the wall sheet's fourth constraint paid for itself. We did not ask anyone to "freeze and migrate." We asked the managing editor to start new manuscript submissions in the new Payload admin on Monday. The old Joomla admin stayed read-write for active manuscripts already mid-review.

For two weeks the systems ran in parallel for editorial work. Every night, a sync job pulled changes from the old jos_* tables into the new Postgres, flagged anything that had been edited in both places (three such conflicts in the whole period; one was a typo fix in an author bio), and pushed the union into the deposit pipeline.

Week 7: cutover, then the old box stays up

The actual cutover was a DNS change on a Saturday morning. The shadow traffic split went from 5% to 100%. The DOI resolver had been serving the new app for five weeks at that point; nothing in that codepath changed at cutover. Crossref and DataCite jobs were already running off Postgres. The launch felt boring, which is what you want.

We left the old Joomla VPS up, read-only, for ninety days. Twice in those ninety days we needed it, both times to verify the original raw HTML of a long-form editorial that had picked up a stray formatting character during the JSONB-to-Markdown pass. Both times it took five minutes.

What we would do differently

Two things. First, we underestimated the indexing rebuild cost on the new Postgres. The full-text search on fifteen years of article bodies needed a tsvector column and a GIN index we had not budgeted for. Cost us a day in week five. Second, we should have asked the editor-in-chief to write down the deposit credentials in week one. We had to dig them out of an old phpMyAdmin session in week four, which is exactly the kind of bus-factor moment you want to avoid.

The work itself, migrating a legacy publishing portal off PHP 7.1 without dropping a DOI, is not glamorous. It is reading other people's SQL at 11pm and choosing not to "improve" the parts that already work. When we built the deposit pipeline for the Leiden publisher, the thing we ran into was that Crossref's older direct-deposit endpoint silently accepts XML that fails its own schema; we ended up solving it by validating against the schema ourselves before every push, which caught two malformed records in the first month.

If you are looking at a Joomla 3.10 site and a calendar full of PHP deprecation emails, the smallest useful thing you can do today is open your nightly cron list and write down what each job actually does. That document is worth more than any architecture diagram you will draw later.

Key takeaway

A cutover should be a non-event. Move the DOI resolver first, run shadow traffic for weeks, and never freeze the editorial team.

FAQ

How do you preserve DOIs during a CMS migration without breaking inbound citations?

Build the new resolver first, run it in parallel against the legacy database for at least a week, and codify the old URL quirks instead of fixing them. Trailing slashes and case sensitivity matter more than aesthetics.

Why Payload CMS over Strapi or Directus for an academic publisher?

Payload colocates the admin UI, deposit jobs and DOI resolver in one TypeScript repo with shared types and Postgres. For a small editorial team that means one deploy, one on-call rotation, and access rules that match real editor roles.

Can you really run a CMS migration without a content freeze?

Yes, with shadow traffic and a parallel-write window. We kept Joomla live for editorial work through week seven and synced changes nightly. Three conflicts in two weeks, all trivial. No freeze, no strike from the editorial team.

What does Crossref deposit need that most migrations get wrong?

Crossref's older direct-deposit endpoint silently accepts XML that fails its own schema. Validate against the schema yourself before sending. Otherwise you find out months later when a citation graph tool flags missing records.

joomlaphpmigrationlegacy sitescase studyarchitecture

Building something?

Start a project