Joomla
Joomla 3.9 to Directus + Remix: a six-week SCORM cutover
It is February in Deventer. A 14-year-old Joomla 3.9 portal serves 22,400 SCORM lespakketten to Dutch schools. Six weeks later it runs on Directus and Remix.

It is a Tuesday morning in February in Deventer. The product owner at an onderwijsuitgever watches her staging build fail for the fourth time this sprint. Joomla 3.9 reached end-of-life in August 2023. PHP 7.3 followed it down. The portal that serves 22,400 SCORM-1.2 lespakketten to schools across the Netherlands still runs on both. Their hosting provider sends a calm email every Friday asking when they plan to do something about it.
This is the story of the next six weeks.
The portal we inherited
Built in 2012. Joomla 3.9 with eighteen extensions, four of them custom and unmaintained. A bespoke PHP 7.3 layer for SCORM playback, voortgangsregistratie, and the EduStandaard ECK-koppeling to Basispoort. About 4 GB of MySQL state — including per-leerling voortgangs-history the publisher is legally required to keep under the UAVG-bewaartermijn. 22,400 lespakketten ranging from a 40 KB HTML+JS shell to a 280 MB media-heavy wiskunde module.
The brief was not "make it nicer." The brief was: it cannot go down during a school week. Ever.
Why Directus and Remix
Joomla's content model assumes pages with bolted-on fields. The publisher's actual model is closer to: pakket → module → asset → SCORM-manifest → leerlingvoortgang. A headless CMS with relational tables is Directus. We have shipped on it eleven times and the row-level permission model maps cleanly onto Basispoort's school+leerling+rol tuple.
We also weighed Sanity and Payload. Sanity's GROQ is elegant, but the publisher's data team reports out of Metabase against raw SQL — they were not giving that up. Payload would have meant Postgres or MongoDB underneath; Directus runs cleanly on MySQL, which kept our infrastructure footprint flat against what the operations team already monitored at 03:00.
Remix on the frontend, because the existing PHP routes embedded SCORM players inside server-rendered pages with cookie-bound sessions. Remix's nested routes and loaders made the one-to-one port boring, which is what you want during a migration. We did not consider Next.js seriously — the App Router was still churning in early 2026 and we wanted a stable Remix v2.
Week one — inventory before code
The first week we did not write production code. We wrote a SCORM auditor.
find /var/www/joomla/scorm -name imsmanifest.xml \
| xargs -P 8 -n 1 ./bin/audit-scorm.js \
> audit-2026-02-09.ndjsonThe auditor walked every imsmanifest.xml, validated it against the SCORM 1.2 XSD, resolved every <resource href> against the actual filesystem, and hashed each asset. It also flagged any manifest whose <resources> block referenced a file the unpacker had silently renamed — the failure mode Joomla extensions love to leave behind. The output was one NDJSON row per pakket, which made the rest of the week a shell pipeline rather than a JIRA board.
Of the 22,400 pakketten:
- 21,803 were well-formed SCORM 1.2 manifests we could parse cleanly.
- 412 had non-ASCII filenames inside the ZIP that the legacy PHP unpacker had silently URL-encoded on disk. They worked. They had to keep working.
- 147 referenced absolute URLs pointing back at the Joomla install. Those were going to break the moment we changed the domain.
- 38 were corrupt, uploaded somewhere between 2014 and 2016 and never opened since.
The 38 corrupt pakketten became their own decision: we shipped a list to the redacteur and let her mark which ones to bury. Three came back as "actually we still need that for de kweekschool" and we rebuilt those from the original Word sources. The other 35 stayed in a legacy_burials tabel with a tombstone.
Week two — the Directus data model
Directus collections, one-to-one with the domain:
-- simplified; foreign keys and timestamps omitted
pakketten (id, ecknummer, titel, vakgebied, niveau, manifest_url)
pakket_assets (id, pakket_id, path, size_bytes, mime, sha256)
leerlingen (id, basispoort_eckid, school_id, geboortejaar)
voortgang (id, leerling_id, pakket_id, sco_id,
status, score, suspend_data, last_seen)
voortgang_archive (id, ...same as voortgang...,
archived_at, retention_until)Two things to notice. First, voortgang_archive is separate from voortgang. The UAVG-bewaartermijn for onderwijsvoortgang is two years after the leerling leaves the school. Active progress and retained-for-compliance progress have different access rules. Mixing them in one table is the kind of decision you regret on day 700.
Second, suspend_data stays a TEXT column. SCORM 1.2's cmi.suspend_data is a 4096-byte opaque blob that each authoring tool encodes differently. Do not try to be clever with it. Keep it verbatim.
Alongside these sits a migration_map table mapping joomla_leerling_id ↔ directus_leerling_id and joomla_pakket_id ↔ directus_pakket_id, with a confidence column for rows we joined fuzzily on geboortejaar plus schoolcode rather than ECK-iD. Roughly 200 of 38,000 leerlingen needed manual review; the redacteur cleared them in a morning with a paged list and two SQL views.
Week three — shadow traffic, not blue-green
The interesting decision in week three was rejecting blue-green. With blue-green you flip a load balancer and pray the new stack handles the real load. We did not have a staging school willing to be the canary.
Instead we ran shadow traffic. Every request to the live Joomla portal got mirrored to the new Directus + Remix stack at nieuw.[client].nl, with all writes going to a sandbox schema. Reads were real. Writes were thrown away after diffing.
The mirror lived in nginx:
location /scorm/ {
mirror /shadow;
proxy_pass http://joomla_php_upstream;
}
location = /shadow {
internal;
proxy_pass http://remix_shadow_upstream$request_uri;
proxy_set_header X-Shadow-Request "1";
}A Node worker then tailed both access logs and diffed three things: HTTP status codes, response body hash for the SCORM API calls (LMSGetValue, LMSSetValue, LMSCommit), and response latency p50/p95.
In week three we saw a 0.4% diff rate. By week five it was 0.02% — almost entirely the 412 non-ASCII pakketten, which needed a separate normalisation pass in the Remix loader. None of this would have surfaced from synthetic load tests.
A typical diff looked like this. A leerling opens module 6 of vmbo-2 wiskunde. The legacy player sends LMSSetValue('cmi.core.lesson_status', 'completed') with a session cookie path of /scorm/. Remix stored it correctly, but the diff worker flagged a p95 latency of 280 ms on the shadow side against 85 ms on legacy. That was nginx buffering the mirror request, not the application. We set proxy_buffering off on the mirror path and the shadow latency settled within 40 ms of the legacy stack. The class of diff we were hunting for was content; the class we kept finding was infrastructure noise — which is itself useful, because it told us where the real cutover would hurt.
Shadow traffic costs more than blue-green, but it gives you something blue-green cannot: the bug report writes itself, in production, while the legacy system is still serving real users.
Week four — the Basispoort ECK-koppeling
This is where most migrations die. Basispoort is the central authentication and licensing service for Dutch primary and secondary education. The EduStandaard ECK-koppeling is a SAML-based handshake plus a licentierechten lookup — straightforward in the abstract, painful in the real world.
The legacy PHP layer did three things in one endpoint: SAML-assertie validatie, licentierechten lookup, and session bootstrap. We split them into three Remix routes:
// app/routes/eck.saml.ts
export async function action({ request }: ActionFunctionArgs) {
const assertion = await parseAndVerifySamlResponse(request, {
metadataUrl: process.env.BASISPOORT_METADATA_URL!,
clockSkew: 60,
});
const eckid = assertion.attributes['nlEduPersonProfileId'];
if (!eckid) throw new Response('Missing ECK-iD', { status: 400 });
return redirect(`/eck/licence?eckid=${encodeURIComponent(eckid)}`, {
headers: { 'Set-Cookie': await samlSession.serialize({ eckid }) },
});
}We kept clockSkew at 60 seconds because Basispoort's reference servers drift, and a stricter check failed maybe two times an hour during the test phase. The legacy site allowed 300 seconds; we cut that down once we confirmed our infrastructure clocks were tight against NTP.
Two Basispoort quirks worth flagging. First, the IdP signs assertions but not the response envelope; if you require envelope signing you will fail roughly one handshake in three, depending on which reference instance you hit. Verify the assertion signature and move on. Second, the IdP returns nlEduPersonProfileId for primair onderwijs but nlEduPersonRealId for parts of voortgezet onderwijs. Our resolver reads both and prefers the first; the legacy code only read one and silently 500'd on the rest, which is why a handful of bovenbouw-classes had been logging in as their docent for the last two years.
Licentierechten came back as a list of ecknummer values, which we joined directly against the pakketten.ecknummer column. No mapping table. No translation layer. If the publisher renumbers a pakket they re-issue ECK-nummers — that is the contract.
Week five — voortgangsmigratie under bewaartermijn rules
4.1 million voortgang-rows. Of those, 1.2 million belonged to leerlingen who had already left their school. Under the UAVG they had to stay readable to the publisher for two years, but they did not need to be hot.
The migration happened in two SQL passes:
-- Active leerlingen: into the hot table
INSERT INTO directus.voortgang
(leerling_id, pakket_id, sco_id, status, score, suspend_data, last_seen)
SELECT
m.directus_leerling_id,
m.directus_pakket_id,
v.sco_id, v.status, v.score, v.suspend_data, v.last_seen
FROM joomla.voortgang v
JOIN migration_map m
ON m.joomla_leerling_id = v.leerling_id
AND m.joomla_pakket_id = v.pakket_id
WHERE v.last_seen > NOW() - INTERVAL 2 YEAR;
-- Archive leerlingen: into the cold table, with retention stamps
INSERT INTO directus.voortgang_archive
(..., archived_at, retention_until)
SELECT
..., NOW(), DATE_ADD(v.last_seen, INTERVAL 2 YEAR)
FROM joomla.voortgang v
WHERE v.last_seen <= NOW() - INTERVAL 2 YEAR;A cron then runs nightly to drop rows where retention_until < NOW(). The publisher's Functionaris Gegevensbescherming signed off on the policy before we wrote a line of it. That sign-off lives in version control next to the migration.
Week six — cutover, by minutes
The cutover ran on a Friday in March, 16:30 to 17:45, after the school day ended.
16:30 Set Joomla to read-only via a plugin we wrote in week two.
16:32 Final voortgang delta sync. About 9,400 rows since 06:00.
16:38 DNS TTL on portaal.[client].nl already at 60s since Monday.
16:40 Flip DNS to the Remix edge.
16:43 First real LMSCommit lands on Directus. Worker confirms.
17:05 Smoke-test 40 random ECK-nummers via Basispoort acceptatie.
17:30 Disable shadow mirror in nginx.
17:45 Re-enable writes to legacy Joomla as a read-only fallback.By 18:00 we were watching dashboards. The first school Monday peaked at 1,840 concurrent leerlingen, well within what the Remix loaders and Directus connection pool were sized for. The legacy stack stayed dormant for two more weeks as a rollback target, then went into a frozen archive.
What we got wrong
Three things, since the post-mortem is the post.
We underestimated how much the Joomla session cookies leaked into the SCORM iframe via cross-origin reads. The fix was a SameSite=None; Secure; Partitioned cookie on the Remix side, scoped to the SCORM player route. We caught it in week four because of the shadow diff, not because we thought of it.
We over-engineered the asset CDN in week one. A signed-URL scheme with 5-minute expiry sounded sensible until a docent tried to embed a pakket in a printed worksheet via QR code. We loosened it to a per-school-per-day token. Watch what your real users do, not what your threat model imagines they should do.
We trusted the leerling-id join too long. About 600 leerlingen had been merged in Basispoort between exports — same person, two ECK-iDs across the years. The old Joomla code fell back to a fuzzy match on geboortejaar plus schoolcode; our clean Directus join did not. We added a basispoort_id_aliases table the night before cutover, and the merge cases lit up on Monday morning as dual-row warnings rather than missing-progress complaints. Not painful, but they would have been if we had not been watching.
Where ABN sits in this
When we shipped this legacy migration for the Deventer onderwijsuitgever, the hardest part wasn't the data. It was the four-week stretch where the new stack served shadow traffic while the old one stayed authoritative. That double-running window is what lets you migrate without a maintenance weekend, and it is the part most teams skip.
If you have a school week to protect, start the audit before you write any code. The 38 corrupt pakketten we found in week one would have surfaced as ouder-klachten in week eight. The smallest thing you can do today: find . -name imsmanifest.xml | wc -l, then read three at random and confirm they still parse.
Key takeaway
Shadow traffic costs more than blue-green, but it lets you migrate a live school portal without a single maintenance window.
FAQ
Why not stay on Joomla 4 or 5?
The custom PHP 7.3 layer for SCORM and ECK was the real cost centre, not Joomla itself. Upgrading Joomla would have left the riskiest code untouched. A clean break onto Directus retired both at once.
Why Directus over Strapi or Payload?
Directus speaks SQL natively, so the leerling-voortgang tables stay queryable from outside the CMS. Strapi and Payload abstract the database in ways that complicate UAVG-bewaartermijn audits and bulk reports.
How do you handle SCORM 1.2's suspend_data field?
Keep it as opaque text. Different authoring tools encode different things there. Never try to parse or normalise it. Copy it verbatim from old LMS to new and let the player decode it.
Can you migrate a school portal during the school year?
Yes, if you run shadow traffic for at least four weeks and cut over on a Friday after school hours. We have never done a weekday cutover for an education client and would not recommend one.
What about the Basispoort acceptatie-omgeving during shadow traffic?
We used acceptatie for SAML round-trips and licentierechten only. Real ECK-iDs from productie were never sent to the shadow stack. The diff worker compared assertions structurally, not by ECK-iD value.