Joomla
Joomla migration war story: when K2 ate 8,700 affiliations
Twelve days into a Joomla 3.10 to Astro migration, every K2 author affiliation on staging read 'unknown'. A JParameter blob the J4 upgrader had silently flattened.

Eleven at night, a Slack message from the operations lead at a 25-person academic publisher in Leiden. "The migration is rolled back. Again. Every author affiliation reads unknown." Their old site ran Joomla 3.10 with K2 as the content engine. 8,700 article items, roughly 14 years of peer-reviewed back catalogue. The plan was clean on paper: bump to Joomla 4 on a staging clone, export to JSON, import into Payload CMS, render with Astro. Day twelve of a sprint we had quoted at five days.
The visible symptom was small and brutal. Every author block on staging showed the institution name as unknown. Every single one of 8,700 items. The same field on production Joomla 3.10 rendered correctly. Something in the J4 upgrade was eating the data, silently, with no error in the migrator log and no exception in the K2 upgrade output.
The client publishes art-history and Dutch-literature journals. Author affiliations are not cosmetic for them. Affiliation drives the editorial board approval flow, the indexing on Google Scholar, the institutional licensing reports, and the ORCID-to-author resolution that two of their journals require by submission policy. Losing it was not an option, and shipping a site that said "unknown" 8,700 times was not a soft fail.
How we got into the mess
We had inherited the site from the publisher's previous agency, a one-person shop that had retired. The Joomla install was on 3.10.11 with K2 v2.10 as the content engine and a small constellation of paid extensions for citation export, ORCID, and DOI minting. The hosting was a single 4 GB VPS at a Dutch provider that the previous agency had not touched in three years. Joomla 3 reached end-of-life in August 2023, and the publisher's external IT audit had flagged the site as a security risk. They had a December deadline to be off Joomla 3 entirely.
The cleanest path was not Joomla 4. It was Astro + Payload, for three reasons. The editors only ever touched the article editor, never the rest of Joomla's admin sprawl. The public site was mostly read paths that benefited from static rendering. And the new ORCID flow they wanted required a modern fetch and queue model that was painful in Joomla and trivial in Payload.
To migrate the content out cleanly we still needed Joomla 4 on the way through. K2's data lives in K2 tables, and the cleanest export tools all assume the J4 data model. The plan: upgrade Joomla 3.10 to 4.4 on staging, run K2's J4 migrator, export JSON, import into Payload, point Astro at it. Then archive the old VPS.
That was the plan. Day twelve, we still had not exported a single clean record.
What K2 was actually storing
K2's "extra fields" system is one of those features that worked beautifully in 2014 and quietly aged into a liability. For this publisher, author affiliations were not in K2's first-class author profile fields. They had been jammed into #__k2_items.extra_fields as a JSON blob years ago. The affiliation metadata itself was nested inside one field as a serialized JParameter string. INI-shaped, escaped, wrapped in JSON, wrapped in the row.
If you ever inherited a Joomla 1.5 or 2.5 codebase, you have seen this format. JParameter was Joomla's pre-Registry way of storing key=value config blobs:
institution=Universiteit Leiden
department=Faculty of Humanities
orcid=0000-0002-1825-0097
country=NL
That string lived inside a JSON array, inside the row, inside the table. It worked because every K2 item display template knew to call JParameter::loadString() on the way out. The previous agency had added a small content plugin that did exactly that. The site had run on that shim for a decade.
Nobody documented it. There was no comment in the template. There was no entry in the operations runbook. The only reason we found it was that one of our developers had built Joomla 1.5 sites in 2010 and recognised the shape from the corner of his eye.
Where Joomla 4 went silent
Joomla 4 removed JParameter. It had been deprecated since Joomla 1.6 in favour of Joomla\Registry\Registry, which is a much better class but speaks a different dialect. Registry expects JSON or PHP-array shapes, not the bare INI strings JParameter happily ate. The Joomla 4 migration guide covers the swap at the component level, but it cannot help you when the legacy format is sitting inside user data.
The K2 upgrade that runs on top of Joomla 4 (K2 v2.11+) does an honest attempt at parsing extra-field blobs. When it hits a value it cannot decode as JSON or Registry, it does what a lot of well-meaning upgraders do. It falls back to a safe default and writes "unknown" into the field rather than crash the whole import. No exception. No warning. The K2 admin log records the upgrade as successful.
Just 8,700 rows of unknown.
If a vendor upgrader claims "no data loss" but you cannot find unit tests for the parser on legacy formats, treat that promise as marketing copy. Diff a sample of rows before and after. Assume nothing.
We caught it on day two of the upgrade. The next ten days were spent figuring out where the real data still lived.
The rollback that polluted production
The first instinct was right. Roll back the staging DB, re-import a fresh dump from production, try again. But the publisher had a problem we did not see until day four. The previous agency's "staging" environment had a cron job that pushed staging content edits back to production every six hours for a feature they had stopped using two years ago. Nobody had disabled it.
Every staging upgrade attempt had re-saved K2 items, which serialised the now-coerced extra_fields through K2's J3 model on the way back to production. By day three, production also had unknown written into about 1,400 rows.
This is the kind of thing you only catch when you diff a SELECT id, extra_fields FROM jos_k2_items between snapshots taken at known times. It took us a morning to write the diff script and another morning to realise that the cron was the source. We killed the cron, restored production from the most recent clean nightly dump, and then we had a real plan.
Recovering data from the pre-upgrade backup
The escape hatch was the nightly InnoDB dump the publisher's host kept in cold storage. We pulled a snapshot from before the first upgrade attempt and restored it into an isolated MySQL 5.7 container on our own infrastructure. Then we wrote a small extractor that re-parsed the JParameter strings the old way, mapped them to the Payload schema, and emitted JSON.
<?php
// extract-k2-affiliations.php
// Reads K2 extra_fields, finds the JParameter blob, returns clean rows.
declare(strict_types=1);
$pdo = new PDO(
'mysql:host=127.0.0.1;dbname=k2_legacy;charset=utf8mb4',
'root',
getenv('DB_PASS'),
[PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION]
);
function parseJParameter(string $blob): array {
$out = [];
foreach (preg_split('/\r?\n/', $blob) as $line) {
if ($line === '' || str_starts_with($line, ';')) continue;
if (!str_contains($line, '=')) continue;
[$k, $v] = explode('=', $line, 2);
$out[trim($k)] = trim($v);
}
return $out;
}
$stmt = $pdo->query('SELECT id, title, extra_fields FROM jos_k2_items');
foreach ($stmt as $row) {
$fields = json_decode($row['extra_fields'], true) ?? [];
foreach ($fields as $field) {
if (($field['id'] ?? null) !== 17) continue; // affiliation field id
$aff = parseJParameter((string) $field['value']);
echo json_encode([
'k2_id' => (int) $row['id'],
'title' => $row['title'],
'institution' => $aff['institution'] ?? null,
'department' => $aff['department'] ?? null,
'orcid' => $aff['orcid'] ?? null,
'country' => $aff['country'] ?? null,
], JSON_UNESCAPED_UNICODE) . "\n";
}
}
Twelve minutes of execution. 8,700 rows, 8,612 with valid affiliation data. 88 rows with genuinely empty values, which we confirmed by spot-check against the production frontend. Those articles really had no affiliation on file. Not a single row of unknown.
We validated three ways before we trusted it. We checked 30 random rows by hand against the live production rendering. We compared the institution distribution to a CSV the publisher had exported six months earlier for an internal audit. And we ran a fuzzy match against the ROR institutional identifier registry to flag any institution name that did not match a known ROR entry within edit distance 2. Three flags came back, all of them genuinely obscure Dutch theological seminaries that ROR does not cover.
Schema for Payload, written to never need a shim again
The reason this got messy in the first place was that affiliations had been jammed into a free-text extra field. The Payload schema treats them as first-class.
// payload/collections/Article.ts
import type { CollectionConfig } from 'payload/types'
export const Article: CollectionConfig = {
slug: 'articles',
fields: [
{ name: 'title', type: 'text', required: true },
{ name: 'legacyK2Id', type: 'number', unique: true, index: true },
{
name: 'authors',
type: 'array',
fields: [
{ name: 'name', type: 'text', required: true },
{ name: 'institution', type: 'text' },
{ name: 'department', type: 'text' },
{ name: 'orcid', type: 'text' },
{ name: 'country', type: 'text' },
],
},
{ name: 'body', type: 'richText', required: true },
{ name: 'publishedAt', type: 'date', index: true },
],
}
We kept legacyK2Id because the editors still navigate by the old article numbers in their internal Trello board, and 14 years of inbound links from Google Scholar point at the old URL slugs. The Astro front-end's /article/[slug] route falls back to a /k2/[legacyK2Id] redirect for the long tail.
The five-minute audit you can run tonight
If you are about to upgrade a Joomla 3 K2 site to Joomla 4, do this before you touch anything.
- Open phpMyAdmin or your favourite MySQL client against a copy, not production.
- Run
SELECT id, extra_fields FROM jos_k2_items WHERE extra_fields LIKE '%=%' LIMIT 50;to surface candidate JParameter rows. The equals sign inside an extra-field value is the tell. - If any row's
extra_fieldscontain INI-stylekey=valuelines rather than nested JSON objects, you have legacy JParameter data. The J4 upgrader will eat it. - Take an
.ibdbackup withmysqldump --single-transactionbefore the K2 J4 migrator runs. - After the upgrade, diff the same query. If the offending rows now read
unknown, restore from the pre-upgrade dump and write a parser.
That takes longer to read than to run. It will save you twelve days.
When we built this migration for the Leiden publisher, the real win was not the parser. It was insisting on the pre-upgrade .ibd backup as the source of truth instead of trusting the half-upgraded staging DB. If you have a legacy Joomla, Drupal, or Magento site staring down an end-of-life deadline, our legacy migration work usually starts with that same boring backup check before any new stack gets discussed.
Key takeaway
When a CMS upgrader 'succeeds' but writes 'unknown' into your fields, the data is not gone. It is still in the pre-upgrade backup. Restore there first.
FAQ
What is JParameter and why did it cause this?
JParameter was Joomla's pre-Registry config class for INI-style key=value blobs. Joomla 4 removed it. Data still in that format gets coerced or replaced with a safe default during the upgrade, often silently.
Does K2 v2.11 fix legacy JParameter data automatically?
It tries. When it cannot parse a value as JSON or Registry, it writes a safe default like 'unknown' rather than crash. The upgrade log records success, but the original data is gone from the live database.
How do I check if my Joomla 3 K2 site has the same risk?
Query jos_k2_items.extra_fields on a backup. Look for rows whose values contain INI-style key=value lines instead of JSON objects. Any match means the Joomla 4 upgrader will flatten that data on its way through.
Why migrate off Joomla 3 in 2026 if it still works?
Joomla 3 reached end-of-life in August 2023. No security patches since. Any site still on 3.x is carrying unpatched vulnerabilities that compound every month, and most insurance audits will flag it.