← Blog

Legacy sites

Spot a CMS by its folders: a legacy migration field guide

You just got FTP credentials and nobody on the client side remembers what their site runs on. You have ninety seconds. Read the folder tree.

Jacob Molkenboer· Founder · A Brand New Company· 11 Mar 2024· 6 min
Open leather logbook, brass key on cream card, iron shipping tag with twine, green sticky tab, red wax seal on ivory paper.

You just got FTP credentials from a client who cannot remember what their site runs on. The login still works. The site loads. Nobody on their side knows whether it is WordPress, Drupal 7, or some bespoke PHP a freelancer wrote in 2014 and then disappeared.

You have ninety seconds before the call resumes. You do not need ninety seconds. You need to read the folder tree.

Why folders tell on a CMS faster than markup

The rendered page lies. A custom theme can make WordPress look like Squarespace, Drupal look like WordPress, and Magento look like a brochure. View-source has been scrubbed by half the optimisation plugins on Earth. Wappalyzer guesses, and on legacy infrastructure it guesses wrong about a third of the time.

The directory tree on disk does not lie. Every CMS we deal with in migration work leaves a distinct fingerprint in the filesystem long before it renders a byte. Once you can read those fingerprints at a glance, you cut the discovery phase of a legacy audit from an afternoon to a single SSH session.

WordPress: the wp- prefix tells on itself

The easiest read in the legacy web. Three sibling directories at the docroot, all prefixed with wp-, plus a config file:

$ ls /var/www/html
index.php  wp-admin/  wp-config.php  wp-content/  wp-includes/  wp-login.php

Inside wp-content/ you will find plugins/, themes/, uploads/, sometimes mu-plugins/. Confidence: total. Even heavily customised stacks (Bedrock, ClassicPress, a forked WordPress under another name) keep the wp- prefix because too much core code hardcodes those paths.

Read wp-includes/version.php for the exact build. Anything below 6.x means you are looking at a security debt the client has not paid in years. The WordPress security page lists which majors still receive backports; pre-5.0 sites are running on borrowed time.

Drupal: core/ versus sites/all/

Two eras of Drupal share almost no filesystem layout.

Drupal 7 and earlier put contributed modules under sites/all/modules, themes under sites/all/themes, and the site config in sites/default/settings.php. There is no core/ directory. A CHANGELOG.txt sits at the docroot and brags about which Drupal version shipped when.

# Drupal 7
$ ls
CHANGELOG.txt  includes/  misc/  modules/  profiles/  scripts/  sites/  themes/  index.php

Drupal 8, 9, 10, 11 introduced a top-level core/ folder and a vendor/ driven by Composer. Contributed code lives under modules/contrib and themes/contrib:

# Drupal 10
$ ls
composer.json  core/  modules/  profiles/  sites/  themes/  vendor/  web.config

If you see Drupal 7, treat it as a migration emergency. Drupal 7 reached end-of-life on 5 January 2025 after several extensions, the project no longer ships security patches, and the contrib ecosystem has gone cold. We still find Drupal 7 in production at universities, municipalities, and any company whose original agency vanished.

Joomla: configuration.php at the root

The dead giveaway is a configuration.php file sitting next to an administrator/ directory. No other CMS uses that exact filename at the docroot.

$ ls
administrator/  components/  configuration.php  htaccess.txt  index.php
language/       libraries/    modules/           plugins/      templates/

That configuration.php holds database credentials in plaintext PHP. Open it carefully and never commit it. Joomla 3.x reached end-of-life in August 2023 and is still in the wild on a startling number of forgotten subdomains.

Magento: bin/magento or the mage script

Magento 1 and Magento 2 share a name and nothing else. The filesystems are completely different stacks.

# Magento 1 (EOL June 2020)
$ ls
app/  errors/  index.php  js/  lib/  mage  media/  skin/  var/

# Magento 2
$ ls
app/  bin/  generated/  lib/  pub/  var/  vendor/

The tells: Magento 1 has a mage CLI executable at the root and a skin/ directory with frontend assets. Magento 2 has bin/magento, no skin/, and an app/etc/env.php instead of M1's app/etc/local.xml. If you find Magento 1, you are not migrating, you are rebuilding. Adobe stopped patching it in June 2020 and the Adobe Commerce lifecycle policy has the receipts.

The long tail

You will see all of these in the wild. Memorise the smallest unique signature for each.

  • TYPO3: typo3/, typo3conf/, typo3temp/, fileadmin/. Unmistakable. Common at German-speaking universities and B2B publishers.
  • Concrete CMS (formerly concrete5): concrete/ and application/ side by side.
  • Craft CMS: a craft executable in the root, plus config/, templates/, web/. Composer-driven and modern enough that it is usually fine.
  • ModX: core/, manager/, connectors/, assets/. The connectors/ folder is the giveaway.
  • SilverStripe: framework/ + cms/ + mysite/ on 3.x; app/ + public/ + vendor/ on 4+.
  • PrestaShop: classes/, controllers/, override/, and an admin folder renamed to something like admin1234/ in the name of security through obscurity.
  • ExpressionEngine: system/ee/ and themes/ee/.
Warning

Do not trust the X-Powered-By header, the <meta name="generator"> tag, or the favicon. Themes lie. Reverse proxies lie. The filesystem is the only honest witness, and even it can be confused by a Bedrock-style WordPress install that hoists the docroot into web/wp/.

The five-minute fingerprint

Paste this into an SSH session. It is dumb on purpose: no dependencies, no network calls, returns one word.

#!/usr/bin/env bash
# fingerprint.sh - point at a docroot, get a one-word CMS guess
ROOT="${1:-.}"
cd "$ROOT" || exit 1

[ -d wp-admin ]        && [ -d wp-includes ]     && echo "wordpress"     && exit
[ -f configuration.php ] && [ -d administrator ] && echo "joomla"        && exit
[ -f core/lib/Drupal.php ]                       && echo "drupal-8plus"  && exit
[ -f CHANGELOG.txt ]   && [ -d sites/all ]      && echo "drupal-7"      && exit
[ -f bin/magento ]                               && echo "magento-2"     && exit
[ -f mage ]            && [ -d app/code/local ] && echo "magento-1"     && exit
[ -d typo3 ]           && [ -d typo3conf ]      && echo "typo3"         && exit
[ -d concrete ]        && [ -d application ]    && echo "concrete-cms"  && exit
[ -f craft ]           && [ -d templates ]      && echo "craft"         && exit
[ -d manager ]         && [ -d connectors ]     && echo "modx"          && exit
[ -d classes ]         && [ -d override ]       && echo "prestashop"    && exit

echo "unknown - dig deeper"

Knowing what you are looking at decides everything that comes next. A Drupal 7 site at end-of-life with a Composer-free filesystem is a rebuild on Drupal 10 or a sideways move to a static front-end with a headless CMS. A Joomla 3.x site is almost always cheaper to port than to upgrade. A WordPress 6.x with thirty active plugins is a hardening job, not a migration. Reading the tree first stops you from quoting the wrong project.

When we run a legacy migration, the fingerprint script above is the first thing we run after the credentials land. On a recent Joomla rebuild it told us inside ten minutes that the template was unsalvageable and the database carried three abandoned components leaking data, which is the kind of read you would prefer to have before you quote.

Open a terminal, ssh into the next client server you have access to, paste the script. Whatever it prints decides the next conversation.

Key takeaway

Markup lies, headers lie, but the directory tree on disk never does. Read the folders first; decide the migration plan second.

FAQ

Can I tell a CMS from the URL alone?

Sometimes. Paths like /wp-admin or /administrator leak the engine. But many sites rename or hide admin URLs, so a filesystem check stays the only reliable test.

What if there is no SSH or FTP access?

Probe for known asset paths in the browser (e.g. /wp-content/uploads/, /sites/default/files/). If the host blocks those too, fall back to response headers and the generator meta tag.

Is Drupal 7 really dead?

Drupal.org stopped shipping core security patches in early 2025. A few third-party vendors still sell extended support, but the contrib ecosystem has gone cold and most modules no longer get updates.

How do I fingerprint a headless or static site?

A pure static export has no CMS fingerprint on disk. Look for the build pipeline instead: package.json, next.config.js, gatsby-config.js, astro.config.mjs, or a Hugo config.toml.

Does the script handle multisite installs?

Partly. WordPress multisite still has wp-admin and wp-includes, so it identifies correctly. Drupal multisite hides per-site config under sites/<domain>/, so check sites/ for more than just default/.

legacy sitesmigrationwordpressdrupalmagentophp

Building something?

Start a project