PHP
Inheriting a PHP 5.6 ERP: 180 procs and a full disk
On a Friday in April the production planner could not log in to the ERP. By Monday neither could the workshop. The backup script had been silent for ten weeks.

The metal shop runs two shifts. The first cuts and welds frame sections from 07:00. The second packs and stages from 15:00. On a Friday in April the planner came in at 06:45, opened the ERP, and got a white screen. By the time the welders had their first coffee, no one in the office could log in either. We were on the call by 10.
Twenty-one people, one ERP, custom-built in 2011 by a freelancer who moved to Adelaide three years later. Nobody at the company had a contract with him. Nobody had ever seen a schema diagram. There was no schema diagram. The two managing directors had a Dropbox folder with three PDFs, all titled Final_handover_v2, and a 7-Zip archive of the codebase from 2015 that no one could remember the password for.
The login page returned a white screen because the session table had grown to 4.2 million rows and the box had run out of memory mid-INSERT. That was the surface symptom. The real problem was older and quieter, and it took us most of a week to map it.
The stack we inherited
The application was PHP 5.6.40 on Debian 8 (jessie). Both have been out of support for years: PHP 5.6 reached end of life on 31 December 2018, and Debian 8 LTS ended in June 2020. The server was reachable on port 80 only because UFW had been disabled in 2019 by someone troubleshooting a printer.
The codebase was 84,000 lines of PHP across 612 files. No autoloader, no namespaces, no dependency manager. Composer was not installed. Three included files (db.php, session.php, helpers.php) defined 240 global functions between them. The MVC was not so much absent as actively refused.
But the PHP was not where the business logic lived. The business logic lived in the database.
One hundred and eighty stored procedures
MySQL 5.6 was running on the same box. The schema had 94 tables and 180 stored procedures. We pulled them out with the obvious query:
SELECT ROUTINE_NAME, ROUTINE_TYPE, CREATED, LAST_ALTERED
FROM information_schema.ROUTINES
WHERE ROUTINE_SCHEMA = 'erp'
ORDER BY LAST_ALTERED DESC;
The newest procedure had been altered in 2019. The oldest dated to 2012. About forty of them shared the prefix sp_calc_ and did pricing math against a costs table that was itself populated by another procedure that read from a CSV imported nightly by yet another procedure. The dependency graph, once we drew it, looked like a plate of capellini.
Procedures called procedures. One of them, sp_order_close_v3, was 1,400 lines of MySQL with a 41-step IF/ELSE ladder and three nested cursors. It handled order completion, invoice generation, stock decrement, commission split, and an email trigger that fired through UDF_SENDMAIL, a user-defined function compiled against MySQL 5.6 that nobody had source code for.
We did not try to read all 180 in the first week. We ranked them by how often they were called. The MySQL general log had been on (probably accidentally) for three years and the file had rolled, but the last 60 days were still on disk. Ten procedures accounted for 92% of calls. Those ten got read first. The other 170 got a one-line comment header (UNREAD, last altered YYYY-MM-DD) and went into a backlog. In a system this old, the stored procedures often are the application; the call-frequency log is the right place to start reading.
A session table that pretended to be a session store
PHP's default session handler writes to /tmp. This one did not. session.save_handler was set to user and pointed at a class called SessionDb that wrote to a table called sess_users with this shape:
CREATE TABLE sess_users (
id INT AUTO_INCREMENT PRIMARY KEY,
session_id VARCHAR(128),
user_id INT,
payload TEXT,
created DATETIME,
last_seen DATETIME
) ENGINE=InnoDB;
No index on session_id. No TTL. No cleanup job. Every page load did a full scan of 4.2 million rows to find one session row, and then wrote a new row. Login was the slowest endpoint in the system because login wrote twice.
The first fix was a four-line migration:
ALTER TABLE sess_users
ADD UNIQUE INDEX idx_session_id (session_id),
ADD INDEX idx_last_seen (last_seen);
DELETE FROM sess_users WHERE last_seen < NOW() - INTERVAL 30 DAY;
That deleted 4.18 million rows and brought average login time from 9.2 seconds to 140ms. We added a nightly cleanup procedure to the cron and left the schema otherwise alone. The proper fix, moving sessions to Redis or to a signed cookie, is on the week-four list, not the week-one list. OWASP's Session Management Cheat Sheet is the right reference when you get there; we will be using it.
The backup that wrote to a closed door
The cron had one backup job. It ran nightly at 02:30. It looked like this:
#!/bin/bash
DATE=$(date +%Y%m%d)
mysqldump -u root -p'...' erp > /var/backup/erp-$DATE.sql
tar -czf /var/backup/files-$DATE.tar.gz /var/www/erp
find /var/backup -mtime +30 -delete
Four problems, in order of cruelty.
First, no set -e. The mysqldump command had been failing since March because the disk holding /var/backup was full. Bash did not care. The next line ran. The line after that ran. The cron emailed nothing because cron mail had been disabled in 2017 when the SMTP relay changed.
Second, find -mtime +30 -delete works fine, but it was deleting from the same disk that was full because old, half-written backups from 2024 still counted as files. The cleanup could not even finish.
Third, tar -czf exited with code 1 every night because some files in /var/www/erp were owned by www-data and the cron ran as root over an NFS mount that had been remounted read-only at some point. Exit code 1 was logged nowhere.
Fourth, the database password was in plain text in a world-readable cron script. We will not dwell on that one; you can picture it.
The fix was, again, the smallest reasonable thing. New disk mounted at /srv/backup. New script with set -euo pipefail, exit codes checked, output piped to a logfile that rotates, and a healthcheck ping to healthchecks.io at the end. If the ping does not arrive by 03:30, someone gets a text. We use this pattern across legacy hand-offs because the cost of failure is too high to leave to I'll just check the logs sometimes.
A backup script that does not actively page someone when it fails is not a backup. It is a story you tell yourself about a backup.
Triage in week one
By the end of the first week we had not migrated anything off PHP 5.6, not refactored any stored procedures, and not introduced any new infrastructure other than the backup disk. We had:
- Indexed and pruned the session table.
- Stood up a proper backup with off-box copy to S3 and a verified restore (we restored to a staging box and logged in as a real user).
- Put the production box behind Cloudflare so we could rate-limit and shut off the world if we needed to.
- Enabled MySQL slow-query logging at 500ms and pointed it at a logfile we actually read.
- Inventoried the 180 stored procedures with last-altered dates and call frequency, and put the top ten into a read queue.
The ERP stayed up. Production never paused again. That was the deliverable.
Migration over rewrite
The temptation with a system like this is to declare it bankrupt and start over. Sometimes that is right. Usually it is not, because the business logic in those 180 procedures encodes fifteen years of edge cases nobody remembers and nobody will re-discover in a kickoff workshop. Commission gets split 60/40 unless the order ships to Belgium, in which case it is 70/30, unless the customer is on the legacy contract from 2014. That rule lives in a stored procedure. It does not live in anyone's head.
What we proposed for month two through six was modest. Pull the database off the application box onto a managed MySQL 8 instance, with the stored procedures intact. MySQL 8's stored-procedure semantics are close enough to 5.6's that most of them port without changes; the ones that do not usually break on stricter SQL modes (ONLY_FULL_GROUP_BY, NO_ZERO_DATE) and surface immediately on a staging copy.
Wrap the PHP 5.6 application in a thin PHP 8.3 facade that proxies new endpoints while old endpoints still hit the legacy code. Strangler-fig, not big-bang. Every new feature ships on the modern stack. Every old screen stays alive until it is rewritten or retired.
Read and document the top ten procedures first, then the next twenty, then triage the rest. We have been running an agent that chews through procedure source and produces annotated summaries with input/output schemas and observed call sites; it is faster than reading them by hand and the diffs are easy to audit. The model is rarely the interesting part of that work. What matters is the loop around it: what you feed in, what schema you force the output into, and how you verify what comes back before it touches the documentation.
A five-minute audit
Three checks take under an hour and tell you more than any architecture review. Run php -v on the server: if it starts with 5 or 7.0 to 7.3, you are running unsupported software with known CVEs. Read your backup script and verify the last successful restore was within the last 90 days, not the last 900. Query information_schema.ROUTINES and count what is there; if the number surprises you, your application logic is partly in the database, and any migration plan that pretends otherwise will miss six months.
When we took over the Tilburg ERP, the first week's win was not replacing it. It was buying the business eighteen months of stability to plan a real legacy migration instead of an emergency one.
Open your own backup script tonight and grep for set -e. If it is not there, the war story is already written.
Key takeaway
In a legacy PHP system the business logic usually lives in the database. Read the stored procedures by call frequency, not alphabetically.
FAQ
Is PHP 5.6 still safe to run on a hardened server?
No. PHP 5.6 stopped getting security patches at the end of 2018. CVEs in PHP, its core extensions, or the bundled OpenSSL build will not be backported. Network hardening does not fix that.
Should we rewrite a legacy PHP ERP or migrate it?
Migrate first, rewrite by surface area. Business rules in stored procedures encode years of edge cases. A strangler-fig facade lets revenue keep flowing while you replace screens one at a time.
How do you read 180 stored procedures without losing a month?
Sort them by call frequency in the slow-query or general log. Ten usually account for most traffic. Read those first, document inputs and outputs, and triage the rest into a backlog.
What is the cheapest meaningful upgrade for a legacy backup script?
Add set -euo pipefail, check exit codes, write to off-box storage, and ping a heartbeat service that pages a human when the ping does not arrive on time.