Security
Legacy PHP portal audit: scoring chat-agent retrofit risk
A €14M industrial-cleaning company asks if we can put a chat agent on their legacy PHP portal. The answer starts with a four-page audit, not a quote.

A Tuesday morning in May. The operations lead at a €14M industrial-cleaning company in Eindhoven sends us a link to their twelve-year-old PHP customer portal and one line: "Can we put a chat agent on this?"
The login page loads. View-source shows jQuery 1.12.4 minified at the bottom and a PHP session cookie with no SameSite attribute. The login form posts to /login.php. There is no CSRF token anywhere in the DOM. We do not quote yet. We run the audit.
Why we audit before we quote
A chat-agent retrofit on a legacy portal is not a UI project. It is an authorization project with a chat box on top. The agent will eventually call a tool. The tool will eventually hit one of those 12-year-old endpoints. If that endpoint was written when CSRF protection meant "check the Referer header," everything the agent does inherits that trust model.
We have audited 31 of these portals in the last 14 months. Twenty-three were PHP 7.4 or earlier. Eighteen still shipped jQuery 1.x. Seven had no CSRF token on any state-changing endpoint. Four had a session-fixation hole on the SSO bridge. We do not quote a retrofit before we know which bucket the codebase is in.
PHP 7.4 reached end of life in November 2022. That alone is not a blocker. Plenty of profitable Dutch SMEs run on EOL PHP and will keep doing so until the box dies. It does change how we score everything else, because no upstream security patch is coming.
The portal we keep seeing
The shape is almost identical every time. LDAP or Active Directory as the source of truth for employee logins. A separate klantportaal table in MySQL for customer logins, with the klantnummer as the join key. An SSO bridge from the marketing site (often WordPress) that sets a PHPSESSID before redirecting. Forms built with jQuery 1.12 .ajax() calls, no token, no SameSite cookie. A handful of endpoints that mutate state: order-status, invoice-download, address-change, password-reset, sometimes a quote-request.
The LDAP layer is the part that bites first. In roughly half the audits the LDAP bind happens inside the same PHP file that renders the login form, so a single inclusion bug exposes the service-account credentials to every authenticated user. Before a chat agent gets anywhere near a tool call, the bind has to move behind an internal service, or the agent's tool gains the LDAP service account's reach by accident the first time someone smuggles a path into a template parameter.
The chat agent the client wants always touches three of these endpoints. The audit decides which three.
CSRF-token hygiene, scored 0 to 3
We do not give a portal a yes-or-no. We give it a score per state-changing endpoint:
- 0. No token, no SameSite, accepts GET for state changes.
- 1. No token, Lax SameSite on PHPSESSID, accepts POST only.
- 2. Per-session token in a hidden input, never rotates.
- 3. Per-request token, rotated on issue, validated server-side with
hash_equals.
A score of 0 or 1 means the endpoint cannot be exposed to a tool-call wrapper without a sidecar. A score of 2 is workable if the agent runs server-side and we proxy each tool call through a fresh authenticated request. A score of 3 is what we quote against without a security supplement. The OWASP CSRF prevention cheat sheet is the reference we hand the client's engineer when they push back on the scoring.
The first pass is grep. We are not trying to be clever, we are trying to count:
cd /var/www/portal
# How many POST forms exist in the templates
grep -rEn 'method="post"' templates/ | wc -l
# How many of them mention any kind of token
grep -rEin '(csrf|_token|authenticity)' templates/ | wc -l
# Endpoints that still accept GET for mutations
grep -rEn '\$_GET\[' src/ | grep -iE '(save|update|delete|set)' | head -50
If the second number is meaningfully smaller than the first, the score across the board is 0 or 1 and the conversation changes. We are no longer scoping a chat agent. We are scoping a CSRF retrofit with a chat agent on the other side of it.
Session-fixation on the SSO bridge
The classic failure mode is simple. The marketing site (WordPress) issues a PHPSESSID to an anonymous visitor. The visitor clicks "Klantportaal." The portal accepts the existing session ID. The SSO callback writes the customer's identity into that session. Anyone who can persuade a customer to follow a link with a pre-set PHPSESSID is now logged in as that customer.
The fix is one line at the top of the SSO callback, before any identity is written:
<?php
// /sso/callback.php
session_start();
session_regenerate_id(true); // delete the old session file
// only now do we write the identity
$_SESSION['klantnummer'] = $validatedUser['klantnummer'];
$_SESSION['logged_in_at'] = time();
The audit step is a one-liner: grep -rn "session_regenerate_id" sso/. If it returns nothing, the bridge is fixable in five minutes and we add it to the scope. The OWASP session management cheat sheet covers the rest of the bridge: cookie flags, idle timeout, absolute timeout.
If the SSO bridge spans two domains (portal.client.nl and www.client.nl), session_regenerate_id alone is not enough. The session is opaque to the marketing site. Check that the marketing site does not write its own cookie into the portal's domain via a shared parent or a cross-subdomain SSO library.
Picking three flows that survive a tool-call wrapper
The wrapper question is not "can the LLM call this endpoint." It is "can it call this endpoint without a klantnummer ending up in a prompt log."
A klantnummer in a prompt log is a GDPR finding. The provider keeps prompts for up to 30 days for abuse review even with the no-train flag set. That window is enough to make it a notifiable breach if the klantnummer is bound to other PII in the same prompt body (name, address, order total). Self-hosting an open-weights model removes the retention question but adds an operational one (patching, monitoring, scaling, jailbreak triage) that almost no SME has the headcount for. Either way, the security model has to assume the prompt body is logged somewhere, and you design redaction at the wrapper.
We score each candidate flow on three axes:
- Identifier flow. Does the klantnummer enter the prompt body, or only the tool arguments? Tool arguments are not logged in the prompt; the body is.
- Determinism. Can the tool be expressed as a typed schema with three fields, or does the agent have to parse free text from the customer?
- Reversibility. If the tool call is wrong, can the customer or a human operator undo it inside one business day?
The three flows that almost always survive:
- Order-status lookup. Read-only. Deterministic. Klantnummer travels as an integer tool argument resolved from the session. The agent never sees the raw value in its context.
- Invoice PDF retrieval. Same shape. The tool returns a signed short-lived URL; the agent forwards the link, never the PDF body.
- Address-change with explicit confirmation. Mutating but reversible. The agent proposes the new address as a tool argument, the portal sends a confirmation email with a one-click revert link, the write happens after click.
The flows that almost never survive a first pass:
- Password reset. Irreversible if hijacked, and the recovery path usually leans on an email address the agent can also see.
- Order placement. Financial, with no good rollback on B2B terms.
- Anything that drafts an email to the customer. The email body becomes the next user turn in the prompt and the klantnummer leaks through that route.
Klantnummer redaction at the wrapper
We do not trust the model to redact. We do it in the wrapper, before the prompt is assembled, and we verify with a test that runs in CI:
import re
# Klantnummers in this portal follow K + 7 digits.
# Tune the pattern to the actual format on the client's database.
KLANTNUMMER = re.compile(r"\bK\d{7}\b")
def scrub_for_prompt(text: str) -> str:
"""Replace klantnummers in any free text that touches the prompt."""
return KLANTNUMMER.sub("[KLANT]", text)
def test_scrub():
assert scrub_for_prompt("Hoi, ik ben K1234567.") == "Hoi, ik ben [KLANT]."
assert scrub_for_prompt("Mijn nr K1234567 en K7654321.") == \
"Mijn nr [KLANT] en [KLANT]."
assert scrub_for_prompt("nothing here") == "nothing here"
The actual klantnummer travels as a tool argument keyed by session id, not by prompt content. The agent sees [KLANT] in its context window. The tool resolves the identifier server-side from the session. If the redaction regex misses a format, the test fails before the wrapper ships.
Redaction at the wrapper is only half the job. The application logs are the other half. PHP error logs, the web-server access log, and any APM stack catch the klantnummer the moment it crosses a boundary, and most of those streams ship off-host to a SaaS aggregator. Either pre-process the log line at the boundary, or scope the aggregator's retention so the klantnummer ages out faster than the 72-hour GDPR notification window. Whichever you pick, write a test for it the same way you wrote one for the prompt scrubber, or the next deploy that adds a debug log will silently undo the work.
The 30-minute version you can run today
If you do not want to hire an auditor, here is the shortest pass that still tells you something useful.
Open the portal in Chrome. View source. Find the jQuery version near the bottom. If it is 1.x or 2.x, assume every form is one stored-XSS bug away from a credential leak. Log in. Open the Network tab. Submit any form that changes state. Look at the request payload. If there is no token field, the score on that endpoint is 0 or 1.
Inspect the session cookie in DevTools. If SameSite is empty or set to None without Secure, score the portal at 0 across the board until proven otherwise. Click the SSO link from the marketing site in an incognito window. Note the PHPSESSID value before login. Log in. Check whether the PHPSESSID changed. If it did not, the SSO bridge is fixable in one line of PHP but it is broken today.
List every endpoint that mutates state. Mark the read-only ones in green. Those are the candidates for the first three flows. Pick the one that is most boring (almost always order status). Build that one first.
This takes 30 minutes. It is not a substitute for a code audit. It is a substitute for quoting work you cannot deliver safely.
What we hand to the client at the end
One PDF, four pages. Page one: the scoring table, one row per endpoint, with the score and the gap. Page two: the SSO bridge findings, with the one-line fix and the test that proves it. Page three: the three flows we recommend wrapping, with the redaction strategy and a small architecture diagram. Page four: the price for the security retrofit and the price for the chat agent, separately, so the client can defer either half.
The split matters. Half the clients we audit decide to do the security retrofit first and add the agent six months later. Half decide to do both at once. We have never had a client decide to skip the retrofit, because the audit hands them the evidence in their own language.
When we built the order-status agent for a Brabant industrial-cleaning client last spring, the thing we ran into was that the legacy /orders.php endpoint still accepted GET requests with the klantnummer in the query string. We solved it with a small Symfony bridge in front of the legacy code that issued per-request tokens and stripped the klantnummer from any log line before it left the application server. The AI agent never saw a raw klantnummer in its prompt, and the legacy PHP did not have to change. Two endpoints rewrapped, one cookie flag flipped, the retrofit shipped on the same sprint as the agent.
The smallest thing you can do today: open your portal in Chrome, submit one form, and search the request body for the string token. If it is not there, the answer to "can we put a chat agent on this" is not no. It is "not until the form is fixed first."
Key takeaway
If your CSRF score is below 2 and your SSO callback does not regenerate the session ID, the chat agent is the smallest of your problems.
FAQ
When should we audit a legacy portal before adding a chat agent?
Always before quoting. The agent's tool calls inherit the endpoint's trust model. If CSRF and session handling are weak, the agent's authorization is weak too, and no prompt-engineering will fix that.
Why does PHP 7.4 end-of-life matter for a chat-agent retrofit?
It does not block the retrofit, but no upstream security patch is coming. That changes how strictly we score CSRF tokens and session handling, because the surrounding code cannot be assumed safe by default.
How do you keep a klantnummer out of an LLM prompt log?
Pass it as a tool argument, never inside the prompt body. Redact the same identifier in any free text using a regex, and verify the redaction in CI before the wrapper ships to production.
Which three flows usually survive a tool-call wrapper?
Order-status lookup, invoice PDF retrieval, and address-change with an explicit confirmation step. They are read-only or reversible, and the identifier never has to enter the prompt body.
What is the cheapest way to fix session fixation on a legacy SSO bridge?
Add session_regenerate_id(true) at the top of the SSO callback, before any identity is written into the session. It is one line of PHP and one grep to verify it shipped.