Security

Cloudflare WAF audit for chat agents: the pre-launch checklist

We watched two chat pilots quietly bleed traffic through Cloudflare's managed challenge. Here is the WAF and bot-management audit we now run before a single message reaches production.

Jacob Molkenboer· Founder · A Brand New Company· 10 Jun 2026· 7 min

Brass padlock on cream index card with green silk ribbon, paper tag on linen twine, red wax seal, ivory blotter.

The 14% we couldn't see

It was a Tuesday afternoon when a client's marketing lead pinged us. The chat widget on their checkout page was performing fine in tests. Conversations from internal QA worked. The pilot dashboard showed steady volume. But checkout conversion was down a hair and they couldn't explain why.

We pulled the Cloudflare analytics. Out of around 11,400 distinct visitors that day, 1,610 had been silently issued a managed challenge before the chat widget ever loaded. Most cleared it. Some did not. The ones who failed it never saw the widget. They never knew it existed. From the chat-agent dashboard's point of view, everything was fine.

This was the second pilot in two months where we caught the same pattern. Different industries, different stacks, same root cause: a WAF posture tuned for the legacy site was now silently filtering the chat path. We wrote down what we should have checked before turn-on. This is that list.

What the audit actually covers

Cloudflare is not one product. It is a stack of overlapping security layers, each of which can interfere with a chat agent in a different way. The audit walks through six of them, in order, before we point a single token at production traffic.

Bot Fight Mode and its bigger sibling

The first thing we look at is whether Bot Fight Mode or Super Bot Fight Mode is on. Bot Fight Mode is the free tier's blunt instrument. It blocks anything scoring as a likely automated connection. The problem is that bucket includes a lot of legitimate things: server-to-server webhooks from your own backend, streamed responses from an LLM provider you proxy through Workers, the chat widget's own fetch to a stats endpoint.

Super Bot Fight Mode (Pro plans and up) adds knobs. You can split Definitely automated, Likely automated, Verified bots, and Static resource protection. For a chat agent, we usually set the first to Block, the second to Managed Challenge only on auth-adjacent paths, and we explicitly allow our own user agent on the streaming endpoint.

If Bot Management proper is enabled (the Enterprise feature with the 1-99 score), we sample the actual cf.bot_management.score distribution against the chat endpoint over a few hours of real traffic before going live. Cloudflare's docs describe the score, but they don't tell you what your particular widget looks like to it.

WebSockets and Server-Sent Events

Most modern chat agents use either WebSockets or SSE to stream tokens. Cloudflare supports both, but the defaults bite.

For WebSockets, the upgrade request can be subject to Browser Integrity Check, which inspects headers before letting the upgrade through. If your widget connects from a slightly older mobile browser, the check can fire and the connection gets a 403. The widget then falls back to polling, which kills the streaming feel and triples your token cost per session.

For SSE, the issue is buffering. Cloudflare's edge will buffer responses to gather a complete chunk before forwarding, which means tokens arrive in clumps every few seconds instead of one at a time. The fix is to set Cache-Control: no-cache, no-transform and X-Accel-Buffering: no on the response, then verify with a curl from outside Cloudflare's network.

curl -N -H "Accept: text/event-stream" \
  -H "Authorization: Bearer $TOKEN" \
  https://chat.example.com/api/stream

If tokens arrive one per line as they're produced, you're clear. If they arrive in batches, something on the path is buffering.

Rate limiting rules

Rate limiting is the second-biggest source of silent failures. The default rules a Cloudflare account ships with are tuned for HTTP page loads, not for chat sessions where a single user might fire 20 small POST requests in a minute.

We look at three things. First, are there account-level rate-limiting rules covering /api/* or whatever your chat namespace lives at. Second, what is the threshold and what is the window. Third, what does the rule do on hit: log, challenge, or block.

A chat agent that emits a typing indicator, sends a message, polls for tool-call results, then asks for a follow-up can easily breach a thirty-requests-per-minute-per-IP rule from a corporate NAT where forty employees share a single egress. The fix is usually to scope the rule by session cookie instead of IP, or to exclude the chat path from the global rule and write a chat-specific one with a higher ceiling.

# Cloudflare rate limiting rule expression
(http.request.uri.path matches "^/api/chat/"
 and not cf.client.bot)
# Threshold: 120 requests per 60s, per session cookie

Managed challenges on the widget origin

The widget itself usually loads from a separate origin. A subdomain like chat.example.com, a Cloudflare Pages deployment, or your vendor's CDN. We check the WAF rules on whichever origin the widget script lives at, separately from the main site.

This is where the 14% came from on the first pilot. The chat widget was served from a chat. subdomain with a more aggressive WAF posture than the main site. Anyone on a stale mobile browser, a tracking-protection extension, or a corporate VPN that strips JavaScript got a managed challenge before the widget loaded. They closed the tab. The dashboard showed nothing because the widget never registered the visit.

Warning

A managed challenge is invisible to your analytics. If your chat dashboard counts sessions only when the widget initialises, you cannot see the users who were challenged before the widget ever loaded. Always cross-reference Cloudflare's Security Events against widget session counts for the same window.

Cache rules and the POST trap

Cloudflare will not cache POST requests by default, but it will cache the response to a GET that has the same path as a POST endpoint. If your widget calls GET /api/chat/session to bootstrap and then POST /api/chat/session to send a message, an aggressive Cache Rule can return a stale bootstrap to a new user, including another session's token.

We check every Cache Rule, Page Rule, and Tiered Cache configuration touching the chat namespace. Anything that says Cache Everything on a path that handles auth or session state gets rewritten or removed before launch.

We also check Transform Rules. If the team has a remove sensitive headers rule on the response side, it might be stripping X-Accel-Buffering: no and quietly bringing back the SSE buffering problem above.

Logs you can actually read

The last step is verifying that, once traffic is live, we can see what is happening. That means turning on Logpush (or at minimum confirming the team has access to Security Events for the relevant zone), filtering by the chat hostname, and setting up an alert on the ratio of challenges to clean requests.

The rule we use: if challenge volume on the chat path goes above two percent of total requests for fifteen minutes, page someone. Below that threshold we treat it as background noise. Above it, a rule has gone wrong somewhere.

The pre-launch checklist itself

We run this as a literal checklist on a shared doc, ticked off by name before flipping DNS or pointing the widget at the live endpoint.

Bot Fight Mode and Super Bot Fight Mode posture documented and tested against the chat path.
WebSocket upgrade verified from at least two browsers (one mobile Safari, one desktop Firefox) and one curl.
SSE endpoint streams unbuffered, confirmed with curl -N from outside Cloudflare.
Rate limiting rules cover the chat path explicitly, scoped by session cookie, with thresholds matched to the agent's traffic shape.
Managed challenge volume on the widget origin measured over a 24-hour window before launch, with the failure rate from challenge-issued sessions estimated.
Cache rules audited, with no Cache Everything rule touching the chat namespace.
Logpush or Security Events access confirmed, with an alert on challenge ratio.

After launch, the 48-hour watch

The audit is not the end of it. The first 48 hours after pointing live traffic at the agent are when you find the rules that looked fine in staging and break in production. We keep someone on the chat dashboard and Cloudflare Security Events in parallel for those two days. Roughly half the time, we end up adjusting a rule we thought was safe.

The two patterns we see most often: a Verified bots allowance that turns out not to cover your LLM provider's IPs (so the first time the widget tries to validate a webhook, it gets blocked), and a managed challenge issued to authenticated users from a specific region because their IP block has a low Cloudflare reputation score. Both are fixable in five minutes, once you can see them.

The smallest thing you can do today

If you have a chat widget already in production behind Cloudflare and you have not done this audit, open Security Events for your widget hostname, filter to the last 24 hours, and check the ratio of managed challenges to clean requests on the path the widget loads from. If it is above two percent, you are leaving real conversations on the table.

When we wired the chat agent for a Dutch retail client through Cloudflare earlier this year, the thing we ran into was a Bot Fight Mode rule the previous agency had left on for a year. We solved it by moving the chat namespace to its own subdomain with a clean WAF posture and re-enabling protections one rule at a time, the kind of work that sits under our AI agents practice.

Key takeaway

A Cloudflare WAF posture tuned for your legacy site will silently filter your chat agent. Audit bot rules, WebSocket upgrades, and challenge ratios before launch, not after.

FAQ

Why does Cloudflare's managed challenge break chat widgets?

It loads before the widget script and asks the browser to solve a JavaScript puzzle. Older mobile browsers, tracking-protection extensions, and stripped-JS corporate VPNs can fail it silently.

Can I keep Bot Fight Mode on with a chat agent?

Only Super Bot Fight Mode with explicit allow rules for your chat path and your LLM provider's webhook IPs. The free tier's Bot Fight Mode is too blunt for streaming chat.

How do I test if SSE responses are being buffered?

Run curl with the -N flag and an Accept: text/event-stream header against the streaming endpoint from outside Cloudflare's network. Tokens should arrive line by line as produced.

What is a safe rate limit for a chat endpoint?

Scope by session cookie, not IP. Most agents need 90 to 150 requests per minute per session to handle typing indicators, tool calls, and follow-ups without false positives on corporate NAT.

ai agentschat agentssecurityintegrationsoperationsarchitecture

Building something?

Start a project