SEO

GA4 attribution audit: the checklist we run before any agent

Three attribution pilots in a row produced confidently wrong revenue numbers. The data was bent before the agent ever saw it. Here's the audit we now run first.

Jacob Molkenboer· Founder · A Brand New Company· 9 Jun 2026· 9 min

Brass compass, linen tape, cream index card with green thread and red wax seal on ivory paper.

Week 11 of a marketing-attribution pilot. The ops lead opens the quarterly summary on the projector. The agent reports LinkedIn drove €312k of pipeline, paid search €188k, organic €74k. The finance director pulls the invoice ledger. Real numbers: LinkedIn €41k, paid search €230k, organic €119k. Nobody is lying. The agent read from GA4. GA4 was lying.

This happened to three of our clients in three consecutive quarters across late 2025 and early 2026. Three different verticals, three different attribution agents, three different attribution windows. Each pilot was wired correctly. Each underlying dataset was bent. We stopped quoting attribution work without a one-week audit in front of it.

Here is the checklist we run now. It is unglamorous. It catches every problem we saw in those three pilots, plus a few we have seen since.

Three pilots, three wrong stories

One client was a B2B SaaS doing roughly €4M ARR. One was a regional e-commerce brand under €15M. One was a professional-services group at about €9M. They had nothing in common except this: someone had touched their Google Tag Manager container in the previous eighteen months, marked the project done, and never gone back. Then a marketing-attribution agent sat on top of that container and started writing weekly memos that the CFO eventually printed out and circled in red.

The agents were not the problem. The data was the problem. The agents were extremely confident about the wrong answer, which is the worst failure mode of all.

Why GA4 reads as ground truth and isn't

GA4 is not a ledger. It is a measurement platform that models gaps when it cannot observe them. Two mechanisms do most of the lying.

The first is Google Consent Mode v2. When a visitor declines cookies, Google's basic and advanced consent modes still send cookie-less signals to GA4, which then uses behavioural modelling to infer what that visitor probably did. In the EU, depending on cookie-banner design, you can be modelling 30 to 70 percent of your conversions. That fraction is not labelled "modelled" inside the standard GA4 reports. It sits in the same column as observed conversions.

The second is data-driven attribution on top of that. GA4 distributes credit across touchpoints using a model trained on your own conversions. If those conversions are themselves modelled, the credit weights are downstream of two layers of inference. A marketing-attribution agent reading the GA4 API gets the output of both layers, with no flag telling it which numbers are observed and which are guessed.

Warning

If your cookie-banner reject rate is above 40 percent and your GA4 reporting identity is set to "blended", you are showing the board a number that is partially synthetic. None of that is technically wrong. All of it is invisible.

The pre-quote audit, item by item

The audit takes about three working days for a normal mid-market setup. It produces a shared doc with a green or red verdict per item, plus a list of remediation tickets. We do not start the attribution-agent pilot until the doc is at least amber across the board.

1. Consent state and conversion modelling

Pull the GA4 explorations view and segment by consent state. If the "denied" segment is more than 20 percent of sessions and your reporting identity is "blended", flip the property to "observed" for the duration of the audit. You want to see what GA4 actually measured before you let anything sit on top of the modelled numbers.

Then compare observed conversions to modelled conversions for the same period. If the gap is more than 15 percent, the agent is going to be reading mostly inference. Note the percentage. Tell the agent in its system prompt. Do not let it default to assuming every revenue row is observed.

2. Server container drift

If the client runs server-side Google Tag Manager, open the server container and check the deployed version against the workspace. Twice now we have found a workspace with six approved-but-undeployed changes, including a new purchase tag that the marketing team thought had been live for two months.

Then check the actual Cloud Run (or App Engine, or self-hosted) instance. Confirm the URL the web container is pinging matches the URL the server container is serving from. We once found a server container running on a preview URL for a year because the migration ticket was closed before the DNS change.

3. Identity stitching across domains

Most mid-market sites have at least two domains in the journey: marketing site and app, shop and checkout, main site and a hosted booking tool. Open the GA4 cross-domain configuration. Confirm both domains are listed and that the linker parameter is being passed in the URL between them. Open a private window. Click a CTA. Watch the network tab. If _gl= is not on the destination URL, your sessions are splitting.

Sessions that split look like a much larger top-of-funnel and a much smaller conversion rate. The agent will read that as a paid-channel problem and tell you to cut spend. The actual problem is one missing line of GTM config.

4. Revenue: GA4 versus the order system

This is the highest-value check. Pick a closed month. Pull total revenue from GA4 for that month. Pull total revenue from the source-of-truth system: the invoice table, the Shopify export, the Stripe dashboard, the ERP. Compare.

If the gap is over 8 percent in either direction, the attribution agent should not be running yet. Find out where the gap comes from before you spend a quarter believing the channel mix.

-- The query we always run on day one of an audit
-- Compares GA4 purchase events to the order system for the last 30 days
with ga4 as (
  select date(event_date) as day,
         sum(ecommerce.purchase_revenue) as ga4_rev
  from `project.analytics_xxxxxx.events_*`
  where event_name = 'purchase'
    and _table_suffix between
        format_date('%Y%m%d', date_sub(current_date(), interval 30 day))
        and format_date('%Y%m%d', current_date())
  group by day
),
orders as (
  select date(created_at) as day,
         sum(total_excl_tax) as order_rev
  from raw.orders
  where created_at >= current_date() - 30
    and status in ('paid', 'shipped', 'fulfilled')
  group by day
)
select g.day,
       ga4_rev,
       order_rev,
       safe_divide(ga4_rev - order_rev, order_rev) as gap
from ga4 g
join orders o using (day)
order by day;

If the daily gap swings wildly, you have a tag that fires on confirmation-page reloads, or a thank-you template that lost its purchase tag during a theme update, or a one-page checkout that double-fires. Each of those is fixable. None of them is fixable from inside an attribution agent.

5. Currency, refunds, partial payments

GA4 stores ecommerce revenue in a single currency per property. If the client sells in EUR and USD and GBP, ask which one is being passed. We have seen properties where the dataLayer was hardcoded to EUR on a USD checkout because the original implementation was Dutch and nobody changed it when the US store opened.

Refunds are even worse. Most GTM implementations do not send the refund event back to GA4 at all. Revenue keeps accumulating. The attribution agent thinks every euro that ever entered the funnel is still there. Three months in, the channel mix in GA4 has a phantom lift of several percent that finance cannot find.

6. Bot, internal, and preview traffic

Open the data-stream settings. Check the internal-traffic IP filter. Most properties we audit have a filter that was set up four office moves ago. Then check whether the staging and preview environments are sending events to the production property. They usually are.

A surprisingly common pattern: the agency that built the site is running uptime checks from a German datacentre, and those checks are being counted as Direct / (none) sessions in Düsseldorf. The agent recommends doubling down on the German market. Nobody in Düsseldorf has heard of the brand.

The data contract the agent actually needs

Once the audit comes back amber or green, we write a one-page contract between GA4 and the attribution agent. It looks like this.

Revenue source of truth is the order or invoice table, not GA4. The agent joins on transaction_id.
GA4 is used for behaviour only: sessions, source / medium, landing page, content group. No money flows from GA4 into the attribution model.
Consent state is a column, not a filter. The agent sees the modelled fraction and weights it down explicitly when it writes its weekly memo.
Refunds are subtracted at the line level, not netted off at the channel level, so attribution windows do not get distorted by late returns.
One canonical user key: usually a hashed email from the order system, joined back to the GA4 client_id via a server-side identity event.

This is not glamorous. It is also the difference between an attribution agent that gives finance a number they trust and one that gets switched off after a quarter because the CFO printed the memo out and circled it in red.

Attribution agents do not invent bad data. They inherit it. If you wire one onto a tag setup that already misreads revenue by 12 percent, you have built a very confident liar.

What we do when the audit fails hard

About a third of the time the audit comes back too red to remediate inside the pilot window. In that case we do not start the agent. We quote the remediation as a separate two-to-four-week project, ship it, and only then turn the agent on. It is a less exciting first month for the client. It is a much more interesting second quarter, when the channel-mix numbers in the weekly memo actually match the bank account.

When we built the marketing-attribution stack for a Dutch B2B distributor earlier this year, the thing we ran into was that GA4 had been recording USD on an EUR checkout for nine months. We rebuilt the dataLayer, treated the order table as truth, and only then wired the AI agents on top, which matched finance to within 1.4 percent in the second quarter.

The smallest thing you can do today: run the SQL above on your own GA4 and your own order table for the last thirty days. If the gap is over 8 percent, do not onboard an attribution agent yet. Fix the gap first.

Key takeaway

If GA4 revenue and your invoice ledger disagree by more than 8 percent over a closed month, fix the data before you quote an attribution agent.

FAQ

How long should a GA4 attribution audit take?

About three working days for a normal mid-market setup. Most of that is comparing GA4 revenue to the order system day by day and tracing the gaps.

What's an acceptable gap between GA4 revenue and the invoice ledger?

Under 8 percent in either direction over a closed month. Above that, the attribution model is sitting on bent data and an agent will read it as channel signal.

Do we need server-side tagging before attribution can work?

Not strictly. But if you have it, audit the deployed version against the workspace. Undeployed changes and stale preview URLs are two of the most common failures we see.

Why not just trust GA4's data-driven attribution model?

Because in EU traffic with high cookie-rejection rates, a large share of conversions are modelled, not observed. The attribution model is then built on top of inference, not measurement.

ai agentsintegrationsworkflowarchitectureoperationsstrategy

Building something?

Start a project