GA4 and BigQuery: how to match the Console traffic grouping metrics
Google Analytics 4 offers excellent functionality with its GA4 BigQuery dataset. However, you might find it challenging to align your GA4 UI figures with the unnested sources, mediums, and campaigns in BigQuery.
So what happens in BigQuery compared to GA4?
When data is generated, it’s sent directly to BigQuery as raw data without any pre-processing. However, once it’s in GA4, Google applies various models and transformations. This includes joining data, modeling session traffic, adding cost data, and replacing ISO2 codes with language names.
Matching up your numbers
If you unnest the data in GA4 to extract the session source, you’ll notice significant discrepancies between your GA4 UI and GA4 BigQuery data, particularly if you have extensive paid and social activity.
The sample code below displays the raw traffic source, resulting in more users appearing as ‘direct’ or ‘organic’ than in the UI. This discrepancy arises because the data is unprocessed.
SELECT
DISTINCT
(select value.string_value from unnest(event_params) where key = 'source') AS source,
(select value.string_value from unnest(event_params) where key = 'medium') AS medium…