Integration: GA4 BigQuery

This article will help you learn about how Daasity replicates data from GA4 BigQuery, limitations to the data we can extract and where the data is stored in the GA4 BigQuery schema

Key Topics

Click on the links below to take you to the section where you can learn more about this Integration

Integration Overview

Google Analytics V4 is the newest version of Google Analytics, wherein raw event data from Google Analytics V4 properties can be exported to BigQuery and then fetched through a series of SQL queries.

This document provides context on what kind of data is being gathered through this extractor, which endpoints that data is coming from, and how the extracted tables relate to each other. 

Integration Availability

This integration is available for:

  • Pro V2

NOTE: You are considered V2 if...

  • Your account says Pro V2 in the Daasity App

Not sure? Reach out to Support@Daasity.com

BigQuery Export

The Daasity GA4 BigQuery extractor is built based on this GA4 BigQuery Export Documentation. Staging tables are extracted from BigQuery Exports via SQL and are then stitched together during the transformation process to create the standard GA report tables. The Daasity GA4 BigQuery extractor fetches reports in the following time periods:

  • Full day (tables ending in _stg1)
  • Intraday (tables ending in _stg2)
  • Hourly

Note: 

  • Daasity includes a cutover date for merchants that have both a GA3 and a GA4 integration. This allows merchants to link the integrations and set a date after which the UTS tables will be populated with data from GA4 instead of GA3.

  • Intraday reports are currently unavailable but will be optional for merchants soon.

Entity Relationship Diagram (ERD)

The embedded diagram houses the ERD for the Daasity GA4 BigQuery integration illustrating the different tables and keys to join across tables. Hover over the embedded diagram to reveal controls to zoom and scroll.

GA4 BigQuery Schema

The Daasity GA4 BigQuery extractor creates these tables using SQL and replication methods listed. The data is mapped from BigQuery to the table based on the mapping logic outlined in each table.

Base Landing Page

  • Endpoint: BigQuery Export
  • Update Method: UPSERT
  • Table Names: 
    • ga4_bq_stg.base_landing_page_stg1
    • ga4_bq_stg.base_landing_page_stg2
SQL Database Column
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(REGEXP_REPLACE(landing_page,r'\?.*',''),'')) id
stream_id stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date) created_on
REGEXP_REPLACE(landing_page,r'\?.*','') landing_page_path
SUM(pageviews) pageviews
COUNT(DISTINCT ga_session_id) sessions
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END) engaged_sessions
SUM(transactions) transactions
SUM(session_duration) session_duration
SUM(transaction_revenue) transaction_revenue
'events' __source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) __file_date
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(REGEXP_REPLACE(landing_page,r'\?.*',''),'')) __sync_key
'<%= integration.account.id %>' _account_id
CURRENT_TIMESTAMP() __synced_at

Base PDP Performance

  • Endpoint: BigQuery Export
  • Update Method: UPSERT
  • Table Names: 
    • ga4_bq_stg.base_pdp_performance_stg1
    • ga4_bq_stg.base_pdp_performance_stg2
SQL Database Column
MD5('<%= integration.account.id %>'||':'||stream_id||':'||event_date||':'||item_name||':'||item_id||':'||IFNULL(currency,'')) id
stream_id stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date) created_on
item_name product_name
item_id product_sku
currency currency
COUNT(CASE WHEN event_name = 'view_item' THEN CONCAT(user_pseudo_id,event_timestamp) END) product_detail_views
COUNT(CASE WHEN event_name = 'add_to_cart' THEN CONCAT(user_pseudo_id,event_timestamp) END) product_adds_to_cart
COUNT(CASE WHEN event_name = 'begin_checkout' THEN CONCAT(user_pseudo_id,event_timestamp) END) product_checkouts
SUM(CASE WHEN event_name = 'purchase' THEN item_revenue ELSE 0 END) product_sales
COUNT(DISTINCT CASE WHEN event_name = 'purchase' THEN transaction_id END) product_purchases
'events' __source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) __file_date
MD5('<%= integration.account.id %>'||':'||stream_id||':'||event_date||':'||item_name||':'||item_id||':'||IFNULL(currency,'')) __sync_key
'<%= integration.account.id %>' _account_id
CURRENT_TIMESTAMP() __synced_at

Base Shopping Stage

  • Endpoint: BigQuery Export
  • Update Method: UPSERT
  • Table Names: 
    • ga4_bq_stg.base_shopping_stage_stg1
    • ga4_bq_stg.base_shopping_stage_stg2
SQL Database Column
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||shopping_stage||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||device_category) id
stream_id stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date) created_on
source source
medium medium
campaign campaign
shopping_stage shopping_stage
NULL channel_grouping
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END user_type
device_category device_category
COUNT(DISTINCT ga_session_id) sessions
term keyword
content adcontent
NULL channel
NULL vendor
NULL sub_channel
NULL media_type
NULL attribution_channel
'events' __source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) __file_date
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||shopping_stage||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||device_category) __sync_key
'<%= integration.account.id %>' _account_id
CURRENT_TIMESTAMP() __synced_at

Base Transactions

  • Endpoint: BigQuery Export
  • Update Method: UPSERT
  • Table Names: 
    • ga4_bq_stg.base_transactions_stg1
    • ga4_bq_stg.base_transactions_stg2
SQL Database Column
MD5('<%= integration.account.id %>'||':'||stream_id||':'||transaction_id) id
stream_id stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date) created_on
transaction_id transaction_id
source source
medium medium
campaign campaign
NULL channel_grouping
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END user_type
NULL country_iso_code
device_category device_category
NULLIF(country,'') country
city city
full_referrer full_referrer
region region
browser browser
operating_system operating_system
term keyword
content adcontent
FIRST_VALUE(purchase_revenue_in_usd IGNORE NULLS) OVER (PARTITION BY transaction_id ORDER BY event_timestamp DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) transaction_revenue
NULL channel
NULL vendor
NULL sub_channel
NULL media_type
NULL attribution_channel
'events' __source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) __file_date
MD5('<%= integration.account.id %>'||':'||stream_id||':'||transaction_id) __sync_key
'<%= integration.account.id %>' _account_id
CURRENT_TIMESTAMP() __synced_at

Base Traffic

  • Endpoint: BigQuery Export
  • Update Method: UPSERT
  • Table Names: 
    • ga4_bq_stg.base_traffic_stg1
    • ga4_bq_stg.base_traffic_stg2
SQL Database Column
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(device_category,'')||':'||IFNULL(country_name,'')) id
stream_id stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date) created_on
source source
medium medium
campaign campaign
NULL channel_grouping
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END user_type
device_category device_category
NULL country_iso_code
NULLIF(country_name,'') country_name
SUM(pageviews) pageviews
COUNT(DISTINCT ga_session_id) sessions
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END) engaged_sessions
SUM(transactions) transactions
SUM(session_duration) session_duration
NULL time_on_page
SUM(transaction_revenue) transaction_revenue
term keyword
content adcontent
NULL channel
NULL vendor
NULL sub_channel
NULL media_type
NULL attribution_channel
'events' __source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) __file_date
MD5('<%= integration.account.id %>'||':'||sm.stream_id||':'||sm.created_on||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,'')) sync_key
'<%= integration.account.id %>' _account_id
CURRENT_TIMESTAMP() __synced_at

Base Traffic Hourly

  • Endpoint: BigQuery Export
  • Update Method: UPSERT
  • Table Names: 
    • ga4_bq_stg.base_traffic_hourly_history_stg1
    • ga4_bq_stg.base_traffic_hourly_history_stg2
SQL Database Column
MD5(sm.stream_id||':'||sm.created_on||':'||DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(st.event_timestamp)),hour)||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||IFNULL(ts.full_referrer,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,'')) id
stream_id stream_id
PARSE_TIMESTAMP('%Y%m%d', event_date) created_on
DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(event_timestamp)),hour) AS date_hour_utc date_hour_utc
source source
medium medium
campaign campaign
NULL channel_grouping
term keyword
content adcontent
full_referrer full_referrer
CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END user_type
device_category device_category
NULL country_iso_code
NULLIF(sm.country_name,'') country_name
SUM(pageviews) pageviews
COUNT(DISTINCT  ga_session_id) sessions
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END) engaged_sessions
SUM(transactions) transactions
SUM(transaction_revenue) transaction_revenue
NULL channel
NULL vendor
NULL sub_channel
NULL media_type
NULL attribution_channel
'events' __source_table
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) __file_date
MD5(sm.stream_id||':'||sm.created_on||':'||DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(st.event_timestamp)),hour)||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||IFNULL(ts.full_referrer,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,'')) __sync_key
'<%= integration.account.id %>' _account_id
CURRENT_TIMESTAMP() __synced_at

Related Resources