This article will help you learn about how Daasity replicates data from GA4 BigQuery, limitations to the data we can extract and where the data is stored in the GA4 BigQuery schema
Key Topics
Click on the links below to take you to the section where you can learn more about this Integration
- Integration Overview
- Integration Availability
- BigQuery Export
- Entity Relationship Diagram
- GA4 BigQuery Schema
- Related Resources
Integration Overview
Google Analytics V4 is the newest version of Google Analytics, wherein raw event data from Google Analytics V4 properties can be exported to BigQuery and then fetched through a series of SQL queries.
This document provides context on what kind of data is being gathered through this extractor, which endpoints that data is coming from, and how the extracted tables relate to each other.
Integration Availability
This integration is available for:
- Pro V2
NOTE: You are considered V2 if...
- Your account says Pro V2 in the Daasity App
Not sure? Reach out to Support@Daasity.com
BigQuery Export
The Daasity GA4 BigQuery extractor is built based on this GA4 BigQuery Export Documentation. Staging tables are extracted from BigQuery Exports via SQL and are then stitched together during the transformation process to create the standard GA report tables. The Daasity GA4 BigQuery extractor fetches reports in the following time periods:
- Full day (tables ending in _stg1)
- Intraday (tables ending in _stg2)
- Hourly
Note:
-
Daasity includes a cutover date for merchants that have both a GA3 and a GA4 integration. This allows merchants to link the integrations and set a date after which the UTS tables will be populated with data from GA4 instead of GA3.
-
Intraday reports are currently unavailable but will be optional for merchants soon.
Entity Relationship Diagram (ERD)
The embedded diagram houses the ERD for the Daasity GA4 BigQuery integration illustrating the different tables and keys to join across tables. Hover over the embedded diagram to reveal controls to zoom and scroll.
GA4 BigQuery Schema
The Daasity GA4 BigQuery extractor creates these tables using SQL and replication methods listed. The data is mapped from BigQuery to the table based on the mapping logic outlined in each table.
- Base Landing Page
- Base PDP Performance
- Base Shopping Stage
- Base Transactions
- Base Traffic
- Base Traffic Hourly History
Base Landing Page
- Endpoint: BigQuery Export
- Update Method: UPSERT
- Table Names:
- ga4_bq_stg.base_landing_page_stg1
- ga4_bq_stg.base_landing_page_stg2
SQL | Database Column |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(REGEXP_REPLACE(landing_page,r'\?.*',''),'')) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
REGEXP_REPLACE(landing_page,r'\?.*','') | landing_page_path |
SUM(pageviews) | pageviews |
COUNT(DISTINCT ga_session_id) | sessions |
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END) | engaged_sessions |
SUM(transactions) | transactions |
SUM(session_duration) | session_duration |
SUM(transaction_revenue) | transaction_revenue |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(REGEXP_REPLACE(landing_page,r'\?.*',''),'')) | __sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Base PDP Performance
- Endpoint: BigQuery Export
- Update Method: UPSERT
- Table Names:
- ga4_bq_stg.base_pdp_performance_stg1
- ga4_bq_stg.base_pdp_performance_stg2
SQL | Database Column |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||event_date||':'||item_name||':'||item_id||':'||IFNULL(currency,'')) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
item_name | product_name |
item_id | product_sku |
currency | currency |
COUNT(CASE WHEN event_name = 'view_item' THEN CONCAT(user_pseudo_id,event_timestamp) END) | product_detail_views |
COUNT(CASE WHEN event_name = 'add_to_cart' THEN CONCAT(user_pseudo_id,event_timestamp) END) | product_adds_to_cart |
COUNT(CASE WHEN event_name = 'begin_checkout' THEN CONCAT(user_pseudo_id,event_timestamp) END) | product_checkouts |
SUM(CASE WHEN event_name = 'purchase' THEN item_revenue ELSE 0 END) | product_sales |
COUNT(DISTINCT CASE WHEN event_name = 'purchase' THEN transaction_id END) | product_purchases |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||event_date||':'||item_name||':'||item_id||':'||IFNULL(currency,'')) | __sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Base Shopping Stage
- Endpoint: BigQuery Export
- Update Method: UPSERT
- Table Names:
- ga4_bq_stg.base_shopping_stage_stg1
- ga4_bq_stg.base_shopping_stage_stg2
SQL | Database Column |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||shopping_stage||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||device_category) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
source | source |
medium | medium |
campaign | campaign |
shopping_stage | shopping_stage |
NULL | channel_grouping |
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END | user_type |
device_category | device_category |
COUNT(DISTINCT ga_session_id) | sessions |
term | keyword |
content | adcontent |
NULL | channel |
NULL | vendor |
NULL | sub_channel |
NULL | media_type |
NULL | attribution_channel |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||shopping_stage||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||device_category) | __sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Base Transactions
- Endpoint: BigQuery Export
- Update Method: UPSERT
- Table Names:
- ga4_bq_stg.base_transactions_stg1
- ga4_bq_stg.base_transactions_stg2
SQL | Database Column |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||transaction_id) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
transaction_id | transaction_id |
source | source |
medium | medium |
campaign | campaign |
NULL | channel_grouping |
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END | user_type |
NULL | country_iso_code |
device_category | device_category |
NULLIF(country,'') | country |
city | city |
full_referrer | full_referrer |
region | region |
browser | browser |
operating_system | operating_system |
term | keyword |
content | adcontent |
FIRST_VALUE(purchase_revenue_in_usd IGNORE NULLS) OVER (PARTITION BY transaction_id ORDER BY event_timestamp DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) | transaction_revenue |
NULL | channel |
NULL | vendor |
NULL | sub_channel |
NULL | media_type |
NULL | attribution_channel |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||transaction_id) | __sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Base Traffic
- Endpoint: BigQuery Export
- Update Method: UPSERT
- Table Names:
- ga4_bq_stg.base_traffic_stg1
- ga4_bq_stg.base_traffic_stg2
SQL | Database Column |
MD5('<%= integration.account.id %>'||':'||stream_id||':'||created_on||':'||IFNULL(source,'')||':'||IFNULL(medium,'')||':'||IFNULL(campaign,'')||':'||IFNULL(term,'')||':'||IFNULL(content,'')||':'||CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(device_category,'')||':'||IFNULL(country_name,'')) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
source | source |
medium | medium |
campaign | campaign |
NULL | channel_grouping |
CASE WHEN ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END | user_type |
device_category | device_category |
NULL | country_iso_code |
NULLIF(country_name,'') | country_name |
SUM(pageviews) | pageviews |
COUNT(DISTINCT ga_session_id) | sessions |
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END) | engaged_sessions |
SUM(transactions) | transactions |
SUM(session_duration) | session_duration |
NULL | time_on_page |
SUM(transaction_revenue) | transaction_revenue |
term | keyword |
content | adcontent |
NULL | channel |
NULL | vendor |
NULL | sub_channel |
NULL | media_type |
NULL | attribution_channel |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5('<%= integration.account.id %>'||':'||sm.stream_id||':'||sm.created_on||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,'')) | sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |
Base Traffic Hourly
- Endpoint: BigQuery Export
- Update Method: UPSERT
- Table Names:
- ga4_bq_stg.base_traffic_hourly_history_stg1
- ga4_bq_stg.base_traffic_hourly_history_stg2
SQL | Database Column |
MD5(sm.stream_id||':'||sm.created_on||':'||DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(st.event_timestamp)),hour)||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||IFNULL(ts.full_referrer,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,'')) | id |
stream_id | stream_id |
PARSE_TIMESTAMP('%Y%m%d', event_date) | created_on |
DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(event_timestamp)),hour) AS date_hour_utc | date_hour_utc |
source | source |
medium | medium |
campaign | campaign |
NULL | channel_grouping |
term | keyword |
content | adcontent |
full_referrer | full_referrer |
CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END | user_type |
device_category | device_category |
NULL | country_iso_code |
NULLIF(sm.country_name,'') | country_name |
SUM(pageviews) | pageviews |
COUNT(DISTINCT ga_session_id) | sessions |
COUNT(DISTINCT CASE WHEN is_engaged = 1 THEN ga_session_id END) | engaged_sessions |
SUM(transactions) | transactions |
SUM(transaction_revenue) | transaction_revenue |
NULL | channel |
NULL | vendor |
NULL | sub_channel |
NULL | media_type |
NULL | attribution_channel |
'events' | __source_table |
PARSE_TIMESTAMP('%Y%m%d', _TABLE_SUFFIX) | __file_date |
MD5(sm.stream_id||':'||sm.created_on||':'||DATETIME_TRUNC(DATETIME(TIMESTAMP_MICROS(st.event_timestamp)),hour)||':'||IFNULL(ts.source,'')||':'||IFNULL(ts.medium,'')||':'||IFNULL(ts.campaign,'')||':'||IFNULL(ts.term,'')||':'||IFNULL(ts.content,'')||':'||IFNULL(ts.full_referrer,'')||':'||CASE WHEN ftu.ga_session_id IS NOT NULL THEN 'First-time' ELSE 'Returning' END||':'||IFNULL(sm.device_category,'')||':'||IFNULL(sm.country_name,'')) | __sync_key |
'<%= integration.account.id %>' | _account_id |
CURRENT_TIMESTAMP() | __synced_at |