Daasity Data Model: Overview & Design Philosophy

This article provides an outline of the general Daasity Data Model, why we designed the data model this way, and how the Daasity transformation layer works.

Design Philosophy

The Daasity Data Model was designed to be future-proof so that a complete rebuild of reporting would not be needed when either a source system changed or a new source system was added.

Our team's experience of dealing with switching Email Service Providers (ESPs) and changing APIs lead to this design that allows us to minimize the impact by leveraging a normalized middle layer so that we only need to make one set of changes when upstream systems are modified.

daasity_data_model_concept

Data undergoes three steps:

  1. Data is replicated from the source system into the Extractor Schema
  2. Data is transformed from the Extractor Schema into a Normalized Schema
  3. Data is transformed from the Normalized Schema into the Reporting Schema

Extractor Schema

The extractor schema is the best representation of the source system (SaaS platform, database or other data source) that is possible in a traditional database structure. Thus, nested data sources (e.g., JSON) may be denested into multiple tables.

This approach enables us to implement an ELT approach and move the transformation logic to a SQL/Python layer where it is easier to access and modify

Although our storage costs may increase because of the data replication, in the Consumer Product Brand industry the size of data is relatively small, and storage costs are minimal in comparison to the cost of maintaining pipelines that transform from source to end reporting.

Normalization Schema

The normalization schemas are a core component of the Daasity platform. Developing a normalization schema has significants impacts for analytics development, as it reduces the overall maintenance of the data model and allows you to plan for the future.

For example, our Unified Order Schema (UOS) is built to support a multi-shipment/multi-recipient framework across eCommerce, Marketplace, Retail, and Wholesale, which very few commerce platforms support.

This means if a commerce platform were to add additional functionality for multi-shipment/multi-recipient you would only need to change the transformation code from the Extractor Schema to the Normalization Schema, and none of the downstream data models and reports would be impacted. This greatly reduces the maintenance, as we have one single data model to change.

Currently we have three normalized data models deployed:

  • Unified Order Schema (UOS): a data model built for multi-channel / multi-shipment / multi-recipient orders
  • Unified Notification Schema (UNS): a data model built to combine email, SMS, push and app notifications together
  • Unified Marketing Schema (UMS): a data model built to combine all paid marketing platforms together

We are currently designing two additional data models to support omnichannel and other complex business cases:

  • Unified Traffic Schema (UTS): a data model built to track retail, eCommerce, and Marketplace traffic and conversion
  • Unified Subscription Schema (USS): a data model built to normalize subscription data to enable easier switching between subscription platform

Data Reporting Schema

The data reporting schema (DRP) is the source schema for reporting and where we link a visualization tool like Looker, Tableau or Sigma Computing. Building the data reporting schema from the normalized schema enables us to build the business logic into this transformation layer and limit changes that need to be made to changes to business logic and not the source system.

The data reporting schema is broken down in the concept of data marts (even though they are stored in a single schema). This allows us to build our visualization layer for specific user groups to ensure that a user can build reports themselves and reduce the likelihood they will get the wrong results.

For consumer product brands, we construct these data marts into:

  • Visitor Traffic and Store Performance: providing the ability to understand traffic across eCommerce, Marketplace and Retail as well as conversion and site performance (eCommerce and Marketplace)
  • Channel and Attribution: providing the ability to understand where your customers came from and how different attribution methodologies change that
  • Marketing: providing the ability to understand how your acquisition marketing is performing
  • Orders & Revenue: providing the ability to understand the component of revenue and perform complex customer/product/order analytics
  • Customer & Lifetime Value: providing the ability to build customer segments and how customers perform over time
  • Subscription: providing the ability to understand the performance of businesses that offer subscription
  • Email & SMS: providing the ability to understand email/SMS performance from both an email/SMS and customer perspective