Key information

Customer's data, in customer's warehouse – Access to interaction data directly inside Google BigQuery. Run your own queries, build custom reports.
Unified data model – A clean, consistent hierarchy (Users → Tickets → Interactions → Events) that brings together data from across Zowie's AI Agent, Inbox, and CRM
Zero-copy sharing via Analytics Hub – Data is shared securely through Google Analytics Hub private listings
Full AI transparency – The Events table exposes every message, reasoning step, knowledge retrieval, and tool invocation, giving you complete visibility into how your AI Agent handles each conversation.
Cross-channel coverage – Review data from chat, email, and voice interactions in a single, consistent schema.

Database Schema & ERDs

Detailed technical documentation is maintained via dbdocs.io. This serves as our "source of truth" for database structures, providing interactive diagrams and field definitions for developers and analysts

How it works

Unified Data Access exposes a dataset containing customer's Zowie data within BigQuery through Analytics Hub.

Analytics Hub allows for private and secure data sharing between organizations.
Zowie pays for data storage, while customer pays for data querying.
The Data team at Zowie is responsible for the data sharing configuration.

The shared dataset contains tables organized around a unified data model that normalizes data from Zowie's AI Supervisor, Inbox, CRM, GenAI engine, and Zowie Engine into a single, consistent hierarchy.

Data Model

The unified data model follows a clear hierarchy:

USER
 └── TICKET
      └── INTERACTION
           └── EVENT

A user can have many tickets.
A ticket can have many interactions (e.g., the AI Agent handles first, then transfers to a human agent).
An interaction can have many events (individual messages, actions, system events).

Users `dim_product_unified_users`

End users - your customers - from Zowie's CRM service. Contains profile fields (name, email, phone), authentication level (authenticated, recognized, or anonymous), external user ID, and arbitrary custom properties set by the customer

Field	Description
`environment_id`, `user_id`	Primary identifiers
`first_name`, `last_name`, `email`, `phone_number`	Profile fields
`authentication_level`	`authenticated`, `recognized`, or `anonymous`
`external_user_id`	Customer's own user ID (only for authenticated herochat users)
`custom_properties`	Arbitrary key-value properties set by your system

Tickets `fct_product_unified_tickets`

A ticket represents a single support request. Tickets are synthesized from two sources:

AI tickets – One per AI Supervisor interaction (covers chat, email, and voice channels).
Standalone inbox tickets – Inbox threads that were never handled by the AI Agent.

Field	Description
`ticket_id`	Primary identifier, Supervisor interaction ID (AI tickets) or inbox chat ID (standalone)
`user_id`	Links to the Users table
`channel_type`	`chat`, `email`, or `voice`
`status`	`completed` or `in_progress`
`subject`	Conversation title from inbox
`collected_feedback`	Post-chat survey feedback
`snapshot_*`	User profile fields as they were at ticket creation time

💡
Snapshot fields capture the user's profile at the moment the ticket was created. This lets you analyze interactions based on who the customer was at that point in time, even if their profile has since changed.

Interactions `fct_product_unified_interactions`

An interaction is a segment of a ticket owned by a single actor.

Interactions are synthesized from three sources:

AI Agent interactions – From the AI Supervisor. Includes contact reasons, intents, sentiment, and knowledge sources used.
Inbox synthetic interactions – Human agent and queue segments derived from Inbox chat metadata.
Standalone inbox fallbacks – a fallback human_agent interaction for standalone inbox tickets with no synthetic interaction data

Field	Description
`interaction_id`, `ticket_id`	Primary key and link to Tickets
`owner`	`ai_agent`, `human_agent`, or `queue`
`owner_user_id`	Chatbot ID (AI) or resolving agent ID (human)
`state`, `status`	Lifecycle state and detailed status
`recognized_contact_reasons`, `final_contact_reason`	Contact reasons (AI Agent only)
`recognized_intents`, `started_processes`	AI enrichment fields
`final_topic`, `assigned_topics`	Topic classification (Inbox only)
`statistics_*`	First response time, resolution time, SLA metrics

Practical examples

Filter interactions by owner = 'ai_agent' and status = 'transferred' to find all cases where the AI Agent escalated to a human.
Join interactions with tickets on ticket_id to analyze how many interactions it took to resolve a support request.
Use final_contact_reason to build volume and automation rate breakdowns by topic.

Events `fct_product_unified_events`

Individual events (messages, actions, system events) within an interaction. This is the most granular level of data available.

Events are synthesized from three sources:

Inbox events – Agent panel events from human agent interactions.
GenAI events – AI Agent session events (messages, actions, tool calls)
Zowie Engine events – Process session events from the legacy engine.

Field	Description
`event_id`, `interaction_id`	Primary key and link to Interactions
`event_type`	Type of event (e.g., `UserMessageV2`, `AgentMessage`, `LLMUsed`, `BlockEntered`)
`event_data`	JSON payload with event-specific details
`author`	`user`, `ai_agent`, or `human_agent`
`created_at`	Event timestamp

⚠️
The Events table can be large. Use the created_at timestamp in your WHERE clauses to limit the data scanned and control query costs.

Practical examples

Filter by author = 'user' and event_type = 'UserMessageV2' to extract all customer messages.
Parse the event_data JSON for LLMUsed events to analyze which knowledge sources your AI Agent relies on most.
Count events per interaction to measure conversation complexity

Key Relationships

All tables share a consistent key hierarchy:

dim_product_unified_users.user_id
    = fct_product_unified_tickets.user_id
    = fct_product_unified_interactions.user_id
    = fct_product_unified_events.user_id

fct_product_unified_tickets.ticket_id
    = fct_product_unified_interactions.ticket_id

fct_product_unified_interactions.interaction_id
    = fct_product_unified_events.interaction_id

All tables are partitioned by timestamp and clustered by environment_id for optimal query performance.

Setting up Unified Data Access

On Customer side

BigQuery data sharing configuration instruction

Data region

The linked dataset is created in the europe-west3 region.

If you need this data in another region, you have two options:

Set up another dataset in europe-west3, copy data to that dataset (schedule a periodic copy job), and set a secondary replica to your desired region.
Use a third-party tool to copy data to your desired region or service.

Data availability and update schedule

Refresh Timeline

The daily data refresh has a fixed schedule and runs between 00:00 AM UTC and 5:30 AM UTC.

Time (UTC)	Stage	Details
00:00–04:00 AM	Data ingestion	Platform data is pulled into the data warehouse. EU ingestion starts at 00:00 AM US ingestion starts at 02:00 AM
04:00 AM	Data reload	The data warehouse is reloaded with the newly ingested data.
05:30 AM	Delivery confirmation	We verify that refreshed data is available for all customers Data covers full previous day (00:00 - 23:59)

Delayed Deliveries

If data is not available by 5:30 AM UTC, we will:

Take immediate internal action to resolve the issue.
Send you an email notifying you of the delay and outlining any actions needed on your end.

⚠️
Data delivery monitoring is not currently covered by an SLA. We work to resolve any issues as quickly as possible, but do not guarantee specific resolution times at this stage

Querying tips

Control query costs

BigQuery charges based on the amount of data scanned. To keep costs down:

Always filter on partitioned columns (created_at, timestamp) in your WHERE clause.
Select only the columns you need — avoid SELECT * on large tables.
Use the BigQuery query validator to preview estimated data scanned before running.

FAQ

Can I modify the shared data? No. The shared dataset is read-only. You can query it and copy data into your own datasets for transformation, but the source tables are controlled by Zowie.

What happens if I delete my linked dataset? You can re-subscribe to the Analytics Hub listing at any time. No data is lost on Zowie's side.

Does this include personally identifiable information (PII)? Yes. The shared data includes end-user PII (name, email, phone number, custom properties). You are responsible for handling this data in accordance with your own privacy policies and applicable regulations (e.g., GDPR).

Can I connect tools other than BigQuery? Any tool that can read from BigQuery (Looker, Tableau, Power BI, dbt, Metabase, etc.) can work with the shared dataset. The data lives in BigQuery — your downstream tools connect to it like any other BigQuery dataset.

What if I need the data in a region other than europe-west3? You can set up a cross-region dataset copy job in BigQuery or use a third-party replication tool. See the Data region section above for details.

Key information

Database Schema & ERDs

How it works

Data Model

Users dim_product_unified_users

Tickets fct_product_unified_tickets

Interactions fct_product_unified_interactions

Practical examples

Events fct_product_unified_events

Practical examples

Key Relationships

Setting up Unified Data Access

On Customer side

Data region

Data availability and update schedule

Refresh Timeline

Delayed Deliveries

Querying tips

Control query costs

FAQ

Users `dim_product_unified_users`

Tickets `fct_product_unified_tickets`

Interactions `fct_product_unified_interactions`

Events `fct_product_unified_events`