Unified Data Access
Unified Data Access is Zowie’s data sharing product that gives customers direct access to their own Zowie operational data inside Google BigQuery. It provides a structured, unified data model that normalizes data from across Zowie’s product services (AI Supervisor, Inbox, CRM, GenAI, and Zowie Engine) into a single, consistent hierarchy that customers can query, analyze, and integrate with their own data infrastructure
Key information
- Customer's data, in customer's warehouse – Access to interaction data directly inside Google BigQuery. Run your own queries, build custom reports.
- Unified data model – A clean, consistent hierarchy (Users → Tickets → Interactions → Events) that brings together data from across Zowie's AI Agent, Inbox, and CRM
- Zero-copy sharing via Analytics Hub – Data is shared securely through Google Analytics Hub private listings
- Full AI transparency – The Events table exposes every message, reasoning step, knowledge retrieval, and tool invocation, giving you complete visibility into how your AI Agent handles each conversation.
- Cross-channel coverage – Review data from chat, email, and voice interactions in a single, consistent schema.
Database Schema & ERDs
Detailed technical documentation is maintained via dbdocs.io. This serves as our "source of truth" for database structures, providing interactive diagrams and field definitions for developers and analysts
How it works
Unified Data Access exposes a dataset containing customer's Zowie data within BigQuery through Analytics Hub.
- Analytics Hub allows for private and secure data sharing between organizations.
- Zowie pays for data storage, while customer pays for data querying.
- The Data team at Zowie is responsible for the data sharing configuration.
The shared dataset contains tables organized around a unified data model that normalizes data from Zowie's AI Supervisor, Inbox, CRM, GenAI engine, and Zowie Engine into a single, consistent hierarchy.
Data Model
The unified data model follows a clear hierarchy:
USER
└── TICKET
└── INTERACTION
└── EVENT
- A user can have many tickets.
- A ticket can have many interactions (e.g., the AI Agent handles first, then transfers to a human agent).
- An interaction can have many events (individual messages, actions, system events).
See also -> Onepager Diagram
Users dim_product_unified_users
dim_product_unified_usersEnd users - your customers - from Zowie's CRM service. Contains profile fields (name, email, phone), authentication level (authenticated, recognized, or anonymous), external user ID, and arbitrary custom properties set by the customer
| Field | Description |
|---|---|
environment_id, user_id | Primary identifiers |
first_name, last_name, email, phone_number | Profile fields |
authentication_level | authenticated, recognized, or anonymous |
external_user_id | Customer's own user ID (only for authenticated herochat users) |
custom_properties | Arbitrary key-value properties set by your system |
Tickets fct_product_unified_tickets
fct_product_unified_ticketsA ticket represents a single support request. Tickets are synthesized from two sources:
- AI tickets – One per AI Supervisor interaction (covers chat, email, and voice channels).
- Standalone inbox tickets – Inbox threads that were never handled by the AI Agent.
| Field | Description |
|---|---|
ticket_id | Primary identifier, Supervisor interaction ID (AI tickets) or inbox chat ID (standalone) |
user_id | Links to the Users table |
channel_type | chat, email, or voice |
status | completed or in_progress |
subject | Conversation title from inbox |
collected_feedback | Post-chat survey feedback |
snapshot_* | User profile fields as they were at ticket creation time |
Snapshot fields capture the user's profile at the moment the ticket was created. This lets you analyze interactions based on who the customer was at that point in time, even if their profile has since changed.
Interactions fct_product_unified_interactions
fct_product_unified_interactionsAn interaction is a segment of a ticket owned by a single actor.
Interactions are synthesized from three sources:
- AI Agent interactions – From the AI Supervisor. Includes contact reasons, intents, sentiment, and knowledge sources used.
- Inbox synthetic interactions – Human agent and queue segments derived from Inbox chat metadata.
- Standalone inbox fallbacks – a fallback human_agent interaction for standalone inbox tickets with no synthetic interaction data
| Field | Description |
|---|---|
interaction_id, ticket_id | Primary key and link to Tickets |
owner | ai_agent, human_agent, or queue |
owner_user_id | Chatbot ID (AI) or resolving agent ID (human) |
state, status | Lifecycle state and detailed status |
recognized_contact_reasons, final_contact_reason | Contact reasons (AI Agent only) |
recognized_intents, started_processes | AI enrichment fields |
final_topic, assigned_topics | Topic classification (Inbox only) |
statistics_* | First response time, resolution time, SLA metrics |
Practical examples
- Filter interactions by
owner = 'ai_agent'andstatus = 'transferred'to find all cases where the AI Agent escalated to a human. - Join interactions with tickets on
ticket_idto analyze how many interactions it took to resolve a support request. - Use
final_contact_reasonto build volume and automation rate breakdowns by topic.
Events fct_product_unified_events
fct_product_unified_eventsIndividual events (messages, actions, system events) within an interaction. This is the most granular level of data available.
Events are synthesized from three sources:
- Inbox events – Agent panel events from human agent interactions.
- GenAI events – AI Agent session events (messages, actions, tool calls)
- Zowie Engine events – Process session events from the legacy engine.
| Field | Description |
|---|---|
event_id, interaction_id | Primary key and link to Interactions |
event_type | Type of event (e.g., UserMessageV2, AgentMessage, LLMUsed, BlockEntered) |
event_data | JSON payload with event-specific details |
author | user, ai_agent, or human_agent |
created_at | Event timestamp |
The Events table can be large. Use the
created_attimestamp in yourWHEREclauses to limit the data scanned and control query costs.
Practical examples
- Filter by
author = 'user'andevent_type = 'UserMessageV2'to extract all customer messages. - Parse the
event_dataJSON forLLMUsedevents to analyze which knowledge sources your AI Agent relies on most. - Count events per interaction to measure conversation complexity
Key Relationships
All tables share a consistent key hierarchy:
dim_product_unified_users.user_id
= fct_product_unified_tickets.user_id
= fct_product_unified_interactions.user_id
= fct_product_unified_events.user_id
fct_product_unified_tickets.ticket_id
= fct_product_unified_interactions.ticket_id
fct_product_unified_interactions.interaction_id
= fct_product_unified_events.interaction_id
All tables are partitioned by timestamp and clustered by environment_id for optimal query performance.
Setting up Unified Data Access
On Customer side
BigQuery data sharing configuration instruction
Data region
The linked dataset is created in the europe-west3 region.
If you need this data in another region, you have two options:
- Set up another dataset in europe-west3, copy data to that dataset (schedule a periodic copy job), and set a secondary replica to your desired region.
- Use a third-party tool to copy data to your desired region or service.
Data availability and update schedule
Refresh Timeline
The daily data refresh has a fixed schedule and runs between 00:00 AM UTC and 5:30 AM UTC.
Time (UTC) | Stage | Details |
|---|---|---|
00:00–04:00 AM | Data ingestion | Platform data is pulled into the data warehouse. |
04:00 AM | Data reload | The data warehouse is reloaded with the newly ingested data. |
05:30 AM | Delivery confirmation | We verify that refreshed data is available for all customers |
Delayed Deliveries
If data is not available by 5:30 AM UTC, we will:
- Take immediate internal action to resolve the issue.
- Send you an email notifying you of the delay and outlining any actions needed on your end.
Data delivery monitoring is not currently covered by an SLA. We work to resolve any issues as quickly as possible, but do not guarantee specific resolution times at this stage
Querying tips
Control query costs
BigQuery charges based on the amount of data scanned. To keep costs down:
- Always filter on partitioned columns (
created_at,timestamp) in yourWHEREclause. - Select only the columns you need — avoid
SELECT *on large tables. - Use the BigQuery query validator to preview estimated data scanned before running.
FAQ
Can I modify the shared data? No. The shared dataset is read-only. You can query it and copy data into your own datasets for transformation, but the source tables are controlled by Zowie.
What happens if I delete my linked dataset? You can re-subscribe to the Analytics Hub listing at any time. No data is lost on Zowie's side.
Does this include personally identifiable information (PII)? Yes. The shared data includes end-user PII (name, email, phone number, custom properties). You are responsible for handling this data in accordance with your own privacy policies and applicable regulations (e.g., GDPR).
Can I connect tools other than BigQuery? Any tool that can read from BigQuery (Looker, Tableau, Power BI, dbt, Metabase, etc.) can work with the shared dataset. The data lives in BigQuery — your downstream tools connect to it like any other BigQuery dataset.
What if I need the data in a region other than europe-west3? You can set up a cross-region dataset copy job in BigQuery or use a third-party replication tool. See the Data region section above for details.