Unified Data Access

Unified Data Access is Zowie’s data sharing product that gives customers direct access to their own Zowie operational data inside Google BigQuery. It provides a structured, unified data model that normalizes data from across Zowie’s product services (AI Supervisor, Inbox, CRM, GenAI, and Zowie Engine) into a single, consistent hierarchy that customers can query, analyze, and integrate with their own data infrastructure

Key information

  • Customer's data, in customer's warehouse – Access to interaction data directly inside Google BigQuery. Run your own queries, build custom reports.
  • Unified data model – A clean, consistent hierarchy (Users → Tickets → Interactions → Events) that brings together data from across Zowie's AI Agent, Inbox, and CRM
  • Zero-copy sharing via Analytics Hub – Data is shared securely through Google Analytics Hub private listings
  • Full AI transparency – The Events table exposes every message, reasoning step, knowledge retrieval, and tool invocation, giving you complete visibility into how your AI Agent handles each conversation.
  • Cross-channel coverage – Review data from chat, email, and voice interactions in a single, consistent schema.

Database Schema & ERDs

Detailed technical documentation is maintained via dbdocs.io. This serves as our "source of truth" for database structures, providing interactive diagrams and field definitions for developers and analysts


How it works

Unified Data Access exposes a dataset containing customer's Zowie data within BigQuery through Analytics Hub.

  • Analytics Hub allows for private and secure data sharing between organizations.
  • Zowie pays for data storage, while customer pays for data querying.
  • The Data team at Zowie is responsible for the data sharing configuration.

The shared dataset contains tables organized around a unified data model that normalizes data from Zowie's AI Supervisor, Inbox, CRM, GenAI engine, and Zowie Engine into a single, consistent hierarchy.


Data Model

The unified data model follows a clear hierarchy:

USER
 └── TICKET
      └── INTERACTION
           └── EVENT
  • A user can have many tickets.
  • A ticket can have many interactions (e.g., the AI Agent handles first, then transfers to a human agent).
  • An interaction can have many events (individual messages, actions, system events).

See also -> Onepager Diagram


Users dim_product_unified_users

End users - your customers - from Zowie's CRM service. Contains profile fields (name, email, phone), authentication level (authenticated, recognized, or anonymous), external user ID, and arbitrary custom properties set by the customer

FieldDescription
environment_id, user_idPrimary identifiers
first_name, last_name, email, phone_numberProfile fields
authentication_levelauthenticated, recognized, or anonymous
external_user_idCustomer's own user ID (only for authenticated herochat users)
custom_propertiesArbitrary key-value properties set by your system

Tickets fct_product_unified_tickets

A ticket represents a single support request. Tickets are synthesized from two sources:

  • AI tickets – One per AI Supervisor interaction (covers chat, email, and voice channels).
  • Standalone inbox tickets – Inbox threads that were never handled by the AI Agent.
FieldDescription
ticket_idPrimary identifier, Supervisor interaction ID (AI tickets) or inbox chat ID (standalone)
user_idLinks to the Users table
channel_typechat, email, or voice
statuscompleted or in_progress
subjectConversation title from inbox
collected_feedbackPost-chat survey feedback
snapshot_*User profile fields as they were at ticket creation time
💡

Snapshot fields capture the user's profile at the moment the ticket was created. This lets you analyze interactions based on who the customer was at that point in time, even if their profile has since changed.


Interactions fct_product_unified_interactions

An interaction is a segment of a ticket owned by a single actor.

Interactions are synthesized from three sources:

  • AI Agent interactions – From the AI Supervisor. Includes contact reasons, intents, sentiment, and knowledge sources used.
  • Inbox synthetic interactions – Human agent and queue segments derived from Inbox chat metadata.
  • Standalone inbox fallbacks – a fallback human_agent interaction for standalone inbox tickets with no synthetic interaction data
FieldDescription
interaction_id, ticket_idPrimary key and link to Tickets
ownerai_agent, human_agent, or queue
owner_user_idChatbot ID (AI) or resolving agent ID (human)
state, statusLifecycle state and detailed status
recognized_contact_reasons, final_contact_reasonContact reasons (AI Agent only)
recognized_intents, started_processesAI enrichment fields
final_topic, assigned_topicsTopic classification (Inbox only)
statistics_*First response time, resolution time, SLA metrics

Practical examples

  • Filter interactions by owner = 'ai_agent' and status = 'transferred' to find all cases where the AI Agent escalated to a human.
  • Join interactions with tickets on ticket_id to analyze how many interactions it took to resolve a support request.
  • Use final_contact_reason to build volume and automation rate breakdowns by topic.


Events fct_product_unified_events

Individual events (messages, actions, system events) within an interaction. This is the most granular level of data available.

Events are synthesized from three sources:

  • Inbox events – Agent panel events from human agent interactions.
  • GenAI events – AI Agent session events (messages, actions, tool calls)
  • Zowie Engine events – Process session events from the legacy engine.
FieldDescription
event_id, interaction_idPrimary key and link to Interactions
event_typeType of event (e.g., UserMessageV2, AgentMessage, LLMUsed, BlockEntered)
event_dataJSON payload with event-specific details
authoruser, ai_agent, or human_agent
created_atEvent timestamp
⚠️

The Events table can be large. Use the created_at timestamp in your WHERE clauses to limit the data scanned and control query costs.

Practical examples

  • Filter by author = 'user' and event_type = 'UserMessageV2' to extract all customer messages.
  • Parse the event_data JSON for LLMUsed events to analyze which knowledge sources your AI Agent relies on most.
  • Count events per interaction to measure conversation complexity

Key Relationships

All tables share a consistent key hierarchy:

dim_product_unified_users.user_id
    = fct_product_unified_tickets.user_id
    = fct_product_unified_interactions.user_id
    = fct_product_unified_events.user_id

fct_product_unified_tickets.ticket_id
    = fct_product_unified_interactions.ticket_id

fct_product_unified_interactions.interaction_id
    = fct_product_unified_events.interaction_id

All tables are partitioned by timestamp and clustered by environment_id for optimal query performance.


Setting up Unified Data Access

On Customer side

BigQuery data sharing configuration instruction

Data region

The linked dataset is created in the europe-west3 region.

If you need this data in another region, you have two options:

  • Set up another dataset in europe-west3, copy data to that dataset (schedule a periodic copy job), and set a secondary replica to your desired region.
  • Use a third-party tool to copy data to your desired region or service.


Data availability and update schedule

Refresh Timeline

The daily data refresh has a fixed schedule and runs between 00:00 AM UTC and 5:30 AM UTC.

Time (UTC)

Stage

Details

00:00–04:00 AM

Data ingestion

Platform data is pulled into the data warehouse.
EU ingestion starts at 00:00 AM
US ingestion starts at 02:00 AM

04:00 AM

Data reload

The data warehouse is reloaded with the newly ingested data.

05:30 AM

Delivery confirmation

We verify that refreshed data is available for all customers
Data covers full previous day (00:00 - 23:59)

Delayed Deliveries

If data is not available by 5:30 AM UTC, we will:

  1. Take immediate internal action to resolve the issue.
  2. Send you an email notifying you of the delay and outlining any actions needed on your end.
⚠️

Data delivery monitoring is not currently covered by an SLA. We work to resolve any issues as quickly as possible, but do not guarantee specific resolution times at this stage


Querying tips

Control query costs

BigQuery charges based on the amount of data scanned. To keep costs down:

  • Always filter on partitioned columns (created_at, timestamp) in your WHERE clause.
  • Select only the columns you need — avoid SELECT * on large tables.
  • Use the BigQuery query validator to preview estimated data scanned before running.


FAQ

Can I modify the shared data? No. The shared dataset is read-only. You can query it and copy data into your own datasets for transformation, but the source tables are controlled by Zowie.

What happens if I delete my linked dataset? You can re-subscribe to the Analytics Hub listing at any time. No data is lost on Zowie's side.

Does this include personally identifiable information (PII)? Yes. The shared data includes end-user PII (name, email, phone number, custom properties). You are responsible for handling this data in accordance with your own privacy policies and applicable regulations (e.g., GDPR).

Can I connect tools other than BigQuery? Any tool that can read from BigQuery (Looker, Tableau, Power BI, dbt, Metabase, etc.) can work with the shared dataset. The data lives in BigQuery — your downstream tools connect to it like any other BigQuery dataset.

What if I need the data in a region other than europe-west3? You can set up a cross-region dataset copy job in BigQuery or use a third-party replication tool. See the Data region section above for details.