Product News

Announcing General Availability of Real-Time Reverse ETL, with Confluent Cloud and Apache Kafka® as sources

Will Voutier
Will Voutier January 17, 2024

Last October, when we announced Live Syncs and our intent to create the first-ever Real-Time Composable CDP, we were aware that it was beyond what our customers expected. What they were looking for was faster Reverse ETL syncs, not necessarily activation in single-digit seconds (aka “true” real-time). 

However, at Census we like to push the limits of what’s possible. We’re actively building towards a future where every customer interaction can be real-time. Imagine instantly sending a targeted notification based on a customer's real-time location, A/B testing the right incentive based on their profile data, and analyzing engagement in seconds — all powered by your single source of truth in the data warehouse.

The future of data is real-time, and today’s GA of Live Syncs is our first step towards enabling real-time speed for every single customer use case.

General availability of Live Syncs, enabling real-time data activation

Today, we’re thrilled to announce the general availability of Live Syncs with support for our first two sources — Apache Kafka and Confluent Cloud. We chose to support Confluent and Kafka natively from the ground up because Kafka is the gold standard in event-driven systems, and the authoritative source of truth for streaming event data at many enterprises. 

Live Syncs can activate from real-time streaming data sources with sub-second latency to unlock new high-speed use cases like geotargeted campaigns, abandoned cart notifications, and lead routing in over 200 business tools.

Apache Kafka is an open-source data streaming technology used by over 80% of the Fortune 100 for mission-critical, real-time applications. Confluent, founded by the original creators of Kafka, provides a cloud-native and complete data streaming platform available everywhere it’s needed to massively expand the value of real-time data while eliminating the costly operational burdens of infrastructure management.

“Enterprises around the world use Confluent’s data streaming platform to efficiently and securely implement a near unlimited number of real-time use cases to deliver high-value, differentiated customer engagement. Together, Census and Confluent make it possible for brands to operate around real-time, trusted data streams that equip business teams with the insights they need to build rich, personalized experiences.”
— Rob Taylor, Global Head of Technology Alliances, Confluent

Census now supports real-time, high-speed streaming data in addition to high-quality warehouse data

The advantages of Census Live Syncs

  1. Real-time use cases are no longer a limitation of the Composable CDP — Latency is one of the last barriers blocking some marketing teams from using the warehouse to power their operations. Now, marketers can unleash the full power of the data warehouse by leveraging Customer 360 profiles in real-time.
  2. Fully composable with no vendor lock-in — Census integrates seamlessly with any data stack and business tool, achieving maximum flexibility as a core tenet of the Composable CDP. Previous solutions like packaged CDPs required adoption of their entire system to even begin tackling high-speed use cases. Census prevents both vendor lock-in and unwieldy custom builds. 
  3. Works with existing event tracking infrastructure — Unlike packaged solutions, Census works wherever your source of event data is. The ability to “bring your own event collection” makes Census easier and faster to implement than any other real-time activation platform. Customers can leverage any variety of event collection solutions like CDIs, product analytics, or even custom tracking.

The journey to true real-time Reverse ETL

Leading brands like Sonos, Crocs, and Canva use Reverse ETL to sync customer data from their cloud data warehouse to downstream tools like CRMs and marketing platforms. This enables them to centralize business operations, customer data, and analytics across all teams in a single source of truth.

Ever since we created Reverse ETL in 2018, we’ve built an increasingly sophisticated and scalable data syncing engine that’s highly optimized for processing big batches of data. However, a Live Sync is a sync that runs forever—syncing records from the source to the destination as they change—not based on a periodic diff. 

This is a big departure from how traditional Reverse ETL normally works. 

Traditional Reverse ETL, circa 2018

Traditional Reverse ETL: Batch-based diffing in the data warehouse

We built a new system to overcome the inherent latency in Reverse ETL. Because interacting with Kafka is very different from generating and diffing snapshots of a SQL query, building Live Syncs required us to re-engineer our sync architecture from the bottom up to activate data with sub-second latency.

Warehouse Streaming Reverse ETL: Near real-time
Warehouse Streaming Reverse ETL doesn’t yet meet the requirements of “true” real-time, i.e. activation in single-digit seconds

While many major data warehouses have made strides in recent years ingesting streaming data, far fewer are able to stream out incremental query results. Streaming Reverse ETL is a solution enabled by major data warehousing vendors like Snowflake’s Dynamic Tables and Databricks’s Streaming Tables. While it can be significantly faster than traditional data processing pipelines, it requires data teams to learn and implement an entirely new way of modeling their warehouse data and the latency offered. Realistically, Streaming Reverse ETL works in the frame of 1-5 minutes, not 1-5 seconds, which may not be performant enough for some real-time applications.

Real-Time Streaming Reverse ETL: True real-time data activation

With today’s release of Confluent and Kafka as Live Sync sources, we’re the first Reverse ETL platform to offer true real-time data activation. Our Live Syncs can activate off a real-time streaming data source to trigger actions in downstream tools as fast as sub-second time, and is built on top of the gold standard in streaming technology. We believe that any approach to true real-time data activation must involve Kafka, alongside or instead of streams available in cloud data warehouses.

Want to see it in action? We’re happy to show you live

Warehouse-Enriched Real-Time Reverse ETL: The best of both worlds

Census’s end goal — enriching real-time event streams with historical data in the data warehouse or lakehouse

‎That said, a streaming data source on its own lacks the historical data that you’ve worked so hard to model in your warehouse. Our ultimate goal is to offer Real-Time Streaming Reverse ETL with Warehouse Enrichment, allowing any Census customer to easily join their high-speed event streams with their high-quality warehouse data. This unlocks a new era of data activation, where you’ll be able to serve your customers better, instantly.

How Live Syncs work under the hood

Live Syncs are a new type of sync in Census, available to all organizations with access to Continuous syncs.

When users create a sync, they can choose a Run Mode for that sync, either Live or Triggered. 

  • Triggered Syncs can be run manually, via API or external trigger, or on a schedule. 
  • Live Syncs are always running while they are enabled, syncing data in real-time.

Live run mode is available for syncs from select sources, starting with Confluent Cloud and Kafka, and replaces continuous syncs for these sources. After a user connects their Confluent Cloud or Kafka cluster, they must define schemas for the Kafka topics they want to use via the Models tab.

Our Confluent Cloud source integration was verified by Confluent as part of their Connect with Confluent technology partner program.

Getting started with real-time

We strongly believe that composable and warehouse-native marketing is the future, and that warehouse latency should no longer be a barrier to real-time customer engagement. 

Next on our roadmap is support for more Live Sync sources such as Snowflake and Databricks, as well as the ability to enrich streaming events with warehouse data.

If you share our vision of the data warehouse as the center of the business, we’d love to help you start building your Real-Time Composable CDP. Get a demo or start a free trial today.

Not yet a Confluent customer? Start your free trial of Confluent Cloud. New sign-ups receive $400 to spend during their first 30 days. No credit card required.

We look forward to hearing from you!