Best Practices

The ideal analytics stack for founders | Census

Sylvain Giuliani
Sylvain Giuliani February 25, 2021

Syl is the Head of Growth & Operations at Census. He's a revenue leader and mentor with a decade of experience building go-to-market strategies for developer tools. San Francisco, California, United States

Jamie Quint knows a thing or two about analytics. He’s one of the minds behind Reddit’s coin system, the creator of Notion’s analytics stack, and an alum of many household-name startups.

Last November, he tweeted about which tools make up the best analytics stack for founders.

Best stack is: @Amplitude_HQ @ModeAnalytics @segment @fivetran @SnowflakeDB @getdbt @getcensus
— Jamie Quint (@jamiequint) November 24, 2020

“Over the years, I've used probably basically every conceivable database or data system that's reasonably popular to use at companies of zero to 500 people,” he said.

To get a bit more insight into how Jamie used these seven tools —  Amplitude, Mode, Segment, Fivetran, Snowflake, dbt, and Census – we sat down with him and discussed his underlying philosophy behind choosing this stack, and how he sees analytics stacks changing in the future.

Jamie’s Framework for Picking the Right Analytics Stack

When Jamie started at Notion, the company was running its analytics on Amplitude. While Quint said it’s an “amazing product” that most companies should be using, he added that people often get into trouble when they try to stretch it past its limits.

“Once you need to dive deep into, say, a retention question … that doesn’t really fit what Amplitude is designed to do,” he said.

To answer this type of question, you’d need to combine different types of data, like purchasing data from Stripe and support data from Intercom, which Amplitude isn’t designed to do.

With these limitations in mind, Jamie thought about the functions he needed from his analytics stack. He describes his thought process as a question-and-answer format, based on different needs.

Need: We need a place to combine different types of data.
Jamie’s Question: How can we store our customer data in a way that allows it to be combined with the other data and queried directly?
Answer: We need a data warehouse that can store structured and semistructured data. Snowflake is our best option.

Need: We need a way to send different types of data to the data warehouse.
Jamie’s Question: How do we import all the data we need into the data warehouse in a way that allows us to transform it all later?
Answer: We need an ELT (extract, load, transform) tool that can import all our data. Fivetran can suit this purpose.

Need: We need a way to get our combined data out of that place.
Jamie’s Question: How can we push that data back out to Salesforce for light attribution modeling, Facebook and Google for custom audiences, or Google Sheets for growth modeling?
Answer: We need a tool that makes it simple to automate the process of pulling data out of our warehouse and putting it where it needs to go. Census is what we need here.

He repeated this process of identifying needs and functions for about three months, and at the end, was left with the exact analytics stack he wanted and needed. He said his combined growth-marketing and engineering background helped in his process, but that it’s not a prerequisite for anyone looking to do the same.

“At Notion, I built the entire analytics stack myself. I was the person doing almost all of the analysis and the person doing most of the marketing,” Jamie says.

“Building the analytics stack was singularly encapsulated in my job, but I don't think it's some huge technical undertaking. Usually, someone on the marketing side is going to have to partner with somebody who's a little bit more technical to figure out some of this stuff.”

The marketer presents the need, the technical person presents the question, and they both find the answer.

For example, if a growth-marketing-oriented founder is trying to find out what the upgrade rates are for customers who call support, they might know that that they would need to combine Stripe and Intercom data, but they’d have to their technical partner to make it happen. The engineer, in turn would turn that need into a more specific question, like, “If you want to combine those types of data, shouldn’t we combine all customer data and make it easy to query?” From there, you both set out to find an answer — a tool that serves the specific function you need in order to answer your retention question.

The Seven Tools in Jamie's Analytics Stack

Jamie has honed his ideal modern data stack down to Amplitude, Mode, Segment, Fivetran, Snowflake, dbt, and Census. Right now, he says these are the best tools to fulfill the analytics functions every founder needs.

“You’ve got to get the data in, got to transform it to be useful, got to be able to analyze it and then have to be able to get it out into other platforms, where it can be used and utilized to add value to the business,” he said, describing the core needs this stack serves.

He broke his thoughts down for us tool by tool:

  1. Amplitude fulfills the need for easy, flexible data analysis. “People need to have simple analysis that doesn't require writing a bunch of SQL.”
  2. Mode fulfills the need for more complex analysis. “People also need visualization for analysis that's written in SQL.”
  3. Segment serves as a “meta analytics layer” which serves to “federate data across all the different tools ... without having to include a bunch of JavaScripts on the page.”
  4. Snowflake is the central place to store all your data. It’s best suited for this function because if its scalability and general ease of use.
  5. dbt helps you prepare this centralized data to make it easier to use. Jamie says dbt is “the transform layer, ... so people can just load all the data into the database/data warehouse and then figure out what to do with it.”
  6. Fivetran sends data into your data warehouse. “Everyone's using a lot more SaaS tools and they need to get data into their data warehouse.”
  7. Census helps you put your data to use. According to Jamie, people need to “get their data out of the warehouse and back into your other tools so you can make it usable.”

As comprehensive as this analytics stack is, Jamie is quick to admit that there is room for improvement. He says that data-quality monitoring is the one big gap that needs to get filled soon.

“There are a number of startups like Iteratively that are looking at doing stuff like that. Companies like Segment and Amplitude are also trying to do that at the same time.”

The Future of This Analytics Stack

Right now, Jamie’s analytics stack is the ideal stack for founders. But it may not always be that way. The key is understanding how Jamie landed on this particular set of tools so you can adapt as technology advances and changes.

“The functionality of all of these tools in my stack is something that people are going to continue to need for an indefinite amount of time,” Jamie says. “I think the question is always 'will something come along that just blows the other ones away?'”

Jamie has a lot more to say on the future of data and his experience working with analytics throughout the years. We think learning from the past is one of the best ways to prepare for the future, especially in an industry that moves as fast as the data and analytics industry.

“I don't know enough to make a prediction about where analytics tools will be in five years,” Jamie says. “I don't think five years ago anyone would have predicted that Google's BigQuery and Amazon Redshift would get beaten by Snowflake.

For the foreseeable future, Jamie’s seven tools—Amplitude, Mode, Segment, Fivetran, Snowflake, dbt, and Census—are what you need. Schedule a demo with Census today and we’ll help you get started on your analytics stack.

Related articles

Product News
Sync data 100x faster on Snowflake with Census Live Syncs
Sync data 100x faster on Snowflake with Census Live Syncs

For years, working with high-quality data in real time was an elusive goal for data teams. Two hurdles blocked real-time data activation on Snowflake from becoming a reality: Lack of low-latency data flows and transformation pipelines The compute cost of running queries at high frequency in order to provide real-time insights Today, we’re solving both of those challenges by partnering with Snowflake to support our real-time Live Syncs, which can be 100 times faster and 100 times cheaper to operate than traditional Reverse ETL. You can create a Live Sync using any Snowflake table (including Dynamic Tables) as a source, and sync data to over 200 business tools within seconds. We’re proud to offer the fastest Reverse ETL platform on the planet, and the only one capable of real-time activation with Snowflake. 👉 Luke Ambrosetti discusses Live Sync architecture in-depth on Snowflake’s Medium blog here. Real-Time Composable CDP with Snowflake Developed alongside Snowflake’s product team, we’re excited to enable the fastest-ever data activation on Snowflake. Today marks a massive paradigm shift in how quickly companies can leverage their first-party data to stay ahead of their competition. In the past, businesses had to implement their real-time use cases outside their Data Cloud by building a separate fast path, through hosted custom infrastructure and event buses, or piles of if-this-then-that no-code hacks — all with painful limitations such as lack of scalability, data silos, and low adaptability. Census Live Syncs were born to tear down the latency barrier that previously prevented companies from centralizing these integrations with all of their others. Census Live Syncs and Snowflake now combine to offer real-time CDP capabilities without having to abandon the Data Cloud. This Composable CDP approach transforms the Data Cloud infrastructure that companies already have into an engine that drives business growth and revenue, delivering huge cost savings and data-driven decisions without complex engineering. Together we’re enabling marketing and business teams to interact with customers at the moment of intent, deliver the most personalized recommendations, and update AI models with the freshest insights. Doing the Math: 100x Faster and 100x Cheaper There are two primary ways to use Census Live Syncs — through Snowflake Dynamic Tables, or directly through Snowflake Streams. Near real time: Dynamic Tables have a target lag of minimum 1 minute (as of March 2024). Real time: Live Syncs can operate off a Snowflake Stream directly to achieve true real-time activation in single-digit seconds. Using a real-world example, one of our customers was looking for real-time activation to personalize in-app content immediately. They replaced their previous hourly process with Census Live Syncs, achieving an end-to-end latency of <1 minute. They observed that Live Syncs are 144 times cheaper and 150 times faster than their previous Reverse ETL process. It’s rare to offer customers multiple orders of magnitude of improvement as part of a product release, but we did the math. Continuous Syncs (traditional Reverse ETL) Census Live Syncs Improvement Cost 24 hours = 24 Snowflake credits. 24 * $2 * 30 = $1440/month ⅙ of a credit per day. ⅙ * $2 * 30 = $10/month 144x Speed Transformation hourly job + 15 minutes for ETL = 75 minutes on average 30 seconds on average 150x Cost The previous method of lowest latency Reverse ETL, called Continuous Syncs, required a Snowflake compute platform to be live 24/7 in order to continuously detect changes. This was expensive and also wasteful for datasets that don’t change often. Assuming that one Snowflake credit is on average $2, traditional Reverse ETL costs 24 credits * $2 * 30 days = $1440 per month. Using Snowflake’s Streams to detect changes offers a huge saving in credits to detect changes, just 1/6th of a single credit in equivalent cost, lowering the cost to $10 per month. Speed Real-time activation also requires ETL and transformation workflows to be low latency. In this example, our customer needed real-time activation of an event that occurs 10 times per day. First, we reduced their ETL processing time to 1 second with our HTTP Request source. On the activation side, Live Syncs activate data with subsecond latency. 1 second HTTP Live Sync + 1 minute Dynamic Table refresh + 1 second Census Snowflake Live Sync = 1 minute end-to-end latency. This process can be even faster when using Live Syncs with a Snowflake Stream. For this customer, using Census Live Syncs on Snowflake was 144x cheaper and 150x faster than their previous Reverse ETL process How Live Syncs work It’s easy to set up a real-time workflow with Snowflake as a source in three steps:

Best Practices
How Retail Brands Should Implement Real-Time Data Platforms To Drive Revenue
How Retail Brands Should Implement Real-Time Data Platforms To Drive Revenue

Remember when the days of "Dear [First Name]" emails felt like cutting-edge personalization?

Product News
Why Census Embedded?
Why Census Embedded?

Last November, we shipped a new product: Census Embedded. It's a massive expansion of our footprint in the world of data. As I'll lay out here, it's a natural evolution of our platform in service of our mission and it's poised to help a lot of people get access to more great quality data.