Building an Account Hierarchy Model | Census

Sean Lynch
Sean Lynch October 20, 2020

Sean is co-founder and Chief Product Officer (CPO) at Census. He loves enabling data-driven organizations, so he's energized by introducing the world to Data Activation. San Francisco, California, United States

Bottoms-up growth is a buzzy go-to-market technique right now. From the exciting self-service adoption of products like Figma to the record breaking growth of Zoom, products that enable individuals to adopt a tool without first having to talk to sales is a winning strategy. Customers get started faster, and because they understand the product, they don’t need to be marketed or sold to in order to make a purchase decision.

But these companies are finding that self-service adoption is only the first step in a customer journey. For the business to grow, individuals and teams adoption needs to eventually lead to enterprise sales. Sure, in the best case that happens naturally as excitement inside the company forces a CIO’s decision. More often than not though, it still requires traditional sales to step in.

The tricky part is that all that self-service usage data is messy. There’s nothing stopping people from signing up with personal emails or multiple people at the same company from signing up for their own teams, completely independently from each other. All of this easy adoption leaves you with multiple different teams and overlapping users, all working for the same employer. Cutting through the noise for Sales, Marketing, and Success teams requires bringing order to the chaos.

Today, we’ll talk through how you can structure product data into account hierarchies that are actionable, all within SQL and your Data Warehouse. Here's what we'll cover:

  • How to identify companies
  • Understanding your current product data model
  • Aggregating product data into companies
  • Using your new hierarchy

Identifying companies

It all starts by figuring out what real world companies are using your product. And in order to identify a unique, real world company, we need a reliable identifier. Unfortunately, that may not be immediately obvious if you’re just looking at a list of users that have signed up to start using your product. But we have one very obvious hint: domain names that appear in users email addresses.

Domain names are the easiest place to start building an account hierarchy. They’re easy for humans to understand and debug. And users are often providing them directly to you as part of their email address. Alternatively, you can also ask users to provide this domain as they’re signing up to your product.

To start building our account hierarchy of companies, we’ll create a set of all the domain names that appear in our database. There’s some obvious issues to this (A very common example gmail.com users) and we’ll address them along the way. So let’s see what domains we have.

Understanding your data model

We’re going to be translating one model to the other so before we dive in, let's get a handle on what your data looks like currently and where it’s located. It may be spread across a few different tables, it may not even be in your data warehouse (you do have a data warehouse, right?).

One thing to keep an eye out for: does your product have a workspace concept? You might call it an organization, a group, or a team. This group is a really powerful hint we can use for identifying companies. If you do have workspaces, you’ll also need to know who is the “owner” of a workspace. This is typically the user that created it, but you may have a better identifier such as admin.

With your users and your workspaces, you can use some basic SQL to extract the domain from an email address. For each user and workspace, you now know which Company they belong to. We can use this to start aggregating our users.

Pro tip: You may have situations where companies are represented by more than one domain. In this case, you’ll want to add the ability to associate secondary domains to a primary domain as part of rolling up workspaces together.

Aggregating into Companies

The simplest hierarchy is

  • GROUP BY domain
  • COUNT(*) of users or workspaces

We’ve got the start of our account hierarchy. We can roll up more metrics from here. For each metric, it’s a matter of thinking about what sort of aggregation makes sense at the company level, ie SUM, AVG, MAX

What about personal email domain users like gmail.com or outlook.com? Well, that depends a bit on your business. You have three options, and the right option for you depends on what percentage of your user base use personal accounts

1) Ignore themThis is a completely reasonable option and the best place to start. Chances are, users that provided a personal email address are likely using your product for personal reasons. For example, someone using Notion to track their sourdough recipes are not going to be interested in Enterprise Single Sign-on features. So, you may want to just ignore them. You can do that by simply ignoring every domain in this comprehensive list of public email domains.

2) Roll up to a single accountYou could also translate all of the personal usage into a separate Public Email catch all company/account. This is a great strategy if you plan to connect these users/leads to other parts of your marketing automation and don’t need company information. If you’re planning on publishing this data to a CRM however, it’s really only worth while if you plan on having a human review the users that end up associated with this account.

3) Use firmographic data The most advanced option is incorporate firmographic data such as data provided by Clearbit, Zoominfo, or FullContact to translate a personal email address into an employer. Note that this data is not perfectly accurate so you may end up linking users to their previous employers or to someone completely unrelated to them.

Using your new hierarchy

Now that you’ve got your Account Hierarchy up and running, you can start putting it to use. Census makes it easy to connect the data models you build in your Data Warehouse to the Sales and Marketing products you use to drive your Customer Journey, but how that journey unfolds is completely up to you.

What you’ve built is a Business Identity model that you can share across all of your products and tools. Census helps you associate this product model with all the other tools you use to serve your customer.

Here’s just a few ideas:

  • Enrich your existing accounts and contacts with usage data - Use this data to give account reps context on what users are doing in product without searching across all your tools
  • Identify net new logos - Use the data to identify companies that are actively using the product but are not yet in your CRM. Congratulations, you now have product qualified leads!
  • Orchestrate Sales and Marketing at the same Account-oriented goals - Account Based Marketing requires Sales and Marketing to target the same accounts, but not necessarily the same contacts within those accounts. Use your new account hierarchy to share target lists between both sales and marketing

The sky’s the limit! This is meant to be a data platform you can build upon. Add more product metrics, incorporate data form other SaaS tools and more. We’re excited to see what you do with your account hierarchy. Keep an eye on this space for more customer success stories and other tricks to get the most out of your product data.

If you have any questions, contact me! I always move geeking about these data problems.

Related articles

Customer Stories
Built With Census Embedded: Labelbox Becomes Data Warehouse-Native
Built With Census Embedded: Labelbox Becomes Data Warehouse-Native

Every business’s best source of truth is in their cloud data warehouse. If you’re a SaaS provider, your customer’s best data is in their cloud data warehouse, too.

Best Practices
Keeping Data Private with the Composable CDP
Keeping Data Private with the Composable CDP

One of the benefits of composing your Customer Data Platform on your data warehouse is enforcing and maintaining strong controls over how, where, and to whom your data is exposed.

Product News
Sync data 100x faster on Snowflake with Census Live Syncs
Sync data 100x faster on Snowflake with Census Live Syncs

For years, working with high-quality data in real time was an elusive goal for data teams. Two hurdles blocked real-time data activation on Snowflake from becoming a reality: Lack of low-latency data flows and transformation pipelines The compute cost of running queries at high frequency in order to provide real-time insights Today, we’re solving both of those challenges by partnering with Snowflake to support our real-time Live Syncs, which can be 100 times faster and 100 times cheaper to operate than traditional Reverse ETL. You can create a Live Sync using any Snowflake table (including Dynamic Tables) as a source, and sync data to over 200 business tools within seconds. We’re proud to offer the fastest Reverse ETL platform on the planet, and the only one capable of real-time activation with Snowflake. 👉 Luke Ambrosetti discusses Live Sync architecture in-depth on Snowflake’s Medium blog here. Real-Time Composable CDP with Snowflake Developed alongside Snowflake’s product team, we’re excited to enable the fastest-ever data activation on Snowflake. Today marks a massive paradigm shift in how quickly companies can leverage their first-party data to stay ahead of their competition. In the past, businesses had to implement their real-time use cases outside their Data Cloud by building a separate fast path, through hosted custom infrastructure and event buses, or piles of if-this-then-that no-code hacks — all with painful limitations such as lack of scalability, data silos, and low adaptability. Census Live Syncs were born to tear down the latency barrier that previously prevented companies from centralizing these integrations with all of their others. Census Live Syncs and Snowflake now combine to offer real-time CDP capabilities without having to abandon the Data Cloud. This Composable CDP approach transforms the Data Cloud infrastructure that companies already have into an engine that drives business growth and revenue, delivering huge cost savings and data-driven decisions without complex engineering. Together we’re enabling marketing and business teams to interact with customers at the moment of intent, deliver the most personalized recommendations, and update AI models with the freshest insights. Doing the Math: 100x Faster and 100x Cheaper There are two primary ways to use Census Live Syncs — through Snowflake Dynamic Tables, or directly through Snowflake Streams. Near real time: Dynamic Tables have a target lag of minimum 1 minute (as of March 2024). Real time: Live Syncs can operate off a Snowflake Stream directly to achieve true real-time activation in single-digit seconds. Using a real-world example, one of our customers was looking for real-time activation to personalize in-app content immediately. They replaced their previous hourly process with Census Live Syncs, achieving an end-to-end latency of <1 minute. They observed that Live Syncs are 144 times cheaper and 150 times faster than their previous Reverse ETL process. It’s rare to offer customers multiple orders of magnitude of improvement as part of a product release, but we did the math. Continuous Syncs (traditional Reverse ETL) Census Live Syncs Improvement Cost 24 hours = 24 Snowflake credits. 24 * $2 * 30 = $1440/month ⅙ of a credit per day. ⅙ * $2 * 30 = $10/month 144x Speed Transformation hourly job + 15 minutes for ETL = 75 minutes on average 30 seconds on average 150x Cost The previous method of lowest latency Reverse ETL, called Continuous Syncs, required a Snowflake compute platform to be live 24/7 in order to continuously detect changes. This was expensive and also wasteful for datasets that don’t change often. Assuming that one Snowflake credit is on average $2, traditional Reverse ETL costs 24 credits * $2 * 30 days = $1440 per month. Using Snowflake’s Streams to detect changes offers a huge saving in credits to detect changes, just 1/6th of a single credit in equivalent cost, lowering the cost to $10 per month. Speed Real-time activation also requires ETL and transformation workflows to be low latency. In this example, our customer needed real-time activation of an event that occurs 10 times per day. First, we reduced their ETL processing time to 1 second with our HTTP Request source. On the activation side, Live Syncs activate data with subsecond latency. 1 second HTTP Live Sync + 1 minute Dynamic Table refresh + 1 second Census Snowflake Live Sync = 1 minute end-to-end latency. This process can be even faster when using Live Syncs with a Snowflake Stream. For this customer, using Census Live Syncs on Snowflake was 144x cheaper and 150x faster than their previous Reverse ETL process How Live Syncs work It’s easy to set up a real-time workflow with Snowflake as a source in three steps: