Best Practices

Introduction to Marketing Attribution Modeling (with Examples) | Census

Sylvain Giuliani
Sylvain Giuliani March 06, 2021

Syl is the Head of Growth & Operations at Census. He's a revenue leader and mentor with a decade of experience building go-to-market strategies for developer tools. San Francisco, California, United States

Marketing has always been considered more of an art than a science … until recently. As marketing in the modern world becomes increasingly digitized, data science has become an increasingly crucial partner for marketers to help them better understand their customers. With modern technologies and marketing mediums, marketers know more about who their customers are, where their customers are coming from, and why they’re making purchasing decisions.

In particular, we’re going to focus on one of the most popular, digital-era modeling techniques in marketing called attribution modeling.

Attribution modeling lets marketers have a better understanding of the effectiveness of their marketing channels so they can optimize their sales funnels to bring in new sales and improve customer experiences. In this article, we’re going to cover attribution modeling – what it is, why you’d want to use it, and a few pointers on how to get started yourself.

What is attribution modeling?

We use attribution modeling to determine how credit is assigned to each marketing channel, or touchpoint, for a given conversion. In fact, it doesn’t have to be specific to conversions – it can be used to assign credit for each click, each conversion, each email sign up, etc. – but generally speaking, it’s used for conversions.

Why is attribution modeling so useful?

Attribution modeling allows marketers to see which parts of their sales funnel, i.e. touchpoints or marketing channels, were most effective in pushing the prospect to make a sale. This data gives marketers the insight to make better, more-informed decisions about which tactics to pursue and how well a campaign is working.

With attribution modeling, you can learn which marketing channels are the most effective in reaching your customers, ultimately increasing your return-on-investment (ROI) and customer acquisition costs.

Attribution modeling is great for answering the following questions:

  • Which marketing channels lead to the most conversions?
  • Which marketing channels have the greatest/least impact in the upper-funnel? Mid-funnel? lower-funnel?
  • Which marketing channels are the most consistent over time?

Types of attribution modeling, and a general overview

There are several types of attribution models that are commonly used, each with their own pros and cons. To this day, there’s no consensus on which type is the best, and typically, many marketing scientists look at more than one at the same time.

Here are some of the most common types of attribution models:

1: First-click model

First click attribution model

In the first-click model, all the credit – 100% of the value – for the sale is assigned to the first touchpoint, or first click, the customer makes. First click is simple and gives you a good idea of the most effective upper funnel channel, but it doesn’t consider any touchpoints afterward.

2: Last-click model

Last click attribution model

In the last-click model, you assign 100% of the value to the final touchpoint. Last-click is the simplest to implement, to evaluate, and is generally the most accurate. Similar to first-click, however, it doesn’t consider any touchpoints prior to the last touchpoint.

3: Linear attribution model

linear attribution model

In a linear attribution model, you give credit to each touchpoint equally. Linear attribution gives you a more balanced look at your marketing strategy but it generalizes it in the sense that every touchpoint equally contributes to a conversion, when that’s likely not the case in reality.

4. Last non-direct click attribution model

Last non-direct click attribution model

The name says it all. In this model, credit is given to the last touchpoint that isn’t direct traffic. This is useful because it helps you understand what drove your customer to go directly to the website. For example, if a customer saw a really catchy YouTube ad and later searched it, it would be incredibly useful to know that it was the YouTube ad that drove sales.

5. Time decay attribution model

Time decay attribution model

In a time decay attribution model, the closer the interaction is to the time of conversion, the more value that interaction is assigned. Time Decay Attribution is especially useful for long sales cycles, as it’s likely the case that advertisements from many months ago will have as strong an impact as advertisements seen close to today.

6. Position based attribution model

Position based attribution model

In position-based attribution, 40% of the credit is given to the first and the last touchpoint, while the remaining 20% is spread throughout the rest of the touchpoints in between. Position-based attribution was created with the belief that the first and last touchpoint are ultimately the most important.

Methods to Conduct Attribution modeling

There are a number of ways that you can create an attribute model, each with their own pros and cons. Below are some of the most common methods:

🐍 Do it yourself (in Python!)

Python is used for a lot of data science purposes like attribution modeling. It’s a straightforward language with easy syntax, lots of libraries and global usage. With Python you can pretty much customize your attribution models as much as you like, programming in new functionality, making updates and using any number of libraries and packages to spice it up. This makes it great for long-term technical positions at jobs, or if you’re going to be a particular company’s “go-to” for programming their models to an advanced level.

Still though, the downside is that you’ll need to have data pipelines already set up, and this can be a little tricky for beginners. Python is fun and anyone can learn it, sure, but the learning curve gets steep if you want to go beyond amateur stuff.

If you want to get started with Python, I recommed this great tutorial.

📊Google Analytics

Google’s analytics platform is used by millions of companies. It’s basically an industry-standard in marketing and allows anyone with so much as a page on the internet to measure metrics and touchpoints to see how they’re performing. However, Google Analytics is more targeted towards the actual marketing end and less towards the technical data science end. So, if all you want is to measure metrics and see what you can do to improve, Google Analytics is for you, but if you want to carefully analyze your data and really get a deep understanding of what makes your business tick, it may be better for you to program your own models, or use a platform like Bizible.

🔨3rd party tools

There are tools on the market (such as Bizible) that let you get Marketing Attribution “out of the box” if you plug your event stream (think Segment) into them. They are more powerful than Google Analytics, but a lot less than Python or SQL which allow you to fine-tune your model however you want.

What’s next?

I hope this introduction was useful to understand the core concept of attribution modeling. In our next post on the topic, we will dive into a practical use case of building a marketing attribution model in SQL (maybe even dbt).

In the meantime, if you have questions, or want us to help you build your marketing attribution model, don’t hesitate to contact us.

Related articles

Customer Stories
Built With Census Embedded: Labelbox Becomes Data Warehouse-Native
Built With Census Embedded: Labelbox Becomes Data Warehouse-Native

Every business’s best source of truth is in their cloud data warehouse. If you’re a SaaS provider, your customer’s best data is in their cloud data warehouse, too.

Best Practices
Keeping Data Private with the Composable CDP
Keeping Data Private with the Composable CDP

One of the benefits of composing your Customer Data Platform on your data warehouse is enforcing and maintaining strong controls over how, where, and to whom your data is exposed.

Product News
Sync data 100x faster on Snowflake with Census Live Syncs
Sync data 100x faster on Snowflake with Census Live Syncs

For years, working with high-quality data in real time was an elusive goal for data teams. Two hurdles blocked real-time data activation on Snowflake from becoming a reality: Lack of low-latency data flows and transformation pipelines The compute cost of running queries at high frequency in order to provide real-time insights Today, we’re solving both of those challenges by partnering with Snowflake to support our real-time Live Syncs, which can be 100 times faster and 100 times cheaper to operate than traditional Reverse ETL. You can create a Live Sync using any Snowflake table (including Dynamic Tables) as a source, and sync data to over 200 business tools within seconds. We’re proud to offer the fastest Reverse ETL platform on the planet, and the only one capable of real-time activation with Snowflake. 👉 Luke Ambrosetti discusses Live Sync architecture in-depth on Snowflake’s Medium blog here. Real-Time Composable CDP with Snowflake Developed alongside Snowflake’s product team, we’re excited to enable the fastest-ever data activation on Snowflake. Today marks a massive paradigm shift in how quickly companies can leverage their first-party data to stay ahead of their competition. In the past, businesses had to implement their real-time use cases outside their Data Cloud by building a separate fast path, through hosted custom infrastructure and event buses, or piles of if-this-then-that no-code hacks — all with painful limitations such as lack of scalability, data silos, and low adaptability. Census Live Syncs were born to tear down the latency barrier that previously prevented companies from centralizing these integrations with all of their others. Census Live Syncs and Snowflake now combine to offer real-time CDP capabilities without having to abandon the Data Cloud. This Composable CDP approach transforms the Data Cloud infrastructure that companies already have into an engine that drives business growth and revenue, delivering huge cost savings and data-driven decisions without complex engineering. Together we’re enabling marketing and business teams to interact with customers at the moment of intent, deliver the most personalized recommendations, and update AI models with the freshest insights. Doing the Math: 100x Faster and 100x Cheaper There are two primary ways to use Census Live Syncs — through Snowflake Dynamic Tables, or directly through Snowflake Streams. Near real time: Dynamic Tables have a target lag of minimum 1 minute (as of March 2024). Real time: Live Syncs can operate off a Snowflake Stream directly to achieve true real-time activation in single-digit seconds. Using a real-world example, one of our customers was looking for real-time activation to personalize in-app content immediately. They replaced their previous hourly process with Census Live Syncs, achieving an end-to-end latency of <1 minute. They observed that Live Syncs are 144 times cheaper and 150 times faster than their previous Reverse ETL process. It’s rare to offer customers multiple orders of magnitude of improvement as part of a product release, but we did the math. Continuous Syncs (traditional Reverse ETL) Census Live Syncs Improvement Cost 24 hours = 24 Snowflake credits. 24 * $2 * 30 = $1440/month ⅙ of a credit per day. ⅙ * $2 * 30 = $10/month 144x Speed Transformation hourly job + 15 minutes for ETL = 75 minutes on average 30 seconds on average 150x Cost The previous method of lowest latency Reverse ETL, called Continuous Syncs, required a Snowflake compute platform to be live 24/7 in order to continuously detect changes. This was expensive and also wasteful for datasets that don’t change often. Assuming that one Snowflake credit is on average $2, traditional Reverse ETL costs 24 credits * $2 * 30 days = $1440 per month. Using Snowflake’s Streams to detect changes offers a huge saving in credits to detect changes, just 1/6th of a single credit in equivalent cost, lowering the cost to $10 per month. Speed Real-time activation also requires ETL and transformation workflows to be low latency. In this example, our customer needed real-time activation of an event that occurs 10 times per day. First, we reduced their ETL processing time to 1 second with our HTTP Request source. On the activation side, Live Syncs activate data with subsecond latency. 1 second HTTP Live Sync + 1 minute Dynamic Table refresh + 1 second Census Snowflake Live Sync = 1 minute end-to-end latency. This process can be even faster when using Live Syncs with a Snowflake Stream. For this customer, using Census Live Syncs on Snowflake was 144x cheaper and 150x faster than their previous Reverse ETL process How Live Syncs work It’s easy to set up a real-time workflow with Snowflake as a source in three steps: