Marketing attribution: The challenges of current software solutions | Census

Sylvain Giuliani
Sylvain Giuliani May 09, 2023

Syl is the Head of Growth & Operations at Census. He's a revenue leader and mentor with a decade of experience building go-to-market strategies for developer tools. San Francisco, California, United States

Ah… Marketing attribution — the key to optimizing marketing strategies and improving marketing spend ROI. 😌

Sounds perfect, right?

There’s just one, teensy problem: Current marketing attribution software solutions have marketers feeling unsatisfied at best. In fact, according to a survey by QueryClick, 58% of marketers say their current attribution models are an obstacle to implementing marketing activities. And only 11% of marketers are happy with their attribution solutions. 😬 While these existing software solutions do provide some valuable insights, they fall short of meeting today’s dynamic marketing needs.

In this blog post, we'll explore how to overcome the limitations of current marketing attribution software solutions by modeling marketing data in plain SQL.

The limitations of current marketing attribution software solutions

Like we said: Most marketers are dissatisfied with the current marketing attribution software solutions.

One of their main limitations? They’re often too rigid and inflexible. Existing solutions rely on predefined rules and models to assign credit to each touchpoint, so they (typically) lead to inaccurate and incomplete results. They simply don’t provide enough transparency into how the credit is assigned, making it difficult for marketers to understand and interpret the results.

Another roadblock is that current marketing attribution solutions don’t integrate very well with other marketing tools and platforms. 🚧 Among other things, that disconnect makes it difficult for marketers to bring together data from different sources and gain a comprehensive view of their marketing efforts.

So… if marketing attribution — when done correctly — improves marketing campaigns and ROI, but current solutions fail to meet expectations, where do you turn? 👂 Pssst… You can model your marketing data in plain SQL.

Why use SQL for marketing attribution?

SQL is a powerful tool for uniting data from multiple sources and applying sophisticated algorithms to assign credit to each touchpoint.

And using it for marketing attribution has several advantages over traditional marketing attribution software solutions.👇

  1. SQL provides more flexibility and control over the attribution process. You can define your own attribution rules and models, and easily modify or adjust them as needed.
  2. SQL provides greater transparency into the attribution process. Now, you can see how credit is assigned to each touchpoint, so you can understand the underlying logic behind the attribution rules and models.
  3. SQL can help us integrate data from multiple sources and gain a comprehensive view of our marketing efforts. By joining your data from web analytics, CRMs, and marketing automation, you can analyze the effectiveness of different marketing channels and campaigns.

All about attribution modeling

Before we dive into the details of how to model marketing data in SQL, let's review the different types of attribution modeling:

  • First touch: Assigns all the credit to the first touchpoint that led to a conversion.
  • Last touch: Assigns all the credit to the last touchpoint that led to a conversion.
  • Linear: Assigns equal credit to all touchpoints that led to a conversion.
  • U shape: Assigns more credit to the first and last touchpoints, and less credit to the touchpoints in between.
  • Time decay: Assigns more credit to the touchpoints that occurred closer in time to the conversion.

Of course, each attribution model has its own strengths and weaknesses, so the "best” model will depend on the specific needs and goals of your organization.

Building an attribution model in SQL

Let’s cut to the chase: To build an attribution model in SQL, you need to bring together data from multiple sources, including web analytics, CRM, and marketing automation. You also need to define your own attribution rules and models in order to apply sophisticated algorithms to assign credit to each touchpoint.

Here’s a basic table representation of the data schema involved:


.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
 overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
 font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-cly1{text-align:left;vertical-align:middle}
.tg .tg-1wig{font-weight:bold;text-align:left;vertical-align:top}



 
   Table
   Fields
 


 
   sessions
   session_id, user_id, start_time, end_time, source_channel
 
 
   touches
   touch_id, user_id, timestamp, channel, campaign
 
 
   conversions
   conversion_id, user_id, conversion_time, revenue
 


In this example, we’ll use the schema in this table 👆 to build a view that assigns credit to each touchpoint based on our set of predefined rules. Then, once we analyze the effectiveness of different marketing channels and campaigns, we can optimize our marketing strategy and improve the ROI of our marketing spend.

Let’s write some SQL

Building a query takes four basic steps:

  1. Gather your required data sources
  2. Find all sessions before conversion
  3. Calculate the total sessions and the session index
  4. Allocate points

Here’s what each step looks like using actual SQL queries.

1. Gather your required data sources

The following snippet creates three CTEs for our three data sources — session, touch (can be any events), and conversion data — from their respective tables. We’re just prepping the data at this point, but this is what it looks like.

WITH sessions AS (
 SELECT
   session_id,
   user_id,
   start_time,
   end_time,
   source_channel
 FROM
   sessions_table
),

touches AS (
 SELECT
   touch_id,
   user_id,
   timestamp,
   channel,
   campaign
 FROM
   touches_table
),

conversions AS (
 SELECT
   conversion_id,
   user_id,
   conversion_time,
   revenue
 FROM
   conversions_table
)


2. Find all sessions before conversion

Now, we have our source data, so let’s get crunching. Start by creating a view that combines session and conversion data. The goal is to identify all sessions that occurred before a conversion, then join the session and conversion data together. Here’s how the query looks:

WITH sessions_before_conversion AS (
 SELECT
   s.session_id,
   s.user_id,
   s.start_time,
   s.end_time,
   s.source_channel,
   c.conversion_time
 FROM
   sessions s
   JOIN conversions c ON s.user_id = c.user_id AND s.start_time < c.conversion_time
)


3. Calculate the total sessions and the session index

With the session data in hand, we can assign the weight to each of these sessions based on the number of touches using the following query:

WITH touches_with_session AS (
 SELECT
   t.touch_id,
   t.user_id,
   t.timestamp,
   t.channel,
   t.campaign,
   s.session_id
 FROM
   touches t
   JOIN sessions_before_conversion s ON t.user_id = s.user_id AND t.timestamp BETWEEN s.start_time AND s.end_time
),

touchpoints AS (
 SELECT
   user_id,
   session_id,
   channel,
   campaign,
   COUNT(DISTINCT touch_id) AS touches
 FROM
   touches_with_session
 GROUP BY
   user_id,
   session_id,
   channel,
   campaign
),

touchpoint_weights AS (
 SELECT
   *,
   ROW_NUMBER() OVER (
     PARTITION BY user_id, session_id
     ORDER BY touches DESC
   ) AS touchpoint_rank
 FROM
   touchpoints
)


4. Allocate points

Allocate points to each touchpoint based on the model you want. Here, we are assigning 100% of the credit to the first touchpoint and only 50% of the credit to the second touchpoint with the following query.

WITH attribution AS (
 SELECT
   user_id,
   session_id,
   channel,
   campaign,
   touches,
   CASE touchpoint_rank
     WHEN 1 THEN 1
     WHEN 2 THEN 0.5
     ELSE 0
   END AS attribution_weight
 FROM
   touchpoint_weights
 WHERE
   touchpoint_rank <= 2
)


5. Join in revenue and ad spend data

Use the allocated credit to analyze the effectiveness of different marketing channels and campaigns based on the revenue they generated plus the cost of your ads. The final output calculates the ROI for each marketing channel and campaign based on the attributed credit and revenue data.

WITH attribution AS (
 ...
),

attribution_by_channel_campaign AS (
 SELECT
   channel,
   campaign,
   SUM(attribution_weight) AS attribution
 FROM
   attribution
 GROUP BY
   channel,
   campaign
),

revenue_by_channel_campaign AS (
 SELECT
   channel,
   campaign,
   SUM(revenue) AS revenue
 FROM
   conversions
 GROUP BY
   channel,
   campaign
)

SELECT
 abc.channel,
 abc.campaign,
 abc.attribution,
 rbc.revenue,
 abc.attribution / rbc.revenue AS roi
FROM
 attribution_by_channel_campaign abc
 JOIN revenue_by_channel_campaign rbc ON abc.channel = rbc.channel AND abc.campaign = rbc.campaign


And when you combine the above five steps, you get an attribution model in SQL that assigns credit to each touchpoint based on a set of predefined rules. 🙌  Putting it all together, you get the full query.


WITH sessions_before_conversion AS (
 SELECT
   s.session_id,
   s.user_id,
   s.start_time,
   s.end_time,
   s.source_channel,
   c.conversion_time
 FROM
   sessions s
   JOIN conversions c ON s.user_id = c.user_id AND s.start_time < c.conversion_time
),

touches_with_session AS (
 SELECT
   t.touch_id,
   t.user_id,
   t.timestamp,
   t.channel,
   t.campaign,
   s.session_id
 FROM
   touches t
   JOIN sessions_before_conversion s ON t.user_id = s.user_id AND t.timestamp BETWEEN s.start_time AND s.end_time
),

touchpoints AS (
 SELECT
   user_id,
   session_id,
   channel,
   campaign,
   COUNT(DISTINCT touch_id) AS touches
 FROM
   touches_with_session
 GROUP BY
   user_id,
   session_id,
   channel,
   campaign
),

touchpoint_weights AS (
 SELECT
   *,
   ROW_NUMBER() OVER (
     PARTITION BY user_id, session_id
     ORDER BY touches DESC
   ) AS touchpoint_rank
 FROM
   touchpoints
),

attribution AS (
 SELECT
   user_id,
   session_id,
   channel,
   campaign,
   touches,
   CASE touchpoint_rank
     WHEN 1 THEN 1
     WHEN 2 THEN 0.5
     ELSE 0
   END AS attribution_weight
 FROM
   touchpoint_weights
 WHERE
   touchpoint_rank <= 2
),

attribution_by_channel_campaign AS (
 SELECT
   channel,
   campaign,
   SUM(attribution_weight) AS attribution
 FROM
   attribution
 GROUP BY
   channel,
   campaign
),

revenue_by_channel_campaign AS (
 SELECT
   channel,
   campaign,
   SUM(revenue) AS revenue
 FROM
   conversions
 GROUP BY
   channel,
   campaign
)

SELECT
 abc.channel,
 abc.campaign,
 abc.attribution,
 rbc.revenue,
 abc.attribution / rbc.revenue AS roi
FROM
 attribution_by_channel_campaign abc
 JOIN revenue_by_channel_campaign rbc ON abc.channel = rbc.channel AND abc.campaign = rbc.campaign


Limitations of SQL-based attribution modeling

While modeling marketing data in SQL can help us overcome the limitations of current marketing attribution software solutions, there are still some limitations to be aware of.

  1. Positional attribution is not a perfect representation of human decision-making. In reality, customers may be influenced by multiple touchpoints, so their decision-making process may be more complex than can be captured by a simple attribution model.
  2. The quality of the source data is often low. Web traffic is often poorly tagged with UTM parameters, and CRM data may be incomplete or inaccurate. Poor source data quality can make assigning credit to each touchpoint difficult.

Despite these few limitations, modeling marketing data in SQL helps you gain deeper insights into the effectiveness of our marketing campaigns. And by experimenting with different attribution models and analyzing the results, you can continuously improve marketing attribution and drive better business outcomes. 🚀

The future of marketing attribution is SQL

Sure, marketing attribution is a game-changer for marketers. But if (and when) current software solutions fall short of meeting your evolving needs, start modeling your marketing data in plain SQL — and gain deeper insights into the effectiveness of your marketing campaigns along the way.

Remember: The SQL code example we ran through is just one way to model marketing data. There are so many other approaches you can use to gain deeper insights into the effectiveness of your marketing campaigns and improve ROI. It all just comes down to the specific needs of your organization.

💪 Want to learn how Census can empower your marketing attribution mission? Book a demo with a Census product specialist.