Tutorials

How to move data from Snowflake to Facebook Ads | Census

Michel Zurkirchen
Michel Zurkirchen May 11, 2022

Michel Zurkirchen is an Amsterdam-based digital analyst and a regular contributor to the Census blog. After starting his career in digital marketing, Michel started learning Python and has been an enthusiastic learner and practitioner of data analytics ever since. Amsterdam, North Holland, Netherlands

Facebook boasts the world's biggest audience for you to promote your products. It’s no surprise, then, that you’d want your advertisements on the Facebook Ads data platform.

But even if you know you want to use Facebook Ads, you don’t want to throw any and every advertisement on the platform just to get the word out about your product quickly – it’s more complicated than that. It’s not about arbitrarily putting your products out there; you have to do so with purpose. When you do advertise, you want to make sure that you're reaching the right people with the right message. 💬

We'll show you how to do just that by syncing data about (potential) customers between your Snowflake data warehouse and Facebook Ads. In this tutorial, we'll create an audience that we can target with ads based on a list of email addresses (though you can also sync offline event data similarly). We'll show you three ways to do it:

  1. A quick, manual method
  2. A more elaborate, automated method
  3. An easy and trustworthy method using Census reverse ETL.

The manual method is great for prototyping and one-offs, but the automated method is preferred when you have to regularly update your data. Census is a great addition if you want to optimize and trust that your automatic updates will work without any errors or maintenance. 👏

Manually syncing data between Snowflake and Facebook Ads

If you don't have a Facebook audience to sync your data sources to yet, start by creating one in the Facebook Ads Manager. Click Create Audience > Custom Audience and select Customer list as your source. After clicking Next, Facebook will give you step-by-step instructions on how to proceed.

👀  Looking for some extra-helpful resources? Check out this CSV file template and the formatting guidelines for loading data.

When you proceed to the next screen, you’ll be prompted to select whether your data contains the value of a customer. For example, this value could be the dollar amount a customer has spent with you.

After answering the prompt, you’ll continue to the next screen, where you'll be asked to upload a CSV or TXT file. Now, it’s showtime. 👐  We know exactly what Facebook expects from us, so we have to extract data from Snowflake.

In the Snowflake web UI, you’ll need to write a query that will pull the data that you need. Here, we'll be matching customer data based on their email address, and we renamed the column to make it easy for Facebook to recognize our ad set. ✅


SELECT C_EMAIL_ADDRESS AS "Email"
FROM "SNOWFLAKE_SAMPLE_DATA"."TPCDS_SF100TCL"."CUSTOMER"
WHERE C_EMAIL_ADDRESS IS NOT NULL;

Once your query finishes running, download the resulting CSV file and inspect it for any data quality issues, such as missing rows, invalid email addresses, or anything else that might clog your data pipeline. If everything is good to go, return to Facebook and upload the file.

Facebook will ask you to map the columns to their identifiers, so verify that the columns in your CSV file have been mapped to the correct identifier and then hit import. Just like that, you'll be able to use this audience for your ad targeting within minutes — provided, of course, that Facebook was able to match your data to its users.

Syncing from Snowflake to Facebook Ads with Facebook’s Python API

Both Snowflake and Facebook have well-documented APIs which allow us to automate the syncing process. We'll be using Python to do so, but no hard feelings if you decide to use SQL or another preferred language. 😜

You can immediately access the Snowflake API if you have an account. To simplify things, we'll use the Snowflake Connector for Python, which you can install with pip, like so:


pip3 install snowflake-connector-python

Once installed, import it along with the hashlib library, then create a connection to Snowflake.


import hashlib
import snowflake.connector

conn = snowflake.connector.connect(
    user = "your-username",
    password = "your-password",
    account = "your-account"
    )

You can use the same username, password, and account that you use to log in to the web UI.

👂 If you normally log into the classic Snowflake web UI, you'll find your account in the URL: https://<your-account>. snowflakecomputing.com.

You're now all set to query your data. 🙌 In this case, all we need is a list of email addresses, resulting in the below query.


cursor = conn.cursor().execute(""" SELECT C_EMAIL_ADDRESS AS "Email" FROM "SNOWFLAKE_SAMPLE_DATA"."TPCDS_SF100TCL"."CUSTOMER" WHERE C_EMAIL_ADDRESS IS NOT NULL; """)

The cursor contains the data as a list of tuples which you can retrieve using fetchall. This essentially creates a “list of lists” to meet Facebook’s format requirements. Facebook also demands that we normalize and hash email addresses, so you’ll see that in the below query.


data = [[i[0]] for i in cursor.fetchall()]

for i in data:
    i[0] = i[0].lower().strip()
    i[0] = hashlib.sha256(i[0].encode('utf-8')).hexdigest()

Don't forget to close your connection to Snowflake once you've retrieved your data and verified that you got what you needed!


conn.close()

Now that we have our Snowflake data, we’re headed back over to Facebook. We'll take the easy route again, this time by utilizing Facebook's Python Business SDK.

Full transparency: Getting access to the API is the most time-consuming part of the process. You'll need to follow these instructions to set up a Facebook Developer Account, Facebook App, and Access Token. Once that's done, go to your app and choose "Add product" in the sidebar to add the Marketing API to your app.

The Marketing API will now appear in the sidebar, so you can click on it and then select Quickstart > Build Custom Audiences to go through the setup. It will even generate an example code for you to download (if you want it), but let's get to the good stuff: The actual syncing of data. We’ll start by installing the SDK.


pip3 install facebook_business

Next up: The imports.


from facebook_business.adobjects.adaccount import AdAccount
from facebook_business.adobjects.customaudience import CustomAudience
from facebook_business.api import FacebookAdsApi

Once you have the SDK installed and your imports set up, you’ll need to configure a few settings – all of which you can copy straight from this example code. 💃 You'll just need to come up with your own audience_name and decide on a retention period.


access_token = 'access-token'
app_secret = 'app-secret'
ad_account_id = 'ad-account-id'
audience_name = 'Census demo'
audience_retention_days = '30'
pixel_id = 'pixel-id'
app_id = 'app-id'

Now, initialize a connection to the API. If you run into any access errors due to an expired access token, you can use Graph API Explorer to create a new one.


FacebookAdsApi.init(access_token=access_token)

To be on the safe side, do an initial check to verify that the audiences already exist and determine if you need to append your data to an existing audience, or create a new one altogether.


fields = [
  'id',
  'name',
  'description',
  'customer_file_source'
]

AdAccount(ad_account_id).get_custom_audiences(fields=fields)

Assuming that you didn't find an existing audience and you need to create a new one, you can do so with the below code.


params = {
  'name': audience_name,
  'subtype': 'CUSTOM',
  'description': 'Snowflake import',
  'customer_file_source': 'USER_PROVIDED_ONLY'
}

response = AdAccount(ad_account_id).create_custom_audience(params=params)

Storing the response gives us access to the ID of the audience we just created (we'll need this in the final step where we will finally upload our data).


params = {
  'payload': {
    'schema': ['EMAIL'],
    'data': data # The list of lists containing hashed email addresses
  }
}

CustomAudience(response['id']).create_user(
  params=params
)

Voila! Fin. ✨ Facebook will now get to work on matching your uploaded data to their user base.

These are the basics of keeping your data in sync, but, to avoid creating an overly lengthy article, we skipped the additional functionality needed to move this to a production environment. If you need to move this to a production environment, that might involve scheduling this process to execute frequently, adding logging to determine if the script ran successfully or encountered errors, rerunning the script if it did encounter an error, etc.

Automating advertising audience syncs with Census

The automated method is a great way to reduce the hassle of updating your audiences, but how do you know you can trust that these syncs are error-free? The short answer: You can’t – unless you automate your audience syncs with Census.

Check out this step-by-step video tutorial to guide you through the process:

You can also check out some additional Census documentation describing how to use Census with Facebook Ads (Audiences and Conversions).

Need help connecting to Facebook? Contact support@getcensus.com or start a conversation via the in-app chat.

Final advice

If your audiences change so infrequently that the manual method just makes sense for you, you're all set. If you prefer the ease and simplicity of the automatic method, however, we gave you all the information you need to do so in this article. Turning to automation, though convenient, is not necessarily 100% correct.

Want to make sure that information is updated correctly with no maintenance required? Check out Census and start syncing today!

Related articles

Customer Stories
Built With Census Embedded: Labelbox Becomes Data Warehouse-Native
Built With Census Embedded: Labelbox Becomes Data Warehouse-Native

Every business’s best source of truth is in their cloud data warehouse. If you’re a SaaS provider, your customer’s best data is in their cloud data warehouse, too.

Best Practices
Keeping Data Private with the Composable CDP
Keeping Data Private with the Composable CDP

One of the benefits of composing your Customer Data Platform on your data warehouse is enforcing and maintaining strong controls over how, where, and to whom your data is exposed.

Product News
Sync data 100x faster on Snowflake with Census Live Syncs
Sync data 100x faster on Snowflake with Census Live Syncs

For years, working with high-quality data in real time was an elusive goal for data teams. Two hurdles blocked real-time data activation on Snowflake from becoming a reality: Lack of low-latency data flows and transformation pipelines The compute cost of running queries at high frequency in order to provide real-time insights Today, we’re solving both of those challenges by partnering with Snowflake to support our real-time Live Syncs, which can be 100 times faster and 100 times cheaper to operate than traditional Reverse ETL. You can create a Live Sync using any Snowflake table (including Dynamic Tables) as a source, and sync data to over 200 business tools within seconds. We’re proud to offer the fastest Reverse ETL platform on the planet, and the only one capable of real-time activation with Snowflake. 👉 Luke Ambrosetti discusses Live Sync architecture in-depth on Snowflake’s Medium blog here. Real-Time Composable CDP with Snowflake Developed alongside Snowflake’s product team, we’re excited to enable the fastest-ever data activation on Snowflake. Today marks a massive paradigm shift in how quickly companies can leverage their first-party data to stay ahead of their competition. In the past, businesses had to implement their real-time use cases outside their Data Cloud by building a separate fast path, through hosted custom infrastructure and event buses, or piles of if-this-then-that no-code hacks — all with painful limitations such as lack of scalability, data silos, and low adaptability. Census Live Syncs were born to tear down the latency barrier that previously prevented companies from centralizing these integrations with all of their others. Census Live Syncs and Snowflake now combine to offer real-time CDP capabilities without having to abandon the Data Cloud. This Composable CDP approach transforms the Data Cloud infrastructure that companies already have into an engine that drives business growth and revenue, delivering huge cost savings and data-driven decisions without complex engineering. Together we’re enabling marketing and business teams to interact with customers at the moment of intent, deliver the most personalized recommendations, and update AI models with the freshest insights. Doing the Math: 100x Faster and 100x Cheaper There are two primary ways to use Census Live Syncs — through Snowflake Dynamic Tables, or directly through Snowflake Streams. Near real time: Dynamic Tables have a target lag of minimum 1 minute (as of March 2024). Real time: Live Syncs can operate off a Snowflake Stream directly to achieve true real-time activation in single-digit seconds. Using a real-world example, one of our customers was looking for real-time activation to personalize in-app content immediately. They replaced their previous hourly process with Census Live Syncs, achieving an end-to-end latency of <1 minute. They observed that Live Syncs are 144 times cheaper and 150 times faster than their previous Reverse ETL process. It’s rare to offer customers multiple orders of magnitude of improvement as part of a product release, but we did the math. Continuous Syncs (traditional Reverse ETL) Census Live Syncs Improvement Cost 24 hours = 24 Snowflake credits. 24 * $2 * 30 = $1440/month ⅙ of a credit per day. ⅙ * $2 * 30 = $10/month 144x Speed Transformation hourly job + 15 minutes for ETL = 75 minutes on average 30 seconds on average 150x Cost The previous method of lowest latency Reverse ETL, called Continuous Syncs, required a Snowflake compute platform to be live 24/7 in order to continuously detect changes. This was expensive and also wasteful for datasets that don’t change often. Assuming that one Snowflake credit is on average $2, traditional Reverse ETL costs 24 credits * $2 * 30 days = $1440 per month. Using Snowflake’s Streams to detect changes offers a huge saving in credits to detect changes, just 1/6th of a single credit in equivalent cost, lowering the cost to $10 per month. Speed Real-time activation also requires ETL and transformation workflows to be low latency. In this example, our customer needed real-time activation of an event that occurs 10 times per day. First, we reduced their ETL processing time to 1 second with our HTTP Request source. On the activation side, Live Syncs activate data with subsecond latency. 1 second HTTP Live Sync + 1 minute Dynamic Table refresh + 1 second Census Snowflake Live Sync = 1 minute end-to-end latency. This process can be even faster when using Live Syncs with a Snowflake Stream. For this customer, using Census Live Syncs on Snowflake was 144x cheaper and 150x faster than their previous Reverse ETL process How Live Syncs work It’s easy to set up a real-time workflow with Snowflake as a source in three steps: