Tutorials

3 ways to sync data from Snowflake to Google Ads | Census

Michel Zurkirchen
Michel Zurkirchen March 25, 2022

Michel Zurkirchen is an Amsterdam-based digital analyst and a regular contributor to the Census blog. After starting his career in digital marketing, Michel started learning Python and has been an enthusiastic learner and practitioner of data analytics ever since. Amsterdam, North Holland, Netherlands

In this article, Michel breaks down three methods for syncing data from Snowflake to Google Ads, including:

  • Manual CSV upload to Google Ads
  • An semi-automated sync using the Snowflake and Google Ads APIs
  • A fully automated sync using Census reverse ETL

Google Ads is one of the biggest platforms in the world you can advertise on and we're going to show you how to use it more effectively. Any digital marketer worth their money knows you have to target the right message to the right audience, and that they need fresh, accurate data from their data team to do it.

Importing data into Google Ads allows your org to leverage your customer lists for lookalike audiences, exclusion lists, and direct targeting lists. If you, for example, have a list of people who have expressed interest in a particular product, you can upload that list and target those people with a tailored ad.

We'll show you how to export your data from Snowflake and import it into Google Ads using three different methods: one manual, one semi-automated, and one fully automated.

Let’s get going.

Snowflake to Google Ads method 1: Manually import data into Google Ads

Google Ads has a very specific format you need to adhere to when uploading your data, so we'll start there. Here’s a breakdown of the steps:

  1. Go to Google Ads and log into your account.
  2. Click on the Tools and settings button in the navigation bar.
  3. From the dropdown, choose Audience manager.
  4. On the left-hand side click on Segments.
  5. Click the + button labelled create remarketing list.
  6. From the dropdown, choose Customer list.

In the Data to upload section, you can choose (you guessed it) which type of data you'll upload. We'll go with the first option, which is Upload Emails, Phones and/or Postal Addresses. You can then download a template, which will show you exactly what your upload file needs to look like. And I do mean exactly. For example, if you choose to upload email addresses, you will need a CSV file with exactly one column called Email and nothing else.

Now that we know exactly which data we need, we can export it from Snowflake. Head over to the Snowflake web UI and write a query that will satisfy the template requirements. In our case, it's as simple as the query below.


SELECT C_EMAIL_ADDRESS AS "Email"
FROM "SNOWFLAKE_SAMPLE_DATA"."TPCDS_SF100TCL"."CUSTOMER";

Click the download button once your query finishes running. Then head back to Google Ads and from there upload your file. If everything went well, you'll get a confirmation screen. Note: It can now take up to 24 hours for Google to match your data to users, so take some time to get a snack, a new cup of coffee, and read through the rest of the tips at the end of this article. ☕

Snowflake to Google Ads method 2: Automatically import data into Google Ads

The automated method leverages the Snowflake and Google Ads APIs. We'll show you how to set up the export/import using Python, though you can use your language of choice here (e.g. Java). You only need your “regular” Snowflake credentials, which you use to log into the web UI, to use the Snowflake API.

For Google Ads, you'll have to jump through a number of hoops before you're up and running with their API. While I won’t cover that here, Google has dedicated entire pages of their docs to get you to that point if you need some extra support. And, as with all things tech, there is always a YouTuber with a video to make your life easier if Google’s docs feel a little unclear.

Once you're all set with API access, we'll start by exporting the data from Snowflake. If you're following along with Python and haven't installed the libraries yet, do so now with the following.


pip install snowflake-connector-python google-ads

Next, import the Snowflake connector.


import snowflake.connector

To export data from Snowflake, you'll need the same username, password, and account you use to log into the web UI. If you normally log into the classic Snowflake web UI, you'll find your account name in the URL: https://<your-account>.snowflakecomputing.com. Using these, create a Connection object with the following code:


conn = snowflake.connector.connect(
    user = "your-username",
    password = "your-password",
    account = "your-account"
    )

You're now ready to query your data. Our query is straightforward since we only need to retrieve email addresses.


cursor = conn.cursor().execute(""" SELECT C_EMAIL_ADDRESS AS "Email" FROM "SNOWFLAKE_SAMPLE_DATA"."TPCDS_SF100TCL"."CUSTOMER"; """)

The cursor contains our data as a list of tuples, which we can retrieve using fetchall. It also contains the column names, which, for our particular use case, we don't need. However, I’ll show you how to access them in case you need it for a different use case down the road.


data = [i[0] for i in cursor.fetchall()]
columns = [i[0] for i in cursor.description]

If you got the data you wanted, you can close your connection to Snowflake.


conn.close()

Unfortunately, importing our data into Google Ads using the Python client library is as challenging as getting access to the API. In the end, we used this example with a few modifications. Start by importing the necessary libraries.


from google.ads.googleads.client import GoogleAdsClient
from google.ads.googleads.errors import GoogleAdsException
import hashlib

You'll need a client. There are multiple ways to instantiate one, but you can follow along with this method:


client = GoogleAdsClient.load_from_storage()

If you haven't created a segment list yet, now’s the time. We made some slight changes to this function to make it more useful (see comments in the code block below).


def _create_customer_match_user_list(client, customer_id, list_name, list_description, membership_life_span):
    """Creates a Customer Match user list. Args: client: The Google Ads client. customer_id: The Google Ads customer ID for which to add the user list. list_name: The name to give to the user list. list_description: The description of the user list. membership_life_span: Number of days a user should remain on the list. Returns: The string resource name of the newly created user list. """
    # Creates the UserListService client.
    user_list_service_client = client.get_service("UserListService")

    # Creates the user list operation.
    user_list_operation = client.get_type("UserListOperation")

    # Creates the new user list.
    user_list = user_list_operation.create
    user_list.name = list_name
    user_list.description = list_description
    user_list.crm_based_user_list.upload_key_type = (client.enums.CustomerMatchUploadKeyTypeEnum.CONTACT_INFO)
    user_list.membership_life_span = membership_life_span

    response = user_list_service_client.mutate_user_lists(customer_id=customer_id, operations=[user_list_operation])
    user_list_resource_name = response.results[0].resource_name
    print(f"User list with resource name '{user_list_resource_name}' was created.")

    return user_list_resource_name

    user_list_resource_name = _create_customer_match_user_list(client, 'your-google-ads-customer-id', 'Census - demo', 'A demo by Census', 30)

If all went well, the variable user_list_resource_name will look something like customers/123456789/userLists/987654321, where 123456789 is your Google Ads customer ID and 987654321 is the ID of your segment. If you already have a segment list and need to find the ID, the quickest way to do so is through the web UI. You’ll find your Google Ads customer ID in the navigation bar, next to the name of the account. To get its ID, navigate into the segment you want to add users to, it’ll be in the URL as the query parameter userListId. Use these values to manually create the user_list_resource_name string variable.

Now onto the actual adding of users. Again, we've made some slight modifications to the example code. Specifically, we've added emails as a parameter to _add_users_to_customer_match_user_list and looped over them to add the operations to request.operations. We’ve also added email as a parameter to _build_offline_user_data_job_operations.

Note: Google Ads requires you hash your customers’ email addresses using SHA-256 prior to uploading. This function takes care of that, along with cleaning the email addresses. Even a single deviation in characters will result in an entirely different hash and will derail Google Ads’s ability to match the email address to one of its users. If you have reason to believe your data requires more cleaning than what’s implemented in this function, now would be the time to do something about that.


def _normalize_and_hash(s):
    """Normalises and hashes a string with SHA-256. Args: s: The string to perform this operation on. Returns: A normalised (lowercase, remove whitespace) and SHA-256 hashed string. """

    return hashlib.sha256(s.strip().lower().encode()).hexdigest()

Warning: wall of code incoming. It’s great if you’re able to understand the below two functions so you can modify them to fit your needs, but no worries if you don’t. In short, it creates the actual operations, which will add the hashed email addresses to your segment.


def _build_offline_user_data_job_operations(client, email):
    """Builds and returns offline user data job operations. Args: client: The Google Ads client. email: A single, unhashed email address. Returns: A list containing the operations. """
    # Creates the first user data based on an email address.
    user_data_with_email_address_operation = client.get_type(
        "OfflineUserDataJobOperation"
    )
    user_data_with_email_address = user_data_with_email_address_operation.create
    user_identifier_with_hashed_email = client.get_type("UserIdentifier")
    # Hash normalised email addresses based on SHA-256 hashing algorithm.
    user_identifier_with_hashed_email.hashed_email = _normalize_and_hash(
        email
    )
    user_data_with_email_address.user_identifiers.append(
        user_identifier_with_hashed_email
    )

    return [user_data_with_email_address_operation]

def _add_users_to_customer_match_user_list(
client, customer_id, user_list_resource_name, skip_polling, emails
):
    """Uses Customer Match to create and add users to a new user list. Args: client: The Google Ads client. customer_id: The customer ID for which to add the user list. user_list_resource_name: The resource name of the user list to which to add users. skip_polling: A bool dictating whether to poll the API for completion. """

    # Creates the OfflineUserDataJobService client.
    offline_user_data_job_service_client = client.get_service(
        "OfflineUserDataJobService"
    )
    
    # Creates a new offline user data job.
    offline_user_data_job = client.get_type("OfflineUserDataJob")
    offline_user_data_job.type_ = (
        client.enums.OfflineUserDataJobTypeEnum.CUSTOMER_MATCH_USER_LIST
    )
    offline_user_data_job.customer_match_user_list_metadata.user_list = (
        user_list_resource_name
    )

    # Issues a request to create an offline user data job.
    create_offline_user_data_job_response = (
        offline_user_data_job_service_client.create_offline_user_data_job(
            customer_id=customer_id, job=offline_user_data_job
        )
    )
    offline_user_data_job_resource_name = (
        create_offline_user_data_job_response.resource_name
    )
    print(
        "Created an offline user data job with resource name: "
        f"'{offline_user_data_job_resource_name}'."
    )

    request = client.get_type("AddOfflineUserDataJobOperationsRequest")
    request.resource_name = offline_user_data_job_resource_name
    request.operations = []

    for email in emails:
        operation = _build_offline_user_data_job_operations(client, email)
        request.operations.extend(operation)

    request.enable_partial_failure = True

    # Issues a request to add the operations to the offline user data job.
    response = offline_user_data_job_service_client.add_offline_user_data_job_operations(
        request=request
    )

    # Prints the status message if any partial failure error is returned.
    # Note: the details of each partial failure error are not printed here.
    # Refer to the error_handling/handle_partial_failure.py example to learn
    # more.
    # Extracts the partial failure from the response status.
    partial_failure = getattr(response, "partial_failure_error", None)
    if getattr(partial_failure, "code", None) != 0:
        error_details = getattr(partial_failure, "details", [])
        for error_detail in error_details:
            failure_message = client.get_type("GoogleAdsFailure")
            # Retrieve the class definition of the GoogleAdsFailure instance
            # in order to use the "deserialize" class method to parse the
            # error_detail string into a protobuf message object.
            failure_object = type(failure_message).deserialize(
                error_detail.value
            )

                for error in failure_object.errors:
                print(
                    "A partial failure at index "
                    f"{error.location.field_path_elements[0].index} occurred.\n"
                    f"Error message: {error.message}\n"
                    f"Error code: {error.error_code}"
                 )

    print("The operations are added to the offline user data job.")
    
    # Issues an request to run the offline user data job for executing all
    # added operations.
    operation_response = (
        offline_user_data_job_service_client.run_offline_user_data_job(
            resource_name=offline_user_data_job_resource_name
        )
    )

    if skip_polling:
        _check_job_status(
            client,
            customer_id,
            offline_user_data_job_resource_name,
            user_list_resource_name,
        )
    else:
        # Wait until the operation has finished.
        print("Request to execute the added operations started.")
        print("Waiting until operation completes...")
        operation_response.result()
        _print_customer_match_user_list_info(
            client, customer_id, user_list_resource_name
        )
        # [END add_customer_match_user_list]

After all that, the code below presses the proverbial red button, which uploads your customers email addresses. 🙌


upload = _add_users_to_customer_match_user_list(client, 'your-google-ads-id', user_list_resource_name, False, data)

The output should indicate the operation is running, which will take a while. Now you can sit back and wait for your data to become available, which can take anywhere between six and 48 hours. ⏰

Snowflake to Google Ads method 3: Set and forget your Snowflake to Google Ads syncs with Census

Since I know you're still waiting for your data to hit Google Ads from Snowflake, I’ll take this opportunity to tell you about an easier (and faster) way to move your customer data from A to B.  

While the manual method is straightforward, if you have to make modifications to your lists often enough, you’ll quickly realize it’s more efficient to automate the process. However, as you saw above, setting up syncs via respective APIs isn’t exactly quick or easy (and you’d still have to schedule or otherwise trigger your export/import process another way). Plus, you’d still need to figure out how to do error handling if/when something goes awry.

In short: Even that automated method is a bear. 🐻

Thankfully, there’s an automated method that doesn’t just create more work for you down the line: Reverse ETL with Census.

Once you’ve signed up for Census and have your account set up (you can grab a free trial here), you just have to hook up Snowflake as your data source and make sure your Google Ads account is ready to go.

From there, it’s just six quick steps to set up an automatic sync between Snowflake and Google Ads that will surface any issues to email or Slack if they occur:

  1. Navigate to Connections
  2. Click on Add Service button
  3. Select Google Ads from the list
  4. Authenticate Google Ads access with your account
  5. Select the Google Ads account want to send data to
  6. Take a break and go for a walk with all that extra time you just reclaimed. ☀️

If you want a full video walkthrough of the steps, check out the demo below.

Want to learn more about managing offline conversions in Google Ads via your data warehouse? Check out this webinar. 👇

Then, head on over to the trial sign up page and start syncing your data faster and easier than ever before.

Related articles

Product News
Sync data 100x faster on Snowflake with Census Live Syncs
Sync data 100x faster on Snowflake with Census Live Syncs

For years, working with high-quality data in real time was an elusive goal for data teams. Two hurdles blocked real-time data activation on Snowflake from becoming a reality: Lack of low-latency data flows and transformation pipelines The compute cost of running queries at high frequency in order to provide real-time insights Today, we’re solving both of those challenges by partnering with Snowflake to support our real-time Live Syncs, which can be 100 times faster and 100 times cheaper to operate than traditional Reverse ETL. You can create a Live Sync using any Snowflake table (including Dynamic Tables) as a source, and sync data to over 200 business tools within seconds. We’re proud to offer the fastest Reverse ETL platform on the planet, and the only one capable of real-time activation with Snowflake. 👉 Luke Ambrosetti discusses Live Sync architecture in-depth on Snowflake’s Medium blog here. Real-Time Composable CDP with Snowflake Developed alongside Snowflake’s product team, we’re excited to enable the fastest-ever data activation on Snowflake. Today marks a massive paradigm shift in how quickly companies can leverage their first-party data to stay ahead of their competition. In the past, businesses had to implement their real-time use cases outside their Data Cloud by building a separate fast path, through hosted custom infrastructure and event buses, or piles of if-this-then-that no-code hacks — all with painful limitations such as lack of scalability, data silos, and low adaptability. Census Live Syncs were born to tear down the latency barrier that previously prevented companies from centralizing these integrations with all of their others. Census Live Syncs and Snowflake now combine to offer real-time CDP capabilities without having to abandon the Data Cloud. This Composable CDP approach transforms the Data Cloud infrastructure that companies already have into an engine that drives business growth and revenue, delivering huge cost savings and data-driven decisions without complex engineering. Together we’re enabling marketing and business teams to interact with customers at the moment of intent, deliver the most personalized recommendations, and update AI models with the freshest insights. Doing the Math: 100x Faster and 100x Cheaper There are two primary ways to use Census Live Syncs — through Snowflake Dynamic Tables, or directly through Snowflake Streams. Near real time: Dynamic Tables have a target lag of minimum 1 minute (as of March 2024). Real time: Live Syncs can operate off a Snowflake Stream directly to achieve true real-time activation in single-digit seconds. Using a real-world example, one of our customers was looking for real-time activation to personalize in-app content immediately. They replaced their previous hourly process with Census Live Syncs, achieving an end-to-end latency of <1 minute. They observed that Live Syncs are 144 times cheaper and 150 times faster than their previous Reverse ETL process. It’s rare to offer customers multiple orders of magnitude of improvement as part of a product release, but we did the math. Continuous Syncs (traditional Reverse ETL) Census Live Syncs Improvement Cost 24 hours = 24 Snowflake credits. 24 * $2 * 30 = $1440/month ⅙ of a credit per day. ⅙ * $2 * 30 = $10/month 144x Speed Transformation hourly job + 15 minutes for ETL = 75 minutes on average 30 seconds on average 150x Cost The previous method of lowest latency Reverse ETL, called Continuous Syncs, required a Snowflake compute platform to be live 24/7 in order to continuously detect changes. This was expensive and also wasteful for datasets that don’t change often. Assuming that one Snowflake credit is on average $2, traditional Reverse ETL costs 24 credits * $2 * 30 days = $1440 per month. Using Snowflake’s Streams to detect changes offers a huge saving in credits to detect changes, just 1/6th of a single credit in equivalent cost, lowering the cost to $10 per month. Speed Real-time activation also requires ETL and transformation workflows to be low latency. In this example, our customer needed real-time activation of an event that occurs 10 times per day. First, we reduced their ETL processing time to 1 second with our HTTP Request source. On the activation side, Live Syncs activate data with subsecond latency. 1 second HTTP Live Sync + 1 minute Dynamic Table refresh + 1 second Census Snowflake Live Sync = 1 minute end-to-end latency. This process can be even faster when using Live Syncs with a Snowflake Stream. For this customer, using Census Live Syncs on Snowflake was 144x cheaper and 150x faster than their previous Reverse ETL process How Live Syncs work It’s easy to set up a real-time workflow with Snowflake as a source in three steps:

Best Practices
How Retail Brands Should Implement Real-Time Data Platforms To Drive Revenue
How Retail Brands Should Implement Real-Time Data Platforms To Drive Revenue

Remember when the days of "Dear [First Name]" emails felt like cutting-edge personalization?

Product News
Why Census Embedded?
Why Census Embedded?

Last November, we shipped a new product: Census Embedded. It's a massive expansion of our footprint in the world of data. As I'll lay out here, it's a natural evolution of our platform in service of our mission and it's poised to help a lot of people get access to more great quality data.