Tutorials

How to automate custom Fantasy Football logic | Census

Donny Flynn
Donny Flynn November 03, 2021

Donny Flynn is a customer data architect at Census. He previously worked as a head of data and is (forever and always) a fanatic for Chicago sports. Donny enjoys working on startups and helping educate the community on data architecture, SQL, and Python. San Francisco Bay Area

In this article, you'll learn how to automate custom Fantasy Football logic with reverse ETL. I'll go over:

  • Getting data from ESPN
  • Setting up Airtable
  • Using SQL and configuring your syncs in Census
  • Why my team, Chicago Donster, definitely isn't finishing last this year (kidding, mostly)

To follow along, make sure you check out the accompanying Github repo here.

Hello! It’s the middle of football season, and I happen to be a big sports fan (as you’ve probably guessed already). A couple of buddies from college and I have belonged to a Fantasy Football league for the past seven years (yes, a touchdown and an extra point).

Until this year, the ESPN Fantasy app was our source of truth for standings, projected points, who would make the playoffs, and, perhaps most importantly, who was going to land up in the consolation bracket (AKA in jeopardy of the league punishment).

This year, we had an exciting idea: We’d play two games a week, one against our opponent and one against the other 11 members of the league.

We wanted to do this to smooth out some of the luck involved with head-to-head matchups. We didn't want the player who scored the second-highest points not to get any points that week. So, instead, we decided to award an additional win to teams if they were in the top half of scorers for that week.

If composite wins were tied, head-to-head wins were the first tiebreaker with points as the next tiebreaker. Good news: The amendment to the league bylaws was approved by the majority vote! Challenging news: We now needed a source of truth for monitoring projected points and standings. Since I often talk a lot about setting up sources of truth here at Census, I figured this would be a fun real-world use case for pushing data from the data warehouse to operational tools.

In this article, I’ll walk you through how you, too, can set up a source of truth for your own Fantasy Football league data using reverse ETL.

Step 1: Get data from ESPN

To ensure we have the current data from the Fantasy League, we first need to query ESPN's Fantasy Football API. Check out the links and follow along in the GitHub repo I’ve thrown together. I’ve also made the video explainer below if that format is more your jam.

This script will query the Fantasy API with your credentials and pull down scores and stats from the league based on the objects you identify. If you want to update any of the logic for your league's specific rules, do so in fantasy_data_routing.py, which includes five phases, and the first four take place in the run_el_script function. Here are the first four steps within this function (we’ll go over the fifth step later in this tutorial):

  1. run_el_script initializes the league object from ESPN Fantasy Football; this can remain unchanged.
  2. create_teams_and_times sets up the team objects, specifying which parameters you want to tie to a team. To customize the metrics you want to track, initialize them to zero here.
  3. create_rankings_and_scores modifies the team dataframes to set the metrics as you want them. For our case, this is where the points wins attribution takes place. Also, it pulls the current and projected scores for the ongoing Fantasy games.
  4. write_to_snowflake initializes the SQLAlchemy Snowflake engine, configures the Snowflake parameters (including database and schema) from the .env file, and writes to the table names specified in the function. You can change this out for any SQL database you configure via JDBC.

After you have your logic the way you want, you’re done with the Python portion!

Step 2: Configure Airtable

This part is quite simple, just connect Airtable to Snowflake to how you’d like and share with your team members!

In my case, I wanted to show the current official standings, the current Fantasy matchup scoreboard, and the projected standings for the subsequent week based on what ESPN predicts.

Disclaimer: ESPN's prediction model has the Chicago Donster (AKA my team) in last place. But, that’s not going to happen. It just won't, we're in 8th place and coming off a nail-biter head-to-head win last week. A trade is currently processing. An exciting push for the playoffs could make 2021: the year of the Donster.

Anyway, here’s my Airtable setup: 😅

Step 3: Use SQL and configure Census syncs

Now that we have the data in Snowflake, we can use SQL to pull the relevant statistics for Airtable. I’ve put the SQL I used in the repo, copied and pasted that into Census models, and configured the syncs. For your use case, you’ll need to do the following in Census:

  • Configure your Snowflake (or another data warehouse connection
  • Configure your Airtable connection
  • Write your SQL for each Airtable into the models tab
  • Configure your sync (make sure you map all of the columns!)

We’re going to trigger the Census sync through our Python script, so we’ll need some credentials to properly kick off these syncs to the object at the end of the script. So the fifth and last phase of the fantasy_data_routing.py script is to use trigger Census syncs with the trigger_census_syncs function.

To get the necessary info, you need to click into your sync configuration, take the sync id from the URL, and grab the secret token from the API trigger portion. This secret token will be the same for your project, so you should have three sync ids and one secret token to copy into the .env file.

Step 4. Run the script 🎉

You can run this locally on your machine or orchestrate it however you’d like.

If everything is set up correctly, you’ll run:


python fantasy_data_routing.py

This should kick off the sync and pull the most current ESPN data into Airtable (pretty neat, huh?).

There you have it, a step-by-step way to send your ESPN Fantasy Football data into Airtable and take into account your league’s rules. Thanks so much for reading! If you run into any issues, or just want to talk about data and football, shoot me a line.

Oh, and one more thing: GO BEARS!!

Related articles

Product News
Sync data 100x faster on Snowflake with Census Live Syncs
Sync data 100x faster on Snowflake with Census Live Syncs

For years, working with high-quality data in real time was an elusive goal for data teams. Two hurdles blocked real-time data activation on Snowflake from becoming a reality: Lack of low-latency data flows and transformation pipelines The compute cost of running queries at high frequency in order to provide real-time insights Today, we’re solving both of those challenges by partnering with Snowflake to support our real-time Live Syncs, which can be 100 times faster and 100 times cheaper to operate than traditional Reverse ETL. You can create a Live Sync using any Snowflake table (including Dynamic Tables) as a source, and sync data to over 200 business tools within seconds. We’re proud to offer the fastest Reverse ETL platform on the planet, and the only one capable of real-time activation with Snowflake. 👉 Luke Ambrosetti discusses Live Sync architecture in-depth on Snowflake’s Medium blog here. Real-Time Composable CDP with Snowflake Developed alongside Snowflake’s product team, we’re excited to enable the fastest-ever data activation on Snowflake. Today marks a massive paradigm shift in how quickly companies can leverage their first-party data to stay ahead of their competition. In the past, businesses had to implement their real-time use cases outside their Data Cloud by building a separate fast path, through hosted custom infrastructure and event buses, or piles of if-this-then-that no-code hacks — all with painful limitations such as lack of scalability, data silos, and low adaptability. Census Live Syncs were born to tear down the latency barrier that previously prevented companies from centralizing these integrations with all of their others. Census Live Syncs and Snowflake now combine to offer real-time CDP capabilities without having to abandon the Data Cloud. This Composable CDP approach transforms the Data Cloud infrastructure that companies already have into an engine that drives business growth and revenue, delivering huge cost savings and data-driven decisions without complex engineering. Together we’re enabling marketing and business teams to interact with customers at the moment of intent, deliver the most personalized recommendations, and update AI models with the freshest insights. Doing the Math: 100x Faster and 100x Cheaper There are two primary ways to use Census Live Syncs — through Snowflake Dynamic Tables, or directly through Snowflake Streams. Near real time: Dynamic Tables have a target lag of minimum 1 minute (as of March 2024). Real time: Live Syncs can operate off a Snowflake Stream directly to achieve true real-time activation in single-digit seconds. Using a real-world example, one of our customers was looking for real-time activation to personalize in-app content immediately. They replaced their previous hourly process with Census Live Syncs, achieving an end-to-end latency of <1 minute. They observed that Live Syncs are 144 times cheaper and 150 times faster than their previous Reverse ETL process. It’s rare to offer customers multiple orders of magnitude of improvement as part of a product release, but we did the math. Continuous Syncs (traditional Reverse ETL) Census Live Syncs Improvement Cost 24 hours = 24 Snowflake credits. 24 * $2 * 30 = $1440/month ⅙ of a credit per day. ⅙ * $2 * 30 = $10/month 144x Speed Transformation hourly job + 15 minutes for ETL = 75 minutes on average 30 seconds on average 150x Cost The previous method of lowest latency Reverse ETL, called Continuous Syncs, required a Snowflake compute platform to be live 24/7 in order to continuously detect changes. This was expensive and also wasteful for datasets that don’t change often. Assuming that one Snowflake credit is on average $2, traditional Reverse ETL costs 24 credits * $2 * 30 days = $1440 per month. Using Snowflake’s Streams to detect changes offers a huge saving in credits to detect changes, just 1/6th of a single credit in equivalent cost, lowering the cost to $10 per month. Speed Real-time activation also requires ETL and transformation workflows to be low latency. In this example, our customer needed real-time activation of an event that occurs 10 times per day. First, we reduced their ETL processing time to 1 second with our HTTP Request source. On the activation side, Live Syncs activate data with subsecond latency. 1 second HTTP Live Sync + 1 minute Dynamic Table refresh + 1 second Census Snowflake Live Sync = 1 minute end-to-end latency. This process can be even faster when using Live Syncs with a Snowflake Stream. For this customer, using Census Live Syncs on Snowflake was 144x cheaper and 150x faster than their previous Reverse ETL process How Live Syncs work It’s easy to set up a real-time workflow with Snowflake as a source in three steps:

Best Practices
How Retail Brands Should Implement Real-Time Data Platforms To Drive Revenue
How Retail Brands Should Implement Real-Time Data Platforms To Drive Revenue

Remember when the days of "Dear [First Name]" emails felt like cutting-edge personalization?

Product News
Why Census Embedded?
Why Census Embedded?

Last November, we shipped a new product: Census Embedded. It's a massive expansion of our footprint in the world of data. As I'll lay out here, it's a natural evolution of our platform in service of our mission and it's poised to help a lot of people get access to more great quality data.