The Official Census Blog
Check our product
Get the best data & ops content (not just our post!) delivered straight to your inbox

[Part 2] You can build reverse ETL yourself. But you don’t want to.

Allie Beazell is director of developer marketing @ Census. She loves getting to connect with data practitioners about their day-to-day work, helping technical folks communicate their expertise through writing, and bringing people together to learn from each other.

This is part two of our series on evaluating reverse ETL tools. You can read part one here and grab the companion guide, The ultimate guide for evaluating reverse ETL tools.

In this installment, you’ll learn why reverse ETL is so challenging for many teams to build. We’ll breakdown:

  • The cost to build 🏗️
  • The cost to maintain 🔧
  • The opportunity cost 💸
  • Why reverse ETL isn’t just another data pipeline🚰

You probably already have some talented data and engineering talent on staff. You may be asking yourself: Why can’t these existing resources within my company just build reverse ETL for our team themselves?

We often see companies wrestle with this. On the one hand, you can spin up an MVP of a reverse ETL tool yourself without too much trouble using custom Python scripts and open source options. But the labor isn’t in the initial development cost, it’s in the ongoing time and resources maintaining that tool will cost you. A DIY build will lock your engineers right back in API purgatory.

On the flip side, when you use a pre-built tool like Census, you don’t have to spend time worrying about the ins and outs of each API your team uses (and you can reuse the data modeling and logic work you’ve already built for modern data stack tools like dbt and Airflow on top of you data warehouse). Even better, you can go from request to deployment on the same day. Sounds great, right?

But in case you’re not convinced, let’s take a look at some numbers around the price/cost of the classic build vs buy decision (since we all love numbers, right?). Here are three reasons you don't want to build your own reverse ETL solution.  

1. Cost to build 🏗️

If you choose to build reverse ETL yourself, the first figure you’ll have to reckon with is the labor cost of whoever builds your integrations. This could be a data engineer, a data analyst, or an integrations specialist. For this exercise, we’ll assume that the task would fall to a data engineer.

To start, let’s consider the average annual salary of a data engineer in San Francisco: $162,954.

Let’s do some back-of-the-napkin math to get a ballpark estimate of how much it would cost your organization to build a bespoke reverse ETL tool (not including the potentially infinite number of maintenance hours on that tool after it ships, we’ll get to that in a second):

Let’s say you’re starting out with three connectors: Salesforce, Marketo, Zendesk, and. And let’s say, for the sake of napkin science, it takes a (proficient)engineer about 2 weeks to build a custom DIY connector (or $6,267 per engineer per connector). That’s just for spinning one connector up. All three would run you about $18,800 and 6 weeks in initial engineering time alone. And that’s the cheap part (the expensive part is maintaining it and hoping that, if it fails, someone notices within a short period of time and fixes it before your business teams spend months working off stale data).

Compare that to a reverse ETL tool that typically takes less than 15 minutes to set up a sync with your operational systems and BI tools, and enables you to get started for free, with a yearly subscription that includes unlimited connectors available for less than $10K per year.

2. Cost to maintain 🔧

We probably don’t have to tell you this, but the cost of DIY software doesn’t stop when the integration goes live. Instead, it becomes exponentially more expensive to deal with as it becomes more out of date with new tools and API updates and new team members have to take the time to make heads or tails of it.

This cost is much harder to estimate with a high-level formula. People will inevitably leave your company or move teams, APIs will change, new features will be needed, and the organization will grow. Add in the added pressure to maintain these pipelines quickly since they’re transporting business-critical customer data. If these updates are relegated to the backburner, there are real-world, real-time impacts to customer experiences like new trials not receiving welcome emails or clunky customer support.

As your data sources, integrations, and organization become more complex, you’ll end up with a dangerous, interdependent spider’s web of connectors to maintain and debug.

So how does a pre-built, dedicated reverse ETL tool eliminate this risk and stress? There are three main ways:

  1. Harm reduction: Your business depends on data. If your data integrations aren’t performing as needed, or consistently built using up-to-date best practices, you’ll lose time and money quickly. For example, if you’re not automatically deduping your ad platform audiences or if your sync is running up your API bills by syncing unnecessary fields.
  2. Data quality assurance: Trust in real-time data (pulled from a single source of truth) is an integral part of building a modern, data-driven organization. If your data stack isn't solving for this (or, at the very least, not making it worse), you’re going to come out net negative. For example, if your sales team consistently gets out of date or incorrect contact information for prospects, they’ll hesitate to leverage the data at their disposal and fail to perform as well as they could.
  3. Scalable features: The cost of building and maintaining custom features for your tooling will quickly add up. A quality, pre-built reverse ETL tool will come OOTB with free core features to get you up and running (and running some more) from day one. For example, our free plan includes 10 destination fields, unlimited destination connections, unlimited SQL and dbt Models, incremental syncing, and more from day one.

If you take down all these considerations and decide that it’s worth the ongoing cost to build your reverse ETL tooling in-house, more power to you. But, if you’re like us, you’d rather spend that time and resources building your core product.

3. Opportunity cost 💸

What could your talent build if all the busy work and distractions of maintenance were taken off their plate? Ultimately, the real cost of building a reverse ETL in-house and dedicating engineering effort to maintain your own connectors isn’t the dollar amount of the time spent coding the project, it’s the cost of the work they’re not doing instead.

Not only that, it’s the cost of the work your analysts could do with the right tooling, but are blocked from doing otherwise. After all, tools like Census don’t just let data engineers do better work. They give your analysts an avenue to create greater business impact by building data pipelines themselves with just SQL vs the complex skills required to instrument Airflow or custom scripts. This delivers a measurably positive lifetime value, instead of a lifetime deficit of tool maintenance.

On the data engineering side, practitioners want to spend less time writing integrations to move data and more time unlocking value in data through models and applications. But they can’t do that if all their time is spent troubleshooting a complicated integration they inherited from another engineer. An uphill battle doesn’t scale for anyone.

Reverse ETL is not just another pipeline 🚰

Now that we’ve weighed the costs of building an in-house reverse ETL tool, the final point we want to make is that there is significant complexity in building a reverse ETL tool that goes beyond just building pipelines. It’s easy to think that reverse ETL is “just another data pipeline,” but here are a few reasons why that’s not true:

  • Reverse ETL is designed for a different endgame. Syncing data to downstream operational tools has a different purpose than data landing in a cloud data warehouse via traditional ETL tools. Also, with reverse ETL you’re not just putting a data tool in place for the data team, you’re bridging the organizational gap between that data team and your business users. Operational data is used to power workflows and drive decision-making, and any new data infrastructure needs to be carefully designed with these end-use cases and users in mind. This affects many requirements such as data quality, alerting, and usability for non-technical users.
  • Writing data is a different animal than reading it. Reverse ETL writes business-critical data into hundreds of destination databases at scale, all with their own nuances. Read APIs on most SaaS tools tend to be relatively straightforward, whereas write APIs are much more complex (e.g. each tool handles deletes differently).
  • Reverse ETL requires a new data governance strategy. Reverse ETL tools interact with any application they’re connected to and you must govern them well to ensure there’s a tight feedback loop throughout the organization. Good data stewardship is a uniquely important virtue for reverse ETL tools.

We know that’s a lot to think about when deciding if a managed reverse ETL solution is for you. So here’s a recap:

  • The cost of building your own tool is often more than it makes sense for many companies to take on. Our best case back of the napkin estimates put it at almost $19K and six weeks to build only three connectors, and that doesn’t come close to factoring in the maintenance costs.
  • The cost of maintaining a bespoke reverse ETL tool often balloons well beyond initial estimates and never really stops. Outsourcing your reverse ETL tool to a vendor like Census lets you do higher leverage work with that time and gain some peace of mind.
  • The opportunity cost of having your talented data engineers focus on building and maintaining a DIY tool (and the wasted potential of your analysts) is more than enough to offset the annual cost of a dedicated reverse ETL tool.
  • Reverse ETL is not just another pipeline and there is significant complexity in trying to build a tool with the functionality to satisfy business intelligence use cases and reliably meets technical requirements like dealing with write APIs.

Hopefully, by now you’re in the camp of folks who know they want to invest in a reverse ETL tool from a dedicated, expert vendor like Census. If you’re ready to start the journey toward operational analytics (AKA doing more with your data (and time)), there are some key considerations to keep in mind as you search for the best reverse ETL tool.

Ready to learn more? Check out part 3.1 of our series on evaluating reverse ETL tools: 4 key considerations for finding the best reverse ETL tool. Or, if you want to skip ahead, grab The ultimate guide to evaluating reverse ETL tools.

Related articles

No items found.
Related integrations
No items found.
Get the best data & ops content (not just our post!) delivered straight to your inbox