Customer Stories

Built With Census Embedded: Labelbox Becomes Data Warehouse-Native

Jeff Sloan
Jeff Sloan April 08, 2024

Jeff is a Senior Data Community Advocate at Census, previously a Customer Data Architect and a Product Manager. Jeff has strong opinions on LEFT JOINs, data strategy, and the order in which you add onions and garlic to a hot pan. Based in New York City.

Every business’s best source of truth is in their cloud data warehouse. If you’re a SaaS provider, your customer’s best data is in their cloud data warehouse, too.

That’s why Labelbox, an AI platform, implemented Census Embedded to enable rapid data onboarding from their customers’ data warehouses, all within a user-friendly and Labelbox-branded experience. 

With Census Embedded, Labelbox is one of the first applications to become fully data warehouse-native. Labelbox customers can now onboard their data from their cloud data warehouse and power workflows with fresh data 24/7.

Want to learn how they helped their application speak to data warehouses? Read on.

Data Onboarding: Labelbox enriches data warehouse data

Labelbox is an AI SaaS company – sitting at the heart of machine learning development workflows at companies like Walmart, NASA, and Warner Brothers. Good, fresh data is critical for them to be the most valuable to their customers.

Specifically, Labelbox automatically tags images, text, and audio to describe the contents of the files. Data scientists use this labeled data to develop their machine-learning models. If you can imagine the bad old days, this would take teams of interns, outsourcing, and Mechanical Turk surveys. It was a slow, manual, and expensive process.

But Labelbox faces a fundamental challenge – to add value to their customers, they need access to their customer’s data. This data is often already sitting in a cloud data platform like Snowflake or Databricks, or a data lake built on GCS or S3.

Previously, customers were forced to build integrations from their data sources directly against Labelbox’s API. Labelbox wanted better for their users.

Census Embedded in Action: How it works

Labelbox uses Census Embedded to import data continuously from their customers’ data warehouses and 20+ other sources. 

"The benefit of this approach is that our customers can now sign up for Labelbox, rapidly onboard, connect to their warehouse, and they're done. Customers no longer have to write data pipelines into our API to get the most out of our platform." — Kahveh Saramout, Lead Data Engineer at Labelbox

The process is:

  1. Labelbox customers log into Labelbox, click to connect their data warehouse, and then add their credentials in a secure portal provided by Census Embedded. 
  2. Labelbox customers select the dataset they would like to import into Labelbox and map columns from their data to available fields in Labelbox.
  3. Finally, customers schedule their import jobs to run on a desired cadence.

All of the necessary connections and scheduled syncs are created by the Labelbox application via the Census Embedded API. Within the Census Embedded user interface, Kahveh and his team have deep observability and alerting capabilities to manage these imports at scale.

The Future is Data Warehouse-Native

Labelbox is a pioneer in data warehouse-native applications, but they certainly won’t be the last.

At Census, we believe every SaaS company will need to integrate with the cloud data warehouse. As SaaS providers integrate AI into their platforms, access to good customer data will become the critical difference between valuable, personalized features and generic fluff. 

Census Embedded is the best way to access your customer's rich datasets, whether you’re importing or exporting from your SaaS application.

Interested in learning more about becoming data warehouse-native with Census Embedded? ‎Request a demo today and learn about how an embedded integration platform can help you onboard your customers faster and add exponential customer value.

Related articles

Transforming data Before Syncing with Census Datasets
Transforming data Before Syncing with Census Datasets

The Problem: Your good data is always one request away. Your data team built some great data models in your warehouse; it could be with dbt, or could be plain ol’ SQL — the typical Accounts, Contacts, and Teams golden models. Now you work with another vendor. Maybe a third-party enrichment provider writes open job listings and description data for your warehouse, like Or perhaps you have an enriched target accounts list generated in another marketing tool.

Product News
Census Datasets: The first step toward collaborative data transformation
Census Datasets: The first step toward collaborative data transformation

Late nights, long hours, and a constant string of tickets and feedback are the reality for most data and IT teams today. As every company’s appetite for data grows, technical teams are forced to scale up support to ensure that the right data lands in the right place. But it doesn’t stop there. Data teams are expected to provide actionable insights, comprehensive data governance, and compliant datasets for their entire organization while juggling new technologies, unclear expectations, and an ever-growing volume of data. Data teams are overwhelmed, business teams are confused and anxious, and everyone is spending more time discussing processes and procedures — and less time innovating.

Product News
Introducing Embedded Reverse ETL Syncs, the future of SaaS integrations
Introducing Embedded Reverse ETL Syncs, the future of SaaS integrations

Looking for a demo? Click here to jump to it on this page.