The 5 SQL date functions biz ops teams need to know | Census

Terence Shin
7 September 2021

Key learnings from this article: In this post, you'll get a breakdown of the syntax for each of the date functions below, as well as an example of their use cases.

  • DATE_TRUNC()
  • DATE_DIFF()
  • DATE_ADD()/ DATE_SUB()
  • EXTRACT()
  • CURRENT_DATE()

What makes a good SQL expert? There are a lot of things (we’re pretty awesome), but one (among many) of our superpowers is the ability to manipulate data in any way that we like, especially dates.

Date manipulation allows the businesses we support to compare and assess performance across different periods of time. It should come as no surprise this ability informs a variety of top-tier business ops and reporting. Why? Different business functions have different needs across their datasets, including dates. Typically, finance cares about numbers on a quarterly basis, while customer success might care about numbers on a daily, or even hourly, basis. Thus, being able to manipulate dates for different business users is essential.

In this article, we’ll dive into five of the most important and most useful DATE functions in SQL (along with some of the practical business cases you can use them for) including:

  • DATE_TRUNC()
  • DATE_DIFF()
  • DATE_ADD()/ DATE_SUB()
  • EXTRACT()
  • CURRENT_DATE()

To help illustrate our point and anchor our examples into the real data and business worlds, we’ll be using an example scenario throughout. Let’s say we’re an analyst working for a company called Sandy Shores (if you caught our webinar on operational analytics at every stage, you know this company well) that rents beach chairs. We might want to manipulate the dates in our data warehouse so that we can run analysis on the amount of time between a customer’s rentals, and other important data points to help us measure growth.

Let’s dive in.

1. DATE_TRUNC()

DATE_TRUNC() shortens the date to the specified date part so you can aggregate numbers based on different durations (i.e. daily, monthly, annually). DATE_TRUNC is a necessity when conducting cohort analyses, where you typically group users by month.

It has the following syntax:


DATE_TRUNC(date_expression, date_part)

Thinking back to our example company, our fictional analyst might use the code below to truncate the date, June 28, 2021, to the beginning of June, June 1, 2021. In the example below,  we specified date_part = MONTH so the date_trunc() will truncate the date to get the first day of the month. Here’s the code:


DATE_TRUNC('2021-06-28', MONTH) = '2021-06-01'

To give another example, if our analyst specified date_part = YEAR, then they’d get the first day of the year of the specified date. Here’s the code again with this change:


DATE_TRUNC('2021-06-28', YEAR) = '2021-01-01'

Let’s look at some more examples Sandy Shores might run into.

Example 1: Suppose our analyst wanted to get a weekly sum of rental sales from the table below:

To do so, they would run the following code. In the block below, we’re truncating all dates to the beginning of each week, so that we can group together all sales that happen in the same week:


with sales_data as (
SELECT DATE_TRUNC(date, WEEK) as week_date,
daily_sales
FROM sales
)
SELECT week_date,
SUM(daily_sales) as weekly_sales
FROM sales_data
GROUP BY week_date

This would result in:

Now our analyst can confidently tell their business teams which week had the highest profits from chair rentals, which can be used in later analysis. Now, what if our analyst wanted to compare dates to understand the time between rentals? That’s where DATE_DIFF() comes into play.

2. DATE_DIFF()

DATE_DIFF() compares two dates and returns the difference in date_part between them.

For example, if date_part = DAY, then DATE_DIFF() returns the number of DAYs between the two dates. If date_part = MONTH, then DATE_DIFF() returns the number of MONTHs between the two dates.

In the code below, our date_part = DAY, so we’re told that one day passes between January 1, 2021 and January 2, 2021.


DATE_DIFF('2021-01-02', '2021-01-01', DAY) = 1

DATE_DIFF() is useful when you want to compare two dates, for example, when a package was shipped and when a package was delivered, or when a user registered and when they canceled. DATE_DIFF() is also useful in the WHERE clause if you want to filter dates that happened X periods ago (eg. 5 days ago, 2 weeks ago, or last month).

It has the following syntax:


DATE_DIFF(date_expression_1, date_expression_2, date_part)

Let’s take a look at an example that Sandy Shores might run into.

Example 1: Suppose you wanted to get the time between when a rental chair was picked up and when it was returned to a given Sandy Shores location.

To do so, you would run the following query:


SELECT order_id
, DATE_DIFF(date_returned, date_picked_up, DAY) as shipping_time
FROM orders

This would result in:

Example 2: Using the same table above, suppose you wanted to get all orders where the return time was less than 10 days:


SELECT *
FROM orders
WHERE DATE_DIFF(date_returned, date_picked_up, DAY) < 10

This time your results table would look something like this:

Now, our Sandy Shores data analyst has a better understanding of how long different customers use their rental chairs each time. This information can help the company better tailor its prices to the needs of customers, or project how many rental chairs they should keep on hand for new customers on a given day.

Now, what if they want to add a time period to the dates they’re working with? Let’s take a look.

3. DATE_ADD() / DATE_SUB()

DATE_ADD() adds a specified number of date parts to a date. Conversely, DATE_SUB subtracts a specified number of date parts to a date.


DATE_ADD('2021-01-01', INTERVAL 3 DAY) = '2021-01-04'


DATE_SUB('2021-01-04', INTERVAL 3 DAY) = '2021-01-01'


DATE_ADD('2021-01-01', INTERVAL 1 MONTH) = '2021-02-01'


DATE_ADD() and DATE_SUB() can be similarly used like DATE_DIFF() in the WHERE clause to filter dates that happened X periods ago or X periods in the future.

It has the following syntax:


DATE_ADD(date_expression, INTERVAL int64 date_part)


DATE_SUB(date_expression, INTERVAL int64 date_part)

Let’s take another look at a use case that might be helpful for Sandy Shores.

Example 1: Suppose you wanted to add 7 days to the date_picked_up values to get an estimated return date (assuming that Sandy Shores determined that vacationers rent beach chairs for an average of 7 days).

To do so, you would run the following query:


SELECT order_id
, date_picked_up
, DATE_ADD(date_picked_up, INTERVAL 7 DAY) as estimated_date_returned
FROM orders

This would result in the table below:

This function helps Sandy Shores understand when they should expect to see their chairs back, and follow up with customers if their return window exceeds that date. But what if they wanted to get specific about the number of rentals they had on a specific date for a specific location? That’s where EXTRACT() comes in.

4. EXTRACT()

EXTRACT() returns the value that corresponds to the specified date part.

For example, we can see below how we can use this function to pull out a specific part of a full date, such as a month, day, or year.


EXTRACT(DAY FROM '2021-01-03') = 3


EXTRACT(MONTH FROM '2021-01-03') = 1


EXTRACT(YEAR FROM '2021-01-03') = 2021

It has the following syntax:


EXTRACT(part FROM date_expression)

Using EXTRACT() is an easy way to get specific components of a date you’re interested in. For example, if your company does reporting by week numbers, you can use EXTRACT() to get the week number for a given date for a given record. EXTRACT() also allows you to get the month number or year from the date, both of which are useful for machine learning models.

Let’s see how Sandy Shores can use EXTRACT() to get specific about their rental records.

If Sandy Shores wanted to get more insights on the number of sales in each year, or sales by month, EXTRACT() can come in handy because it provides an easy way to group data.

Consider the following table:

To get the year and month from each date, you could run the following query:


SELECT order_id
, date_picked_up
, EXTRACT(YEAR FROM date_picked_up) as year
, EXTRACT(MONTH FROM date_picked_up) as month
FROM orders

This would result in the following output:

With the four date functions above, the analyst at Sandy Shores can now manipulate dates however she likes for business reporting and insights. However, there’s one more useful date function that will allow our analyst to write dynamic queries, which is CURRENT_DATE().

5. CURRENT_DATE()

CURRENT_DATE() returns the (you guessed it) current date in which the query was executed in the specified timezone. Note that the timezone parameter is optional and does not need to be specified, as the default timezone is in UTC.

Using CURRENT_DATE() offers an easier way of referencing today’s date as opposed to a hard-coded date, which is especially useful if it’s a query scheduled on Airflow or a query you use often.

It has the following syntax:


CURRENT_DATE([time_zone])

Here’s an example that Sandy Shores might use.

Example 1: Suppose today is January 14, 2021, and Sandy Shores wanted to get all rentals that were picked up in the past week:

To do so, they would run the following query:


SELECT order_id
, date_picked_up
, date_returned
FROM orders
WHERE DATE_DIFF(CURRENT_DATE(), date_picked_up, DAY) < 7

This would result in the following output:

As we can see in the table above, the function gave us a list of each chair rental that was picked up in the last seven days so we can see which customers are active and how many of our chairs are out enjoying a beach day somewhere.

Following in the steps of Sandy Shores: What next?

Now that you understand how the top five DATE functions in SQL work as well as our fine analysts at Sandy Shores, it’s time to put your knowledge to the test. For you adventurous few, there are two resources I’d recommend checking out:

  1. Leetcode is a website that has hundreds of coding problems that you can solve. One problem called ‘Rising Temperature’ is a good problem that challenges your ability to work with dates.
  2. w3resource is another great resource — in this link, there are 21 problems focused on datetime.

Looking for even more ways to improve your SQL skills? Check out some of the other great resources from the census team here.