The Distributed SQL Blog

Thoughts on distributed databases, open source, and cloud native

PostgreSQL Timestamps and Timezones: How to Navigate the Interval Minefield

Technical Product Manager

This is the second of a two part blog post series about the date-time data types that PostgreSQL, and therefore YSQL, support. The first part dealt with the basic business of representing moments (when things happen). The relevant data types here are time, date, and timestamp—where the latter has a without time zone and a with time zone variant. I’ll use, hereinafter, the short spellings (plain) timestamp and timestamptz, respectively, for these—and timestamp[tz] to denote either one of these.

I’ll assume, here in the second part, that you’ve read the first part. I’ll also assume that you downloaded and installed the companion code kit for the YSQL date-time documentation.

This second part deals with durations (how long things last). The relevant data type here is interval. When you subtract one timestamp[tz] value from another, you get an interval value. And when you add (or subtract) an interval value to (or from) a timestamp[tz] value, you get a timestamp[tz] value. This sounds as if it couldn’t be simpler. But it turns out that the rules here are complex and confusing. I struggled with all this while I was doing the required study to prepare for my documentation task. I asked endless questions on the pgsql-general email list and received enormous help from that source. One reply did say this:

Looking for logic in dates/times/calendars is a recipe for a continuous pounding headache.

I really think that I’ve found a way to constrain the complexity and the paradoxes. And it’s my aim here to provide you with a powerful headache analgesic. This explains my choice of this post’s title.

Why are durations conceptually hard?

The conceptual difficulties stem from these phenomena:

  • Humans have evolved, and developed language and thought, on planet Earth—which both rotates on its own axis and orbits the sun. And its axis of rotation isn’t normal to the plane of its solar orbit. As a result, we see the sun rise and set—in other words, we experience day and night. And, distinct from this, we experience the cycle of changing length of daylight and changing weather—in other words seasons. We call the duration from when a spot on planet Earth sees the sun at its zenith until it next sees it there one day. And we call the duration from when a spot on planet Earth experiences its longest day until it next experiences this one year.
  • There is no astronomical reason for the ratio of these two durations, one year to one day, to be an integer—and it isn’t. See the Wikipedia article Year. It says that the average length of one year, over a 400 year sample, is about 365.2425 days (to four digits of decimal precision). Humans have adopted a convention to accommodate this: a standard year lasts 365 days and a leap year lasts 366 days; leap years usually come when the year number is divisible by four (2012, 2016, 2020, and so on); but when it’s a century year (1900, 2000, 2100, and so on) it will not be a leap year unless it’s divisible by 400. So 2000 was a leap year, but 1900 wasn’t and 2100 won’t be.
  • There’s also the phenomenon of leap seconds to accommodate the fact that the length quoted by Wikipedia, 365.2425 days, is an approximation. There’s a rule for when one minute occasionally becomes 61 seconds. See the Wikipedia article Leap second. The calendar implementation that PostgreSQL uses, and that YSQL therefore inherits, does honor the ordinary rule for when leap years occur. But it does not honor leap seconds. Therefore, developers who use these two systems are at least spared that complexity.
  • On top of all this, there’s the Moon—clearly visible from Earth. It orbits the Earth in what’s known as one lunar month. But there are two different figures for the duration of a lunar month: about 27.32 days or about 29.53 days. This reflects the fact that there are different criteria for deciding when an orbit starts and ends (relative to the earth’s surface or relative to the position of the sun). For a simple life, people usually say that a lunar month lasts 28 days. But this hardly matters because, these days, this duration is never used as a conventional measure. (The lunar month was once so used—and this is where the word comes from. So important are the phases of the moon to early human cultures that in Chinese, for example, the word 月 means both moon and month. And in English, month and moon are clearly etymologically related.) Rather, the international standard is now the calendar month. And the duration of a calendar month is famously idiosyncratic—variously 30 days or 31 days with February, as a special case, having 28 days in a standard year and 29 days in a leap year. This convention means that the total number of days summed over the twelve calendar months in any year, either 365 or 366, is always identical to the number of days in any particular year.
  • Finally there’s hours, minutes, and seconds. These are arbitrarily defined subdivisions of a day—entirely independent of astronomical phenomena. We could just as well have ended up with twenty hours in a day, one hundred minutes in an hour, and one hundred seconds in a minute. After all, “minute” comes from the Latin pars minuta prima, meaning “first small part”; and “second” comes from pars minuta secunda, meaning “second small part”. There’s no “sixty” to be seen in that etymology. One second in our actual scheme is about two-and-a-quarter times longer than one second in the scheme that has 20*100*100 of these units.

Of course, you all know this. It’s second nature to us in the modern world—even though, over a time scale of centuries, it took a long time for the world’s population to agree on a single calendar convention. And we still have vestiges of outdated conventions bringing outcomes like Easter and the Chinese New Year occurring on a different calendar date each year.

The reason that I spelled out this sorry, but inescapable, mess is that humans conventionally measure durations in different units in different situations. I decided to use the terms “clock time” and “calendar time” in the YSQL date-time documentation. (I introduce the terms in the section Conceptual background.) The PostgreSQL documentation has no such terms of art—and its explanations are correspondingly impoverished.

Clock time durations

The best way to understand clock time durations is as what a reliable stopwatch would show when it’s carried by an observer between two events—no matter where they happen. I’ll assume that you follow my advice, given in Part One, to represent the moments at which the events happen always as timestamptz values. Arithmetic that uses clock time interval values and timestamptz values is independent of the session timezone. However, when, for example, you view a timestamptz value, it must be converted to text—and this conversion is sensitive to the session timezone.

Clock time is measured in hours, minutes, and seconds—which units are mutually reliably convertible, just as are yards, feet and inches. These days, a second is not defined as a fraction of a day. Rather it’s defined with respect to the Caesium standard: one second is defined to be 9,192,631,770 caesium standard units. Notice that days, and longer units, are simply not used to measure clock time durations. The durations of long, multi-segment, international flights are given in clock time—even when they exceed twenty four hours. For example, for San Francisco SFO to Kathmandu KTM, Internet search finds a three-segment option departing SFO at 22:55 local time on Sunday evening, 21-Nov-2021 and arriving at KTM at 15:10 local time on the Tuesday afternoon. The total journey time, as measured by the stopwatch that you start at take-off from SFO and stop at the final touch-down in KTM, is 26:30 hours.

The apparent paradox that departing five minutes before the hour and arriving ten minutes after the hour results in an elapsed clock time of an integral number of hours and thirty minutes is explained by the timezone change. At that time of year, 22:55 on Sunday evening in San Francisco is 12:40 on Monday in Kathmandu. It’s exactly because Kathmandu is, psychologically at least, a day ahead of San Francisco that saying that the journey time is one day two hours and thirty minutes would be confusing—to say the least.

The critical point with this scenario, and others like it that call for durations to be measured in clock time, is this: you know the duration of a journey that starts in one timezone and ends in another; Daylight Saving Time transitions might be encountered en route; and, of course, you want to know the exact local time of arrival. Only if you know this can you, for example, arrange for somebody to meet you at the airport. The person who’ll meet you doesn’t care where, or when you started. In fact, after you’ve spent more than twenty six hours en route from San Francisco to Kathmandu, even you will barely be able to express usefully to yourself when you started! Normal experience just doesn’t prepare you for dramatic changes in what the clock reads, and what the day is named, especially if you’re wide awake, or maybe just fitfully dozing, and eating the wrong kinds of food at random wrong moments, while these changes happen.

Calendar time durations

Calendar time is measured for some purposes in days, and for other purposes in months.

Years are something of a red herring here, because every year is twelve months. Having said this, there are some anomalous programming situations where defining a duration as a non-integral number of years produces a different effect from defining it as an integral number of years and the corresponding number of non-integral number of months for the remainder. I discuss this in the YSQL section How does YSQL represent an interval value? But I’ll keep well away from that particular landmine in this post. As long as you stick to my advice and use the apparatus that I describe in this post’s section “Avoid the risks of the interval minefield with custom domains” (below), then you’ll never hit this quirk.

Arithmetic that uses calendar time interval values takes account of human conventions. Here are two examples:

  • Birthdays are celebrated on the same date each year (leaving aside, here, the annoyance that people born on 29-Feb suffer). This means that the actual clock time duration between two moments that are one year apart using calendar time semantics is sometimes 365 days and sometimes 366 days—which figures translate to different numbers of clock time units.
  • What does it typically mean to postpone an appointment by one day? Suppose that you live in LA and had arranged to ring a friend in San Francisco at eight on Saturday evening, 13-Mar-2021. (You’re both in the same America/Los_Angeles timezone—and the spring forward moment here is in the small hours of the immediately following Sunday morning.) Then something came up and you had to message your friend to say “Sorry, I have to push out our call by a day. Same time tomorrow. OK?” There’s actually only twenty-three clock duration hours between eight on Saturday evening, 13-Mar-2021 and eight on Sunday evening, 14-Mar-2021 in this timezone—yet, by convention, the duration is nevertheless understood to be one calendar time day so that your friend understands that you mean that the new time for the call is eight on Sunday evening,

The only way to convert between clock time and calendar time is to assert an ad hoc rule like, for example, that one day is twenty-four hours and that one month is thirty days. These rules, of course, embody a lie:

  • There are only twenty-three hours between noon on Saturday 13-Mar-2021 and noon on Sunday 14-Mar-2021 in Los Angeles because this period spans the spring-forward moment; and there are as many as twenty-five hours between noon on Saturday 6-Nov-2021 and noon on Sunday 7-Nov-2021 because this period spans the fall-back moment there. It’s different in other timezones. (This lines up with the popular interpretation that you lose an hour at the start of summer time and get it back at the start of winter time.)
  • As we’ll presently see, PostgreSQL and YSQL say that one month is thirty days. So a year of twelve thirty-day months is 360 days. Yet a calendar year is not this many days. Rather, it’s either 365 days or 366 days.

The only way to accommodate these apparent self-contradictions is to realize that calendar time durations are inexact, but sufficient for the purpose. When Ewan McGregor and his crew set out to ride their motorbikes from London to New York City on 14-Apr-2004, they might’ve said that they expected that it’d take them three-to-four months. Let’s say that they started at noon. You might’ve counted on your fingers “April-to-May, May-to-June, June-to-July, July-to-August”. So you might’ve guessed that they’d finish their expedition some time between the middle of July and the middle of August. The planned route was via Western and Central Europe, Ukraine, Western Russia, Kazakhstan, Mongolia, Siberia and Canada—and anything could happen along the way. So it’d certainly never cross your mind to wonder what time of day they’d arrive. In fact, they arrived on 29 July. So “three-to-four months” was a fair estimate. And that’s as good as it gets, and needs to get, for such estimates.

It’s the same story with people’s ages and birthdays. The Internet abounds with stuff to tell you about the development milestones for newborn babies. You’re told what to expect at three months, at five months, at seven months, and so on. But if your baby is born on, say 12-Jan-2019, then you shouldn’t expect a sudden change from the characteristics of a six-month-old from 12-June through 11-July to the characteristics of a seven-month-old from 12-July through 11-August in her first year. Then, later, she’ll celebrate her twelfth, thirteenth, fourteenth, and so on birthdays in 2031, 2032, 2033, and so on—on the same 12-January date each year—and on each birthday she’s a year older than on the last. Nobody thinks that it’s wrong to talk like this just because there’s 365 days between 12-Jan-2031 and 12-Jan-2032 while there’s 366 days between 12-Jan-2032 and 12-Jan-2033. It’s built-in to the notion of a year that it’s an approximate measure and that birthdays always occur on the same date each year—with the well-known special rules for somebody born on 29-February in a leap year.

Notice, by-the-way, that one light year is spoken of as the distance that light travels in a vacuum in however many seconds there are in a year. In fact, one light year is simply defined to be 9,460,730,472,580,800 meters. (One meter, in turn, is defined to be 1,650,763.73 times the wavelength of the orange-red emission line in the electromagnetic spectrum of the krypton-86 atom in a vacuum.) The speed of light is measured to be 299,792,458 meters per second. Putting the two definitions and the measurement together lets you calculate that one year is 31,557,600 seconds. However, it you take one year to be 365.2425 days on average, and one day to be 24*60*60 seconds, then you get that one year is 31,556,952 seconds—so not quite the figure you get from the light year and the speed of light. This shouldn’t surprise you. Rather, it just reminds you of the arbitrariness, rooted in history, that underlies the choice and definition of units that are used for the conventional measurement of time.

Some basic interval arithmetic SQL examples

I’ll simply show you now, as yet without stepping on any mines, the SQL that corresponds to the scenarios that I just described.

Flying from San Francisco to Kathmandu

The example that I described above used this itinerary:

  • depart SFO at  22:55 on Sunday evening, 21-Nov-2021
  • total duration 26:30 hours
  • arrive KTM at 15:10 on the Tuesday afternoon, 23-Nov-2021

The arrival time is calculated thus:

with c as (
  select
    '2021-11-21 22:55 America/Los_Angeles' ::timestamptz as depart_time,
    '26:30'                                ::interval    as journey_time)
select
  at_timezone('Asia/Kathmandu', depart_time + journey_time)::text as arrive_time
from c;

See the YSQL documentation section Recommended practice for specifying the UTC offset for the definition of the user-defined at_timezone() function—and the rationale for preferring it to the raw at time zone SQL operator. This is the result:

 2021-11-23 15:10:00

It’s just what the advertised schedule said. Notice that America/Los_Angeles observes Daylight Savings Time, that Asia/Kathmandu does not, and that 21-Nov-2021 is during the winter time in San Francisco.

Suppose that this flight leaves SFO at the same local time on a Sunday evening throughout the year and that the journey time is always the same. Look how the arrival time changes when the flight leaves SFO during its summer time:

with c as (
  select
    '2022-04-03 22:55 America/Los_Angeles' ::timestamptz as depart_time,
    '26:30'                                ::interval    as journey_time)
select
  at_timezone('Asia/Kathmandu', depart_time + journey_time)::text as arrive_time
from c;

This is the new result:

 2022-04-05 14:10:00

The flight now arrives at 14:10 local time on the Tuesday afternoon—one hour earlier than when it leaves SFO during its winter time. This outcome is easily explained by this query:

with c as (
  select
    '2022-04-03 22:55 America/Los_Angeles' ::timestamptz as summer_dep_time,
    '2021-11-21 22:55 America/Los_Angeles' ::timestamptz as winter_dep_time)
select
  at_timezone('Asia/Kathmandu', summer_dep_time)::text as summer_dep_time,
  at_timezone('Asia/Kathmandu', winter_dep_time)::text as winter_dep_time
from c;

This is the result:

   summer_dep_time   |   winter_dep_time
---------------------+---------------------
 2022-04-04 11:40:00 | 2021-11-22 12:40:00

It shows that the local departure time from SFO in its summer corresponds to one hour earlier in Kathmandu’s local time than does the departure from SFO in its winter.

Postponing a meeting by one day

This scenario highlights the fact that calendar time implements human convention. Try this:

set timezone = 'America/Los_Angeles';
with c as (
  select '2021-03-13 20:00 America/Los_Angeles'::timestamptz as original_appointment)
select
  original_appointment                         ::text,
  (original_appointment + '1 day'   ::interval)::text as postponed_by_1_day,
  (original_appointment + '24 hour' ::interval)::text as postponed_by_24_hours
from c;

This is the result:

  original_appointment  |   postponed_by_1_day   | postponed_by_24_hours
------------------------+------------------------+------------------------
 2021-03-13 20:00:00-08 | 2021-03-14 20:00:00-07 | 2021-03-14 21:00:00-07

You can see from the fact that the postponement changes the timezone offset from minus eight hours to minus seven hours that it crossed the spring-forward moment. The calendar time result, postponed_by_1_day, meets the conventional expectation. And the clock time result, postponed_by_24_hours, violates it.

Cycling from Los Angeles to San Francisco

Suppose, instead, that a strong cyclist sets out at eight on Saturday evening, 13-Mar-2021 on a non-stop road trip from LA to San Francisco. It’s about 740 km by the shortest bikeable route. World class athletes manage an average speed of about 40 km per hour over the twenty-three grueling days of the Tour de France. So our strong amateur cyclist should be able to manage a bit more than 30 km per hour on an overnight endurance challenge—resulting in an exactly twenty-four hour ride. Now the same query that I just showed you tells you that, to calculate the correct arrival time of nine on the Sunday evening, you need clock time semantics.

The analysis of the two examples, postponing a meeting by one day and cycling from Los Angeles to San Francisco, shows you that you have to know the purpose of your calculation so that you can use an interval value with the right semantics. You can see that one day is not semantically the same as twenty-four hours.

(The example of flying from San Francisco to Kathmandu needs clock time semantics, just like cycling from Los Angeles to San Francisco. They are simply two variants of the one scenario. I described both just to hammer the point home.)

Ewan McGregor’s motorcycle expedition from London to New York

Create this formatting function to make the results easier to read:

drop function if exists fmt(text, timestamptz) cascade;
create function fmt(tz in text,t in timestamptz)
  returns text
  language sql
as $body$
  select to_char(at_timezone(tz, t), 'dd-Mon-yyyy');
$body$;

Calculate the arrival time thus:

with
  c1 as (
    select '2004-04-14 12:00:00 Europe/London'::timestamptz as t0),
  c2 as (
    select
       t0,
      (t0 + '3 month'::interval) + '15 day'::interval  as t1
  from c1)
select
  fmt('Europe/London',    t0) as depart_time,
  fmt('America/New_York', t1) as arrive_time
from c2;

This is the result:

 depart_time | arrive_time
-------------+-------------
 14-Apr-2004 | 29-Jul-2004

I’ll explain presently why I wrote this seemingly unnecessarily verbose expression:

(t0 + '3 month'::interval) + '15 day'::interval

rather than what might seem to be equivalent:

t0 + '3.5 months'::interval

(In this example, both ways of writing it bring the same result.)

A newborn’s milestones—seven-days, fourteen-days, one-month, six-months, one year, and the like

Recall that in Part One of this blog post pair, I recommended that you always use timestamptz to persist moment data in tables. I don’t need to use tables in the code examples here. But you know now why I use timestamptz rather than date to define a so-called date of birth. (These days, birth certificates for US citizens, for example, show both the date and the time of birth. And the timezone is implied by the place of birth.)

I’ll take the example that I sketched already a bit further and say that the baby is born at noon on 12-Feb-2022 in New York and that the family moves to Los Angeles just before her six-month milestone. I’ll use the same formatting function, fmt(text, timestamptz) that I defined for the “motorcycle expedition” example, above. First, make sure that it’s in place:

drop function if exists fmt(text, timestamptz) cascade;
create function fmt(tz in text,t in timestamptz)
  returns text
  language sql
as $body$
  select to_char(at_timezone(tz, t), 'dd-Mon-yyyy');
$body$;

Then try this:

deallocate all;
prepare stmt(text) as
with
  c1 as (
    select
      'America/New_York'    as NY,
      'America/Los_Angeles' as LA),
  c2 as (
    select
      NY,
      LA,
      ($1||' '||NY)::timestamptz as t
    from c1)
select
  fmt(NY,  t                         ) as "born",
  fmt(NY, (t + '1 week'   ::interval)) as "1 week",
  fmt(NY, (t + '2 week'   ::interval)) as "2 week",
  fmt(NY, (t + '1 month'  ::interval)) as "1 month",
  fmt(LA, (t + '3 month'  ::interval)) as "3 month",
  fmt(LA, (t + '6 month'  ::interval)) as "6 month",
  fmt(LA, (t + '12 month' ::interval)) as "12 month",
  fmt(LA, (t + '13 month' ::interval)) as "13 month",
  fmt(LA, (t + '5 year'   ::interval)) as "5 year"
from c2;
\x on
execute stmt('2019-01-12 12:00');
\x off

This is the result:

born     | 12-Jan-2019
1 week   | 19-Jan-2019
2 week   | 26-Jan-2019
1 month  | 12-Feb-2019
3 month  | 12-Apr-2019
6 month  | 12-Jul-2019
12 month | 12-Jan-2020
13 month | 12-Feb-2020
5 year   | 12-Jan-2024

These results agree with intuition. One week is simply a synonym for seven days; and one year is simply a synonym for twelve months. However, one month is not a synonym for thirty days. This is because a week always has the same number of days and a year always has the same number of months. The catch is that one month doesn’t always have the same number of days.

  • When you’re adding days, you count the days forward on your fingers from the start date and you get the finish date. You cross the month boundaries according to how many days the current month has, and you go from the last day of some month to the first day of the next month.
  • When you’re adding months, you count the months forward on your fingers from the start month until the finish month, leaving the day in the month the same for each successive month, no matter how many days each month happens to have.

The rule for adding months is adjusted in a natural way when you start on the thirty-first of some month and you reach a month with fewer than thirty-one days. For such a short month, you simply end up on the last day of that month. (It’s the same if you start on the thirtieth of some month and end up in February. You’ll finish on either the twenty-eighth (in a standard year) or the twenty-ninth (in a leap year). Try this:

\x on
execute stmt('2019-01-31 12:00');
\x off

This is the new result:

born     | 31-Jan-2019
1 week   | 07-Feb-2019
2 week   | 14-Feb-2019
1 month  | 28-Feb-2019
3 month  | 30-Apr-2019
6 month  | 31-Jul-2019
12 month | 31-Jan-2020
13 month | 29-Feb-2020
5 year   | 31-Jan-2024

This rule adjustment leads to an outcome that might have seemed wrong had I not introduced you to it as I just did. Try this:

\x on
with
  c1 as (
    select
      'America/New_York' as NY),
  c2 as (
    select
      NY,
      ('2019-01-27 12:00'||NY) ::timestamptz as t1,
      ('2019-01-28 12:00'||NY) ::timestamptz as t2,
      ('2019-01-29 12:00'||NY) ::timestamptz as t3,
      ('2019-01-30 12:00'||NY) ::timestamptz as t4,
      ('2019-01-31 12:00'||NY) ::timestamptz as t5,
      '1 month'                ::interval    as i
    from c1)
select
  fmt(NY, (t1  + i)) as t1,
  fmt(NY, (t2  + i)) as t2,
  fmt(NY, (t3  + i)) as t3,
  fmt(NY, (t4  + i)) as t4,
  fmt(NY, (t5  + i)) as t5
from c2;
\x off

This is the result:

t1 | 27-Feb-2019
t2 | 28-Feb-2019
t3 | 28-Feb-2019
t4 | 28-Feb-2019
t5 | 28-Feb-2019

So when you go one month forward from each of the different starting dates from 28-Jan through 31-Jan, in a non-leap-year, you always get to the same 28-Feb end date.

What do these interval arithmetic SQL examples tell us?

Without yet considering the internal implementation, and how you might go wrong, I hope that I’ve convinced you that, however it’s done, interval arithmetic must support three kinds of interval values:

  • seconds intervals — This kind of interval value is specified using hours, minutes, and seconds and obeys clock time semantics—in other words it honors what a stopwatch shows when you go from one timestamptz value to another even when you cross a spring-forward or fall-back moment (respectively one hour less, or one hour more, than when you don’t cross such a transition).
  • days intervals — This kind of interval value is specified using days (and, if you like, weeks) and obeys the “day” flavor of calendar time semantics so that adding or subtracting one day to a timestamptz value simply changes the date by one day in the specified direction without changing the time-of-day even when you cross a spring-forward or fall-back moment.
  • months intervals — This kind of interval value is specified using months and years (and, if you like, decades, centuries, and so on) and obeys calendar time semantics so that adding or subtracting one month to a timestamptz value simply changes the month to the next or previous one without changing the day-in-month (subject to the caveat that I already explained about moving from, say, the thirty-first in some month to a month that’s shorter than thirty-one days). As with days interval values, the time-of-day stays unchanged even when you cross a spring-forward or fall-back moment.

The internal representation of an interval value

The PostgreSQL documentation, just under Table 8.17. Interval Input, says that the internal representation of an interval value is a three-field tuple representing months, days, and seconds. This seems to line up nicely with the requirement that I outlined above to support three kinds of interval values: months intervals, days intervals, and seconds intervals. I’ll write this internal value as [mm, dd, ss]. I introduced this notation in the YSQL documentation section How does YSQL represent an interval value? Notice this wording from the PostgreSQL documentation:

This is done because the number of days in a month varies, and a day can have 23 or 25 hours if a Daylight Savings Time adjustment is involved… Because intervals are usually created from constant strings or timestamp subtraction, this storage method works well in most cases, but can cause unexpected results.

“Unexpected results”? In other words, “You’re in a minefield. Take care where you step.” This might have been expressed more explicitly, thus:

  • The rules that govern adding or subtracting an interval value to a timestamptz value are different for each of the three fields of the [mm, dd, ss] internal representation of an interval value. They differ to reflect the critical difference between clock time durations and calendar time durations. The mm and dd fields are governed by, respectively, months calendar time semantics and days calendar time semantics. And the ss field is governed by seconds clock time semantics. You can choose the semantics that you want by specifying the interval value appropriately.
  • However, the rules that govern producing an interval value by subtracting one timestamptz value from another give you no discretion. In general, you get a [mm, dd, ss] value where both the dd and ss components are non-zero. (The mm component is always zero.) You should not add or subtract such an interval value to a timestamptz value because it would mix days and seconds semantics in a meaningless way.

(The PostgreSQL documentation doesn’t define the rules for the outcomes when you add an interval value where both the dd and ss components are non-zero. This, presumably, is what it means by “unpredictable”.)

How to access the individual fields of a [mm, dd, ss] tuple

There is no native functionality that lets you access the individual fields of a [mm, dd, ss] tuple directly. But it’s easy to write such a function, interval_mm_dd_ss(), using the extract() built-in function. I show you how to do this here in the YSQL documentation section User-defined interval utility functions. This is its signature:

function interval_mm_dd_ss(interval) returns interval_mm_dd_ss_t

Of course, interval_mm_dd_ss_t is a user-defined type with the required three fields. If you followed my recommendation and downloaded and installed the companion code kit for the YSQL date-time documentation, then you’ll find that the interval_mm_dd_ss() function and the type that it depends on are already available for use. Create and execute the months_days_seconds_intervals_comparison() table function to show the effects of, the internal representations of, and the comparisons between these three nominally equal one month interval values:

  • make_interval(months=>1)
  • make_interval(days=>30)
  • make_interval(hours=>30*24)
drop function if exists months_days_seconds_intervals_comparison() cascade;
create function months_days_seconds_intervals_comparison()
  returns table(z text)
  language plpgsql
as $body$
declare
  t0 constant timestamptz not null := '2021-03-13 20:00 America/Los_Angeles';
  i_months      constant interval not null := make_interval(months =>     1);
  i_days        constant interval not null := make_interval(days   =>    30);
  i_secs        constant interval not null := make_interval(hours  => 30*24);
  i_months_rep  constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(i_months);
  i_days_rep    constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(i_days);
  i_secs_rep    constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(i_secs);
begin
  z := 't0:            '||(t0)                                   ::text; return next;
  z := 't0 + i_months: '||(t0 + i_months)                        ::text; return next;
  z := 't0 + i_days:   '||(t0 + i_days)                          ::text; return next;
  z := 't0 + i_secs:   '||(t0 + i_secs)                          ::text; return next;
  z := '';                                                               return next;
  z := 'i_months: '||rpad(i_months ::text, 10)||i_months_rep     ::text; return next;
  z := 'i_days:   '||rpad(i_days   ::text, 10)||i_days_rep       ::text; return next;
  z := 'i_secs:   '||rpad(i_secs   ::text, 10)||i_secs_rep       ::text; return next;
  z := '';                                                               return next;
  z := 'i_months = i_days: '||
       (i_months = i_days)                                       ::text; return next;
  z := 'i_days   = i_secs: '||
       (i_days   = i_secs)                                       ::text; return next;
  z := 'i_months = i_secs: '||
       (i_months = i_secs)                                       ::text; return next;
  z := '';                                                               return next;
  z := 'i_months_rep = i_days_rep: '||
       (i_months_rep = i_days_rep)                               ::text; return next;
  z := 'i_days_rep   = i_secs_rep: '||
       (i_days_rep   = i_secs_rep)                               ::text; return next;
  z := 'i_months_rep = i_secs_rep: '||
       (i_months_rep = i_secs_rep)                               ::text; return next;
end;
$body$;
     
set timezone = 'America/Los_Angeles';
select z from months_days_seconds_intervals_comparison();

This is the result:

t0:            2021-03-13 20:00:00-08
t0 + i_months: 2021-04-13 20:00:00-07
t0 + i_days:   2021-04-12 20:00:00-07
t0 + i_secs:   2021-04-12 21:00:00-07
     
i_months: 1 mon     (1,0,0)
i_days:   30 days   (0,30,0)
i_secs:   720:00:00 (0,0,2592000)
     
i_months = i_days: true
i_days   = i_secs: true
i_months = i_secs: true
     
i_months_rep = i_days_rep: false
i_days_rep   = i_secs_rep: false
i_months_rep = i_secs_rep: false

The results for “t0 + i_months”, “t0 + i_days”, and “t0 + i_secs” show clearly how the semantics of these three different kinds of interval values differ—exactly in line with the requirements set out in the section “What do these interval arithmetic SQL examples tell us?” above.

The ::text typecasts of the internal representations of i_months, i_days, and i_secs show clearly that each has a non-zero field value for just, respectively, mm, dd, and ss.

The results for the comparisons “i_months = i_days”, “i_days = i_secs”, and “i_months = i_secs” might surprise you. I was certainly surprised when I first did these kinds of comparisons. How can they all compare as equal when each produces a different result when added to a timestamptz value? The only way to make sense of this is to understand that the overload of the “=” operator for a pair of interval values is actually an approximate equality that uses the rule of thumb that there are always twenty-four hours in one day (and that Daylight Savings Time transitions are of no consequence) and that there are always thirty days in one month—notwithstanding the fact that this is simply not the case.

The results for the comparisons of the internal representations of the three approximately equal interval values of i_months, i_days, and i_secs (each pairwise comparison shows that the internal representations are different) are consistent with the different semantic effect that each different value has. The YSQL documentation section User-defined interval utility functions shows you, here, how to create the user-defined “strict equals” interval-interval “==” operator. If you downloaded and installed the companion code kit for the YSQL date-time documentation, then you’ll find that the “==” operator is already available for use. Use it like this:

\x on
with c as (
  select
    make_interval(months =>     1) as i_months,
    make_interval(days   =>    30) as i_days,
    make_interval(hours  => 30*24) as i_secs)
select
  (i_months =  i_days)::text as "i_months =  i_days",
  (i_months == i_days)::text as "i_months == i_days",
  ''                         as " ",
  (i_days   =  i_secs)::text as "i_days   =  i_secs",
  (i_days   == i_secs)::text as "i_days   == i_secs",
  ''                         as " ",
  (i_months =  i_secs)::text as "i_months =  i_secs",
  (i_months == i_secs)::text as "i_months == i_secs"
from c;
\x off

This is the result:

i_months =  i_days | true
i_months == i_days | false
                   | 
i_days   =  i_secs | true
i_days   == i_secs | false
                   | 
i_months =  i_secs | true
i_months == i_secs | false

It agrees with the result given by the months_days_seconds_intervals_comparison() table function above.

Defining the term hybrid interval value

I define the term hybrid interval value in the YSQL documentation in terms of the [mm, dd, ss] internal representation. A hybrid interval value has a non-zero value for more than one of the fields of its internal representation. It’s a useful term because it supports the discussion that lets you understand what lies behind that quote I reproduced, above, from the PostgreSQL documentation that I’ll paraphrase here thus:

  • Interval values produced by subtracting one timestamptz value from another can cause unexpected results when you add or subtract them to timestamptz values because, in general, they are hybrid and therefore mix the rules of days calendar time durations and seconds clock time durations in an undefined way.

The PostgreSQL documentation doesn’t use the term hybrid interval value—and defines no other term for the notion. Its explanations are therefore correspondingly impoverished.

Moment arithmetic using seconds intervals, days intervals, and months intervals

I’ll first show you what I consider to be a shocking result: timestamptz-interval addition and subtraction, using the native operators, violate the normal rules of algebra.

Then, in the following section, I’ll show you how you can conquer this mess with three kinds of user-defined interval domains (months, days, and seconds) each with its own user-defined implementations of interval arithmetic.

Interval addition and subtraction, using the native operators, violate the normal rules of algebra

Try this:

drop function if exists normal_rules_of_algebra_violation() cascade;
create function normal_rules_of_algebra_violation()
  returns table(z text)
  language plpgsql
as $body$
declare
  LA  constant text not null := 'America/Los_Angeles';
     
  t1  constant timestamptz not null := '2021-03-01 12:00 '||LA;
  i1  constant interval    not null := '20 days';
  t2  constant timestamptz not null := t1 + i1;
  i2  constant interval    not null := t2 - t1;
  t3  constant timestamptz not null := t2 - i2;
     
  t1_ constant text not null := rpad('t1:',                           30);
  i1_ constant text not null := rpad('i1:',                           30);
  t2_ constant text not null := rpad('t1 + i1:',                      30);
  i2_ constant text not null := rpad('(t1 + i1) - t1:',               30);
  t3_ constant text not null := rpad('(t1 + i1) - ((t1 + i1) - t1):', 30);
     
  i1_rep constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(i1);
  i2_rep constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(i2);
begin
  z := t1_||t1::text;                          return next;
  z := i1_||i1::text||' = '||i1_rep::text;     return next;
  z := '';                                     return next;
  z := t2_||t2::text;                          return next;
  z := i2_||i2::text||' = '||i2_rep::text;     return next;
  z := '';                                     return next;
  z := t3_||t3::text;                          return next;
end;
$body$;
     
set timezone = 'America/Los_Angeles';
select z from normal_rules_of_algebra_violation();

This is the result:

t1:                           2021-03-01 12:00:00-08
i1:                           20 days = (0,20,0)
     
t1 + i1:                      2021-03-21 12:00:00-07
(t1 + i1) - t1:               19 days 23:00:00 = (0,19,82800)
     
(t1 + i1) - ((t1 + i1) - t1): 2021-03-01 13:00:00-08

The normal rules of algebra say this:

  x ◄— a + b implies a ◄— x - b

But they don’t hold here. Here’s why:

  • There are different kinds of interval values with different rules for moment-interval arithmetic.
  • You can create the kind of interval value that you want by saying, for example, ‘1 day’::interval or ‘24 hours’::interval.
  • But, using only the native functionality, you cannot ask for the kind of interval value that you want as the result of subtracting one moment value from another. Subtracting one timestamptz value from another gives you, in general, a hybrid days-seconds interval value.

I consider this to be a shocking mistake by the original PostgreSQL designers and implementers. Fortunately, the user-defined domain feature (see below) comes to the rescue and gives you full control.

Avoid hybrid interval values—they bring only confusion

Look back at the results produced by executing the table function months_days_seconds_intervals_comparison(), and it particular these lines:

t0:            2021-03-13 20:00:00-08
t0 + i_months: 2021-04-13 20:00:00-07
t0 + i_days:   2021-04-12 20:00:00-07
t0 + i_secs:   2021-04-12 21:00:00-07

The results also show that i_months has only its mm internal field non-zero, i_days has only its dd internal field non-zero, and i_secs has only its ss internal field non-zero. In other words, these interval values are, respectively, pure months, pure days, and pure seconds; none is hybrid. And, as I’ve explained, each kind of interval value has different semantics on addition/subtraction to/from a timestamptz value—just as is required.

What might it mean to add a hybrid interval value to a timestamptz value? Try this:

\x on
set timezone = 'America/Los_Angeles';
with c as (
  select
    '2021-03-14 01:30 America/Los_Angeles'::timestamptz as t0,
    make_interval(days=>1)                              as one_day,
    make_interval(hours=>1)                             as one_hour)
select
  t0::text,
  ((t0 + one_day) + one_hour)::text as "(t0 + one_day) + one_hour",
  ((t0 + one_hour) + one_day)::text as "(t0 + one_hour) + one_day"
from c;
\x off

This is the result:

t0                        | 2021-03-14 01:30:00-08
(t0 + one_day) + one_hour | 2021-03-15 02:30:00-07
(t0 + one_hour) + one_day | 2021-03-15 03:30:00-07

So the order in which each kind of interval value is added to the starting timestamptz value affects the result. Notice that the starting value is half-an-hour before the spring-forward moment.

When one day is added first, you advance to the same time of day on the next date: 2021-03-15 01:30:00-07, even though the UTC offset has changed. Now you’re safely past the spring-forward moment—so that adding one hour simply advances you to 2021-03-15 02:30:00-07.

When one hour is added first, you advance to what the self-adjusting clock on your smartphone will now show after a stopwatch has shown that one hour has elapsed. Because you crossed the spring-forward moment, the clock now reads 2021-03-14 03:30:00-07. Now, too, you’re safely past the spring-forward moment—so that adding one day, again, simply advances you to the same time on the next day, 2021-03-15 03:30:00-07.

This test raises the obvious question: what does it mean to add the hybrid interval value make_interval(days=>1, hours=>1) to a timestamptz value? Feel free to test this for yourself. But whatever you see doesn’t imply that this is the fixed rule of precedence. The PostgreSQL documentation simply doesn’t define the outcome for such hybrid interval arithmetic—and if it did, it wouldn’t necessarily be what you want. If you really think that you can express requirements that lead to one, or the other, of the two possible outcomes that I just demonstrated, then you should simply write the explicit order that you want—just as I did.

Notice that hybrid interval values can arise inadvertently. Try this:

drop function if exists fmt(interval) cascade;
create function fmt(i in interval)
  returns text
  language plpgsql
as $body$
declare
  i_rep constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(i);
begin
  return
    '['||
    to_char(i_rep.mm,    '99')        ||', '||
    to_char(i_rep.dd,    '99')        ||', '||
    to_char(i_rep.ss, '99999.999999') ||']';
end;
$body$;
   
\x on
with c as (
  select
    '1.23456 months'::interval       as i1,
    make_interval(months=>19)/1.2345 as i2)
select
  rpad(i1::text, 40)||fmt(i1) as i1,
  rpad(i2::text, 40)||fmt(i2) as i2
from c;
\x off

This is the result:

i1 | 1 mon 7 days 00:52:59.52                [  1,   7,   3179.520000]
i2 | 1 year 3 mons 11 days 17:24:34.128      [ 15,  11,  62674.128000]

The PostgreSQL documentation doesn’t describe the algorithm that derives an interval value from an interval literal. I guessed at what it might be and implemented my guess in PL/pgSQL. See the YSQL documentation section Modeling the internal representation and comparing the model with the actual implementation. Not surprisingly, my first guess disagreed with the actual outcome in some edge-case tests. But, after a few rounds of thought and experimentation, I arrived at a program that agreed with the actual outcomes across a wide range of tests. Of course, this doesn’t mean that my program models the actual PostgreSQL implementation exactly. But it certainly gives a reliable sense of the complexity of the algorithm and of the arbitrariness of some of its features. Anyway, it’s crystal clear that careless definition of an interval literal can produce a hybrid interval value.

The exercise convinced me beyond doubt that nobody could write an interval literal that has non-integral values for any of its fields (apart from the seconds) and know what they’re going to get. That conclusion was one of my reasons to implement the custom interval domains, with safe value constructor functions, that I describe below.

Similarly, the PostgreSQL documentation doesn’t describe the algorithms for multiplying and dividing an interval value by a real number. I guessed at, and tested, model PL/pgSQL implementations here too. See the YSQL documentation section Interval arithmetic and its children. It’s crystal clear here too that interval multiplication and division can produce a hybrid interval value. This gave me another reason to implement the custom interval domains with safe functions to multiply and divide their values.

Avoid the ‘interval month’, ‘interval day’, ‘interval second’ declarations and their cousins—they don’t do what you want

See the YSQL documentation section Declaring intervals. It examines the semantics of the different spellings of interval declarations—in other words, how these different spellings differently constrain the interval values that can be represented. Presumably, the design intention behind the syntax, and the semantics that it brings, was to differentiate between calendar months semantics, calendar days semantics, and clock seconds semantics and to prohibit hybrid interval values. But the mechanisms simply don’t achieve this. And what they do achieve brings no useful value. This is why I implemented the domains that the next section explains. Their definitions use only “bare” interval declarations and shield you from writing interval declarations explicitly.

Avoid the risks of the interval minefield with custom domains

The YSQL documentation section Custom domain types for specializing the native interval functionality tells you how to create these domains:

  • interval_months_t
  • interval_days_t
  • interval_seconds_t

If you want a detailed understanding of the definitions of these domains and to see the tests and a range of code examples, then you should read the YSQL documentation carefully. This blog post section gives a high-level overview and just a few examples.

Brief overview of the three domains

User-defined PL/pgSQL functions implement appropriate specific functionality for each domain. If you downloaded and installed the companion code kit for the YSQL date-time documentation, then you’ll find that this apparatus is already available for use. Briefly, each domain constrains its values so that only, respectively, the mm field, the dd field, or the ss field of the [mm, dd, ss] internal representation tuple of an interval value can be non-zero. Yugabyte recommends that you never use native interval values but, rather, use interval_months_t values, interval_days_t values, or interval_seconds_t values. This way, you will never see hybrid interval values and you’ll avoid the risks that these bring.

Each of the three domains, X, has a corresponding set of functions thus:

  • function interval_X_ok(i in interval)
    returns boolean

This function checks that only the appropriate one field of the [mm, dd, ss] tuple is non-zero and that its value is within the limits that the YSQL documentation section Interval value limits specifies for the field. It is the basis of a check constraint that the create domain statement defines.

  • function interval_X (<appropriate parameterizing values>)
    returns interval_X_t

This function creates a value of the chosen domain using actual arguments to specify, respectively: years and months; days; or hours, minutes and seconds. This design therefore prevents you even from attempting to define a hybrid interval value that would, without this prevention, go on to cause a constraint violation error.

  • function interval_X (t_finish in timestamptz, t_start in timestamptz)
    returns interval_X_t

This function creates a value of the chosen domain by subtracting the second actual timestamptz argument from the first. Each uses a rule that’s appropriate to the semantics of the kind of interval that the domain models. The months domain’s function considers only the year and month of the input arguments. The days domain’s function considers only the year, month, and day (i.e. the date typecast) of the input arguments. And the seconds domain’s function considers the exact values of each argument by using extract(epoch from …) on each input and computing the difference as the YSQL documentation section Demonstrating the rule for displaying a timestamptz value in a timezone-insensitive way shows. Contributors on the pgsql-general email list have written that they use this method when they need reliable clock time semantics for the difference between two timestamptz values. See, for example, this post.

  • function interval_X (i in interval_X_t, f in double precision)
    returns interval_X_t

This function creates a value of the chosen domain by multiplying the specified value of that domain by the specified real number. It uses an appropriate rounding rule for each of the months and days domains to avoid a hybrid result. Because there is no such hybrid risk when you start with a pure seconds interval value, it calculates the result here exactly.

Demonstrating the functionality of the three domains

This code is copied from the YSQL documentation section Basic demonstration using one month expressed as a months interval, a days, interval, and a seconds interval>. It models the same scenario as this blog post’s section “How to access the individual fields of a [mm, dd, ss] tuple” used. There, the native functionality was used. But here, the custom interval domains are used. Moreover, the demonstration also shows that, when you use these domains, the normal rules of algebra now hold. These rules say this:

  x ◄— a + b implies a ◄— x - b

Create the table function test_results() thus:

drop function if exists test_results() cascade;
create function test_results()
  returns table(z text)
  language plpgsql
as $body$
declare
  t0                    constant timestamptz not null :=
                          '2021-03-13 20:00 America/Los_Angeles';
  i_months              constant interval_months_t not null :=
                          interval_months(months=>1);
  i_days                constant interval_days_t not null :=
                          interval_days(days=>30);
  i_seconds             constant interval_seconds_t not null :=
                          interval_seconds(hours=>30*24);
  t0_plus_i_months      constant timestamptz         not null := t0 + i_months;
  t0_plus_i_days        constant timestamptz         not null := t0 + i_days;
  t0_plus_i_seconds     constant timestamptz         not null := t0 + i_seconds;
  calculated_i_months   constant interval_months_t   not null :=
                          interval_months (t0_plus_i_months,  t0);
  calculated_i_days     constant interval_days_t     not null :=
                          interval_days   (t0_plus_i_days,    t0);
  calculated_i_seconds  constant interval_seconds_t  not null :=
                          interval_seconds(t0_plus_i_seconds, t0);
begin
  assert calculated_i_months = i_months,
         'calculated_i_months <> i_months';
     
  assert calculated_i_days = i_days,
         'calculated_i_days <> i_days';
     
  assert calculated_i_seconds = i_seconds,
         'calculated_i_seconds <> i_seconds';
     
  z := 't0'||rpad(' ', 40)||t0::text;
        return next;
     
  z := 'i_months:  '||rpad(interval_mm_dd_ss(i_months)::text,  15)||
       't0 + i_months:  '||t0_plus_i_months::text;
        return next;
     
  z := 'i_days:    '||rpad(interval_mm_dd_ss(i_days)::text,    15)||
       't0 + i_days:    '||t0_plus_i_days::text;
       return next;
     
  z := 'i_seconds: '||rpad(interval_mm_dd_ss(i_seconds)::text, 15)||
       't0 + i_seconds: '||t0_plus_i_seconds::text;
       return next;
end;
$body$;

Execute it using a timezone where the interval values cross the spring-forward moment:

set timezone = 'America/Los_Angeles';
select z from test_results();

This is the result:

t0                                        2021-03-13 20:00:00-08
i_months:  (1,0,0)        t0 + i_months:  2021-04-13 20:00:00-07
i_days:    (0,30,0)       t0 + i_days:    2021-04-12 20:00:00-07
i_seconds: (0,0,2592000)  t0 + i_seconds: 2021-04-12 21:00:00-07

Each test result is different from the other two and is consistent, respectively, with the semantic definitions of months calendar time durations, days calendar time durations, and seconds clock time durations:

  • The test that uses the months domain advances the month by one while keeping the day number the same, even though it starts from a date in March which has thirty-one days. And it keeps the local time the same even though the timezone offset has sprung forward from minus eight hours to minus seven hours.
  • The test that uses the days domain advances the day by thirty days. Because it starts from the thirteenth of March, which has thirty-one days, it finishes on the twelfth of April. It keeps the local time the same even though the timezone offset has sprung forward from minus eight hours to minus seven hours.
  • The test that uses the seconds domain advances the day by thirty days to finish on the twelfth of April. It started at 20:00 local time. But because it has crossed the spring forward moment, it finishes at 21:00 local time.

Each of these results is exactly what you want (and the other two are exactly what you don’t want) according to whether you want months calendar time semantics, days calendar time semantics, or seconds clock time semantics. Critically, the assert statements show that the normal rules of arithmetic do hold now. You cannot get these results without using the interval domains whose implementations the YSQL documentation shows unless you program the logic that they implement explicitly, time and again, and case by case. In summary:

  • The custom domains implement sensible rules for deriving the desired kind of interval value by subtracting one timestamptz value from another that guarantee that the result is never a hybrid value.
  • The native interval functionality implements crazy rules for subtracting one timestamptz value from another that, in general, bring a hybrid days-seconds value and offer no possibility of getting a months value—hybrid or otherwise.

The PostrgeSQL documentation characterizes the native interval functionality by saying that it brings unpredictable (and therefore, crazy) results. In contrast, the custom interval domains functionality predictably brings sensible results.

However, a caveat about fidelity to the normal rules of algebra must be made.

  • If you first add an interval domain value to a timestamptz value, and then subtract the start timestamptz value from the finish timestamptz value, then you do recover the interval domain value that you added.
  • But if you reverse that order and start by subtracting the timestamptz value t1 from the timestamptz value t2 to get the interval domain value i_mm, i_dd, or i_ss, then t1 + i_mm and t1 + i_dd will not, in general, get you back to t2. This is because the subtraction algorithms to get i_mm or to get i_dd use each their own rounding scheme. However, because the subtraction to get i_ss doesn’t do any rounding, you are guaranteed to get back to t2 here.

The possibility that the second bullet explains is no more remarkable that what this little test shows:

drop function if exists f() cascade;
create function f()
  returns table(z text)
  language plpgsql
as $body$
declare
  t2 constant numeric not null := 9.6;
  t1 constant numeric not null := 4;
  i  constant numeric not null := round(t2 - t1);
  t3 constant numeric not null := t1 + i;
begin
  z := 't2:                           '||t2        ::text;   return next;
  z := 't1:                           '||t1        ::text;   return next;
  z := 'round(t2 - t1):               '||i         ::text;   return next;
  z := 't1 + i:                       '||t3        ::text;   return next;
  z := 'Get back to where to started? '||(t3 = t2) ::text;   return next;
end;
$body$;
select z from f();

This is the result:

t2:                           9.6
t1:                           4
round(t2 - t1):               6
t1 + i:                       10
Get back to where to started? false

Now probe the maximum and minimum usable timestamptz values. First use the native functionality “as is”:

select (
  '294276-12-01 00:00:00 AD UTC'::timestamptz -
    '4713-01-01 00:00:00 BC UTC'::timestamptz)::text;

This is the result:

 -104300523 days -08:01:49.551616

Not only is this a hybrid interval value; it’s also self-evidently wrong. You shouldn’t get a negative interval value when you select an earlier timestamptz value from a later one. This is discussed in the YSQL documentation subsection Practical limit for the ss field — ±7,730,941,132,799.

Now repeat the test using the months and days domains:

with c as (
  select
    '294276-12-01 00:00:00 AD UTC'::timestamptz as t_max,
      '4713-01-01 00:00:00 BC UTC'::timestamptz as t_min)
select
  interval_months(t_max, t_min)::text as "months interval",
  interval_days  (t_max, t_min)::text as "days interval"
from c;

This is the result:

   months interval    | days interval
----------------------+----------------
 298988 years 11 mons | 109203459 days

Notice that (298988*12*30 + 11*30) is 107636010. This uses the rule of thumb that one month is always thirty days. So it gives an answer that’s bound to be smaller than the real number of days between the two timestamptz values. The ratio of 107636010 to 109203459 is about 98.5%. The better way to compare the results is to understand that nobody considers months for a duration that’s about 300 thousand years, and to use the Gregorian average year length of 365.2425 days. (This takes account both of ordinary four-yearly leap years and those that might, or might not, occur every one hundred years.) So the better comparison is between (298989*365.2425) and 109203459. These two values are equal to within about three parts in ten million.

Finally, try to evaluate the difference between the two timestamptz values as a seconds interval:

with c as (
  select
    '294276-12-01 00:00:00 AD UTC'::timestamptz as t_max,
      '4713-01-01 00:00:00 BC UTC'::timestamptz as t_min)
select
  interval_seconds(t_max, t_min)
from c;

It causes this error:

ERROR:  23514: value for domain interval_seconds_t violates check constraint "interval_seconds_ok".
HINT:  Bad ss: 9435178857600. Must be in [-7730941132799, 7730941132799].

This is a far, far better outcome than a silent wrong result! It reflects the practical limit for the ss field that the YSQL documentation section referred to above explains.

Add the biggest legal seconds interval domain to the earliest practical timestamptz value:

set timezone = 'UTC';
select
  '4713-01-01 00:00:00 BC UTC'::timestamptz +
  interval_seconds(secs=>7730941132799);

This is the result:

240271-10-10 07:59:58.999552+00

Now subtract the starting timestamptz value from this resulting timestamptz value:

with c as (
  select
    interval_seconds(
      '240271-10-10 07:59:58.999552 AD UTC'::timestamptz,
        '4713-01-01 00:00:00        BC UTC'::timestamptz) as i_ss)
select
  interval_mm_dd_ss(i_ss)::text as "interval_mm_dd_ss(i_ss)",
  i_ss::text                    as "i_ss"
from c;

This is the result:

 interval_mm_dd_ss(i_ss) |          i_ss
-------------------------+-------------------------
 (0,0,7730941132799)     | 2147483647:59:58.999552

It all agrees nicely.

Conclusion

In this post, I’ve claimed, and backed this up with convincing examples, that PostgreSQL’s interval apparatus, inherited by YSQL, is surprisingly large and complex. This is partly a direct consequence of inescapable facts of the history of human convention. But it’s also due to some unfortunate design choices, made by PostgreSQL implementers so long ago that their quirks will never now be changed: they decided to allow hybrid interval values and to let the results that you get when you use these simply emerge from the implementation. The size and complexity, and those unfortunate design choices, mean that you can easily go wrong.

I’ve shown you here that hope is not lost. You can avoid going wrong by realizing that for new work you need only a small subset of the interval apparatus together with some user-defined utilities that enforce safety.

I’ll finish by reminding you that YugabyteDB’s YSQL subsystem uses PostgreSQL’s SQL processing C code as is. I’ve run all the code examples in this post, and all the code examples in the date-time section of the YSQL documentation, in both environments. All the results from the code examples are the same in both YugabyteDB and PostgreSQL. I hope, therefore, that users of PostgreSQL will find this post interesting and will come to value the documentation that I wrote as a useful resource to complement the PostgreSQL documentation.

If you haven’t already, take YugabyteDB for a spin by downloading the latest version of the open source. And if you have any questions, please don’t hesitate to ask them in the YugabyteDB community Slack channel.

Related Posts

Technical Product Manager