{question}
How are time and date data types (like TIMESTAMP, DATETIME) within SingleStore affected by the timezone setting on the Linux hosts within my cluster?
{question}
{answer}
In SingleStore, you can set the timezone in two different ways:
- From the host level (default). The timezone of the Linux host which holds the node will influence the timezone of those nodes. This can be changed by changing the timezone on the host.
- From the cluster level, a timezone is set across the cluster irrespective of the setting on the hosts. This can be changed by altering the value of a SingleStore engine variable.
How does this affect your data? Well, let's first take a look at a cluster where the nodes are set to PST on the host level, and default_time_zone is still the default setting of SYSTEM (which will use the system timezone on node launch).
The table in the following example contains two columns that correspond to the TIMESTAMP data type and the DATETIME data type:
singlestore> select * from tztest order by id;
+----+---------------------+---------------------+
| id | timeStamp | dateTime |
+----+---------------------+---------------------+
| 1 | 2021-11-10 16:57:55 | 2021-11-10 16:57:55 |
| 2 | 2021-11-10 16:57:55 | 2021-11-10 16:57:55 |
| 3 | 2021-11-10 16:57:55 | 2021-11-10 16:57:55 |
| 4 | 2021-11-10 16:57:55 | 2021-11-10 16:57:55 |
| 5 | 2021-11-10 16:57:55 | 2021-11-10 16:57:55 |
| 6 | 2021-11-10 16:57:55 | 2021-11-10 16:57:55 |
+----+---------------------+---------------------+
6 rows in set (0.01 sec)
Now we'll change the system timezone to UTC on just one of the hosts and restart all relevant nodes on that host. Then, when we query the table again, we can see how this changed the way the results are output:
singlestore> select * from tztest order by id;
+----+---------------------+---------------------+
| id | timeStamp | dateTime |
+----+---------------------+---------------------+
| 1 | 2021-11-10 16:57:55 | 2021-11-10 16:57:55 |
| 2 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 3 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 4 | 2021-11-10 16:57:55 | 2021-11-10 16:57:55 |
| 5 | 2021-11-10 16:57:55 | 2021-11-10 16:57:55 |
| 6 | 2021-11-10 16:57:55 | 2021-11-10 16:57:55 |
+----+---------------------+---------------------+
6 rows in set (0.03 sec)
The data held on the host, which had the timezone change, now reports a different value for timeStamp by 8 hours. This makes sense as we changed from PST to UTC on that host. Therefore, the timeStamp is held as UTC and displayed as the timezone SingleStore is set to.
So to have consistency in TIMESTAMPS, we'll need to have a consistent timezone set across ALL of the nodes in the cluster. As there are two ways to do this, you can either:
- RECOMMENDED: Update the timezone on all hosts (from the Linux/host level) to be consistent and reboot the nodes on those hosts.
- Update the timezone as the cluster sees by updating the default_time_zone on all nodes.
If we connect to the other host in this cluster and update its timezone to UTC, we can achieve conformity in our results. However, the values are still different from our original select before changing any timezone settings:
singlestore> select * from tztest order by id;
+----+---------------------+---------------------+
| id | timeStamp | dateTime |
+----+---------------------+---------------------+
| 1 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 2 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 3 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 4 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 5 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 6 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
+----+---------------------+---------------------+
6 rows in set (0.03 sec)
This is because the DATETIME datatype is not affected by the UTC setting on the host. Therefore, whatever it is set to when it is inputted into the database will remain regardless of the timezone on the host that holds that data.
If we add new entries to this table now that all the hosts are on UTC, we can see that the new DateTime entries match the host timezone:
singlestore> select * from tztest order by id;
+----+---------------------+---------------------+
| id | timeStamp | dateTime |
+----+---------------------+---------------------+
| 1 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 2 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 3 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 4 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 5 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 6 | 2021-11-11 00:57:55 | 2021-11-10 16:57:55 |
| 7 | 2021-11-11 01:02:21 | 2021-11-11 01:02:21 |
| 8 | 2021-11-11 01:02:21 | 2021-11-11 01:02:21 |
| 9 | 2021-11-11 01:02:21 | 2021-11-11 01:02:21 |
+----+---------------------+---------------------+
9 rows in set (0.00 sec)
It's for this reason that we recommend hosts be on UTC. That way, no matter what information you have about the hosts timezone setting you know the data will always be UTC 0 and can be adapted to any timezone from there.
{answer}