“Eppur si muove!”* – Dealing with Timezones in Python

As a result of our world not being a flat disc but a rotating geoid and our solar system only having one sun, we have different time of days at different parts at precisely the same time. Everybody learns that in school these days and is well aware of the effects on human life (“Call your aunt over sea and she will pick up at an odd time”, jetlag etc.). But unfortunately that whole timezone thing is only partially based on constraints our world gave us and in computing we have to deal with these oddities as well.

* “and yet it moves” is what people say Galileo Galilei uttered upon leaving the courtyard after being forced to recant his belief that the Earth rotates around the Sun. Which unfortunately is the case and gives us these wonderful timezone problems.

What does this article have to do with Galileo? Not really much I am afraid because even if the world would be in the center of the universe you would still have timezones. Consider the title a mistake on my part which I cannot correct now, can I :-)

What's a Timezone? What's your timezone? If you respond with “UTC+X” that will be correct for this very moment, but not necessarily true over time. If you look at the timezone info database you will find that Berlin and Vienna, even though they are both in “UTC+1” will have a different timezone (Europe/Berlin vs Europe/Vienna). Why that? The reason are differences in daylight saving time and historical dates. Even if those two countries and cities nowadays have the same DST configurations, a hundred years ago that was not the case. Both Austria and Germany for instance used to not have DST over periods of time. Austria stopped in 1920, Germany did in 1918. During WWII both countries unsurprisingly had the same DST configuration, but afterwards there are a few unsynchronized years again. Germany abolished DST in 1949 and reintroduced DST in 1979, Austria abolished it in 1948 and reintroduced it in 1980. What's worse is that they did not even select the same date for the switch. And this pattern is quite common all around the world. For computing DST is a huge problem. The reason for that is that we're usually assuming that time has a monotonic advancing. With daylight saving time, during that one hour of enabling/disabling each year we either get an hour twice or we skip an entire hour. Results are log entries that appear out of order if you log with local time for instance. To quote the pytz documentation: For example, 1:30am on 27th Oct 2002 happened twice in the US/Eastern timezone when the clocks where put back at the end of Daylight Savings Time, similarly, 2:30am on 7th April 2002 never happened at all in the US/Eastern timezone, as the clocks where put forward at 2:00am skipping the entire hour But timezones have more than just DST settings. Some countries are switching the means of time measuring altogether, in some cases even without entering or leaving DST. For example, in 1915 Warsaw switched from Warsaw time to Central European time. So at the stroke of midnight on August 5th 1915 the clocks were wound back 24 minutes. In neither case was DST active. Much fun can be had with timezones in general. There was at least one country that at one point had a timezone that differed per day because they synchronized 0:00 with the time of the sunrise.

Where is the Sanity? The sanity right now is called UTC. UTC is a timezone without daylight saving time and still a timezone without configuration changes in the past. However because our world is again this rotating geoid and something we don't really have under control, the problem of leap seconds will at one point show up. If UTC will then take leap seconds into account (which are irregular and with that problem for computing) or not (and each timezone will have sub-minute differences to UTC) is, as far as I know, nothing that was decided for sure yet. However right now, UTC is the safest bet. From UTC you can convert into any local time, however of course the reverse is not true due to what was shown above. So here the rule of thumb which never shall be broken: Always measure and store time in UTC. If you need to record where the time was taken, store that separately. Do not store the local time + timezone information!

Where is the Problem? Now in theory that blog post should end here and we all go on with our lives. Unfortunately in Python there are a couple of more things to keep in mind due to some design decisions that were made a long ago that were not thought well through. The motivation was sound, the implications however were not. At one time the following decisions were apparently made for the datetime module in the standard library: the datetime module should not ship timezone information because timeszones change too often. the datetime module however should provide an API to attach timezone information to a datetime object. It should provide these objects: date, time, date+time, timedelta Unfortunately a few things went wrong. The biggest problem is that a datetime object with timezone information attached and a datetime object without that timezone information don't work at all together: >>> import pytz , datetime >>> a = datetime . datetime . utcnow () >>> b = datetime . datetime . utcnow () . replace ( tzinfo = pytz . utc ) >>> a < b Traceback (most recent call last): File "<stdin>" , line 1 , in <module> TypeError : can't compare offset-naive and offset-aware datetimes Ignoring the horrible API you have to use to attach a timezone information to a datetime object this leads to quite a few problems. If you are dealing with datetime objects in Python you will sooner or later start attaching and removing tzinfo objects all over the place. Another problem is that there are two ways to create a datetime object for the current time in Python: >>> datetime . datetime . utcnow () datetime.datetime(2011, 7, 15, 8, 30, 55, 375010) >>> datetime . datetime . now () datetime.datetime(2011, 7, 15, 10, 30, 57, 70767) One gives the time in UTC, the other in local time. However it will not tell you what local time is (because it does not have a timezone information object, at least before 3.3), and it does not give you way to know which one was UTC. If you convert from a UNIX timestamp into a datetime object you also have to be very careful to use the datetime.datetime.utcfromtimestamp method because the normal one will assume the timestamp is in local time. On top of that, the library provides a time object and a date object, both of which are close to being useless when timezones are involved. The former cannot be shifted to other timezones because that would require the date component. The date itself also only makes any sense local to a timezone because what's today for me, could be tomorrow or yesterday for you thanks to the wonderful world of timezones.