In a very globalised world, where your developers can be working in a multitude of countries, I've found that the only sane thing to do is a) run all your boxes with a TZ of Zulu and b) the only date format allowed in logs is RFC-3339.
Sure, you can format datetimes in localised formats for end users, but standardising on UTC+0 and RFC-3339 for programming and maintenance purposes just prevents so much confusion and hassle.
Not that I've ever had to spend a day trying to troubleshoot a bug that was alerted by our monitoring system at 2020-11-09T09:11:05Z (as it was defaulting to UTC), but the relevant logs were timestamped at 10:11:05 09/11/2020 (I only wished they'd used the normal German period separators in the date to give me a clue) because of course your German colleagues want their timestamps in German format and defaulting to CET...
When you have colleagues in Europe, Oceania, and distributed from coast to coast in the US, doing everything in UTC means everyone only has to do one mental conversion from their local timezone to UTC, as opposed to trying to remember if they're in LA or New York, or if they're in Mountain Time, if they're in the parts of the US that are Mountain Time but don't do daylight savings.
> Not that I've ever had to spend a day trying to troubleshoot a bug that was alerted by our monitoring system at 2020-11-09T09:11:05Z (as it was defaulting to UTC), but the relevant logs were timestamped at 10:11:05 09/11/2020
It gets even more fun when such events happen during the switch between daylight saving time (like CEST) and standard time (like CET).
When I worked closely with Germans there was a special time of year where our daylight savings changes overlapped, so for a few weeks we'd have a 10 hour time difference, then an 11 hour time difference, then 12. We tried to minimise meetings during that period, as someone always got the timing wrong.
CloudWatch at least lets you choose in which timezone to display datetimes. Other AWS services don't let you choose and don't even show in which timezone they display datetimes, which is somewhat infuriating to me, when I'm in a hurry anyway and suddenly have to wonder which timezone the shown dates are in.
I also remember that AWS used to display some datetimes in Pacific Standard Time, which is completely useless to somebody outside the US, however I haven't seen that in a while anymore.
Similar to the global language switch I wish the AWS Management Console would allow setting in which timezone to display datetimes globally as well.
Or zero mental conversions because it's trivial to write a simple script that ingests RFC-3339 and spits it out in your local time.
"2020-11-09T09:11:05Z" uniquely identifies a point in time. "10:11:05 09/11/2020" is some time 11 minutes past the hour on either September 11th or November 9th. It doesn't identify anything.
The first format also makes it easy to do counts of # of entries within November 2020. The Second makes life considerably more difficult (not impossible though).
Am I the only one who would expect the behavior in the top example?
I don't want my equality operator doing timezone conversions. If one type has no timezone attached and one does, then you probably shouldn't be able to compare them at all. Likewise, for a type that has a timezone attached, if the timezone differs between a and b, then I want a == b to return false, even if it represents the same instant in time.
Equality can be a tricky thing. Named functions are the way to go when two different developers might intuit different behaviors.
The problem is not that the equality behavior is wrong, it’s that I would expect “utcnow()” to return a timestamp with a UTC timezone. The problem is that it does not, it has no timezone at all, and therefor is generally less compatible than creating a timestamp with timezone from a utc timezone object.
I work on the Python API of a timeseries database and it’s very frustrating to have to accept timestamp objects without time zone because you don’t know what to do with them. People aren’t aware of this constraint, which leads to unexpected behavior.
Django solves this by having the default timezone in the framework settings. Any datetime without a timezone is assumed to be using the default one. If you don't set a default timezone, the server timezone is used, and you are responsible for making sure your data is congruent.
If I’m calling .utcfromtimestamp then by the name I expect it to be doing _more_ than just attaching a time zone (which it also doesn’t do). The only other behaviour that seems sensible is that it’s doing… some sort… of conversion. Exactly what I’d need to look up at the time I wanted to use it.
I have, however, been using python so long that I’ve gotten used to the way things sort-of-work and developed such a healthy caution around it’s unintuitive built-in time zone handling.
The comparison is being done between x_ts, which is a float, and x, which is a datetime object. Call be crazy but I think allowing this comparison to begin with is the problem.
> Likewise, for a type that has a timezone attached, if the timezone differs between a and b, then I want a == b to return false, even if it represents the same instant in time.
Really? So if you want to compare that two times are really the same time for two different timezones, you'd want to convert them to UTC first? What's the use case for this?
You don't need to convert them to UTC, just one to the timezone of the other, but yes. "a.is_same_instant(b)" is a lot more clear to me than "a == b", especially in a dynamic language where it might not be obvious what the types involved are.
Strongly implied but not explicitly stated unless I missed it: the actual surprising behaviour here is that utcnow returns a naive datetime, not a timezone aware one, despite the appropriate timezone to use being obvious.
I was about to argue for your initial version! I would say that utcnow() is marginally acceptable (though only on backwards-compatibility grounds), but only if the conversion from a naive time to aware time raises an error if done without an explicit timezone specification. It is the combination keeping utcnow() as-was (or, more generally, the concept of naive time at all), while introducing a default for the conversion, that turns it into a subtle trap.
Unfortunately, now that the default has been introduced, there seems no way to get to get out of this situation while preserving backwards-compatibility. The lesson here, perhaps, is to not make it easier to use that which should have been deprecated.
Technically correct (best kind of correct) but irrelevant here. "utc" (lowercase) is a timezone in the sense that it is a named instance of python's `timezone` class provided by the standard library, and the obvious choice if utcnow() were to return a timezone-aware datetime.
It's compatible with the datetime api, but it has sane default, nice tools to convert between timezone, some cool date adjustment stuff, and can humanize time in several languages.
Basically, datetime has the same problem as text vs raw bytes in python 2.7, except it has never been fixed.
Timezones aren’t really the problem, implicit locales are the problem. All this stuff would be a lot easier if timezones always had to be stated explicitly.
> All this stuff would be a lot easier if timezones always had to be stated explicitly.
On Java, you can use the forbidden-apis build plugin (https://github.com/policeman-tools/forbidden-apis) to fail the build whenever a timezone or locale or charset is not specified explicitly (it forbids the methods from the Java API which use an implicit timezone/locale/charset). I don't know whether there's something similar for Python; it might be harder because Python is much more dynamic (though it might be possible to use monkeypatching to warn whenever the bad methods are used).
The real problem is that datetime.datetime.now() returns a naïve datetime.
If only it returned something that contained the UTC offset in which is was created, then all of this would be much easier to deal with. Conversions to timestamps would just work, and comparisons would just work.
Note that I said "UTC offset", not "timezone". datetime doesn't deal in UTC offsets, it deals in timezones. And timezones are a zillion times more complicated, which is why no one want to deal with them on the critical path, and no one has ever proposed having now() return an object with .tzinfo set. Logical though it may be.
This kind of stuff really needs to be made explicitly clear in any/all API documentation as it inevitably leads to underlying issues later down the line. It's amazing how something as "simple" as official time/date libraries still have footguns like these.
I'm not bashing anyone that's involved as I realize we're all human and prone to mistakes, I just wish we could all do better.
Both datetime.utcnow() and datetime.utcfromtimestamp() have red box warnings in the documentation. People copying code from StackOverflow won’t see them, of course.
I should mention that the warnings are relatively recent, added in 3.8.
I suspect that the author of Jodatime (which effectively became the java.time API), who was notorious for being almost Linus Torvalds like in his attitudes and approaches, was probably once a very nice and kind individual. Until he started implementing a datetime library.
I mean, he nailed it, but at what cost? java.time.* is my second favourite datetime API, the first being Postgres'.
I'm actually surprised the source code of Jodatime isn't entirely in Zalgotext, technically it'd be valid Java code (I'm pretty sure, anyway).
Most languages have something similar, that's why it's important to use static analysers as part of your process to identify usage errors. Everyone is human, mistakes happen. Having tooling as a backup is great.
"If self is naive, it is presumed to represent time in the system timezone."
presumed is kind of a bad word in Python. I would never write an API today that "presumed" something that can be lots of other things and I would not allow any tz conversion on a naive datetime without requiring the existing known timezone be passed.
As mentioned in the article, this was a deliberate change in Python 3. If you read this article: https://blog.ganssle.io/articles/2022/04/naive-local-datetim... you will see why this actually makes quite a bit of sense, given the design constraints the authors were working with.
I tried to read it, will try again later but I wasn't really getting it. the idea would be, .astimezone() is bad, and should be something equivalent to .convert_between_timezones(from, to), where there's some convenient way for "from" to indicate "the system time zone", but you still have to be explicit that you consider this arbitrary number to be in a particular time zone.
not following why the "this is the system timezone" must be hardcoded to be an invisible assumption, as opposed to something somewhat explicit. I mean this is literally not far off from a simple namechange of ".astimezone()".
Sure, you can format datetimes in localised formats for end users, but standardising on UTC+0 and RFC-3339 for programming and maintenance purposes just prevents so much confusion and hassle.
Not that I've ever had to spend a day trying to troubleshoot a bug that was alerted by our monitoring system at 2020-11-09T09:11:05Z (as it was defaulting to UTC), but the relevant logs were timestamped at 10:11:05 09/11/2020 (I only wished they'd used the normal German period separators in the date to give me a clue) because of course your German colleagues want their timestamps in German format and defaulting to CET...
When you have colleagues in Europe, Oceania, and distributed from coast to coast in the US, doing everything in UTC means everyone only has to do one mental conversion from their local timezone to UTC, as opposed to trying to remember if they're in LA or New York, or if they're in Mountain Time, if they're in the parts of the US that are Mountain Time but don't do daylight savings.
It gets even more fun when such events happen during the switch between daylight saving time (like CEST) and standard time (like CET).
Ah, daylight savings.
(I always set CW back to UTC, for consistency)
I also remember that AWS used to display some datetimes in Pacific Standard Time, which is completely useless to somebody outside the US, however I haven't seen that in a while anymore.
Similar to the global language switch I wish the AWS Management Console would allow setting in which timezone to display datetimes globally as well.
"2020-11-09T09:11:05Z" uniquely identifies a point in time. "10:11:05 09/11/2020" is some time 11 minutes past the hour on either September 11th or November 9th. It doesn't identify anything.
Edit: Or 41 minutes past the hour.
I don't want my equality operator doing timezone conversions. If one type has no timezone attached and one does, then you probably shouldn't be able to compare them at all. Likewise, for a type that has a timezone attached, if the timezone differs between a and b, then I want a == b to return false, even if it represents the same instant in time.
Equality can be a tricky thing. Named functions are the way to go when two different developers might intuit different behaviors.
I work on the Python API of a timeseries database and it’s very frustrating to have to accept timestamp objects without time zone because you don’t know what to do with them. People aren’t aware of this constraint, which leads to unexpected behavior.
> ts = 1571595618.0
This a timestamp, there is no TZ data required.
> x = datetime.utcfromtimestamp(ts)
Now for some reason the developer cares about UTC all of a sudden?
If the developer doesn't care about the TZ, they should use:
datetime.fromtimestamp(ts)
Nobody argues for that. Especially since we’re comparing timestamps (floats), not datetimes (objects). Let’s go line by line :
> ts = 1571595618.0
This is 2019-10-20 18:20:18.000 UTC
> x = datetime.utcfromtimestamp(ts)
I would therefore expect x to be 2019-10-20 18:20:18.000 UTC
> x_ts = x.timestamp()
I would therefore expect x_ts to be 1571595618.0. Which is surprisingly not.
Nowhere the equality operator was involved in the surprise.
I have, however, been using python so long that I’ve gotten used to the way things sort-of-work and developed such a healthy caution around it’s unintuitive built-in time zone handling.
Really? So if you want to compare that two times are really the same time for two different timezones, you'd want to convert them to UTC first? What's the use case for this?
Seems like the less-used way to me.
Deleted Comment
Unfortunately, now that the default has been introduced, there seems no way to get to get out of this situation while preserving backwards-compatibility. The lesson here, perhaps, is to not make it easier to use that which should have been deprecated.
Thanks for your clear and to-the-point article!
This is why for anything that uses timezone, I use pendulum: https://pendulum.eustace.io/
It's compatible with the datetime api, but it has sane default, nice tools to convert between timezone, some cool date adjustment stuff, and can humanize time in several languages.
Basically, datetime has the same problem as text vs raw bytes in python 2.7, except it has never been fixed.
On Java, you can use the forbidden-apis build plugin (https://github.com/policeman-tools/forbidden-apis) to fail the build whenever a timezone or locale or charset is not specified explicitly (it forbids the methods from the Java API which use an implicit timezone/locale/charset). I don't know whether there's something similar for Python; it might be harder because Python is much more dynamic (though it might be possible to use monkeypatching to warn whenever the bad methods are used).
If only it returned something that contained the UTC offset in which is was created, then all of this would be much easier to deal with. Conversions to timestamps would just work, and comparisons would just work.
Note that I said "UTC offset", not "timezone". datetime doesn't deal in UTC offsets, it deals in timezones. And timezones are a zillion times more complicated, which is why no one want to deal with them on the critical path, and no one has ever proposed having now() return an object with .tzinfo set. Logical though it may be.
I'm not bashing anyone that's involved as I realize we're all human and prone to mistakes, I just wish we could all do better.
I should mention that the warnings are relatively recent, added in 3.8.
https://docs.python.org/3/library/datetime.html#datetime.dat...
I suspect that the author of Jodatime (which effectively became the java.time API), who was notorious for being almost Linus Torvalds like in his attitudes and approaches, was probably once a very nice and kind individual. Until he started implementing a datetime library.
I mean, he nailed it, but at what cost? java.time.* is my second favourite datetime API, the first being Postgres'.
I'm actually surprised the source code of Jodatime isn't entirely in Zalgotext, technically it'd be valid Java code (I'm pretty sure, anyway).
It's surprising how poorly the situation with timezones is understood in the industry.
Deleted Comment
https://docs.python.org/3/library/datetime.html#datetime.dat...
"If self is naive, it is presumed to represent time in the system timezone."
presumed is kind of a bad word in Python. I would never write an API today that "presumed" something that can be lots of other things and I would not allow any tz conversion on a naive datetime without requiring the existing known timezone be passed.
not following why the "this is the system timezone" must be hardcoded to be an invisible assumption, as opposed to something somewhat explicit. I mean this is literally not far off from a simple namechange of ".astimezone()".