Uncategorized

The Case Against Storing Date & Times as UTC or Why The Dilemma is Bogus Anyway

While many people have talked about about storing dates as UTC as if it were some kind of “best practice” – which I have also been guilty of – it’s actually not that important. In fact it’s a bit silly.

Think about a sleep tracking app.

The users of the app are spread out all over the world in different timezones. Then perhaps you have gathered a lot of data now and want to generate a report of “how many people sleep during the hours of 6am to 10am on a weekday”, or something like that.

If you stored the times in UTC that query is going to be more involved than if the times were stored locally, because just inspecting a date in the database won’t tell you what time it actually was for any particular person. Yes – I see the example is contrived and SQL or NoSQL ninjas are going to say “oh well, that’s easy”, but give me this point.

You not only need to look at what the user’s timezone is in their user settings, but also what the time was at the very time the sleep record was registered. After all, people go on holidays, move to different countries and it’s widely accepted that air travel exists and has made human movement between parts of the planet easier.

Now you’ve ended up with the awkward situation where you need to store the timezone (likely in the Olson timezone format) in each and every row anyway, so as to know keep the user’s time at that specific point in history (i.e. their timezone could have changed multiple times over the years).

In addition it’s actually incredibly easy to convert from one arbitrary timezone to another anyway, given that tzdata is a fact, and great libraries such as Carbon (PHP) or moment (JS) already exist. For most web development it’s going to be easy to make such conversions. The vague idea that having stored stuff as UTC is automagically going to make timezone conversions and life in general easier is magical thinking.

But let’s just stop here. It’s easy to see how silly the dilemma is, because it actually isn’t even one. The storage of times is only a very small part of the whole story, and looking at it in isolation is not only uninteresting but dangerous.

We may as well store UTC, local times, timezones and as much metadata as we care to. Why is it one or the other? If you have a UTC column, you can easily use it to order events correctly, if you have local times, you can more easily do something else, if you have both, so what? If a black swan happens, at least you have more to work with. Either way, how times are stored isn’t really important.

It’s how all the intersection parts work together – storage, display (UI), reporting, parsing, manipulation, converting, third party libraries, etc that can make timezones annoying to work with. It’s as much about educating yourself on the correct terminology and communicating with team members as it is about technology. It’s like the little-big endian thingo – decide on some things that don’t really matter, but it really matters to stick to it!

*Yes, I suppose in theory you might not want to store redundant data. Sure, it can be potentially confusing to store data that isn’t used or maybe you’re worried about storage space. It’s a valid discussion but maybe a bit of a yawn šŸ™‚

TL;DR: It makes no sense to talk about how to store date and times in isolation. Context is important. Shared conventions between team members is important. Communication is important. The technology and the boring arithmetic behind timezones…it comes in last by a big margin.

PHP – Some Date Time “Best Practices”

Your best practices will vary of course. This article makes a lot of assumptions about what app is being built.

Database: Save the user timezone in the database. Save this as in the Olson standard or something similar against the user. Don’t store numeric timezone offsets or daylight savings dates. The user may move countries and timezones so it might also be a good idea to save a history of timezone changes. Also store the current user timezone next to datetime columns, so we know his/her timezone at that particular point in history.

Database: Store UTC and local time, as well as the user’s timezone in each row Why is it a this or that proposition? As mentioned above, storing the user’s timezone in each row (as opposed to only stored against the user’s profile) gives higher contextual specificity and hopefully is lossless for your purposes.

Database: Store the time component and not just the date. Or store the UNIX timestamp. For example store 2019-01-01 00:00:00 and not 2019-01-01. Without a time component, it would probably be reasonable to assume that 2019-01-01 implies 2019-01-01 00:00:00, but I’d rather be explicit and consistent. In addition, it makes no sense to apply a timezone when there is no time component.

Operating System: SetĀ theĀ OS timezoneĀ toĀ UTC.Ā If files and logs are being stored across multiple machines or containers, it is useful to have a common date time when debugging problems to reduce potential confusion.

Code: Set the applicationĀ timezoneĀ toĀ UTCĀ regardlessĀ ofĀ whereĀ theĀ customersĀ are.Ā If the OS timezone is set to UTC, you can use that directly, but don’t make an implicit assumption so always set it manually. In PHP frameworks such as Laravel, configuration files allow you to set application wide timezones

Code: Create presentation layer helper functions to convert the date stored in the database to times into the user’s timezone (or whatever target timezone is desired).

ļ»æGeneral: Use existing databases and libraries. The classic reinventing the wheel syndrome. Timezone databases i.e. tzdata and plenty of high quality date & time parsing tools already exist. There may be good reasons to do thing manually such as calculating offsets, whether it’s for performance reasons or some other, but

Code: Don’t abuse UNIX timestamps. Abusing UNIX timestamps means deliberately making false UNIX timestamps. This is probably far easier to unintentionally do in Javascript or another client side language for the very reason that the browser timezone varies by client.

In the context of PHP, strtotime is a function that returns the UNIX timestamp when given a string. A UNIX timestamp is by its very definition, the number of seconds elapsed since Jan 1, 1970 in UTC. It is not the seconds elapsed since Jan 1, 1970 in any other timezone. If you give strtotime a string e.g. “2019-01-01 09:00:00”, it will return the number of seconds since Jan 1, 1970 UTC relative to the timezone defined in the PHP.