Now, I'm an old luddite who still looks askance at cloud services, but I'm willing to recognize their value. Still, I worry about whether I can trust that vendor to actually deliver the services I need, without them suddenly shoving out breaking changes which screw me, but maybe not their "whale" customers where the real money is.

That's something which "Sleeper" got to grapple with last fall. They use Amazon's Athena service, which allows software to query S3 buckets using SQL syntax. Late last year, "Sleeper" got an email which they forwarded to us with the subject "This should be fun". The email started like this:

Athena has been constantly making improvements to its SQL query syntax to make it more logical and user friendly, and during this process we have identified some query syntax that could have behaved in a more reasonable manner.

Now, just from that opening sentence, you know this means "breaking changes ahoy". And you wouldn't be wrong. There are three syntax changes.

  1. Evaluation of comparison between CHAR type column and a string literal will become easier. No more padding is needed in the string literal.

Example: Suppose table t has a column named col5 with data type CHAR(5), and it contains a row with value 'abcd ' (note the trailing space) for column col5. Currently, you have to use the exact same string literal in order to find a match:

SELECT 1 FROM t WHERE col5 = 'abcd '; -- the trailing space in the string literal is needed to get a match

With the new behavior, you will no longer need to provide the trailing spaces for such queries to match the condition:

SELECT 1 FROM t WHERE col5 = 'abcd'; -- there is no need to have a trailing space in the string literal to get a match

Now, this behavior probably makes more sense, but if, like "Sleeper", you have a bunch of deployed queries that depend on this behavior…. Well, as the old saying goes: "every change breaks somebody's workflow".

  1. Accessing a non-existing key in a MAP type will cause an error instead of having NULL returned as the result.

This is a behavior that could be positive or negative, depending on your point of view. But once again, this is the kind of change that could break somebody's workflow, quite significantly.

There is a way to get the original behavior, sort of:

If you still prefer the query to return NULL for such cases, you can leverage the TRY function to achieve that:

SELECT TRY(MAP(array['foo', 'bar'], array[1, 2])['abc']);

This does mean that if you relied on this behavior, you've got to find every use of MAP and wrap it in a TRY.

These are awkward changes, but it's not like an entire feature got removed, right?

  1. Accessing anonymous row fields in the form of “.field[n]” is no longer supported.

Example: Suppose you want to access a field of an anonymous row. Currently, Athena returns the first field, which is 'a', for the following query. With the new behavior, Athena will report an error indicating "Column 'ROW ('a', 'b').field0' cannot be resolved".

SELECT ROW('a', 'b').field0;

It will not impact the way you access a non-anonymous row. For example, the following query still works the same way and returns 'a'.

SELECT CAST(row('a', 'b') AS ROW(col0 char, col1 char)).col0;

Now, I'm going to assume that Amazon had usage statistics and had a pretty good picture of how widely these features were used and, for them, it was a simple calculus of "the greatest good for the greatest number". But if you're in the sacrificial pool of users, that's still gonna hurt.

This may not be a full on WTF, but I share it because I feel this pain: updates outside of your control that break your software. "Sleeper" got three months of warning to make the required changes, but the entire experience introduced costs and risks into "Sleeper"'s projects that didn't really benefit "Sleeper"'s team at all.

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!