I hate databases


Today, I had further distractions in the shape of database functions. Every database is different, and each has their adherents. Often these are people who know no better, because that’s all they’ve ever used, but we must not forget the counterpoint to that, the forlorn Database Administrator’s Lament: "I prefer beauty, but I’ve only ever had ugliness".

There is a standard for SQL, a set of features that nobody ever implements exactly. While broadly speaking, Microsoft’s dialect of SQL is mostly equivalent to IBM’s to Oracle’s to anyone else’s, the devil is in the detail (and the benefit is in the huge expense contractors can charge you to migrate from one platform to another, or in the huge license fees each manufacturer can extract from its captive customers).

Each one is different, and I wonder if each one displays some of the different characteristics of their maker. Take date arithmetic, for example.

IBM’s code is simple and clear to follow:

('2014-07-01') + 7 days

gives you the eighth of July, 2014, seven days greater than what you input into the formula.

Microsoft’s version is needlessly complicated and vaguely arbitrary

DATEADD('2014-07-01', dd, 7)

but you can basically see what it’s doing there.

And Big Red makes you do

TO_DATE('2014-07-01', 'YYYY-MM-DD') + 7

… 7 what, exactly? Wouldn’t it help yo specify if that was days, weeks, centuries or seconds? And why can’t you recognise a date, as a date?

As Arthur C Clarke should have said, sufficiently complicated code is indistinguishable from bullshit.

Or try another: for a particular date, tell me what was the first date in the week that date is in:

Big Blue:

DAYOFWEEK_ISO('2014-09-09')

yields the 8th of September, 2014. As does

TRUNC(TO_DATE('2014-09-09', 'YYYY-MM-DD'), 'wi')

Now which one of those do you think might have a more obvious purpose, to somebody who doesn’t have an enormous block of documentation learned off by heart, and isn’t distracted by all the other stresses of modern life, or who might just like code that doesn’t feel wilfully obscure? And which one is going to be easier to maintain?

Similarly, almost all databases have a function to get the first x letters in a string. You write

LEFT('string of letters', 4)

and you get

‘stri’

Everyone’s happy.

Except Big Red, who say no, you’ve got

SUBSTRING('string of letters', 1, 4)

which does the same thing and so you should be happy with that. Maybe they didn’t study enough philosophy to understand the differences between intensionality and extensionality. Or maybe they’re just bad people. Occam’s Razor, anyone?

I quite understand that you want the minimum number of functions necessary, to reduce the time spent debugging and testing, but that sort of complexity seems just wilful. If I wanted to do things like that, I’d bin the database and do everything in sed, thanks very much.

,

2 responses to “I hate databases”

  1. Substring’s not so bad, bit like mid really but the date format stuff is pretty hard to follow. Big Red – who are they? Linux/oracle Db?

    • Big Red = those happy go lucky people at Oracle. I wouldn’t mind the date formatting except for the horrible code it forces you to write if you’re trying to build a case statement to clump together everything before a certain date; in DB2 you’d do
      CASE WHEN date_column < '2014-09-01' THEN '2014-09-01' ELSE date_column END
      but in Oracle it appears they want you to write
      CASE WHEN date_column < TO_DATE('2014-09-01', 'yyyy-mm-dd') THEN TO_DATE('2014-09-01', 'yyyy-mm-dd') ELSE date_column END

      I feel the latter sacrifices readability, for, well, what, exactly?

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.