• (nodebb)

    Could be worse. The database probably isn't stored in sparse files, and it probably isn't stored in files with alternate data streams (assuming NTFS for the database location).

  • (nodebb)

    Very poor quality. 1) Good software would prevent backups way before it fills up the disk, lest it causes OS instability (if backup location is OS disk). 2) And what about overwriting the previous backup? 3) It looks like they're using the size of the database itself as a substitute for backup size, which is rarely accurate.

    Addendum 2021-02-18 06:59: Just noticed division by 2, which has got to be the ASSumption of the century.

  • Taily Wuttif (unregistered)

    I wonder if "getTotalSpace" is badly named, as "space" could easily mean (and in this case, was taken to mean) "free space". Isn't space that's taken up with stored stuff no longer space, after all? A better name for that function would be "getTotalCapacity" which I think is less ambiguous.

    (Having said that, it's still a WTF that this got through to production, of course.)

  • Jay (unregistered)

    The divide by 2 part makes me think he assumed 50% compression. It also assumes that only a single backup is kept (at least in that location).

  • Cheats Never Prosper (unregistered) in reply to Mr. TA

    Well spotted... the least significant points of the code! This software returns an "I can fit backup onto disk" in cases where it certainly cannot. In case you missed reading the article, it's becuase it is testing for disk size, not space.

  • Cheats Never Prosper (unregistered) in reply to Taily Wuttif

    "Capacity" is just as ambiguous, as in this context it means the exact same thing. My 1Tb Capacity disk has 1Tb Space and almost 1Tb UsableSpace. It also happens to be full.

    If the code monkey didn't check what "space" meant, or know the common English word "Total" does not always equal "Remaining", then the function could equally be called "GetShoeSizeOfCodeMonkey" and the eejit would have used it anyway.

  • Cheats Never Prosper (unregistered) in reply to Jay

    Almost.

    On the mistaken assumption that "space" meant "free space" then it doesn;t matter if there were N previous backups there; IF free space >= space required THEN good.

    50% compression for the weird-ass database might be right or wrong. It's a low-order problem given the previous assumption.

  • MiserableOldGit (unregistered) in reply to Taily Wuttif

    Yeah, I thought that, but then it also occurred to me that if the programmer had made any effort to check what his bit of code actually did it should have been easy to spot. The name is dumb but the (implied) careless use of the property is dumber.

  • Gummy Gus (unregistered)

    aaaand division tends to be slower than multiplication, so at the very least save a nanosecond when doing that wrong test by multiplying the left side instead of dividing the right side :)

  • 🤷 (unregistered)

    My favorite part of this is the division by 2. Let's just assume that the "getTotalSpace" part was actually correct. It's pretty bold to assume that a 50% compression rate can always be achieved.

  • (nodebb) in reply to Jay

    Did the developer assume 50% compression? Or did they intend to ensure that the database was less than half the remaining disk space (leaving a buffer in case any operation requires a temporary file copy or something) but get the division backwards?

  • Best Of 2021 (unregistered) in reply to Mr. TA
    1. Good software would prevent backups way before it fills up the disk, lest it causes OS instability (if backup location is OS disk)

    That's what the 'divide by 2' is aimed at I think - the intent of this code is to return true if there's at least twice as much free space as it looks like we're going to need, imo. Of course they also got the /2 on the wrong side if that was what they meant, but we've all done that.

    The WTF here is all in the lack of testing, for me. There's two dumb but easy to make mistakes in the code ('total space' != 'total free space' and half vs double) - but it would have failed any kind of cursory verification.

  • tbo (unregistered)

    Seems as though they assumed they had exclusive use of the total space on whatever the datastore is.

    So if you're currently using less than half of it, you have enough space for another copy of it. But the / 2 is on the wrong side.

    It was probably meant to be:

    return backupFreeSpace / 2 > databaseSpace

  • Taily Wuttif (unregistered) in reply to Cheats Never Prosper

    Maybe. "Space" could mean empty, unused. (If it's used, then there's no space, right? "This hard drive is full, there's no space left. The space is all used up.") Whereas "capacity" can only mean the total amount, whether free or used.

  • Taily Wuttif (unregistered) in reply to MiserableOldGit

    Oh absolutely, never trust a function name, always read the docs. As I am currently cursed to working on Wordpress sites, I'm very familiar with the concept of badly-named functions…

  • Rob (unregistered)

    TRWTF is using a Java 7 API link, seeing as Java 7 has been EOL for quite a while already, and we're currently at Java 15.

    So instead of https://docs.oracle.com/javase/7/docs/api/java/nio/file/FileStore.html, it should have been one of these:

    • https://docs.oracle.com/javase/8/docs/api/java/nio/file/FileStore.html (Java 8)
    • https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/nio/file/FileStore.html (Java 11, latest LTS)
    • https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/nio/file/FileStore.html (Java 15, latest overall)
  • Old timer (unregistered)

    I'm not familiar with the language, so I'm struggling to see why they expect the filestore of the root of the working directory to be the same filestore as that containing the database folder of the working directory. That's what links are for: so that you can put link the database folder physically located in a different filestore.

  • Some Guy (unregistered)

    I read this code and thought they were assuming that the backup will be about the same size as the database, so if the database would fit twice on the storage, then there's enough space for the database and the backup. Except that they put /2 instead of *2.

  • 🤷 (unregistered)

    Moderation held for comment.

  • (nodebb) in reply to 🤷

    Moderation held for moderation.

  • (nodebb) in reply to Old timer

    That's what links are for: so that you can put link the database folder physically located in a different filestore.

    Derby is a bit of a special snowflake, in that it really wants to know the physical directory that it's using to hold the database (and that really kinda needs to be on a local physical disk too, because file locking and network filesystems have a long history of being painful in combination). Having used it, these days I can say without a shadow of a doubt that you should use SQLite instead.

  • Lőrinczy, Zsigmond (github)

    Here totalsize/2 means the total size is enough for the actual database and its backup. Assuming you overwrite the previous backup with the new one. Also assuming there's no other files on the same filesystem.

  • (nodebb)

    This reminds me a thing happened about two years ago... I reported to Tech Lead I couldn't complete a single database restore. After several days of discussion with the "DB administrator" boy they all agreed I had no clue what I was doing and offered support.

    Fast forward two weeks. I don't recall ever getting any help, nor any scrutiny on the process I carried out but what I recall is an email confirming "the restored files, for some reason, don't work".

    The following Friday, after the hardware guys blasted their 15 minute time limit by 400% I still had the will to note "we have reached conclusion the database restore procedure doesn't work". I stressed we had paying customers for this service and in the event of a disaster recovery we would end with the wrong end of the stick. Management agreed to allocate one day of work ( 1 ) to "further investigate the issue". OFC they did so with maximum priority.

    ... which means the project I was working on basically got on hold while I delved back into "legacy code" written about 15 years ago (I still insist that legacy == does work reliably, not just "sometimes works if we conjecture the right config and troubleshoot it for hours"). At least it was quick: it all boiled down to binary data being saved by System.IO.File.WriteAllText method which AFAIK carries out quite a few transformations... in the following few hours I also produced a list of 5 other issues, 2 of which were very relevant.

    The Tech Lead was absolutely interested as he readily read the email and agreed "this should be fixed ASAP... when we make a new version". No resources have ever been allocated for this fix and the software still ships mutilating all the databases as well as garbling and overwriting whatever (by mere luck) might go through.

  • (nodebb) in reply to Lőrinczy, Zsigmond

    Here totalsize/2 means the total size is enough for the actual database and its backup. Assuming you overwrite the previous backup with the new one. Also assuming there's no other files on the same filesystem.

    Yep. Except that instead of checking databaseSize < totalSize / 2, they effectively checked databaseSize < totalSize * 2, because they put the divide on the wrong side.

    UserK:

    I still insist that legacy == does work reliably

    No, it just means "so old that nobody remembers how it works". Legacy bugs are common and, in the worst cases, must be preserved because systems have been built around that exact behaviour.

  • MiserableOldGit (unregistered) in reply to UserK
    The Tech Lead was absolutely interested as he readily read the email and agreed "this should be fixed ASAP... when we make a new version". No resources have ever been allocated for this fix and the software still ships mutilating all the databases as well as garbling and overwriting whatever (by mere luck) might go through.

    We had a complete dingbat of a StsAdmin at one gig I was one. His failings were plentiful and comical, but one I remember was somebody reporting complaints to the department meeting that requests to get files restored (whether from a week ago or longer) were always denied/failed.

    His answer? "Our system has been set up to back files up, it was never designed for restoring them"

    I laughed so hard I got kicked out of the meeting, not for the first or last time.

  • Free Bird (unregistered) in reply to Rob

    That's not TRWTF. It's either a minor imperfection, or outright correct if the code base uses Java 7.

Leave a comment on “Spacious Backup”

Log In or post as a guest

Replying to comment #:

« Return to Article