- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Possible, but clearly the said developer didn't bother trying to find out either.
Admin
Come on, Remy, everyone knows that the more times you hash something, the more random it becomes.
Admin
The proper way in PHP to generate a UUID is actually by using the Random\Randomizer class. Now you just have to set a few bits to make it a RFC 4122 conform v4 UUID.
Admin
Why not allow S, C, or K? I can think of a pretty good number of words that couldn't accidentally show up in one of these "tokens" if those letters weren't used...
Admin
I'm sure that having a token read "FUQQ" or "QOQQ" would be perfectly safe and not offend anybody.
Admin
The real WTF is that PHP's
uniqid
function doesn't actually guarantee uniqueness... But I guess it's par for the course for PHPAdmin
"$extra is going to be the actual filename, which if it were me, I'd append the unique fields so the name remains sortable, not prepend them- "
errr - sorry, but they "are" appending the filename....
wtf?
Admin
Of course, UUIDs are only probabilistically "unique"... If you're very, very unlucky, you'll end up with a collision (admittedly, winning the lottery is likelier... until Finagle's Law steps in).
Admin
Right, but they should be prepending the filename. Or, more to the point- splitting the filename and the extension, and inserting the hashes into the middle. E.g.,
foo.txt
becomesfoo.QQF523ABC43.txt
.Admin
I actually ran into a GUID collision in a production DB earlier this year. Brought the whole system down too. Well, not so much crashing as in seemingly unexplainable but very wrong behavior regarding monetary transactions of significant sizes. That one was not fun to track down. I mean, it's just not something you'd ever think about until it happens.
Admin
I'd have gone with saving the file with just a (rendered) UUID as the name, keeping the mapping from that to the "real" name in a database. Like that, nobody can send a file in and have it arrive somewhere that they have any control over at all (unless the server-side code subsequently decides to allow it).
And I'd have also used a multi-layer directory structure Just In Case™ because you really don't want to put thousands of files in the same directory on most filesystems deployed out there.
Admin
That'd be my general leaning, too, though I do see a rationale for including the input filename (but also risks, especially if someone is sending you a carefully malformed filename).
Admin
I was involved with a remediation where around 125000 files were dumped into a single directory on a Windows server. It was seriously time-consuming.
Please folks, no matter how much you decide to screw up filenames, don't dump all those files in one directory.
Admin
I don't think appending is as straightforward as you think. My first instinct follows the code: the ID should be prepended to the filename.
Consider, what do you do if a file ends in .tar.gz? Both of those extensions are significant to the file type. So maybe you decide to put the ID before the first dot in the name. But now consider, there are also a lot of files that use dots outside of the extension! So now "myapp-1.0.zip" would become "myapp-1.iasdf823.0.zip". You just can't win if you try to insert an ID between a file name and extension. Maybe hardcode some known extension values, but that complexity is just not worth it. Just prepend it.
Admin
Erm... in maketoken(), the $scratch variable is only modified inside the "if" statement-- as written, this code will only append the letter "Q" based on random number magic and will not append any other characters.
Admin
Yes, obviously.... my wtf was the use of the opposing terms in the article...
Admin
Also, tempnam() is a thing. Generates a file name guaranteed to be unique.
Admin
QQ moar, noob!
Admin
Come on, split archive.2021.tar.lz4 to filename and extension
Admin
At least their UUID approach was more than just "use a timestamp" - imagine the fun that one hour of DST "rewinding" would create (cf. programmers believe about time).
Admin
Meh. On Windows (Server) you can work fairly easy with directory containing few millions of files… (file system will hate you)
Admin
Since the 'Q' token generation function would be executed before the time hashing is computed, it seems like the selection of the 3 letters is meaningless except to attempt to introduce some random time into the later computed time segment of the filename.
Its all pointless though, since, well, uuid's and all (as pointed out already).