• Oliver Jones (google)

    Wow. Haven't seen the fat-directory problem since I had to troubleshoot a NNTP newsfeed on a SparcStation running SunOS 3 in 1992.

  • Björn Tantau (unregistered)

    That should teach them to review code before letting it lose on a production server.

  • Red Fromage (unregistered)

    An additional WTF is the line replacing dots in the IP address, which makes 215.21.52.15 equals to 215.215.2.15. How is that supposed to be unique?

  • Greg (unregistered)
    <quote> Dimitry claims there are even more WTFs to be found if you look closely! - EM </quote>

    For one thing, while ($file = readdir($handle))
    should be while( ($file = readdir($handle)) !== false )

  • Syntaxerror (unregistered)

    One of the WTFs is, that the script is obviously running on a Windows server? Why else would you check for . .. and Thumbs.db? . and .. are excuseable, but not Thumbs.db

    While-loop could easily be rewritten into a for-loop, although it would be smarter to save the cookies with the cstring-result as the filename and try to open THAT SPECIFIC FILE

  • Church (unregistered)

    "But Dimitry wasn't to ask questions, only test and deploy." It doesn't seem like the test environmnt was aligned to production if testing didn't reproduce this. Even if it didn't lock things up, testing should have shown a huge increase in disk activity.

  • Black Mantha (unregistered) in reply to Church

    @Churck: Well, the disk activity is quadratic in user numbers, so unless they did heavy load testing it wouldn't have been bad.

  • v (unregistered) in reply to Greg

    ... while( ($file = readdir($handle)) !== false ) ... Why, are you worried that there may be a file with an empty string as a filename?

  • Church (unregistered) in reply to Black Mantha

    Sure, but if you aren't going to look at the code and you know your site gets millions of hits, why wouldn't you do heavy load testing?

  • Ashley Sheridan (google) in reply to v

    The check against false using the !== ensures that it's actually returning false and not something that equates to false with PHPs type switching. A file could be named 0 which would equate to false when converted to a boolean. Depending on how the natural sorting order was on the system this was being run, this could crap out before actually checking any file.

  • Russell Judge (google)

    Apparently the concept of a database is completely alien to the agency.

  • Willaien (unregistered) in reply to Church

    @Church: Do you do your load testing from millions of different IPs to make sure that the server isn't doing something stupid with the IP address? I know I typically don't.

  • (nodebb) in reply to Syntaxerror

    it would be smarter to save the cookies with the cstring-result as the filename and try to open THAT SPECIFIC FILE

    It wouldn't have helped hugely much; most filesystems use linear lists for directory structures, and start to really creak once you get a hundred thousand files in the same directory. (There are some exceptions, but they're just that: exceptions.) In particular, the cost of just opening one file can be linear in the size of the directory.

    Good practice is to have no more than 1000 files per directory.

  • Chris Werner (google) in reply to Syntaxerror

    "this script is obviously on a windows server" Unless it's on a samba shared directory or otherwise pulls from a common directory...

  • (nodebb)

    I know I should be outraged at the ignorance of the whole scheme but I'm really concerned about the three loops that iterate over the list of files. The intermediate full-length list could have been avoided easily by combining the loops. Funny thing how they spent extra predicates on even "Thumbs.db" even though they later match for explicit ID. Now I have mental indigestion from copypasta.

  • (nodebb)

    Fast fix:

    function _getcookie($config) {
            return $cookies;
    }
    
  • (nodebb)

    Rats, ruined the joke by page flipping. Try again...

    function _getcookie($config) {
        return array();
    }
    
  • Church (unregistered) in reply to Willaien

    @Willaien No, but I do find that copying the production environment, then patching changes does get me a cookies directory with lots and lots of files. A "clean" test environment doesn't tell me about how something will work in production as mush as a test environment that was a real duplicate of production.

  • Joseph Osako (google)

    So TRWTF is developing live? I would say that this should be obvious, but seeing what one of the 'genius' ShillFarce consultants did at my last commercial IT job (which if I have any say in it is going to remain the last time I've dipped into the pool of evil that is commercial IT), that's clearly not the case.

    <rant> Sorry, but I still get angry about that jackass. That anyone - even an insultant infected with SFDC's particular brand of brainwashing - can be so ignorant of the software he is supposedly an expert in, so blithely dismissive of SOP and basic programming practices, so utterly contemptuous of the client's needs and procedures - can create such a disaster and then calmly walk away with a fat check, is an indictment of the abysmal standards in the industry. </rant>
  • Hanzo'd but posting anyway (unregistered) in reply to Black Mantha

    Which would fall under "It doesn't seem like the test environmnt was aligned to production if testing didn't reproduce this." If your prod. side gets "millions of unique visitors" (I assume per day), test environment should run workload tests with at least 30-50 concurrent requests. And that's lowballing as it assumes 1 million visitors in 8 hours, and targets mean average instead of the more statistically sane mode or median values (whichever of the 2 is highest).

  • Plloi (unregistered) in reply to Syntaxerror

    It's code from an ad agency, likely generic on purpose to run on either/or. Equally likely that other client had this same problem.

  • TheCommonGood (unregistered)

    darn thing is an AD -- send it back as destructive, and let the kindergarteners fix it themselves.

  • isthisunique (unregistered)

    This is not unique but a poor man's fingerprint:

          $ip = preg_replace("/\./", "", $_SERVER['REMOTE_ADDR']);
          $agent = preg_replace("/\s/", "", $_SERVER['HTTP_USER_AGENT']);
          $cstring = base64_encode($ip . $agent);
    

    The IP address has to be the worst.

          if ($file!="." && $file!=".." && $file!="Thumbs.db") {
    

    You really shouldn't be doing things with windows but this is a script to be deployed so it needs portability. However any sane person would ensure the files follow a pattern and match that instead.

    Wait a minute, they later do that instead of merely going straight to the file? And now it's starts with so even more fuzzy. The file name is base64 of those values? What about /?

    With so much fuzzy it has multiple values so it has to apply them all to get all cookies? With the IP overlap and the most popular useragent people's cookies as read by the script are going to get very large.

    It looks like to me as though you almost have "cookie recovery" perhaps intended for those who deleted theirs by importing from anyone who looks vaguely similar in some often irrelevant way.

    What are they doing with $key in this iterators? Why is that $i there instead of the push operator or array_push? Wouldn't you use explode first rather than starts with? Wouldn't you just put all the cookies in one file? Poor mans herpy derpy parallel request support/IO minimisation? Why bother unsetting $cookietemp?

    No wait, the cookie key appear to be part of the file name. My god. Check that code.

    On the bright side, for PHP, at least it's almost indented properly.

  • isthisunique (unregistered)

    v (unregistered) in reply to Greg 2016-04-12 08:54 ... while( ($file = readdir($handle)) !== false ) ... Why, are you worried that there may be a file with an empty string as a filename?

    It's good practice in general. Why run the risk of side effects? Better to specify exactly what you mean when you can and then you don't need to worry about exceptions. My immediate concern is that there is no check for directories although I suppose all the stats going out wouldn't be ideal.

  • (nodebb)

    Ugh, the problem with this comment system is that I have no idea whether anyone replied to my comment asking how to do formatting in this comment system.

  • Syntaxerror (unregistered) in reply to dkf

    You could always use the first 2 symbols of $cstring as the name of a sub-directory

  • Syntaxerror (unregistered) in reply to Chris Werner

    I sure hope not. Nobody should need access to this Directory via a SMB-share.

  • Anonymous (unregistered)

    They put code which was written by some ad company and which they hadn't even reviewed or trivially tested on the same server as their main site? There's the WTF.

    Granted, when your ad server goes down you might have a few other unhappy advertisers, but at least your site's still up.

  • Dave Bacher (github) in reply to TheCommonGood

    Presumably, if you're working for an advertising supported company, refusing to include whatever script serves the ads would not be a popular choice with management. Just saying.

Leave a comment on “Tough Cookies”

Log In or post as a guest

Replying to comment #:

« Return to Article