• Vera (unregistered)

    Looks like Java. In which case, there is a dedicated URL object with some useful functions for getting those parts by name. I suppose this is on the level of rolling your own date-handling code.

  • (nodebb)

    Why are they passing "\." to the split function? Does it treat any string as regex?

  • (nodebb)

    Why they chose to use a loop that only executes once, I'll never understand.

    And more to the point, why they convert (almost) all British domains into co.uk or ac.uk or gov.uk or whatever. (www.nhs.uk and www.parliament.uk and a few recent things would he OK, but not the rest.)

  • (nodebb) in reply to Steve_The_Cynic

    Possibly, because the code-monkey thought "Not my problem."

  • (nodebb) in reply to Domin Abbus

    You're assuming that the code monkey thought.

  • supermagle (unregistered)

    Note that finding the authorative domain part of a domain is not a trivial task. The best way is to use a public resource like the Mozilla Public Suffix List: https://wiki.mozilla.org/Public_Suffix_List

  • NoLand (unregistered)

    My guess: As the all-parts-but-the first approach didn't work out as the intended TL+1 result, the went for TLD only. And the break just did the job. — Another high-performance hero was born… ("See, they just added a single word and fixed the entire code base!")

  • dkf (unregistered) in reply to Steve_The_Cynic

    It's really difficult to know how many pieces to chop off the front of a hostname to get the top-level domain name, so difficult that it's done by reference to a set of rules that changes in a semi-regular fashion and that you need to re-download periodically. You really can't guess it correctly any other way; there are too many complicated exceptions. (Without that list, you simply can't evaluate the safety constraints for cookies correctly.)

    Those rules are written in Javascript, of course. People whose code runs in a web-browser may not realise they exist.

  • Hal (unregistered) in reply to Vera

    It is not as bad as doing your own date parsing; in that it is not so complex a decent developer with some careful attention to detail can't get it right. However its certainly true that URL parsing is complex enough with options components, N subdomains, rules about implied schemes, alternate host representations, that doing it correctly is a lot of work and you almost certainly shouldn't.

    But lets be real here 9/10 devs out there are going be faced with an URL parsing problem and are going to just say 'regex derp derp'

  • (nodebb) in reply to Mr. TA

    Assuming this is Java, yes. String.split() treats the parameter as a regex, so you always have to remember to escape special characters like '.'. That particular WTF is on Java, not whoever wrote this code.

  • TheCPUWizard (unregistered)

    my sites, have many sub-domains - one per protocol..... so ftp.example.com www.example.com etc.etc.

  • (nodebb) in reply to Steve_The_Cynic

    Why are those two very very clearly governmental organisations not in the .gov.uk part?!

  • (nodebb)

    Because any system involving human judgment must NOT be completely systematic; that's why.

    Devs get all twitchy whenever a rule has exceptions. The rest of humanity prefers a world of mostly exceptions with a few (and partly contradictory) organizing rules loosely followed sometimes. It has been, is, and will be ever thus.

  • LZ79LRU (unregistered) in reply to WTFGuy
    Comment held for moderation.
  • (nodebb) in reply to WTFGuy

    Devs get all twitchy whenever a rule has exceptions.

    In my experience, it's more when the rules to go by are too vague or otherwise poorly defined. Exceptions aren't a problem when the specification is clear.

  • Ban Sincler (unregistered)

    Hello, everyone. I was wondering if there are any game developers here? I need some help with code and graphic design.

  • Jack Tompson (unregistered)
    Comment held for moderation.

Leave a comment on “Not My Domain”

Log In or post as a guest

Replying to comment #:

« Return to Article