• Juans (unregistered) in reply to Chris

    Welcome to Spain! 4 official languages and two more big dialects, 5 very different ways to pronounce the biggest Spanish language (castellano), not forgetting about Latin America... J.

  • (cs) in reply to Gordon

    Forgive me, but I thought the Swiss and Belgians spoke chocolate?!!

  • CrankyPants (unregistered) in reply to Guybrush
    Guybrush:
    In Germany they speak deutsch, shortened to "deu" in the table.
    Thank you, Mr. Obvious.
  • NeilMc (unregistered) in reply to Okayyy, hes in school ffs.
    Okayyy:
    Bah. Europa. I prefer scandinavia. ;) Btw, there is a very small and odd language apart from nederlands. ;) dunno what its called in english though. Flamländska in swedish. Cool language.

    I think in English, we call that language 'Flemish'.

  • szukuro (unregistered)

    Don't mean to start a flame, but sure looks like an ASP.NET coder with heavy PHP background. ASP.NET provides easy-to-use localization features, with no need for code duplication and/or url rewriting. Then again it's not the technology/language, it's how you use it.

  • John Cowan (unregistered)

    In fact, these codes are all standard ISO 639-2 codes for the various languages. ISO 639-2 provides a 3-letter code for almost 500 of the world's languages. (ISO 639-3 isn't quite a standard yet, but when it is, it will provide upwardly compatible 3-letter codes for all 7000+ languages.)

    The reason for the "dut/nld" discrepancy is that 22 languages have two different 3-letter codes in 639-2 for backward compatibility. Unfortunately, the official list at http://www.loc.gov/standards/iso639-2/php/English_list.php doesn't explain this very well, and lists the backward-compatibility code (which it calls the "bibliographic code") first in its tables.

    As for the 2-letter codes, they belong to the closely related ISO 639-1; the table above includes all of them. Every language (almost 200) with a 2-letter code also has a 3-letter code.

  • Grant (unregistered)

    The language is C#, and the syntax String[index] is basically syntactic sugar for the Property String.Chars(index) (http://msdn2.microsoft.com/en-us/library/system.string.chars.aspx), which requires a METHOD CALL.

    So to complete the test of all three characters requires SIX (6) method calls, not none as some people seem to be implying.

    Heck, (url.IndexOf(str, 1) == 0) might be faster. The reasoning? In Java at least, the source code of String.indexOf() shows that it tests against the char[]'s that back the Strings being tested. Assuming the implementation of String.IndexOf() in the .NET Class Library is similar, it may be more efficient to let the underlying class perform the char-to-char comparisons, rather than extract each char with a method call and performing the comparison yourself.

  • MadMike (unregistered) in reply to mbvlist

    In switzerland we got 4 official languages (http://en.wikipedia.org/wiki/Switzerland) which are: German, French, Italian and Romansch

    Since english is so prevelant in the world, we are consindering to teach our childern english as the second language. So for a mere 7 milion population, we are proud of our multi-culturated-ness (sp?) ;)

  • Taggy (unregistered)

    Java and C# developers don't know the meaning of optimal.

  • rien (unregistered) in reply to snoofle
    snoofle:
    Forgive me, but I thought the Swiss and Belgians spoke chocolate?!!

    in fact, in belgium we speak chocolate AND drink beer, while in swiss they speak chocolate AND eat cheese: that's two different dialects...

  • anon (unregistered)

    @Derrick Pallas: The article doesn't display properly in Firefox... although it is fine in IE.

  • JD (unregistered) in reply to snoofle
    snoofle:
    Forgive me, but I thought the Swiss and Belgians spoke chocolate?!!
    I thought Belgians spoke waffles
  • Smith (unregistered) in reply to tchize
    tchize:
    rien:
    worst, they may speak different languages within the SAME country !
    Welcome to Belgium. Here you can find people speaking dutch, others speaking french, another part speaking german. Not to mention unofficial local dialects that may sometimes be difficult to understand, the numerous different cultures in big cities and all languages spoken by various european governments representatives when they are around :)
    Gawd!! All in 30K sq km
  • (cs) in reply to rien
    rien:
    Guybrush:
    In Germany they speak deutsch, shortened to "deu" in the table.

    you mean, in deutschland ? they speak german of course ! and when not speaking, they are driving big fast cars...

    Derrick Pallas:
    Apparently, they all speak different languages too.

    worst, they may speak different languages within the SAME country !

    In Germany, apart from german we have frisian, danish and at least one more separate language (which I can't remember right now) spoken by minorities. That does not account for all the dialects of german.

    German is also a minority language spoken in Switzerland, France (Alsace-Lorraine) and Italy (Southern Tyrol). The Austrians speaks german, too, of course.

    Other noteworthy minority languages are Basque (Spain and France), Catalan (Spain), Gaelic (both Irish and Scottish "versions") (UK), Welsh (UK).

    Switzerland is a case for itself: the official languages there include German (Swiss German dialect), Italian, French and one or two more languages spoken by very small minorities (can't remember the names right now).

    In Belgium they speak French in the South and Flemish (essentially Dutch - don't flame me Flemish readers) in the North.

    The list is by no means complete.

  • Harrow (unregistered)

    Without knowing the exact requirements and circumstances, it is impossible to say that there is anything wrong with the SOD. For example, why use a three-letter language designator where no software would require more than a single character? What is the result of this function used for? Does the result end up as part of dynamic URL?

    It's always difficult to derive the module requirements or the programmer's intention solely from reading the code, but the following seems to do the same thing as the SOD, and might be faster, especially if more countries were added.

    I have assumed that the function must return a three-letter string representing the ISO language, and I have added the requirement that each page preparer memorize and use a single letter representing his target language. This means the URLs would be "http://www.company.tld/e/products" for English and "http://www.company.tld/g/products" for Deulish. Of course, if the returned value is used to construct further URLs, then the code would be different.

      /// 
      /// A list of keyed supported folder names
      /// 
      public static string[] VALID_FOLDERS = new string[]
      {
        "eeng", // English
        "gdeu", // German
        "ffra", // French
        "jjpn", // Japanese
        "kkor", // Korean
        "ddan", // Danish
        "ufin", // Finnish (Ugric)
        "vswe", // Swedish (Svensk)
        "nnor", // Norwegian
        "hdut", // Dutch (Hollander)
        "sspa"  // Spanish
      };
    
      /// 
      /// Returns the ISO language from URLs of the form "/lang/foo/bar.aspx"
      /// where lang is a single letter designating the language of the page.
      /// 
      /// URL of the form "/lang/foo/bar.aspx"
      /// 
      public static string IsoFromUrl (string url)
      {
        foreach (string str in VALID_FOLDERS)
        {
          if (str[0] == url[1])
          {
            return str.Substring(1,3);
          }
        }
        return string.Empty;
      }
    

    Since Substring may be slower than two two subscripted comparisons, only a performance comparison will reveal which version is optimal.

    -Harrow.

  • ratboy666 (unregistered) in reply to JL

    Assuming the table is sorted, represent as an INT (pack the characters).

    Then, use the following WTF code (C-ish, not checked):

    #define CV(c0, c1, c2) (((c0) << 8) | ((c1) << 8) | (c2))

    char *verify(char *s) { int v = CV(s[0], s[1], s[2]); if (v lt CV('.', '.', '.')) { if (v lt ...) { ... return "..."; ... } else { /ge/ ... } } else { /ge/ ... } }

    etc. Nested binary search. Unrolled. Given that there are 11 entries in the table, worst case:

    4 if statements (compare/branch) 1 return.

    So, the function executes in 9 machine instructions (worst case).

    As a "double plus" bonus: the full implementation would be completely ugly WTF code -- including mis-using types, horrible looking nesting. Implicit sorting. Hard to maintain (can't add a new country without rewriting the WHOLE function). 4 levels of 'if' nesting.

    I would put this into a program, just for laughs.

  • (cs) in reply to hardcorewizard
    hardcorewizard:
    The real WTF is that the article isn't escaped properly, so the C# comments are parsed by my browser, and breaks the layout of the page.
    Yeah, my layout is broken, too.

    Derrick, could you possibly make it not-broken? That would be greeeeeat, thanks. [takes sip of coffee]

    I'm using FireFox 2, by the way.

  • (cs) in reply to PS
    PS:
    What is Europe? And why don't they speak English like the rest of us?
    [image]
  • (cs) in reply to modelnine

    I guess he didn't know about Array.Exists?

    string lang = str.Substring(1,3);
    
    if (Array.Exists(VALID_FOLDERS, lang))
        return lang;
    else
        return String.Empty;
    
  • (cs) in reply to Frederik
    Frederik:
    PS:
    Jens:
    PS:
    What is Europe? And why don't they speak English like the rest of us?

    Btw. I'm European and don't even own a car... Not even a small one...

    So whay kind of bicycle do you own?

    Generally depends on what kind of bike the person who forgot to use his lock owned. Works pretty well.

    You must be from Amsterdam.

  • (cs) in reply to rien
    rien:
    snoofle:
    Forgive me, but I thought the Swiss and Belgians spoke chocolate?!!

    in fact, in belgium we speak chocolate AND drink beer, while in swiss they speak chocolate AND eat cheese: that's two different dialects...

    Hmmm... chocolate-cheese, as in in chocolate-cheese-cake I can see. Chocolate beer is a new one one me, although...

  • (cs) in reply to JD
    JD:
    snoofle:
    Forgive me, but I thought the Swiss and Belgians spoke chocolate?!!
    I thought Belgians spoke waffles
    Sooo... chocolate-cheese-beer waffles?!! Imagine the possibilities?
  • Alan (unregistered) in reply to snap2grid
    snap2grid:
    It includes an insignificant island called England - the place English comes from
    England's an island now??? Scotland and Wales must have become detached somehow...

    The island is called Great Britian, as it is the largest island in the British Isles. Britain is used as a short name for the United Kingdom of Great Britain and Northern Ireland, a supernational soveriegn state including the countries of England, Scotland, Wales and Northern Ireland.

  • (cs)

    Looks like someone was still thinking in C while programming C#. Anyway, even C uses short-circuit boolean operators, so there is no need to nest the "if"s. Anyway, this function has "premature optimization" written all over it.

  • lingist (unregistered) in reply to Rabbi

    I thought they spoke European.

  • Mixu Lauronen (unregistered) in reply to Gordon
    Gordon:

    'Europe' Swedish? You mean Finnish

    Tuota... Europe on ihan varmasti ruotsalainen. Och samma på svenska.

  • JL (unregistered) in reply to Grant
    Grant:
    The language is C#, and the syntax String[index] is basically syntactic sugar for the Property String.Chars(index) (http://msdn2.microsoft.com/en-us/library/system.string.chars.aspx), which requires a METHOD CALL.

    So to complete the test of all three characters requires SIX (6) method calls, not none as some people seem to be implying.

    Heck, (url.IndexOf(str, 1) == 0) might be faster. The reasoning? In Java at least, the source code of String.indexOf() shows that it tests against the char[]'s that back the Strings being tested. Assuming the implementation of String.IndexOf() in the .NET Class Library is similar, it may be more efficient to let the underlying class perform the char-to-char comparisons, rather than extract each char with a method call and performing the comparison yourself.

    This is true. One would hope that the JIT compiler would optimize the Chars function call away, but you're right that it may not. In this case, it would be fastest to store the country names as arrays of characters rather than as String objects; then you could extract the three relevant URL characters outside of the loop. This would mean three total function calls per invocation rather than one function call per loop (11 per invocation), as you might have when using IndexOf().

    (Also, looking at Reflector, it looks like IndexOf() does not perform a simple binary comparision in .NET... The results depend on the current culture and comparision options, and it performs several nested function calls and parameter validations to handle this correctly. The Chars property, on the other hand, calls a native function. So calling Chars twice per loop may still be faster than calling IndexOf once per loop.)

  • Yo (unregistered) in reply to Rabbi
    Rabbi:
    PS:
    What is Europe? And why don't they speak English like the rest of us?

    Europe is the little place on the far side of the pond where most of your ancestors came from (not counting the few native Americans that were allowed to survive and interbreed). It includes an insignificant island called England - the place English comes from

    As for English, we don' speak it like the rest of you because we speak it as it should be spoken!

    I could not agree more hahahaha

  • (cs)

    ... is premature optimization. I highly doubt this operation makes any serious change in the speed of the app.

  • Corporate Cog (unregistered)

    Come on, that's not that bad. I really don't understand why the function is named IsoFromUrl, though.

  • Spiceweasel (unregistered) in reply to Yo

    Yep. Big case of premature optimization.

    The fact that the programmer thinks that this is a case where optimization is more important than clarity, extensibility, and maintainability shows that he doesn't really understand what's happenining in his system.

    Granted, that's not so much a WTF mistake as a rookie mistake. Unless the guy was the chief architect or something.

  • DiscoVincent (unregistered)

    We speak Italian in Italy (would you ever guess???), broken english on the net, and... of course ...the language of love abroad ;-)

    Respectively ITA, BRE, & FCK

    CAPTCHA pinball: All around the world baby....

  • Corporate Cog (unregistered) in reply to Grant
    Grant:
    So to complete the test of all three characters requires SIX (6) method calls, not none as some people seem to be implying.

    Maybe not; compiler may inline those methods.

  • (cs) in reply to nuclear_eclipse

    Well, globalised pop culture and the Internet are actually causing causing parts of the language to converge again. especially slang.

    But they're mere dialects anyway. As if the entire USA has the same word for coke, soda and pop.

  • Jno (unregistered) in reply to Alan
    Alan:
    snap2grid:
    It includes an insignificant island called England - the place English comes from
    England's an island now??? Scotland and Wales must have become detached somehow...

    The island is called Great Britian, as it is the largest island in the British Isles. Britain is used as a short name for the United Kingdom of Great Britain and Northern Ireland, a supernational soveriegn state including the countries of England, Scotland, Wales and Northern Ireland.

    It's called Great Britain [Britannia Major] to distinguish it from Britannia Minor, i.e. the French province of Brittany.

    James I styled himself King of Great Britain, France and Ireland, when he upgraded from just being James VI of Scotland. They didn't have a good grip of version numbering back then.

  • THC (unregistered)

    The whole optimisation is stupid anyway. The function is called once per request WOW, so that is like, every minute when the user clicked another link? Yeah, that really needs ultra-optimisation, unlike that while( ) loop that displays our 10000 products.

    Wouldnt the best way be to actually parse the URL String once into a URL class, with a int country and only pass that class around? ^^

  • (cs) in reply to Corporate Cog
    Corporate Cog:
    Come on, that's not that bad. I really don't understand why the function is named IsoFromUrl, though.

    Call me crazy, but I'm guessing it's because it gets the ISO code from the URL.

  • DryTyler (unregistered) in reply to Chris

    To be really pedantic, in Germany we speak Deutsch You may call it German, but I know the language I am speaking. It is Deutsch

    ;-)

  • nooblar (unregistered) in reply to mbvlist

    [quote user="mbvlist"][quote user="joerbanno"][quote user="Guybrush"]In and in Switzerland they speak German, Italian or some other language I forgot, and so on.[/quote]

    In Switzerland they speak:

    German (64%) in the north and centre; French (20.4%) to the west; Italian (6.5%)

  • (cs) in reply to Juans

    So, given everyone's comments that we speak this, that or the other thing here, there and everywhere, the language used to post comments and short-hand here in wtf-land is wtf-ish?

      public static string[] VALID_FOLDERS = new string[]
      {
        "eng", // English
        "deu", // German
        "fra", // French
        "jpn", // Japanese
        "kor", // Korean
        "dan", // Danish
        "fin", // Finnish
        "swe", // Swedish
        "nor", // Norwegian
        "dut", // Dutch
        "spa", // Spanish
        "wtf"  // wtf-ish
      };
    
  • bofh69 (unregistered) in reply to nuclear_eclipse
    In all reality though, American English is so different these days, in pronunciation, slang, and meaning, that one could certainly consider it a whole language beyond the Queen's native tongue.

    George Bernard Shaw said it best:

    England and America are two countries separated by a common language.
  • Tom Dibble (unregistered)

    Okay, aside from the likelihood that there's no fathomable reason for this optimization, and in and of itself it doesn't work, here's another:

    By returning an ISO code string they are requiring another piece of code somewhere to be doing a switch statement (aka string of equality tests) based on strings, which is highly non-optimal.

    If they had an enumeration for the countries, and returned that, then the switch on the ISO code returned would be supported at the hardware level directly. Adding a simple wrapper that translates the enum value to a string would be simple enough for the handful of cases where the caller really wants the string.

    So, the problems here are:

    1. The code is likely not a bottleneck to begin with
    2. If it is, it is calling three methods instead of just calling one, optimized, method
    3. It isn't checking for the last '/' character after the country code and so equates "eng" and "englehammer"
    4. It is causing other areas of the code to do string comparisons to act on its return value.
  • Sam (unregistered) in reply to mbvlist
    mbvlist:
    Yeah and in Belgium they speak either (a sort of) Dutch or French,

    And German, just for good measure.

  • S|ic3 X (unregistered) in reply to rien
    rien:
    worst, they may speak different languages within the SAME country !

    Thankfully none of the North American countries do that </sarcasm>

  • gl (unregistered) in reply to snap2grid
    snap2grid:
    England's an island now??? Scotland and Wales must have become detached somehow...

    Shoddy workmanship....

  • Martin Plamondon (unregistered) in reply to mathew
    mathew:
    No, the REAL WTF is that the user's web browser already has a preference for what language they want to see the web site in, so the web server should just be using that.

    I hate when site does that! I'm French Canadian (Québécois, we too have multi-language countries in America) and I hated it the first time I went to Google with my English Windows but French Canadian preferences loaded like keyboard layout and Google redirected me to a French version of the page... Took me a while to make it remember I wanted English page.

  • Jelmer (unregistered) in reply to cklam
    cklam:
    Frederik:
    PS:
    Jens:
    PS:
    What is Europe? And why don't they speak English like the rest of us?

    Btw. I'm European and don't even own a car... Not even a small one...

    So whay kind of bicycle do you own?

    Generally depends on what kind of bike the person who forgot to use his lock owned. Works pretty well.

    You must be from Amsterdam.

    No one forgets his lock in Amsterdam, it's like forgetting to flush the toilet.

  • (cs) in reply to modelnine
    CodeSOD:
    In Europe, they do things a little bit differently. From what I understand, it boils down to this: they work less and play more; when not working or playing, they drive tiny little cars. Apparently, they all speak different languages too.

    This is the greatest comparison of Europe to America in 50 words or less that I have ever read. Actually, it's brilliant. Should I reference "CodeSOD" whenver I quote this excerpt?

  • BW (unregistered) in reply to JL

    The JIT does in fact optimize the calls away. The call instructions still appear in the IL, but they aren't in the generated X86.

  • Mario (unregistered) in reply to Smith
    Smith:
    tchize:
    rien:
    worst, they may speak different languages within the SAME country !
    Welcome to Belgium. Here you can find people speaking dutch, others speaking french, another part speaking german. Not to mention unofficial local dialects that may sometimes be difficult to understand, the numerous different cultures in big cities and all languages spoken by various european governments representatives when they are around :)
    Gawd!! All in 30K sq km
    Yes, and it seems quite a few of us are here too... Most people in Belgium speak English too, and a lot in the north speak a local dialect. There's a lot of foreigners, who mostly brought their own language and keep it alive in their family. I used to work in a library where we had a lot of European Parliament books, and they all had to be entered into the computer in all the official languages (12 at that time)

    We are an interesting country: when Czechoslovakia was about to break up, they came over to see how we had our state organized, and promptly made 2 separate countries. Haha.

    I speak Dutch (Flemish), btw.

Leave a comment on “Laying the Foundation for i18n, Brick by Brick”

Log In or post as a guest

Replying to comment #:

« Return to Article