• MacFrog (unregistered)

    \nFrist line

  • (nodebb)

    Now, it's important to note that CSV usually is expressed as a "comma separated values" file, but the initialism is actually "character separated values".

    It's usually called "comma separated values" because that's what it originally was, a file where values were separated by commas.

    Then the localising folks working on Excel got hold of it, and it became ceparator-separated values.

  • (nodebb)

    I had to test this out, because I was curious what csv.reader would do for line terminators with delimiter="\n". Turns out that the representative line does work, though there's one important difference. With something like f.readlines, each line in the list is a string, but with csv.reader, each line in the list is a list of strings (in this case, they're all one-element lists of strings).

  • (nodebb) in reply to Steve_The_Cynic

    "Then the localising folks working on Excel got hold of it, and it became ceparator-separated values."

    Once upon a time there was a system which used an Excel COM interop to read files from a third party and all was well until the company expanded across the border and some users just got garbage out of the import system because the CSV files where comma-separated but the new users had MS Office installed in Spanish and for whatever reason the geniuses at Microsoft decided that the default separator for COMMA-separation in Spanish was going to be a semicolon instead.

  • Hanzito (unregistered) in reply to AGlezB

    That reason is that in Spanish, like everywhere(?) in continental Europe, the decimal separator is a comma, and for some reason they thought that allowing that in the file was important. Users could see and edit numbers as they were used to. But who edits csv by hand? Not your average Joe. So, just like storing localized function names, and interpreting strings as dates, that lofty thought was a mistake.

  • Industrial Automation Engineer (unregistered)

    and then unicode happened

  • Smithers (unregistered) in reply to Dragnslcr

    Well, that's been fixed in 3.13, because when I tried it, it raised "ValueError: bad delimiter value" which is exactly what I would do if told to read a table where the cell and row delimiters were the same.

    Interestingly, this change is not mentioned in either the csv module docs or the what's new in 3.13? so if they ever upgrade (hah) that's a breakage they won't see coming.

  • (nodebb) in reply to AGlezB

    for whatever reason the geniuses at Microsoft decided that the default separator for COMMA-separation in Spanish was going to be a semicolon instead

    Well, they had to do something to resolve the conflict between 4,7 being a 4 and a 7 separated by a VSC(1) and 4,7 being 4.7 written European-style, with a decimal comma instead of a decimal point.

    And of course they chose the "the content of a ceparator-separated values file is for humans to read" version of the solution, which is 99.999% nonsense.

    (1) Value-separating comma

  • (nodebb)

    Everyone wants to blame Excel for CSV being garbage,[0] but localization is HARD, and CSV gives you the ILLUSION that it's easy. Plain CSV isn't even easy - you can't naively split on commas; You MUST handle quotes, especially the weird "escaped" quotes. CSV tricks you into thinking structured/tabular data is easy, but it's not.

    [0]: Excel has a lot of flaws, like trying to say any number is a date, but that's unrelated to CSVs.

  • staircase27 (unregistered)

    The other difference between the two versions is that the csv based version will support quoting of "line"s, so you could have a "line" that actually contains multiple lines if they were all wrapped in quote characters.

  • (nodebb) in reply to Steve_The_Cynic

    "And of course they chose the "the content of a ceparator-separated values file is for humans to read" version of the solution, which is 99.999% nonsense."

    And thanks to that CSV went from a very useful way to exchange data between systems to something nobody uses unless they absolutely have to, leaving most of us with the choice between XML and binary until JSON joined the party a while later.

    Heck, the issue with the dot vs comma numbers was so anoying that some text-only systems dealt with it by just writing all numbers as integers with fixed decimals, i.e. 6.54 written as 65400. I still have to deal with one of those from time to time.

  • Bogolese (unregistered)

    Brilliant! Or should I say Brillant?

  • some guy (unregistered) in reply to AGlezB

    something nobody uses If only. A mere three years ago some "business" "people" in my old company were still like "Just do a CSV import for this, bro, it can't be that hard, bro, it's what the customer wants, man!". Which only shows that the customer is always an idiot.

  • PedanticRobot (unregistered)

    So TRWTF is the existence of decimal commas and some people's stubborn insistence to keep using them thereby complicating everything.

  • Argle (unregistered)

    0x1f. There. Fixed it.

  • quiteACharacter (unregistered) in reply to colejohnson66

    Literally no CSV file I ever needed to open opened right in Excel. I thought it was just Excel being shit in general, it's bummer it's just specifically shit to me because I live in particular country. Thank you for thinking about me, valiant Microsoft localizers, please think of me less next time

  • (nodebb)

    "some text-only systems dealt with it by just writing all numbers as integers with fixed decimals, i.e. 6.54 written as 65400." Tell me you've never programmed in COBOL without telling me you've never programmed in COBOL.

  • OldCoder (unregistered) in reply to AGlezB

    Some customers of ours got round the problem not by making everything decimals but simply by "quoting" every single field. Every. Single. Field.

  • (nodebb) in reply to n9ds

    LOL. I haven't touched COBOL in a while. That example was from a 3rd party system that uses TXT files with fixed-length fields to exchange data. And you might not believe it but it's a relatively modern accounting system that uses a relational DB as backend with very well normalized tables, so no COBOL there.

    That said, about two years ago one of our clients wanted us to reverse engineer and modify an old COBOL system but fortunately we managed to convinced them it would be simpler and cheaper to just implement the whole thing from scratch without the RE, since the old system is still functional.

    Work as a dev long enough and anyone can open their own WTF site. :D

Leave a comment on “What a Character”

Log In or post as a guest

Replying to comment #:

« Return to Article