- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- It Figures
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
comments = [] headComments = "Record ID, FRIST,UNIQUE_GENERATED_ID".split( "," ) for currentComment in headComments: comments.append(currentComment)
Admin
This is one more example of the results we get with "learn how to do things" approach of un-educated developers searching the web to get stuff done, as opposed to "understand what's going on" as part of AT LEAST a crash course on computing in general.
Admin
There are two kinds of programmers: thoze who uze as little string manipulation as possible, and thoze who go out of their way to find excuses to manipulate strings. I've never understood the latter, but they exist.
Admin
If this is the worst you have to show as example, the code base must be pretty neat. Well ya, the split and print is somewhat pointless but something akin to this can be found everywhere, atleast you can change your Delimiter without messing around with the table header. Someone startet with a different idea und that's what it ended up beeing. Not a WTF in my book, someone didn't care and i myself wouldn't be even bothered.
CSV files itself are very simple, that's the point to them. The only way to break them is mess up newlines or delimiters in your input. Encoding etc. applies the same way to all simple formats. The main problem with CSV generation is, that clients often use Excel -- and that's a real sitty wore if it goes for reading in plain-format data (interpreting decimals as dates and stuff). Sadly that mess never will go away without a new file extension (supporting old hacked csv to behave in excel). Go on, try it yourself. Write a bunch of telephone numbers in a raw csv file and see what happens.
About multiple identifiers, that's what happens if a CSV survives multiple systems. I believe one of the systems i'm taking care of, prints "TODO" in a CSV Column, because even the client couldn't say what was supposed in there (but the format is predefined in another system).
Admin
I take this code as being the cover by which you can definitely judge the book.
Admin
He didn't mention the WTF of not trimming (
strip()
in Python) the surrounding spaces around some of the column names.Admin
This is partially because, as Mr. TA points out, "developers" find stuff on the web and copypasta without understanding the code or what they're supposed to accomplish. But I think it's mostly due to the fact that most "Intro to Programming with <language>" classes spend the entire semester manipulating strings, and then the "Advanced Programming with <language>" classes "teach" how to create lists and arrays (of strings) and never talk about how to actually use these correctly.
Could be worse -- and I'm sure there is much worse elsewhere in the code.
Admin
Wanna bet that the eccentric spacing was to get the column heading to line up nice and neat on the output?
Admin
And it wasn't even mentioned that python includes the "json" module in its standard library which would have taken care of all the complications. (You who think CSV is simple have yet to deal with a CSV file using vertical pipes for field separators and which includes XML for some of the field values.)
And a pet peeve, Python programmers who loop over an iterable just to create another iterable. "columns = HeadColumns" is enough. Of course since columns is never read the whole thing should be elided.
Admin
Improper database schema, huh? A few years ago, I started in on berating a fellow programmer over the schema in a new project for a client. I recently had to deal with the mess because of a request from the client. I had to forgive the programmer. "Their IT guy insisted it be this way." sigh THAT jerk. He threw his weight around on the project just because he could. Then retired. Some days I lack enough profanity in my vocabulary to comment on this.
Admin
I thought that's what the comment about arbitrary spaces was getting at.
Admin
It's worse than that. Here's a very small sample .CSV file:
Is that two columns or one?
The answer is, "Yes, it is two columns or one, if you import it into Excel."
In a US or UK English (YMMV for other varieties of English) version of Excel, it's two colums containing (1,2) in the first and (27, 54) in the second. In French (and probably some other European) localised versions, it's one colum containing (1.27, 2.54) because the fine people who wrote Excel's CSV handling were farble-faced farble-nuts, so it uses the localised number representations for both export and import. In France, they have a decimal comma, not a decimal point, so the column separator in CSV files is a semicolon.
Addendum 2022-06-22 11:37: EDIT: the column separator in CSV files produced or consumed by Excel, that is.
Admin
It's so great when you can paint a picture of the entire script, the programmer, the boss and the entire organization with just 6 lines of Python.
Admin
I don't think you can blame the Excel hacks for choosing wrongly, because every option is wrong.
What we need to do -- and by "we" I mean everyone in a collective effort -- is to deprecate the CSV format and replace it with TabSV. (I'm now waiting for everyone to tell me all the things wrong with that proposal.)
Admin
This makes me wonder - is there a designated "delimiter" character in Unicode?
Admin
[q]This makes me wonder - is there a designated "delimiter" character in Unicode?[/q] Yes, ASCII 0x1E is the record separator and 0x1F is the field separator.
Admin
No, TRWTF is that Python has a perfectly good CSV library that has a single function to write out a 2D array.
Admin
...do you mean "rows"?
Admin
the moment you realise
import csv
would not solve any problems here...Admin
Wanna bet the spacing to get the column headings lining up in the output was only relevant in some version of some long deprecated application used many years ago, and has been kept because changing it would be too much work and likely break many things?
Admin
The worst part is that they had it right initially. The seperator was a setting in Excel. But some brillant mind decided to change that and use the localized seperator that is set in Windows (in Excel 97 maybe, I don't remember the exact version). Always fun (not) when dealing with multiple Excel / csv files with different localizations, at the same time. As is quite common in every country outside of the US, in my experience.
Admin
Most definitely not. The column separator in Excel's version of CSV is localised, as is the representation of numbers. If we interpret that CSV file as US-English Excel would interpret it, then save as a .xlsx file, then load the .xlsx file into French Excel, and export as CSV, the result is:
Note that the commas have become semicolons.
French Excel treats the original file as having two rows of one column each, using the comma as a decimal "point". (The French use a "decimal comma" instead of a "decimal point".)
Admin
Balance Method Acupuncture is a fast and effective acupuncture method that has shown to produce better and faster results than standard Acupuncture. "https://acupunctureedmonton.com/"
Admin
Criticising Excel because you don't know how to use it is daft. It offers import options for all the scenarios mentioned and many more. It's impressive it can get it right automatically as often as it does, but where it doesn't the correct settings are configurable.
Along with such settings are options to do useful things on import like setting column types, transposing tables, and so on.
I get the impression that a lot of people last tried using Excel in about 2000 or earlier, and gave no real experience of the improvements made since.
Ultimately excel is a lot like Wetherspoons pubs - you may not be in the market for what they offer, but it's bloody hard to find anything they're trying to do that could be done better.
Admin
The issue with Excel is generally not that Excel can't do it.
It's that the users who barely know more than simple formulas and raw data entry cannot use it correctly. If there's something like an options tab on a dialog box, rest assured 95% of your users have never clicked it and have idea what wonders it holds, much less when and how to use them. Heck, 50% of your users have never even noticed the tab exists.
Admin
You know that end-users open a .csv file by double-clicking, which gives zero access to the import options? And that using import with options can be a lot of work, if you have to import dozens of files?
Admin
Quote: "Like so many things, the CSV files are actually way more complicated than people think."
There's nothing complicated about CSV-files. Just escape appropriately (strings containing ",", "\n", "\r", "\l" for sure, and perhaps "\t" and ";" as a nicety towards applications that are overly nice by being flexible about what separator is actually used). There is the issue of Excel interpreting a "," as a decimal separator depending on locales, but that's its problem (not CSV's). There used to be an issue with ASCII vs unicode, but any application that still fails reading CSV's on account of that should be put in a home for the elderly.
Admin
That sounds like a feature discoverability issue.
Admin
"The main problem with CSV generation is, that clients often use Excel"
What really made my head explode: when you export/save as CSV from Excel, it's not all valid Excel CSV. Excel can't load it.