- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
At least that predecessor has the "honor" of supplying a good WTF to this site
Admin
The only nice things to say about this mess are a) the guy is consistent; brutally consistent. And b) whoever laid out their data model can both spell & CamelCase consistently.
I give it 2 stars: Not recommended, but I've seen worse.
Separately, I have to say I've seen some "novel" uses of stringly language features before, but iteratively using the C# string interpolation operator as a "buildulator" (great term!) is a new one on me. Whodathunkit?!?11?!
Admin
In fact there is so much consistency that I was wondering whether this code was written by machine rather than a human - they re-run the generator when they change the schema.
Admin
Modern optimisers can use constant folding to evaluate these expressions at compile time.
Admin
And the whole design fails miserably with non-US data. Both names and addresses are best handled as free-form text.
Admin
OK, a WTF... but: "but I suspect this is going to be pretty much the most expensive option." -- Does it matter? What is the overall impact to observed application performance between this implementation and an "ideal" (no such thing) one? What is the performance specification relative to this observed behaviour?
Admin
@Brian the schema looks remarkably like the fields of a
Contact
record in MSFT Outlook. Which of course originally dates from the US-centric days of PC programming and has since been extended in various ways for internationalization, interoperability, storage in Exchange and later the MSFT cloud, etc.I could sure imagine a US-centric programmer or data designer simply firingup Outlook and cloning their data model with the thought that "If it's good enough for MSFT's products, it's good enough for us."
I'm not aware of any specific agreed multi-vendor multi-national standard on contact record schemas. That's just not in my area of practice. If anyone knows more about such a thing I'd be curious to get a link to get started reading.
Admin
Even if this isn't an abysmal way to build a string, it is definitely an abysmal way to build a CSV file.
Admin
In this situation, shouldn’t the choice of concatenation method be left to the
String.join(array)
method (or whatever equivalent in your language) ?Admin
I often choose readability over maximum execution speed in cases where speed isn't critical to the application.
Admin
Unless the input is being sanitized, then the string will happily accept commas in the concatenated values and no doubt break the file format.
This to me appears to be a CSV file, in which case I can't recommend enough that the "CsvHelper" NuGet package be used. Not only can it automatically escape values, but it has a whole range of configurable options and will almost certainly work for whatever your use case is.
I was once flabbergasted when a bank we were integrating with used a form of CSV files and in all their extensive documentation they failed to mention what to do about possible values of the separator character in a field.
Admin
TRWTF is what happens if any of those fields contains a comma. NEVER build a CSV file (or XML or JSON or any or "structured text" format) by concatenating strings, ALWAYS use a mechanism that automatically handles the necessary quoting and escaping.
Admin
Very timely - just this morning, I adapted a function to create a complete list of all financial data in a certain system to a new use.
The original, written by my predecessors, creates a set of HTML tables for display in the user's browser, and did so by concatenating strings. It's usually a bit slow, but not annoyingly so, except that we have this one customer who has tens of thousands of bookings a month in a specific ledger, and thus creates a table of almost a hundred thousand rows. All using string concatenation. Either the server or the browser does not like that, and crashes.
The new one creates a CSV file for import into Excel or some analysis package, and does so line by line. Still takes ten seconds, but then, it does create a 10 MB CSV...
(In other news, I hate Excel localisation.)
Admin
In the following centuries this will be recognised as one of the earliest attempts of PDD - %PATH% Driven Development.
Admin
So in other words, now the analyst's computer crashes due to excel trying to open an 10MB csv file.... :)
Been there and all that, which is one of the reasons i like Linux which comes with a ehome range of tols for data analysis (sed, awk, head, tail, grep, gnuplot etc).
Yazeran
Admin
I would expect the optimizer to clean this up, all those references to the previous string are going to go away--a StringBuilder will keep the garbage down but the extra lines aren't causing garbage. Write for readability unless you're in a loop that's going to execute many, many, many times. And most optimization is a matter of the right choice of algorithm in the first place.
This is going to be IO bound anyway, even if it is a bit slow nobody is going to notice.
The else block is because somebody's expecting the lines to be there even if they're blank.
Admin
His header is full of content, but his head doesn't seem to be.
Admin
I'm saddened to learn and inform you all that even the latest version of .NET is unable to optimize this way:
SharpLab
And no, anti-performance atrocities like these should not be allowed. Sentence: life of writing HTML/CSS without parole.
Addendum 2022-05-19 12:25: PS. Will post this to .NET Github
Admin
meh barf
Admin
It is a well know trait of incompetence.
Admin
Hmmm.... I wrote a quick script in python to write a 100,000-line csv averaging 105 bytes per line and even when the row is composed entirely of randomly generated numbers which are individually converted to strings and even when the output is written to a network drive, the script still manages to complete in under 1 second...
Admin
About the first example:
In the current version of .net (6) the compiler will optimize interpolation literals as well. This example will maybe not be optimized due to it's recursive syntax tree nature, but I don't know the details - however they devs stated often that this implementation was only the first iteration. So I wouldn't be surprised it the compiler would unwind expressions like that in the future.
Personally I never understand why something like that is needed at all; just clear a POCO, annotate it fluently or via attributes, serialize with header with whatever nuget library fits best, done. There is pretty much never the need for a construct like that, especially since code generators are maturing more and more since their introduction and integration with .net 5.
Admin
Maintainability is more often important that worrying about performance for something which runs rarely.
Admin
Re: #3, those are columns, not rows. And it could have just been a single statement appending 7 commas in a row instead of 7 statements each appending one comma.
Admin
I don't know that I would go so far as to call strict consistency a sign of incompetence, so much as a lack of imagination. After all, this code probably does its job just fine. And I think consistency has value in high-risk scenarios. E.g., I would expect NASA code to be brutally consistent; but NASA code would probably also be a little more thoughtful, and not just consistency for consistency's sake.
Admin
But yes, TRWTF is, as usual, a homebrew solution for a well-solved problem (converting a data structure into a CSV).
Admin
if this is c# 👇 string s = getSomeString(); s = s + "some suffix";
Then it is correct, the cost to create a string builder is more expensive than the cost to create 3 strings
Admin
Ah, but the author defended against commas by wrapping every interpolated value in quotes
So now the code will break on quotes instead.
Admin
Looks like the String Buildinator should meet Trogdor the Burninator.
Admin
His problem - he asked for it!
(Actually, Excel on my computer did open it. It gets a bit sluggish, true, but I can't really blame it.
Admin
There may well be some network lag in my numbers. And getting all the data out of an admittedly over-complicated database (which, no, I can't change).
10 seconds is acceptable to me, given that a. this guy's one of our largest users, and b. they're only going to need this data once or twice a year, for the yearly audit.