- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- It Figures
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Obviously you have not seen that old Star Trek episode where they encounter a parallel evolution of Nazis on an M-class planet...
Admin
On a momentary tangent, it still surprises me how many people ignore (usually under the guise of "pre-mature optimization" being the root of all evil) the true cost of things.
In engineering school, the students are usually taught about not just the cost in terms of economic dollars, but in some cases, the amount of energy expended, the number of CPU cycles clicked, etc. In cases of abstraction where a lot of work for developers is now being handled by: a) the framework; and b) the compiler, the developer should still be cognizant of what is going on under the hood, since the abstraction of modern development tools merely shields people from the complexities of various algorithms as well as assists in the generation of boring and boilerplate code.
For anyone that is at all remotely interested in using .NET, Lutz Roeder's Reflector should be the #1 tool in any .NET developer's toolchest.
I created four simple C# static methods on the 'Main' object of a console application, using the various string concatenation methods that I've seen discussed here and weighed their cost using various criteria such as instructions used, function calls made, size of code, etc. I invite anyone using Reflector to see my findings as well, make sure you disassemble to "IL" code so you can see what the compiler truly generates.
I found, after examining the generated IL of a release build, that Test2 and Test3 generated identical code. That's because using the '+' operator becomes a String.Concat() call. In that regard, using either method in Test2() or Test3() is a matter of coding standards or what might be considered most readable/understandable to the developer.Test1(), while also very readable, is highly inefficient, since every instance of Environment.Newline results in a function call to get the newline string for that environment.
Test4(), by the way, is very readable, looks nice, sweet and compact. However, look at it under the hood with reflector. See that objects are created and constructed, functions called, processing done that's not visible unless one actually looks under the hood. String.Format() does indeed use a StringBuilder under the hood, because it is most efficient when dealing with a homogenous set of values that it might expect to find in a format specifier.
StringBuilder, btw, is not a stigma, it does have some very good advantages when used to build large strings or strings that are non-deterministic due to logic within the flow of a program as well as being a conduit to marshal string data back and forth between non .NET code.
Anyhow, my point is this: Never jump to conclusions based on readability of the code alone. Readable code is not necessarily the most optimal and premature optimization (which can make less-readable code) is also not necessarily the best path to choose. However, unless a developer is lazy and does not look at all the options to weigh them, they will not find a "happy medium" to choose between the two. Also, one has to understand the platform on which they are writing code. This includes not just your code, but also the operating system and in some cases, the CPU itself.
Anyhow, in the grand scheme of things, most people who write code as professionals are not necessarily writing code for themselves, they are developing an application for a user. The end user doesn't care about how readable your code is (you should document confusing code), they just care about getting results and getting them as soon as humanly possible. <grin>
Anyhow, I tend to like the code presented in Test3() myself.
Admin
Perhaps you didn't notice the date of the Petzold article.
Admin
Programmers should not need to worry about how to write a newline. All students of programming go through these same new line and string builder issues, and that is waste of time of the humankind. I bet in no other field of expertise they think about newline 3 pages of posts with over hundred of comments...
Why not create a character for newline. Empty character is "space". Then the newline character would be "new line", and you would get the newline by pressing certain key in keyboard and maybe with ctrl down or something. I mean, we have at-character, dollar-character, and even exclamation-character! Why not newline-character? How about Ascii 10: LF bound to somewhere in the keyboard and the system looks the system specific combination from a lookup table?
The same goes with adding strings. If there is need for StringBuilder, why the programming language doesn't implement "+"-operation with StringBuilder in the first place?
Admin
shouldn't this constant have been named cpCRLF?
(cp = C Pound).
Admin
My company does extensive java development on VMS and let me tell you, it is a true joy to write the C code that the JNI calls to convert files to stream_lf...so java can read it.
Admin
there's yet still another obvious way:
myString = "Line1 Line2 Line3";
Admin
Nah, that's just the reason OSX and Windows are wrong.
Admin
I tend to like the code presented in Test4(), myself.
Got any other pointless tests we can laugh at?
Like the soapbox thing, btw. Come down to Hyde Park Corner, some time. I'll be the gent with the carnation in my lapel and a ball peen hammer to take the rest of the world out of your misery.
Admin
Why make it harder than it is? If "\n" works, go with it. If it works, don't fix it. Especially not by making it more difficult.
Admin
Admin
Someone posted this (Grovesy?) on page1, complete with the IL output.
by the way, you can use ildasm.exe to view the IL output, comes wit the .net framework.
Admin
You mean, there's a way other than "\n"? Yeah, I knoes that Windows uses \r\n but PHPGTK works as expected with just \n. Since PHPGTK is the only way I bother writing code that will never run on Windows, this is very much a NON ISSUE.
Seriously, why don't we programmers have a constant LF like:
$a="This is some test".LF."This is the next line";
And let LF be a constant that changes with whatever O/S? So stupid to be worrying about this stuff that was all but resolved in 1976!
Admin
Why still bother with newlines? Just get a video card that allows you to put enough monitors next to each other so everything fits on a single line...
I will lead the way: To start with, I replaced my old 1280x1024 CRT with a 1440x900 LCD screen. Roughly an equal number of pixels, but longer and fewer lines. I have been looking for a 65536x16 pixel screen, but I have been unable to find one yet.
And as for the line separators used in this comment: Although soon I will no longer need them myself, I have strong hopes that everyone's viewer (even Notepad) will be able to grasp them. Otherwise you'll end up right-scrolling a very long way...
Admin
Why is "\n" non-portable? I'm not a .NET guy, but I've used "\n" on DOS/Windows for ages, and whatever's outputting the string takes care of it ... if you're outputting to the console or a file in text mode, it'll nicely change it to Newline ("\r\n") for you. If you're outputting to a file in binary mode, it'll output "\n" literally.
Admin
The world went predictably downhill after that.
Admin
[quote user="real_aardvark] Exquistite!
I tend to like the code presented in Test4(), myself.
Got any other pointless tests we can laugh at?
Like the soapbox thing, btw. Come down to Hyde Park Corner, some time. I'll be the gent with the carnation in my lapel and a ball peen hammer to take the rest of the world out of your misery.[/quote]
Actually, they weren't tests, they were examinations of what the compiler generated from the different ways to write code. If you aren't one of those people who care how much things cost "under the hood" then, of course, you would find it pointless.
And as an aside, I believe nobody should take a hammer to their balls, no matter how peen they are. If that's what one expects in Hyde Park Corner, there's a good reason why that corner has been hidden.
Admin
There are only so many meaningful cases of Hungarian notation to go round.
Admin
I know, it is quite an annoyance, but it is still the best tool I could find that shows what is going on under the hood when something doesn't work as expected.
To give an example, a person once complained that when they changed the opacity property of their form, dragging it around the screen was slow, but when they set the opacity back to full (after it had already been full), the dragging was still slow.
After digging into the framework using reflector, I found that there was a 'bug' in which an API call wasn't made to make the window unlayered.
Over time, it still amazes me how people don't think that either the third party libraries or even the documentation are flawless, because they don't have the skills to investigate it.
Admin
I program in C++. I have no problem with people programming in C#. I don't even have a problem with people programming in Java, although I occasionally allow myself the luxury of wondering, "Why?"
Every now and again, I wonder quite how much CPU might be taken up by a particular loop, or invocation of the OS kernel, or an inadvertent call to the assembly instruction "Halt and catch fire."
I do not, as a matter of course, trouble my few remaining grey cells by exercising them with the concept of converting EOL into CIL. The poor little buggers are still trying to work out whether Hyde Park Corner disappeared between pages 600 and 650, or perhaps even later.
On the whole, I prefer to reserve them for more important computational tasks. Like writing code that actually makes some goddamn sense.
Which, I believe, was my original point.
Admin
An inspiration for us all, I don't think.
Admin
I suppose you weren't referring to Brian Kernighan and Dennis Ritchie, those who birthed the 'C' Programming language.
Admin
Touche' ...I have expanded it out. "People do not think that either the third party libraries or even the documentation are flawless, because they do not have the skills to investigate it."
Yep, I made a mistake. The "do not think" and "are flawless" contradict my point. I should have said: "People think, at times, that third party libraries or the documentation are flawless, yet they do not have the skills to investigate it."
Thank you for the correction.
Admin
Well, I think code that makes 'sense' changes over time. A person who is used to writing console applications gets a bit confused about the whole "event-driven" paradigm when they use a different operating system.
What used to make 'sense' to me from one realm has changed when I entered the next.
However, I had, at one point, stopped to do things the way I was used to doing them as a developer and consider the user who was using what I had written. (Writing code for myself was never a problem, writing code for other persons introduced new things.)
I like to write code that makes sense, however, there are times I have to write code which doesn't. If that is ever the case, I document it thoroughly within comments in the code.
Admin
Admin
Actually, in Java, "\n" actually IS a Unicode newline character, which gets translated to the right sequence of bytes (#13, #10, or #13#10) when outputting it through an OutputStream. Strings are always Unicode (2 bytes/char) in Java.
I'm kinda surprised it doesn't work the same way in .NET...
Admin
Never mind the earlier funny remarks about not using newlines altogether:
I seem to remember MS-DOS (versions 1.xx through 6.xx) was the only OS around at the time somehow needed the redundant CHR(10) (in style, this is GW-BASIC ;-) ) extension to get printers (you know, the old needles-and-pins-things-that-drove-you-crazy-with-their-incredible-noise ones) to output documents correctly. Like using a backslash instead of a forward slash as directory separator, I suspect this was done by Microsoft to prevent compatibility between the then-existing OSes and their new kiddy.
Can anyone confirm this? It's been a long time...
Admin
For very old printers, CRLF makes sense because you sometimes want to do only CR to print two lines on top of each other, for underlining, bold print and various composed characters. (One could also use BACKSPACE for this)
You might want to use a naked LF too, though that would be a rarer case. (When writing a narrow column of text down the middle of the paper, without having the print head moving all the way to the margin for each line)
Since, in the very early days, you didn't want to waste CPU on processing files when printing them, you stored the files with CRLF in them. This meant a file could contain underlining etc. in a way the printer handled directly.
When Unix was created, they decided that you would always use a device driver of some sort to talk to printers, and that driver could convert things as needed. Given that, it made more sense to use a single character for newline.
The decided that a lone LF should be converted to CRLF for output, while a lone CR would just be sent through. This way you could still do overprinting like you used to, without having to waste that extra character on lines that didn't need it. However, you lost the possibility of sending naked LFs to the printer, so this was a trade-off.
MS-DOS came later than UNIX, but UNIX was still a small player in a field with very many players. MS-DOS decided to go with CRLF, which was not uncommon in those days. (I must admit I have no idea how common, I was still a wee lad at the time)
As for which slash to use as a path separator, there where a lot of different practices for that too. The whole concept of directories and paths were quite new and were being reinvented all over the place with different syntaxes. Again, there were no reason for MS to copy Unix in particular, so they chose backslash as the character least likely to appear in file names.
Microsoft could have changed this later on, but breaking backwards compatibility is not done lightly.
Admin
The format specifiers which do not correspond to arguments have the following syntax: %[flags][width]conversion
And, later:
'n' line separator The result is the platform-specific line separator
Which is to say, since neither flags nor width apply to newline, %n is a platform-specific line separator.
Not sure why it doesn't work for you.
-fred
Admin
"make sure you disassemble to "IL" code so you can see what the compiler truly generates." Except, as I've already pointed out, what the compiler generates isn't what gets executed because of JITing. It's perfectly possible for the JITer to chose to inline the property, meaning that Test1 would be better than Test2. In my preliminary investigations, it seems that it's not inlined, while all literature suggests it should be :S
"The end user doesn't care about how readable your code is (you should document confusing code), they just care about getting results and getting them as soon as humanly possible." Not true. Depends on the type of consumer. If it's a home user, they care not only that it's fast, but also that it is to a degree stable. If it's a business or power user then they also tend to care that bugs get fixed quickly. Both of these (stability and maintainability) require clear, clean and easy to understand code.
Admin
Wow! For a brief moment I thought that this was directed at me... Just convert the '\n' legacy code (that I wrote) to a system compatible equivilent string programmatically and come one that's not so bad is... WHAT?!?!
Oh yeah, this is thedailywtf.com. Don't prognosticate because the very reason to read this is because it these stories defy all reason and justifiability. Just remember: CD tray as cup holderl CD tray as cup holder...
Admin
I just hope that the people here who keep using '\n' know what they're talking about. The '\n' is a C construct that is LF while in code/memory, and expands to CR, LF or CRLF when written to disk (depending on the platform).
This code
will fail spectacularly on Windows, producing CRCRLF once it's written :) (I wonder, would it produce CRCR on Apple?)Admin
No for a small number of strings String.Concat is the correct way to do it, it doesn't parse the string for arguments.
Admin
what other platforms are people running .net on?
Admin
Except that IL (being the Intermediate Language), of course, isn't executed. It's basically just a preparsed C# that's quicker to compile by the JITter. If you're not looking at x86 (or x64, or IA-64, etc.) assembly, then you've also missed the "true" cost.
Oh - and I'd be curious as to what kind of braindead program would have string concats of Environment.NewLine as it's bottleneck. Profile, and then optimize. Anything else is just bound to be wrong.
Admin
"To predict or forecast, especially through the application of skill."
Entirely relevant to the OP, I would say.
Admin
I have the impression that things are goin this way:
In fact, CSAML is able to rid itself of every symbol used in old-syntax C#. For example, consider the following old-syntax C# assignment statement:
This statement translates without much fuss into the following chunk of CSAML:
I first thought that this guy is making a joke but it seems that is bloody serious about it. Look here: http://www.charlespetzold.com/etc/CSAML.html
Anybody out there who wants to fight against such insaneness? I will stop programming when I will have to write code like that.
Somebody has to stop M$'s stupidity. What's the future of programming? In the next phase we will also have to put XML in another format like eXtended XML (XXML)? Where is the end? Writing code like
A + B = C
should be 20 lines of code (or more) with several hundreds of characers? The guys who promote such code should be shot before they create greater damage!!! Unbelievable!
Admin
Well, this is not to defend M$: before computers there were typewriters and they had the possibility to Return the Carriage and to Feed a Line. Both were separate "processes" and hence the CRLF instead of a \n. Doesn't make much sense in our days but explains the Why.
Admin
Admin
What about:
Environment.Characters.L + Operation.Concatenate + Environment.Characters.i + Operation.Concatenate + Environment.Characters.n + Operation.Concatenate + Environment.Characters.e + Operation.Concatenate + Environment.NewLine
This must be THE ONLY RIGHT WAY!!!
Admin
Pleeze sent me teh codez to create a new line
Admin
Admin
Hehe, wonder how the source for Environment.NewLine looks... >:o)
Admin
Or you can just do this
#define CRLF "\r\n"
So much typing...ahhh, I think I am going to break my fingers!
Admin
and therein is proof positive of Windows inherent bloat and redundancy.
Admin
Nah, only one of the three is wrong. The bloated and redundant one. OSX and Linux are just different flavours.
Admin