- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
No. Just no. It does not make sense, or at least it's a case of top-tier first-stripe POLA violation. It is surprising that
null + "some string"
doesn't explode with a null pointer exception, especially if it's a nakednull
.Or maybe that's too many years doing C++, where calling a member function (like
string::operator +(const string &other_string)
) via a null pointer is automatically UB.Either way, it might be useful that it works like that, but it doesn't make sense.
Admin
Why not? Null means nothing. As in, not missing or unknown value or anything like that but literally nothing. An empty void without a type or a size or a value or anything.
And nothing + anything else = that anything else.
Admin
it works because in this case the compiler converts the "+" into a call to String::Concat, not into a call to an overloaded operator. And if you write string x = null + "something"; the compiler converts it to just string x = null + "something" Of course, don't ask me to judge if it makes sense or not :D
Addendum 2023-08-01 07:18: even better (or worse, depending on the point of view): there is no overloaded operator + or += AT ALL in the .NET "String" class, so I suppose all compilers for all supported languages always convert concatenation operators to calls to the static method String::Concat.
Addendum 2023-08-01 07:19: p.s. when I wrote:
the compiler converts it to just string x = null + "something""
I meant
the compiler converts it to just string x = "something"
Admin
And if you prefer Java:
p.s. TRWTF is of course using Hungarian notation in C#, such as "string strName" (someone may say TRWTF is using Hungarian notation at all, but YMMV... it looked like a good idea 20+ years ago, and in some... places... was almost enforced...)
Admin
Another side effect, if
strName
is empty, the method returnsnull
. And ifstrName
isnull
, it outright crashes.Admin
I don't really speak C#, or Java - but I'm actually pretty surprised taking
.length
on a null value is apparently OK. Hell, even in Javascript this would blow up with a runtime error, and in Typescript it wouldn't compile (assuming the argument is typed correctly to allownull
, and isn'tany
).Admin
No I'm with Steve. It's horrible. Firstly, the declaration should have been
i.e. an empty string which is more self documenting. Secondly (and I'm not an expert in c# but isn't
null
untyped? Does this make sense?It's all just horrible and
null
is definitely not a string so string concatenation should not work.Addendum 2023-08-01 08:03: sorry
s/x/Twitter/g
Admin
Urgh - This is the kind of code you would write in C in the 90's before we had libraries to do that sort of thing
I'm with Remy on this - it's one of the rare (in this forum) cases where a regex would be the most appropriate solution, but if you're insistent on not using regex, something like this:
while(s.contains(" ")) { s = s.replace (" ", " ") }
would be simpler than the minefield of iterating through a string
Admin
There is so much wrong with this code, from not using StringBuilder over indexing the string multiple times to using char.Parse(" ") over a simple ' '. Whoever wrote this code has zero experience writing C# code obviously.
Admin
And now you have two problems...
Admin
Directly accessing an member that is null won't work in C#. The operator works because it's basically
so the code turns into:
Admin
The fact
null + "some string"
evaluates to"some string"
is clearly saner than in JS (where it would evaluate to"nullsome string"
), but it is still not great.null
is not a string value. You can see it as a reference to the absence of a string if you want, but it is still not a string value.I is not the empty string. It is not the so-called "null character" (
\0
in C). In fact,null
is not a character at all.null
is neither the abstract idea of an empty string (string with zero character) nor the C idea of an empty string (string made of a'\0'
character followed by garbage in the rest of the allocated memory).So don't concatenate a string with
null
. Assign a string value to a string-typed variable that has the valuenull
.But don't concatenate a string with
null
.The next time you suggest that it is okay to concatenate a string with
null
, you won't need a code review, you'll outright need an exorcist.Admin
And what would be that + operation that we can apply to anything?
And what would be the truth table of that = comparison operator?
Can you geniusly add or concatenate any abstract data type?
Are you really 100% sure the comparison would be always be true without a surprise? (even with tricky values like NaN?)
Think carefully before you answer, or we'll happily replace you with ChatGPT.
Admin
Well, that plus a whole bunch of bunches of people who failed to grasp what Simonyi had in mind with the original Hungarian notation, where the wart on the front of the variable name indicated "flavour", what kind of data was in the
int
orstring
orchar *
or whatever, rather than the implementation type. It was a way to work around the inadequacies of the language's type system, since all you have in C for strings ischar *
, and you can't tell from the language type whether that particularchar *
contains HTML-encoded, base64-encoded or raw characters, for example. At least with "applications" Hungarian, you have a chance.But lots of folks thought he meant "describe the language type" ("systems Hungarian"), leading to
WORD wParam
(in Windows message procedures, where "wParam" was literally the "word parameter") and similar sins. Curiously, though, there are still lots of AHN parameters in Windows, e.g. all those integers that are namedcbSomething
- a Count of Bytes.Admin
It's not OK, in either C# or Java. It throws a NullReferenceException in C#, or NullPointerException in Java
Admin
Oh yeah, I only just realised I was getting mixed up between
strName
which is the input and presumably can't be null (not sure if astring
in C# is nullable or not but in context I assume not, which makes me happy), andnewStrName
which is initialised as null and what the discussion in the article is actually about.Clearly I am TRWTF here - it's quite a common occurrence...
Admin
string
is a reference type and so nullable by default. This is a private method so if you're feeling charitable you can assume that everywhere that calls it has already vettedstrName
for nullness. If you're not then all bets are offNewer versions of C# let you turn on warnings that sort of get close to making reference types non nullable, where it keeps track of whether an object can be null at any given point and warns you if you do anything that assumes it isn't. It's a bolted on, late addition to the language and it shows, but it can be quite useful
Admin
cd 'somewhere' for file in *; do sed -i 's/ / /g' $file done
Admin
It's not an "operation that is applied to an object". Operators in C# are ALYWAY static methods, e.g. on the string class in this case, and thus CANNOT (and should not) apply a special meaning to the first parameter. You may argue that it feels weird to concatenate something with null, but that is not really what happens from a language perspective - if you rewrite the operator as a static function call it feels less weird. If my memory serves me right, language projections onto the CLR like VB# only ever see the methods, anyway (I guess strings might be handled differently)
Admin
Yes.... but string concatenation works differently from every other operator on every other type. For performance reasons, the string concatenation operator is compiled down to calls to various overloads of Concat, and the documentation for Concat says: "String.Empty is used in place of any null argument."
Addendum 2023-08-01 13:08: ... in addition, Concat happens to be as static method, so it will not throw a null reference exception.
Admin
Replacing any
\s+
with a space does mean that if a string ends with a space and a newline, you keep the space and toss the newline. That may not be what is wanted, even if it's "according to the letter of the spec".Admin
I always love the reflexive "two problems" response to any use of regex.
Sure, regexes can absolutely be mis-used, or be difficult to interpret, but if I've learned anything from reading the DailyWTF, it's that everything (string concatenation, in this case) gets misused and can be difficult to interpret.
Admin
While I agree the regex is the solution for the job if you're wedded to the parsing approach there's no need to go to the hassle of starting at 1. Rather, chop one off the loop iteration and compare the character to the next character (or, more simply, just compare both characters to a space) rather than the previous. In the end copy the last character outside the loop.
Result: It performs as it should, all duplicate whitespace is removed, no extra conditional in the loop.
Admin
My legilimency suggest the original task was this: 'remove leading and trailing whitespaces, change inner whitespace-sequences to single space', so String.Trim would have been a good start.
Admin
I'm fairly sure that "neat" solution runs in O(n^2) time where the original runs in linear time.
Admin
Coercing null to an empty string is a bit of a code smell - it is hiding a bug (setting newStrname to null, instead of String.Empty.)
I am a bit anti-regex, but in this case it is the best solution. Although I would be questioning the specification - probably \h is better than \s, as they probably do not want to replace newlines/paragraphs with spaces.
Admin
Someone would argue that throwing a regular expression in doesn't make things simpler. I think \s+ is simple enough that it won't cause issues, but it's also simple enough that you could've done without it.
Admin
Not to mention masterpieces such as the "lpsz" prefix: Long (that is, you know, 32-bit) Pointer to Zero(null)-terminated String, a bombastic name for humble char*. Conventions that mysteriously changed when moving from an Hungarian prefix to its related typedef, e.g. "sz" became STR, such as in LPCTSTR (itself another masterpiece). The apex (no, not that APEX!) was MFC, where Hungarian notation infected even the class names: they had to start with C, such as CString, CFile, CPtrList (*) ... because God forbid someone mistakes them for filthy structs! :D
(*) & (Argh! the memories! CPtrList and its sibling CObList, those dreadful WTF bringers! the horror! the horror! )
Admin
I would like to thank you.
I started writing a reply to your post and it got ever longer and more complicated as I had to think more and more about it. And eventually I came to the conclusion that my original opinion was wrong.
For reference, the opinion I held was that since null = nothing the compiler should just safely optimize it away. As in 1 + 2 + null + 3 = 1 + 2 + 3. But I now see how this could lead to interesting behavior and bugs.
It's not often that sort of thing happens to me any more these days.
Admin
I'm pretty sure that premature optimization is the root of all evil https://stackify.com/premature-optimization-evil/
Admin
The code would benefit from at least a null check and an empty check for strName, so null.Length() and "".Length() (as in "for (int i = 0; i < 0; i++)") would not show unwanted behaviour. On the other hand, the code itself is unwanted, so it should probably replace itself with one space.
Admin
And you overstated your case so much that you became WTF yourself. You want to distinguish Interfaces (I series of names) from implementation Classes (C series of names). There were pretty good reasons for this naming rule.
Admin
Meh. null + whatever is a choice. One can argue for and against any choice. In SQL, CONCAT(NULL, 'string') evaluates to NULL. Is that more or less surprising than it evaluating to 'string'? I'd say less in SQL, apparently more in C#. Neither is inherently wrong.
Admin
null is the absence of a value, but not really "untyped". So if you try to call anything with your "x" that expects a parameter that isn't of type "MyClassThatIsNotAString", the compiler will yell at you to fix that.
I originally wanted to say that for the code you've suggested the compiler would yell at you, but it actually doesn't. the plus sign operator got a function defined for the parameters object and string that will return object.ToSring() concatenated to the string, with null handling that treats it the same as the empty string. So in your case the compiler is happy and at run time after the snipper your "someString" will just be the empty string (since it's just two empty strings concatenated).
Unless you overload the plus sign operator to do something different in case the compiler encounters string + MyClassThatIsNotAString. e.g. you could have the following defined:
public static string operator +(string x, MyClassThatIsNotAString b) => "hi";
in which case your someString would end up holding the value "hi". (or gloriously crash in case b is null and you try to work with it)
Admin
You're right, it may be a bias from my part: since the "
I
" prefix made its way to .NET (and it still ubiquitous there today) while the "C
" prefix did not, I may have been tricked in considering the former "normal" and the latter "evil".Anyway, I'm not sure about the other thing you say: the two prefixes may have different origins and so different rationales. At least in the Microsoft world, and as far as I can remember (please correct me if I'm wrong!), the "C" prefix originates from MFC, which never used the "I" prefix for interfaces (hmmm... maybe MFC had few or NO interfaces at all? I mean, of course, "C++ classes with all pure virtual member functions and no data", since there is no "interface" keyword in C++). Instead, the "I" prefix originates from the COM Specification and its IUnknown interface ... did COM advocate the "C" prefix for concrete objects as well, to distinguish them from interfaces? It does now (look for "Coding Style Conventions win32"... yes, unfortunately, COM is still alive), but I'm not sure it did THEN...
On the other hand, Java was born in the Nineties, like COM and MFC, but it never felt the need for a prefix to distinguish interfaces and classes... so ... well, I don't know ... there may have been "pretty good reasons for this naming rule" as you say ... or not ?
Sometimes I wonder if the entire programming universe is just a big WTF :D
Admin
As an alternative to regex, instead of c == ' ', one could use char.IsWhiteSpace(c). If you want to exclude line separators from removal, you can do an additional check for char.IsControl(c).
Admin
My favorite part of this comment is all the way at the end, where it implies that a language that is useful doesn't make sense.