- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
The PHP docs are a joke anyway. Sometimes not quite right (but you won't figure that out until some hours of testing and debugging), sometimes plainly wrong (at least you notice that very quickly) and sometimes they flat out lie to you (by saying "if this and that flag isn't set, the return value is FALSE", which is clearly a lie).
I know, I know. "You can build horrible stuff with any language!". It's just that PHP makes it so incredibly easy to build horrible stuff, because of how the language is constructed (is '0' a valid return value of the function or does it indicate an error?) and how unprofessional some of the developers/maintainers of the language sometimes act ("You don't need a 'finally' because we say so!").
Admin
Your main error here is that assuming PHP is constructed. It's not, it's accreted. The docs reflect this.
(Seriously, though, there are good bits in PHP. It's just that the accreted nature of the language makes it far too easy to use the bad bits.)
Admin
"there are good bits in PHP"
There are. It's easy to use, runs (almost?) anywhere and it costs nothing. Can't get much better than that.
"It's just that the accreted nature of the language makes it far too easy to use the bad bits."
Well, it doesn't really help that it's broken by design. ("foobar" == true) is true. And ("foobar" == 0) is also true. But (0 == true) is false. Furthermore (null == 0) is true. Conclusion: Don't use "==", because it breaks things (but at the same time, you are stuck with it because there's no >== or <==).
Admin
"The PHP docs are a joke anyway"
True, but the whole "language" is a joke already so it's kinda fitting. Quite why so many developers seem to think it's a real language and actually choose to work with it, is one of the great mysteries of the industry.
Admin
PHP is an interpreter and a thin wrapper on top of the mostly C libraries that deliver its functionality. If you think of it that way and treat it that way its fine. The biggest mistake with PHP is thinking of it as a proper language with its own conventions and constructs. Its not its glue for using a bunch of C libraries and a way to save some resources over spawning CGI processes over and over. When you think about it in those terms you can write decent code.
Admin
I have to admit, I did a double take when I read the "eval" in your text. Yes, I didn't notice it at first, hiding so well among the echos.
Admin
"I informed you thusly."
Admin
Admin
To be fair, languages like C# and Java also have problems with ==. Java locks you into and C# defaults to, referential equality of heap objects. Who thought that was a good idea? Completely useless feature which causes great confusion and many bugs. Both have Equals virtual method on Object class, which is nonsensical, for 99% of classes, the concept of equality doesn't make sense. I posted a suggestion to remove Equals from Object in .NET and was dismissed and laughed at. Perhaps it's a huge breaking change given the proliferation of existing code, but I maintain it's a necessary improvement.
Admin
To be fair, the first comment on the page is:
"Keep the following Quote in mind:
If eval() is the answer, you're almost certainly asking the wrong question. -- Rasmus Lerdorf, BDFL of PHP"
posted 15 years ago.
Admin
That's not really broken. It works exactly as one would expect for a loosely typed language. Other interpreted languages, such as JavaScript, are equally bonkers if you're expecting strong types.
See also: http://www.jsfuck.com
Admin
Everyone: XHTML5: exists.
Admin
With all due respect, that was a very naive suggestion. It's not just a breaking change for code written in those languages - it's a breaking change for essential elements of their own standard libraries. And I have to disagree with your assertion that "for 99% of classes, the concept of equality doesn't make sense". It's more that "for some classes, reference equality is the only kind of equality that makes sense". And that's an important distinction...
Let's pretend we didn't have universal equality, and look at good ol'
Set<T>
(which I think would beISet<T>
in Microsoft-land?). Well, a set is a collection of zero or more elements such that no two elements are equal. Right away, we know we need some general concept of "equality". If we don't want universal equality, we'd need an interface that means "elements of this class can be compared for equality". And that's not a wrong approach, per se. It's what Haskell does. But let's look at what that means for us. It means ourSet<T>
would become aSet<T extends Eq<T>>
. That's kind of inelegant, but it's still not wrong.But... now we can't create heterogeneous
Set
s (actually, we can't even parameterizeSet
that way). And I haven't demonstrated it, but I think you can see how the same argument means you can't create aMap
with a heterogeneous key set. So now you can't memoize a function whose domain doesn't implement thatEq
interface. And you can't maintain a pool of objects of a class that doesn't implement it. And the kind of objects you'd want to maintain a pool of, by and large, are exactly the objects where reference equality is the only kind that makes sense - after all, if you're going to yank one out of a pool and completely replace its identity for a new purpose, you'd better hope nothing else ever thought it was equal to it!So, there are two common use cases that just don't work without universal equality. I'm sure you can come up with more if you think about it.
Admin
Hi, my name is Robin <?php $db->exec('drop table users')
Admin
"it's a breaking change for essential elements of their own standard libraries."
I understand, which is why it would be a multi phase transition over a few major versions, with compiler warnings and obsolete attributes pushing people away from the legacy approach over years.
"And I have to disagree with your assertion that "for 99% of classes, the concept of equality doesn't make sense". It's more that "for some classes, reference equality is the only kind of equality that makes sense". And that's an important distinction..."
Reference equality is by definition the wrong approach. How and where objects are stored says nothing about the logic of the program. I can count on one hand the times I needed to use it in more than a decade. And object.ReferenceEquals should still be there, for those edgiest of edge cases.
Regarding collections like Set<T>, what you're describing as inelegant is exactly the right, elegant approach. Either you have an equality compare, which operates on T x and y, or T implements IEquatable. Allowing any class to be T creates all sorts of undefined behaviors. What if the author of the type adds an Equals override? GetHashCode?
I'm not quite certain what you mean by "pool" and needing reference equality, but the version of a generic collection which uses an equality comparer can be created with an instance of referential comparer, satisfying your requirement. The advantage is that you explicitly told the compiler and future developers that you're using referential comparison, and not leaving it to implicit default behavior and others having to possess a higher than average understanding of the platform to be able to read the code.
Admin
Regarding heterogeneous collections, once again, the items either have a shared base type, or you can supply a comparer which handles object class comparison, thereby explicitly communicating your intent.
Admin
(And then Naomi ran into the character limit!)
And there's another subtler problem - what if you have an interface you know you'll need to compare for equality (perhaps because you'll put them in a set) - but you don't want to specify how implementations define equality? You can't let the implementations implement the hypothetical
Eq
interface, because you need a set of the interface type itself - so you'd have to make your interface extendEq
, and now your implementations have to define their equality in terms of what's available on the interface, or do an runtime type check - just like you would with universal equality.This one's not the end of the world, which is why I'm bringing it up last, but it would be pretty weird to have to architect a class differently because someone else might want to put it in a set. It's certainly more strange than universal equality would be.
In conclusion,
Object#equals(Object)
might look strange. It is strange. But it supports some very important use cases, notably object pooling, and meeting those use cases without it would be even stranger. Reference equality is a sensible default for types that don't need to override it. And types that do need to override it (mathematical vectors come to mind) can do useinstanceof
(IIRC, you'd useis
oras
in C#) in theirequals
implementations. I don't believe that there is an acceptable replacement, and removing a useful feature because it's seems somehow impure is a bad idea.(Okay... there are other approaches, but they're not appropriate for Java or C#. C++ uses templates as compile-time duck typing, and Haskell does have an
Eq
typeclass (typeclasses are the Haskell analogue of interfaces) which works the way I described, but object pools aren't a sensible concept in Haskell, and you usually let the compiler memoize things for you.)Admin
I can't see, how this wouldn't be an intentional backdoor, maybe for a crude way of debugging. There's nothing
eval("?>". somestring);' would do, if used naïvely, that a simple
echowouldn't provide. If, however, the submitted string contained something like
<?php echo "alert('$myvar: ".$myvar."');";`, we're talking…Admin
Your first response got held for moderation, so I'll reply to it when I can see it. But I don't think "supply your own equivalence relation" covers it specifically because we need to support heterogeneous collections. I mean, it's one thing if you know your set is going to contain, I dunno, nothing but two-dimensional and three-dimensional vectors. But if you don't know the contents ahead of time, the only sensible relation would be "use
equals
if they implement it, and reference equality if they don't", and now we're right back where we started, with an extra step along the way.Admin
(I've actually been writing some frameworky algorithms that rely pretty heavily on heterogeneous memoization tables, as it happens. So if I keep coming back to that example, it's because it's right at the forefront of my brain.)
Admin
I'm confused by your interface example. If you have IFoo, and you need Set<IFoo>, in my world, you can either a) create FooEqualityComparer : IEqualityComparer<IFoo>, which allows you the set creator to control the equality. Yes you can only rely on interface members in this case, but that's the whole point of having interfaces, isn't it? Or b) IFoo extends IEquatable<IFoo>, then implementors of IFoo, such ass Foo1 and Foo2, have to implement Equals(IFoo o) as they see fit.
Now I'll take it even further, let's say you have a unique set of requirements where you have a bunch of types which like you said may or may not define equality and you need a very heterogenous collection, say containing Cats and Tractors. (Hard to imagine why, but I'll grant you that.) Even in that case, you can use Set<IEquatable<object>>. All your items have to implement IEquatable<object>, even those which depend on reference equality (becomes just a wrapper around object.ReferenceEquals). So, Cat.Equals calls ReferenceEquals, whereas Tractor.Equals compares the VIN (after casting the other object to Tractor).
Let's go even further. Let's say, you cannot rely on a base class, or IEquatable implementation, or IEqualityComparer<T> because there's no good T to cover all scenarios, and you can't mandate all items have to implement IEquatable<object>. We're talking the most edge case of all in the history of programming here. I'd then implement IEqualityComparer<object>, which takes object x and y. In C# 7.3 (I think), we can do this: switch(x) { case Cat cat: return object.ReferenceEquals(cat, y); case Tractor tractor: return tractor.SerialNumber == ((Tractor)y).SerialNumber; } etc. You have casting, but then you would with object.Equals override too. It's inelegant, but this substandard solution to extremely rare challenge is limited to that one solution. The other 99.99% are not affected.
"(actually, we can't even parameterize Set that way)"
C#/.NET supports generic type constraints specifically for this purpose.
Admin
If you like, you could use PHP's idea of object equality. It has two: === is comparable to Java and C# in that it compares the two objects as equal iff they're both one and the same instance.
But == checks that the two objects have the same properties and that their corresponding values are equal. It uses == to make those equality tests, so it recurses. No, it doesn't check for circular references.
Admin
This is predicated on the assumption that we need direct, convenient, built-in support for heterogeneous collections. I maintain that we do not. I further maintain that heterogeneous collections are a very bad idea and are ripe for misuse... with one generic exception (see below).
As Mr TA points out, in C# you can use a HashSet<object> with an EqualityComparer that uses Object.ReferenceEquals(), which is a one-off grobble and doesn't really tax the imagination. It's quite elegant, really -- a bit like the C++ STL separation between containers, iterators, and algorithms. I've never encountered an "object pool" and I find it hard to imagine a concrete use, but if I needed one, then a HashSet<object> it would be. If you removed the virtual equality operator from the base object class, I really wouldn't care or notice.
(As an aside, if you're desperate for heterogeneous collections, may I recommend Microsoft.VisualBasic.Collection? It's the bee's knees in the .NET world for this kind of thing. It's also the poster child for abuse of heterogeneous collections. Trust me. I've just translated 1.5 million lines of VB.Net into C#.)
That "below" above? OK, I'll concede that there is a Pattern for using heterogeneous collections. (Virtually nobody in the OO world, as opposed to Haskell, uses it.) In fact, it's quite literally Pattern Matching, which is a nice high-level way of mapping your heterogeneous collection into a set of homogeneous collections. (Plus a bag of shit on the default side. We shall call that bag "PHP".)
But still and all, the only rational reason for having explicit (ie multi-typed) heterogeneous collections in a language is that they are already there. I wish they weren't.
Admin
What if you make the set of objects, and have to create your own equality comparer as you say. That still requires you to know all the object types that will ever be used by that set. Sure, you could have a default case that uses reference equality, but maybe we want to use it with the Tractor type, which has its own way of assessing equality, but the set creator didn't know about?
Basically your idea here is to make the use / implementation of equality comparison between objects more clear, which has its merits. But it also throws out objects having a default equality comparison (referential), for little benefit, but at the cost of making the code more verbose and at times awkward. I'm also not sure of why it ever wouldn't make sense to compare two object instances with referential equality, except where overridden?
Admin
.Net have Object.ReferenceEquals for when someone want to compare reference equality, but often, reference equality is not what someone needs, add to the confusion that value types may be compared by reference as well and we have the WTF of unintentional reference comparisons.
Admin
Sorry, that's Perl. (Or should that be @{$#perl}?)
Admin
To be fair, the same is true for every other language.
Admin
No matter how loosely typed a language is, I expect it to not return true if I compare null to 0 or an empty string, though.
Admin
That's basically it, yeah. Ditching universal equality is theoretically elegant, but it has practical ramifications and the workarounds are inelegant. I'll trade theoretical elegance for practical elegance any day.
This was a fun debate, though, Mr. TA! We should do this again sometime. But if you'll permit me the last word:
Mutable data says otherwise ;)
Admin
Tractor can implement IEquatable<object>, then Set<T> where T : IEquatable<object>
Admin
Ref Naomi & Mr. TA. A fine and interesting discussion. Thank you.
I think that had .Net had generics from the git-go then
IComparable<T>
&IEquatable<T>
would have been baked into theObject
type andObject.Equals
in all its forms would not exist while 'Object.ReferenceEquals' would exist for use in the places its relevant. Sadly there's no time machine to go fix that.Unfortunately even that solution has it's problems. Just like
IDisposable
, the interface itself is barely half the story. When implementing the interface properly includes the need to use the same half-page of boilerplate everywhere every time in every inheritance of every class with just a minor tweak in some, that's a recipe for obscure defective behavior. This situation cries out for some better technology that's stronger than an interface but less hairy than multiple inheritance. C# v8.0 now offers optional default implementations for interfaces that might be just the needed tech. Only 17 years after C# v1.0 was loosed into the wild. Too late.Oh well.
Admin
The first thing I noticed was the line
what would happen with nested indices? E. g.