• (disco)
    static public bool RegExp(ref String buffer, ref String compare)
    

    Andy why use ref? The code never assigns a new string to buffer or compare.

  • (disco) in reply to NedFodder
    NedFodder:
    Andy why use ref?

    I'm assuming to "Avoid Overhead: We don't need another copy of the arguments!"

  • (disco)

    Why don't they use matchString? Why do they call the RE string compare?

  • (disco) in reply to Tsaukpaetra

    Assuming this is C#, the arguments are already passed by reference, not by value. You only use the ref keyword if you want to change the string and have the caller see the change. This method didn't change either string, so no reason for that keyword here.

  • (disco) in reply to Tsaukpaetra
    Tsaukpaetra:
    NedFodder:
    Andy why use ref?

    I'm assuming to "Avoid Overhead: We don't need another copy of the arguments!"

    Using ref would actually add overhead, right? Because of pointer indirection.

  • (disco)

    Other thread: https://what.thedailywtf.com/t/the-apple-genius/53652

  • (disco)
  • (disco) in reply to NedFodder

    C# does not pass by reference, it passes by value. Of course, the fact that most types in C# are reference types confuses people here. https://msdn.microsoft.com/en-us/library/0f66670z.aspx "To pass a parameter by reference, use the ref or out keyword."

    For that matter, you can't "change a string" in .NET, as the string type is immutable. Yes, I'm being pedantic, but understanding these details is crucial to being able to code in these languages.

  • (disco)
    if ((buffer.Length == 0) || (compare.Length == 0))
                {
                    return bMatch;
                }
    

    ... so if both strings are of zero length, they don't match? That's kind of bollocks.

  • (disco) in reply to Matt_Westwood

    Yeah, the empty regex matches all strings in most implementations.

  • (disco) in reply to wekempf

    There was a flamewar on here a while back about this. I'm using the general definition of "pass by reference", not the pedantic C# one. If I pass a List<int> to a method, the list does not get copied. Obviously I passed a "reference" to the list, no matter what you want to call it. The ref keyword is more like "reference to a reference".

    And all this pedantry doesn't invalidate my point, there's no reason for ref in this code.

    Filed Under: 500 Internal Server Error

  • (disco)

    The C# regex engine caches the 15 most recently used compiled regexes, so even if you try to recompile the same regex, you'll pay the performance penalty once, as long as you don't need more than 15. (You can also tune this cache size if you want.)

    Python does the same thing, with the difference that it does not have the option to not compile regexes — if you use one of the many convenience methods without explicitly compiling, it compiles it behind the scenes for you.

    Not sure about other languages, but I suspect most other modern high-level languages do the same.

  • (disco) in reply to Protoman
    Protoman:
    Not sure about other languages, but I suspect most other modern high-level languages do the same.

    Tcl uses a couple of caches with different lifetime management rules; there's a per-thread cache that keeps the most recent 20 (IIRC) compilations, and a way of caching the compilation in the input RE string's metadata so that if that exact value comes back again (a common case) then that cached compilation is reused. There's no option to not compile; the matching strategy is to compile the RE first…

  • (disco) in reply to wekempf

    This is a useless distinction for both Java and C#. By this logic, there is no such thing as "pass by reference" in any context ever. There is only passing of addresses by value. These languages are not doing anyone any favors by teaching them that their language is strictly "pass by value".

    In Java, you pass objects by reference. In C#, you can pass an object by reference or a variable by reference. Suggesting that these are "pass by value" falsely implies that the callee cannot modify the objects passed in; it very much can.

  • (disco) in reply to TheBuzzSaw
    TheBuzzSaw:
    In Java, you pass objects by reference.

    You pass the object by reference. You pass the object's handle by value. This is the old philosophical difference between a thing and the name of the thing.

  • (disco) in reply to TheBuzzSaw
    function foo(arg)
    {
       arg = b;
    }
    
    a = ...
    
    foo(a);
    

    If the value of a changed after the call you're passing by reference. If it does not you're passing by value, or you're passing a reference by value.

  • (disco) in reply to PleegWat

    It's absurd to suggest that passing the address/handle of an object is "passing by value".

    function foo(arg)
    {
        arg.add(1);
    }
    
    a = collection();
    
    foo(a);
    

    The size of a most certainly changed. Again, I understand the difference between passing an object by reference and passing a reference by reference. Suggesting that the former is "pass by value" is misleading at best. Everything is pass by value at superficial levels.

  • (disco)

    If you think that passing by value and by reference is fun to deal with in most languages, wait until you see Apex by Salesforce:

    https://developer.salesforce.com/blogs/developer-relations/2012/05/passing-parameters-by-reference-and-by-value-in-apex.html

  • (disco) in reply to TheBuzzSaw
    TheBuzzSaw:
    Suggesting that the former is "pass by value" is misleading at best. Everything is pass by value at superficial levels.

    No, it's not misleading, it's being pedantically correct. Pass by value means the arguments are copied as if by assignment.

    Strictly speaking, yes, everything is pass by value at some level — in particular, C and Java only have pass by value. Conversely, C++ and C# have both pass by value and pass by reference, since a function which takes a reference parameter can change which object is referred to by a name. At the assembly level, the reference gets converted to a pointer, and that pointer is passed by value in the ABI (e.g. in a register or on the stack), but that's an irrelevant implementation detail, since what matters is the semantics of the language being compiled.

    The concepts of "modify an object" and "modify which object a name refers to" are fundamentally different. Pass by reference can do either, but pass by value can only do the former.

  • (disco) in reply to Protoman

    But my argument is simply that there is no such thing as truly passing by value in Java except with regard to primitives or immutable objects. Can I make a function that accepts a collection but promises not to modify it? No, documentation does not count. It suddenly becomes the caller's responsibility to produce a copy to ensure that the original is protected, which is dumb.

  • (disco) in reply to TheBuzzSaw
    TheBuzzSaw:
    It's absurd to suggest that passing the address/handle of an object is "passing by value".
    function foo(arg)
    {
        arg.add(1);
    }
    
    a = collection();
    
    foo(a);
    

    The size of a most certainly changed. Again, I understand the difference between passing an object by reference and passing a reference by reference. Suggesting that the former is "pass by value" is misleading at best. Everything is pass by value at superficial levels.

    Yes, pass by reference is just syntax for pass by value. Yes, for many purposes it doesn't matter what's happening, particularly if you're dealing with objects. However, the distinction does matter in certain cases (probably more as you get closer to the bare metal).

  • (disco)

    The 'public' and 'static' keywords are round the wrong way !

  • (disco) in reply to SimpleSimon

    Do you want your methods to be statically public, or publicly static?

  • (disco) in reply to NedFodder

    Good thing that nobody needs RegExes with obscure options like CultureInvariant ("é" == "e"), CaseSensitive, or Multiline. Especially as there is no way to mask flags in C#, not even by InputFlag & !Compiled.

    PleegWat:
    the empty regex matches all strings in most implementations.

    Every string contains the empty string. So I think that's prefectly justifiable.

    Maybe the Apple® Genius™ disliked C# so much because the # reminds too much of an apple slicer?

  • (disco)

    I hope he's checking his strings for null references somewhere else. We've had String.IsNullOrEmpty since .NET 2.0 for a reason. No one should ever write buffer.Length == 0...

  • (disco) in reply to NedFodder

    I'm not particularly fussed, as long as they compile !

  • (disco) in reply to smallshellscript

    Yes, but in 2.0 before the service pack it was broken because the compiler would optimize it wrong. Thus, when it was executed, the null check didn't occur first. Now one should always be using String.IsNullOrEmpty because it reads cleaner, is safer, and compiles well.

    And don't forget we now have String.IsNullOrWhitespace in 4.0+. I love that function.

  • (disco) in reply to The_Bytemaster
    The_Bytemaster:
    And don't forget we now have String.IsNullOrWhitespace in 4.0+

    Given that it took me the better part of a decade to get us off 1.1, it may be a while before I can use that. It's just like calling IsNullOrEmpty(mystring.Trim()), right :innocent:

  • (disco)

    RegEx are bullshit anyway. You should only ever use them to parse XML.

    Please don't kill me.

  • (disco) in reply to Michael_Mahn
    Michael_Mahn:
    RegEx are bullshit anyway. You should only ever use them to parse XML.

    Please don't kill me.

    s/bullshit/awesome/
    s/to parse XML/all the time!/
    s/don't//
    
  • (disco) in reply to Michael_Mahn
    Michael_Mahn:
    RegEx are bullshit anyway. You should only ever use them to parse XML.

    10/10 masterful troll

    http://images.newschoolers.com/images/17/00/74/99/56/749956.png

  • (disco)

    What is the purpose of bmatch -- did someone spend too much time with Pascal? Simply returning false would be clearer than having to track bmatch through the code.

  • (disco) in reply to TheBuzzSaw
    TheBuzzSaw:
    But my argument is simply that there is no such thing as truly passing by value in Java except with regard to primitives or immutable objects.

    It's the opposite actually, there's no such thing as truly passing by reference in Java. If you use "pass by reference" to refer to what Java does, what do you call it when you actually pass by reference in C# using the "ref" keyword?

    Can I make a function that accepts a collection but promises not to modify it?

    That isn't what "pass by value" means.

    It suddenly becomes the caller's responsibility to produce a copy to ensure that the original is protected, which is dumb.

    Perhaps, but that is a different issue.

  • (disco) in reply to na5ch
    na5ch:
    It's the opposite actually, there's no such thing as truly passing by reference in Java. If you use "pass by reference" to refer to what Java does, what do you call it when you actually pass by reference in C# using the "ref" keyword?

    What is the difference between a thing and the handle to the thing? Many languages are slippery when it comes to that, but it's very clear in Java: the handles to objects are always passed by value (with the resulting effect that objects are always passed by reference). C# allows handles themselves to be passed by reference, so they're really passing a reference to the place where the reference is stored so that it can be written back to.

    C# also allows passing entities themselves by value, provided they are declared as struct types. There is guidance on when to use them; it seems pretty sensible, once you buy into whether to use compound-by-value types in the first place at all.

  • (disco) in reply to na5ch
    na5ch:
    It's the opposite actually, there's no such thing as truly passing by reference in Java. If you use "pass by reference" to refer to what Java does, what do you call it when you actually pass by reference in C# using the "ref" keyword?

    Already addressed that. Java and C# pass objects by reference. C# allows you to pass references by reference by way of the ref and out keywords. C# also allows you actually pass by value thanks to struct definitions. If I receive a DateTime parameter, I can do whatever I want with it with full confidence that I am not accidentally modifying anything outside. I receive a proper copy. If Java were to add the struct keyword, what would it call the passing of values? Still pass by value? Again, it's easy to argue that "pass by reference" doesn't exist in any language anywhere: the passing of something "by reference" compiles down to the copying of some number (address, ID, handle, whatever). The term was invented for a reason. Java made the terrible choice of confusing matters.

    void Process(int value);
    void Process(int* value);
    void Process(int& value);
    

    Regardless of the way C++ names them, Java would consider all these arguments pass by value. Again, I find that incredibly misleading and unhelpful for anyone. These are worth distinguishing. The latter two calls have the ability to modify data you passed in. They pass references to data that the caller is probably holding. But no, Java would rather hand-wave and say, "It's so wonderfully consistent! Java is always pass by value!"

    na5ch:
    That isn't what "pass by value" means.

    I didn't say it did, but it is a useful attribute of actually passing something by value. On that note, answer the question, please. Or admit that Java is fundamentally broken.

  • (disco) in reply to TheBuzzSaw
    TheBuzzSaw:
    Or admit that Java is fundamentally broken.

    So you ignored all the valid things there are to criticise Java about and instead went with this? You're confused and wrong, that doesn't make Java confusing or wrong. It's completely consistent and easy to understand, but first you need to understand the difference between reference types and value types. Which, again, is really easy in Java because only the few primitive types are value types and everything else is reference.

  • (disco) in reply to another_sam
    another_sam:
    It's completely consistent and easy to understand

    No, it isn't. It is very confusing that primitives and class types have completely different treatment with the same syntax. It would be much less confusing if the same syntax had the same behavior universally.

  • (disco) in reply to LB_
    LB_:
    It is very confusing that primitives and class types have completely different treatment

    Primitives and classes are fundamentally different.

    LB_:
    same syntax had the same behavior universally

    It does. It's always pass by value. But when I say "pass by value" you need to understand which value is being passed. If you don't understand, you better learn. It's not complicated.

  • (disco) in reply to NedFodder
    NedFodder:
    The ref keyword is more like "reference to a reference".

    That's a rather confusing statement considering each use of the word "reference" means something different.

    Maybe instead of calling it "passing by reference", call it "copying the variable instead of the value".

    another_sam:
    It does. It's always pass by value. But when I say "pass by value" you need to understand which value is being passed. If you don't understand, you better learn. It's not complicated.

    Java doesn't have actual pass-by-reference like C#, so we don't get the two meanings of "reference" problem.

    Java comes out ahead on this one.

  • (disco) in reply to another_sam
    another_sam:
    It does. It's always pass by value.

    Primitives are pass by value. Class types are pass reference by value.

    Those are very different to me.

  • (disco) in reply to LB_

    The issue as I see it is that the terms "pass by value" and "pass by reference" were in common use well before languages that exposed references in any other way first appeared.

    This has led a lot of people to believe that the right way to test whether your language is doing pass-by-value or pass-by-reference is to check whether subroutines can modify the values of things you pass into them, as seen from the caller after the subroutine returns; if you can, the assumption is that the subroutine arguments are being passed by reference.

    But in languages where certain kinds of variable are always references, and the values they refer to don't actually have their own names, this assumption doesn't work. As you correctly note, the subroutine call mechanism can indeed pass all arguments by value, and if the subroutine can modify the values of the objects you pass in, that's because what you're actually passing is a reference.

    This is not pass-by-reference, because the reference is not generated by the subroutine call's argument passing mechanism - the argument was already a reference before it was passed.

    So I like your "pass reference by value" description. It sums up what's actually going on very neatly.

    In respectable languages like C and Perl, where you need to use explicit de-referencing operators if your variables actually hold pointers or references, all this machinery is fully exposed at source code level and you can easily see what's going on. I blame all the confusion that exists around this issue on these fancy schmancy new languages where the . operator might mean dereference-and-select-member or just select-member, depending on the type of the thing to its left.

  • (disco) in reply to flabdablet
    flabdablet:
    In respectable languages like C and Perl, where you need to use explicit de-referencing operators if your variables actually hold pointers or references, all this machinery is fully exposed at source code level and you can easily see what's going on.
    Languages where all reference types are immutable can pretend that they're passing everything by value always, provided they don't have any way to see what the reference identity is. (Of course, there's not really any reason to have such a capability if everything is immutable _except_ for debugging; you can't actually do anything useful with the knowledge.) Once everything is like that, you don't need explicit dereferencing either.
    flabdablet:
    I blame all the confusion that exists around this issue on these fancy schmancy new languages where the . operator might mean dereference-and-select-member or just select-member, depending on the type of the thing to its left.
    FWIW, `.` in Java always means dereference-and-select-member (except where it is a namespace separator and a few other things that don't matter for this argument). There are a few cases where things can get confused, but they usually only come up when a programmer sets out to be [deliberately malicious](http://stackoverflow.com/questions/24572214/java-name-hiding-the-hard-way/24575207#24575207).
  • (disco) in reply to Bort
    Bort:
    That's a rather confusing statement considering each use of the word "reference" means something different.

    No they don't.

  • (disco) in reply to dkf
    dkf:
    FWIW, . in Java always means dereference-and-select-member

    Educate an ignorant old tortoise: does Java not allow the creation of objects with other objects, as opposed to object references, as members?

  • (disco) in reply to flabdablet
    flabdablet:
    does Java not allow the creation of objects with other objects, as opposed to object references, as members?

    The only things that you can put directly as members of an object are primitives (char, boolean, byte, short, int, long, float, double) and references to other objects. While arrays are a bit special (and also a bit “special”) they still adhere to this rule. The rule is enforced at the level of the bytecode; there's literally no way to talk about putting a compound object directly inside another one.

    The JVM has become very good at managing dynamic memory allocation at high speed.

  • (disco) in reply to flabdablet

    I think a better way to explain would be to talk about whether you can change the identity of passed-in objects or not, where my definition of identity would be: for values up to the size of a word, the value, otherwise the address of the object (or in a GCed language, the object identity provided by the GC). The litmus test for whether a call was pass by reference would then be a = b; modify(ref b); return a == b;. By this standard, C doesn't have pass by reference, only pass-reference-by-value.

  • (disco) in reply to Buddy

    I prefer to just ask "Can you write std::swap in your language?"

  • (disco) in reply to LB_
    LB_:
    std swap

    :giggity:

  • (disco) in reply to wekempf
    wekempf:
    C# does not pass by reference, it passes by value. Of course, the fact that most types in C# are reference types confuses people here. https://msdn.microsoft.com/en-us/library/0f66670z.aspx "To pass a parameter by reference, use the ref or out keyword."

    I've found this kind of discussion concerning the implementation mechanism the language uses (whether pointers to pointers or whatever underneath) is a great source of pointless confusion in trying to teach .NET to anyone who has had significant exposure to C++

    Unfortunately, just about all C# code written by seasoned C++ programmers in their first year of writing C# seems to misuse the ref keyword in this way. It's a natural assumption after years of conditioning whereby the & annotation is needed in C++ if you want it to behave in the way that C#'s standard parameter passing mechanism works for all instances of classes.

    Worse than that is when the same programmers have an expectation that passing an instance of a class into a function without the use of the ref keyword, that said function cannot modify it's member variables and will get a new deep copy of the instance.

    This is even more dangerous, because by the time it's discovered these people have usually wasted a lot of time on implementing a completely broken design that will be a lot more costly than some negligible overhead of possibly having an unnecessarily layer of indirection.

    All ref means with regard to variables which store objects which are instances of classes defined using the class keyword is that assignments using the = operator within the called function, will assign to the variable of the calling function. Without it, the assignment is scoped to the called function. On the other hand out mandates an assignment to the variable in the calling function.

    What I find a heap more concerning is that this person managed to find out about some obscure interoperability feature of the language (that @variable thing) but didn't use that same time to understand the subtleties of something as basic as the parameter passing convention.

  • (disco) in reply to chubertdev

    I know defending Salesforce won't win me any street cred, but I seriously see no problem in this description.

    It's the same subtle distinction as is present in C#'s default parameter passing mechanism.

    All they're doing is outlining a limitation of their language that may not be apparent given it's syntactic similarity to the other curly bracket languages. I think it's a good bit of doco, and that the author did a good job explaining the problem.

Leave a comment on “The Apple Genius”

Log In or post as a guest

Replying to comment #:

« Return to Article