- Feature Articles
- CodeSOD
- Error'd
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
If you absolutely must use a function, how about
sub split_at_tabs { return (split('\t', $_[0])); }
Why wouldn't you return a reference? If you return the actual array, it forces a copy of all the data, rather than passing a single value referencing the original. If this is called often or with sizable records, this can be considerably slower.
Besides, if you're too lazy to deal with the reference, you can always force it back to a copy with:
@records = @{split_at_line($line)};
Three more characters, but the option for better performance if you need it. If you want to get all fancy (and harder to read), you can also use wantarray to determine whether the caller wants the scalar (reference) or the actual array in the first place, but this is getting way over-enterprisey for a function that doesn't really need to exist, anyhow.
The java.util.Vector class is a WTF itself. The Java designers wanted to implement a resizable array-based container but did it wrong, ending up with O(n) amortized time complexity for adding a single element at the end. When they realized this (or maybe the Java users were realizing it because things were so painfully slow), they saw that they couldn't change this behavior without altering Vector's interface. They decided against that and wrote a new class called ArrayList which does essentially the same thing, only in a remotely sane way, and so has O(1) time complexity for "add".
C++ and the STL are often cited as a bad example of "design by committee" but if design by committee is the alternative to "design by ad-hoc rectal extraction", then I, for one, welcome our new committee of overlords.
A pointer is an abstraction within the language as well. A C-style pointer hides stuff like segments, virtual memory and memory-mapped IO from you.
So if you're going to be precise, you need to speak of "C pointers" for what you mean and then you might as well also speak of "Java pointers", or just "pointers" if you're being sloppy anyway. The main reason not to do this is that "reference" is what the Java Language Specification calls it.
Vector CAN be O(n) for adding an element if you set a capacityIncrement because then the size of the underlying array is increased by a constant when said array is full. This is only "painfully slow" if you built a very large Vector without setting an appropriately large increment. However, the default behaviour is to double the array size, which is the same thing ArrayList does. It is possible that some ancient (pre 1.1) implementation of Vector did not have that default behaviour, but obviously the problem WAS fixed without having to change the interface.
ArrayList was created for two reasons: First, to implement the List interface as part of the Java 1.2 Collections framework without the legacy methods present in Vector and second, to offer an implementation without the speed penalty of Vector's inbuilt synchronization, which is usually uneeded or insufficient.
Both Vector's and ArrayList's add operations have an amortized complexity of O(N), where N is the number of elements being added. Unless you set vector.incrementCapacity to a fixed positive value, in which case it becomes O(N) with N being the number of elements in the array.
You're also ignoring the fact that in both cases "add" can be used to insert into the middle of the list as well as append to the list, which with both classes is a O(N) operation, with N being the size of the array.
I'm not a Java expert, but it looks to me like they got into trouble because they made Vector threadsafe. At some point they realized that 90% of coders didn't need the threadsafe behavior, but that they couldn't change it without breaking code for the other 10%, so they made a new class which wasn't threadsafe instead. So it wasn't really the interface that was the problem, it was the defined behavior.
Doh! Beaten...
Ok, so they did change the interface, but didn't care to update the documentation properly.
"The capacity is always at least as large as the vector size; it is usually larger because as components are added to the vector, the vector's storage increases in chunks the size of capacityIncrement."
Only in the details about the protected field capacityIncrement do they mention the size doubling behavior. Maybe I'm a purist, but first of all, a protected member is, for obvious reasons, not a part of the public interface, and second, "declaring data members protected is usually a design error" (Stroustrup, The C++ Programming Language, section
We had performance problems using java.util.Vector back in 2000 which we solved by switching to ArrayList. Borland JBuilder incidentally suffered from a similiar performance problem: If you concatenated hundreds of string literals using "+", it would take a long time to compile the code. After doing some performance measurements (showing that there was quadratic growth depending on the number of consecutive literals), it occurred to me that the Java compiler must have concatenated the string literals using java.lang.String's + operator instead of a StringBuilder...
Back to java.util.Vector: I'm quite certain that the documentation back then was the largely the same as the documentation mentioned above, which contradicts Vector's modern behavior. It may not be a big contradiction and there probably were few programs noticeably affected by the change, but it's still odd.
And you'll also notice that this time they decided not to include a "capacityIncrement" property and wrote in ArrayList's documentation: "The add operation runs in amortized constant time, that is, adding n elements requires O(n) time.", something which they didn't bother to do for java.util.Vector. Guess why?
(I agree with you regarding the pointlessness of implicit synchronization. Unfortunately, many inexperienced programmers are vulnerable to this kind of snake oil.)
It's also interesting that ArrayList doesn't have any protected fields except one inherited from AbstractList. I suppose Sun would be more than happy to remove the badly designed Vector from the API if that weren't such a huge compatibility problem.
Thanks for being pedantic^W precise. I was taking about the add(Object o) method, not the add(int index, Object element) method which almost every other container library calls insert.
Don't buy the fairy tales - java.util.Vector is synchronized, not thread-safe. There is no such thing as built-in thread safety. Anyway, I guess Sun had several reasons for writing ArrayList (i.e. several design flaws in Vector).
But I guess Sun considers the much-touted (and admittedly quite impressive) compatibility of Java code and even bytecode across platforms and versions too important to ever break it, even for things that are not only ugly or performance-degrading but outright dangerous like Thread.stop()
I'd rather program in Perl than VB. Both have ugly syntaxes, but Perl at least allows me to do quick and dirty things with a very concise syntax. VB's syntax is not only clumsy, but also more irregular. Also VB's libraries are so useless and inconsistent. Perl is much better and the documentation is much clearer.
As for GUI, there is Perl/Tk, which is quite nice. (I find TCL's syntax much more terrible than Perl.) When will VB have a widget library that does AUTOMATIC geometry management?
I can't decide if the fist or 2nd comment is a better example of comment WTFs on this site. Both excellent examples.
If you try hard you can may unreadable progrma in any language. With Perl, remove "hard". But you still need to work on it.
P.S. Too many Perl examples are still written like in Perl 4. I prefer 5.10.
Say rather you can program readably in any language. With Perl add hard, but you still need to work on it.
can't say I ever recall anyone complaining about any language about how hard it was to obfuscate.
Addendum (2007-05-07 05:05): Oh well, my reply was just about as grammatical as the original post, upon second reading. If you try you can write readably in any language. With me add hard ;}
I will note that there may be a good reason to not use strtok...
From the GNU C manpage:
And don't provide an equivalent that does the same thing. strtok is very useful, afterall...
If someone is reading that page and runs across that, well, they may just write their own. After all, the C library guys know what they're talking about, and if they say not to use it, well, why should we use it?
No, he's not using PERL. He's using Perl.
Bad perl code competing with bad java code?
Why not
@stuff = split /\s+/,$some_string;
then there is no need to trim each element of stuff, it is has been done, automagically, by the split operation. If your split does not do this, then your split function is rather borked. Based upon all the java comments I see going on here, it appears either java's split is borked, or people don't know how to use it.
If you want perl code approximating the java code, why not
$some_string =~ /\s+/ /g; # make multiple white space # become a single white space @stuff = split /\s/,$some_string;
That is, make the perl be more verbose, and less obvious. Iterate this enough times, add in function calls with funny names, and a few long chained method calls
and you will eventually get the Java code.
You shouldn't use StringTokenizer. If it is speed you are worried about use Pattern. It should out perform StringTokenizer unless you have put the compile part into the loop as well.