- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Uh no, it's with the size of the problem being a variable named 'n'. As the input approaches infinity, an O(1) problem is going to take infinite time as well.
captcha: gotcha
Admin
If I missed the sarcasm, I'm sorry but I'm going to have to say you need to hit up your algorithms and complexity textbook again. O(1) means constant time, as in no matter how large the input the operation takes a fixed amount of time. Like removing the first element of a linked list, if the list is infinitely long, it's still just changing a couple of pointers.
Admin
I'm pretty sure it's O(n^2). (assuming Read is O(1), of course)
To read a specific line, all lines before it must be read. To read a line, all characters in it must be read. Thus, to read a line all characters up to and including the CR must be ReadC'd. ReadC calls Read for every character up to and including the one it needs. The function
f(0) = 0 f(n) = f(n - 1) + x
(i.e., f(n) is the sum of all numbers from 0 to n) is equal to the amount of required Read calls, where n is the amount of characters up to and including the CR of that line. f(n) = 1/2n^2 + 1/2n, as can easily be seen from a visual representation of f(n), hence O(n^2).
Admin
Yes, but as the variable n approaches infinity (substitute "extremely large" if you wish).
The point being that an algorithm is not O(n^2 + n) - it's simply O(n^2), since the "n" term is negligible for a suitably large number.
Admin
Dunno if pascal compilers had this protection at that time, but Delphi (and if I recall it right Turbo Pascal 5.5 too) doesn't allow you to use a global variable to control a loop, so the code as shown wouldn't even compile.
In the case the compiler allowed you to use GLOBAL vars as a for loop control var, then this app wouldn't even work. As seen, the I var is used on 2 for loops, one inside the other. If the var is global, the value controlling the first loop would reset, jump, or simply exceed limits.
Admin
seems even this task was too hard to accomplish...
Admin
I work for a computer company; So I am really getting a kick out of most of these replies. Some of you guys are very good at making it sound like you know what you are talking about. But trust me.... You don't. I think you just want to make yourself sound smart, when in reality you don't know what you are talking about. This is how bad info gets passed around. If you don't know about the topic....Don't make yourself sound like you do. Cos some people believe anything they hear.”
( http://en.wikipedia.org/wiki/Big_O_notation )
( and for the uninitiated, http://forums.fark.com/cgi/fark/comments.pl?IDLink=1688267 )
Admin
This tedious software reminds of our experiences with the infamous Commodore plus/4. It came with built-in software, text processor and database. We tried the database: To sort 10 simple data records, it took a few seconds. To sort 20 records, it was in the minutes. To sort 90 records, it needed an hour. It came with a red-paper notice to not use databases with more than 90 records. Must have been also something in the O(n^4) or O(n^5) range.
Admin
I used to work with a guy who wrote an algorithm for finding a new, available radar frequency as his final year project at university. He mentioned in his paper that there were "some performance issues", but it's fortunate that the professor never actually ran it; my friend told me he'd calculated the algorithm was O(n^n), and that it would take "several times the current age of the universe" to produce a single result.
Admin
Admin
Yes, the original Mac's API was close to Standard Pascal. It had no notion of GUI Classes or OOP. I still have five volumes of "Inside Macintosh" holding down my bookshelf.
Admin
For a non-programmer, this is actually pretty good. She naturally got the idea of modularization and figured out that character 13 would be at the end of the line. The big issue is that she built a O(n^3) algo rather than O(n) (or O(1)).
In attempting to understand Sheena's thinking, I focused on the ReadC procedure first. Essentially, she built a O(n) algo to get the nth character. As a noob, this is not a horrible mistake to not know about random file access, so we can cut her some slack on that one.
The next thing she did was build a wrapper that read from the current file position till the next character 13. In and of itself, this is not a bad thing. The thing that caught her up was that ReadC reset the file reading every character from 0 to n. This is where things start to go bad as it changed the algo from O(n) to O(n^2) (0 + 1 + 2 + ... + n = n * (n-1) / 2 = O(n^2)) where n is the number of characters read.
Moving to the next level of encapsulation, the outer loop is O(m) where m is the number of lines. Each time a line was read, there is O(n^2) characters read, so the algo is O(m * n^2). As such, to determine the order of the overarching algo, we would have to consider what function would map m to n (or vice versa). As this is a help file, it is not a bad assumption that m is proportional to n (all lines fairly similar in length) so this would result in O(n^3) where n is the number of characters read from the file.
Now, again, Sheena was checking her algos for accuracy and they returned valid results. The issue is just one of timing. Yes, we all recognize that O(n^3) is pretty crummy, but this is a fairly advanced programming topic. As such, I would argue that, for a Cinema person, this was not so bad.
Now for the fun part: rebuilding the algo for efficiency. Here is my general psuedocode: -- open the file -- read the file till the m-1th (assuming 1 based indexing) new line character is found (read characters ignored) -- read the file storing the read characters to a string till the mth new line character is found (not storing the new line character in the string) -- close the file -- return the resulting string Clearly, this is O(m) (again presuming a linear mapping between m & n) for each call. If we wanted to speed this up for random line access to the file - O(1) - we would have to add caching to the first call - one run of O(n) where n is the file length with O(m) memory being used.
Admin
I had a friend who had a job soon after graduating in a QA department. The manager was a complete moron. And he'd occasionally end his rants about her with "and get this, her college degree is in health!"
Admin
The algorithm can be expressed as x * n^2 - y, where x is the number of lines, n is the total number of characters from the beginning of the file to the end of that line, and y is the accumulation of z in (n-z)...(n-4)+(n-3)+(n-2)+(n-1)+n (her character reading algorithm).
x < y < n^2, so the important part is the power as n approaches large numbers. It is more than O(n^2), but less than O(n^3).
So, for the 100th character, it would take approximately 10,000 passes through the algorithm (disk reads). Lets say the average help line is 50 characters, and you wanted to get the info from the 5th line, it would be 250^2 or 62500 reads. The 6th line would have been 300^2 or 90,000 reads. That doesn't seem like a lot given modern processors with gigabytes of ram, and fast disk access.
However, the original Mac was limited to 128 kilobytes of RAM and the aforementioned 8 mhz processor, and a 400 Kilobyte single sided floppy drive. Given the speed of floppy drives at that time, and the number of operations performed, I can see quickly running out of time for any lookups beyond a trivial number. I am surprised the disk did not melt or fail after the initial test.
Admin
We call that O(Crap)
Captcha: muhahaha! nice!!
Admin
Absolutely wrong. An O(1) problem's running time is completely unrelated to the problem size; that's why it's O(1) and not O(n).
Admin
I have to disagree with your analysis on this point. m has to be less than n in all cases. When n goes to large numbers, m becomes meaningless. The algorithm falls between O(n^2) and O(n^3), leaning towards the lower end of the spectrum as n approaches large numbers.
e.g. assuming each help line has 50 characters on average, we get:
5th line = 62500 6th line = 90000 for the n^2, and:
5th line = 15625000 6th line = 27000000 for the n^3 - which is way too large.
Admin
Admin
Actually quite a bit of code I see out there is O(0) - it does absolutely nothing useful.
Admin
I'm sure you'll figure it out.
Wait for it...
...and you've answered your own question.
See, I knew you could do it. :D
Admin
While I agree that m must be less than n, I specifically made the assumption that m was proportional to n - ie m = a * n. In this case, a would be fairly large, lets assume 10,000,000 just to prove a point. Therefore m * n^2 = (1/10,000,000)*n^3 which in big O notation is just O(n^3). Remember, the factors pull out with big O.
Admin
The Microsoft Win95 installer tool did this with fonts!
To load a font during install, it first had to load all the fonts before it. This little issue wasn't noticed with most apps that installed a couple of fonts.
But then along came Publisher 95 (I think) with its hundreds of fonts.
The installer team stuck to their guns and refused to fix it, thus Publisher took around an hour to install for no good reason.
End result: so many tech support calls on install that net profit on Publisher was zero.
Admin
Could an algorithm whose running time is inversely proportional to the input size exist? If so, it would also be better than O(1), though I admit not as good as O(0).
Admin
A basic example would be splitting strings into equal parts by length: you have a string of length K and an int of length N. Such an algorithm (assuming you were working in a language like C where you can manipulate pointers directly instead of re-initializing strings) would require O(K/N) space and time. The higher N is, the fewer new pointers need to be made.
But that's not really practical...
Also, now that I've thought about it more, this algorithm's best case is still just O(1), when N > K. If you treat K as constant, then you can think of it as O(1/N), I guess. This really isn't a very good example...
Admin
Admin
I'm sure there are some great programmers out there with degrees in film. There are plenty of good programmers without any degrees at all -- were any of them to pick up a degree in film, I'm sure it wouldn't diminish their ability to program.
One of the lead programmers at a company I once interned at had a Ph.D in chemistry.
Admin
The asymptotic behavior of functions works something like this, in order of increasing complexity: O(1) => constant time O(log n) => logarithmic time O(n) => linear time O(n log n) => -- forget a fancy name, but most good sorts are like this O(n^2) => quadratic time ** This function! ** <-- by the above analyses O(n^3) => polynomial time (cubic) O(2^n) => exponential time O(n!) => Factorial time (ouch)
These are some of the most notable complexity classes, anyway. All of these are asymptotic complexities: think calculus; think "what happens as the size of the data set approaches infinity" - this is why you can throw away constants and such. There may be significant differences between two O(n) algorithms, and a particular O(n^2) algorithm may be more effective than an O(n) algorithm - for sufficiently small values of n.
Few computer science problems that deal with enormous streams of data are O(1). Just copying a parameter into a function is an O(n) operation. (Which is why big things are passed by reference...)
Admin
I've searched through both pages for the phrase, and have found it lacking, so I suppose I must be the one to present the congratulations:
Brilliant!
Admin
Did Sheena leave for a big aerospace company in the PNW, by any chance? I saw some production code very much like this when I got called by support because they claimed that our software was to slow. After lots of bitching about security and that they only can show us non-classified examples, they showed us one of the main bottlenecks. Basically what they did to read a 50,000 entry data files was:
Set counter to 0, load file read package, open file, read date point, close file, drop file read package, add to array, increment counter, load file read package open file, read all data points until you reach counter, read data point, close file....
The code was beautifully structured and documented, though.
Admin
O(n^1.5) is a rather coarse estimation of the running time of your example. You need knowledge on the distribution of the smallest prime divisor of numbers to do anything sensible.
In any case, there are lots of algorithms with complexity given by non-integral powers of the input size: just grab a copy of TAOCP.
Btw, using Excel notation for writing formulas hurts my eyes!
Admin
I'm starting to think the authors of these articles purposely make mistakes like "O(4)" just so everybody will squabble over it.
Admin
I think I got it now:
-read everything until the 1th char -read everything until the 2nd char -read everything until the 3rd char -etc
In total, if we read n lines with a total of k chars, we read k*(k+1)/2 chars, so its O(k^2). If the number of chars per line is approximately constant, we can say it's O(n^2).
Admin
good code re-use! and re-use and re use and re-use and re use and re-use and re use ...
Admin
Wrong and horrible. TRWTF is how many people on this site do not understand big O notation. Big O is complexity, not running time.
Also, you're completely ignoring the fact that it continues to loop even after isPrime = false. As is, it will iterate almost precisely n^1.5 times.
Admin
Big O - our dear friend.
I haven't actually calculated the complexity of this program, so I won't comment on its value. At the same time, I feel confident enough to say it is O(2^n). This is because big O notation is used as an upper bound on the complexity. If you want to be more precise, you should use big omega notation.
Admin
See, now I'm confused. Not big Omega, use big Theta. Big omega is for lower bounds.
Admin
The unwritten fact is that the algorithm is running on a quantum computer :)
Admin
ReadLine is O(n^2) with the size of the file (as some people mentioned earlier; the original poster should be ashamed of not knowing what O(f(n)) means).
This isn't entirely Sheena's fault. One of those factors of n is due Sheena's WTF ReadC, but the other is due to the file format. With variable-length lines and no index, you can't know where line n starts until you've read line n-1.
If you're iterating over every line of the file with ReadLine, that would be O(n^3). This API wouldn't make much sense then.
Admin
This is one thing I've never understood. Please help me try to understand what goes through someones mind:
"I need to write a function that reads a single character" "How do I do that... Well I have this 'reset' function, that puts me at the start..." "Now what? Well I know I want the nth character... And I just issues a 'reset,' so I need to loop from 1 to n. Then the next character is what I want..," "But how to I do that? ohhh I'll just call this system function that 'reads a single character'" "Brilant"
Reminds me of a puzzle a friend wrote, that we were having people beta test. It was srabble oriented, with the scrabble words buried in the puzzle text. We overheard one of them mutter "I need a seven letter word for QUALITY"
Admin
Are you sure about this number? If so, there must have been some very fast development. I remember when word of the new Macintosh hit the press. I rushed down to the local outlet to check it out. What I remember most of course, was my disappointment when the saleslady started showing me all the windows and pull-down menus, etc. I was looking for something that I could write and run Pascal on, and all this precooked "software" crap didn't look like it would get me any closer to the goal. I do remember, though, that one of the features she gushed about was "it has a MEG of RAM!!!" On that score, I know that memory does not fail me. Perhaps, though, earlier Macs had slipped by under my radar.
Admin
I wonder if anyone ever took any time to actually help train Sheena or did they simply assume that as a woman she was clearly untrainable.
Just askin'.
Admin
O(n^2) and O(n^2 + n) do represent the same class. But the latter is a more informative approximation for regular size inputs. Consider the difference between an algorithm that runs (exactly) as n^2 + n, versus one that runs (exactly) as n^2 + 500,000,000,000n. Which would you prefer?
Nobody really cares how an algorithm will run with gigantic inputs anyway. There isn't much practical difference between an O(n) and an O(n^2) algorithm if n is 2^1024. You'll be dead before either is done. People only do Landau approximations to figure out how well the algorithm will scale on reasonable inputs, which is why more precise approximations are a good thing.
captcha: tesla (zaaaaaap)
Admin
Was he an expert on the chemistry of lead?
Admin
Well the kids are all hopped up and ready to go They're ready to go now They've got their surfboards And they're going to the discotheque a go go But she just couldn't stay She had to break away Well New York City really has it all Oh yeah, oh yeah
Admin
Yes, the earliest Mac (Jan 84) had 128KB, followed by a 512KB model (Sep 84) followed by the "Mac Plus" with 1MB (Jan '86) of memory (expandable to 4MB) and a then-new disk interface named SCSI.
Apple Computer created its own Lisa Pascal for the Lisa Workshop in 1982 and ported this compiler to the Apple Macintosh and MPW in 1985. In 1985 Larry Tesler, in consultation with Niklaus Wirth, defined Object Pascal and these extensions were incorporated in both the Lisa Pascal and Mac Pascal compilers.
Perhaps news travels quite slowly in your part of the world.
Captcha: howdy
Admin
the Big O would be O(n*(n+1)/2), where n is the index of the first charector on the n-th line from the beginning of the file.
Admin
Many people care greatly how algorithms run with inputs large enough for even O(n lg n) vs. O(n^2) to matter. Some perspective: consider f(n) = Cn log n and g(n) = Cn^2. If both take one second at n=2^8, f takes nine minutes at n=2^16 while g takes over 18 hours.
It doesn't make sense to use confusing notation like O(n^2 + n) - people will just think you're an idiot who doesn't know what asymptotic notation means. Instead, consider the following options:
talk in terms of performance, not scalability - "algorithm A takes ~100 ms at production input sizes (n=2^16), while algorithm B takes two hours".
be explicit - "the algorithm's running time, while technically O(n^2), is dominated by an O(n) term at reasonable input sizes."
be even more explicit - "the algorithm's running time is f(n) = C(n^2 + 500,000,000,000n)"
In practice, I think it's pretty rare for an algorithm's running time to be both dominated by a lower-order term and long enough to be worth measuring in any fashion.
Admin
Now, I do often see algorithms described like O(n^2 + m*n) - several terms where which is largest varies by situation. It doesn't really make sense to have a constant like 500,000,000 - it's a variable determined by something.
Admin
Just relooking though this code and I've just realized that this won't work. Assuming that the variable i is global. i is changed to the current pos in ReadC. So that when the program returns back to ReadLine, ReadLine now thinks that it is at the ith line.
Admin
Usually the only WTFs on this site are the comments by the editors and the users.
For a change, the article is the real WTF. Great one!