• (cs) in reply to Anonymouse
    Anonymous:
    Did you consider that the stored procedure might be called by more than just this one application?

    This is also the problem of using logic in stored procedures, you start to spread the logic of the applications around so solving problems can be a bit of a search.

    Stored procedures are nice for their performance, but horrible for a clean architecture with a clearly defined business-logic layer.



    Both were "owned" and developed by the same programmer. Both exclusively call each other. This of course raises the age old question, what came first, the stored proc or the COM?


    Did you consider that you need to read more than the first three words?
  • AC (unregistered) in reply to Porsche3000
    Anonymous:

    RevMike:
    Anonymous:
    Why?  WHY?!  Why is he returning the value as a recordset AND an output parameter?  WHY, GOD?!


    Some development environments prefer to deal with recordsets from databases, while others can comfortably work with scalars.  He wanted to leave that paradigm choice up to any future caller.

    Or, more likely, because it was his first time writing in T-Sql, he had a few fragments of example code, and kept trying things until it worked.  This is the same methodology that I use with VB.  I write java, perl, C, and PL/SQL.  I tune SQL.  I've designed good relational schemas.  I've architected effective J2EE.  Plus I grok regex better than 95%.  If I actually learned VB, it would make me stupider, and I wouldn't be able to do my job.

    I work with VB and earn 6 figures. 

     



    You don't count the numbers after the decimal. (sorry had to be said)
  • (cs) in reply to Porsche3000
    Anonymous:

    RevMike:
    Anonymous:
    Why?  WHY?!  Why is he returning the value as a recordset AND an output parameter?  WHY, GOD?!


    Some development environments prefer to deal with recordsets from databases, while others can comfortably work with scalars.  He wanted to leave that paradigm choice up to any future caller.

    Or, more likely, because it was his first time writing in T-Sql, he had a few fragments of example code, and kept trying things until it worked.  This is the same methodology that I use with VB.  I write java, perl, C, and PL/SQL.  I tune SQL.  I've designed good relational schemas.  I've architected effective J2EE.  Plus I grok regex better than 95%.  If I actually learned VB, it would make me stupider, and I wouldn't be able to do my job.

    I work with VB and earn 6 figures. 

    Me too. But they've offered to raise it to 6 1/2 if I promise to never program in any language ever again.

    --Rank

  • (cs) in reply to Frost
    Anonymous:
    emptyset:

    Anonymous:
    O(1)?  Dude, can I have a copy of your magic pixie-dust quantum regex library?  Pretty please?

    <font face="Courier New" size="2">it's what perl uses.  that's why so many people use perl:  because regexes are faster than looping/parsing through a string.  the regexes use 'caching' to quickly detect if a substring is present in a string by polling a constant number of characters.</font>

     

    Perhaps you don't understand what O(1) means?  You are saying that a regex scans a 10-character string in the same amount of time as it scans a million-character string.



    I know nothing of how perl's regex's are implemented, but my guess is that, if substring matches are somehow optimized, they achieve something more like O( log(n) ).  Better than O(n), certainly not as great as O(1).

    O(log(n)) is what most good search algorithms that work on sorted data get.  But if a cache is somehow involved, performance could potentially be better than that (or probably significantly worse under the right conditions).

    Also, for speed of search to be that good, the cache must use huge amounts of memory (proportional to the data being searched).  Imagine the size of this cache on a 4GB file.  If you are not on an awesome computer, you're probably going to run the cache into your swap file, at which point you can say goodbye to any kind of performance gain the cache offers.

    But again, I don't know how perl does it so I am only speculating here.
  • (cs) in reply to RevMike
    RevMike:
    fregas:

    RevMike:
    Anonymous:
    The "trouble of fixing" involves deleting a "- 1" and a "+ 1".  Come on, how lazy have we become?


    That depends on where you are working.  For many of my clients it means getting a QA resource.  Creating and agreeing upon a QA plan.  Implementing that plan.  Getting sign-offs from the user representatives.  Scheduling a release.  Writing a release plan.  Validating the release plan against the integration testing system.  And finally deploying.

    Deploying a change like this can easily take 40 man hours spread amongst several groups.  No one bothers with this crap except that someone probably has it in the back of their mind that if they ever go into this section of code for another reason, they'll clean this up while they're doing it.

    Dude, if you have that big of a fiasco for such a simple change, than management needs to take a step back.  Personally, I would roll that as part of another change and put it into that QA/Release cycle, rather than having the +1/-1 released on its own.

    Or I would just change it in the source code, not tell anyone and let the chips fall where they may.  :)



    Take a step back?  Depends on the environment.  I'm doing a lot of work with three of the world's top five investment banking houses.  The practices that are appropriate to managing the web site for your local dry cleaner don't cut it when hundreds of billions of dollars are on the line.  Code doesn't go into production that doesn't have a verifiable audit trail.

    If just the risks of losing money wasn't enough motivation, SarbOx has made the CTO/CIO criminally liable for these systems.  In other words, if the CIO fails to take reasonable steps to secure his systems against both incompetence and malice, and that damages the clients of that firm, he can be arrested, convicted, and imprisoned.

    The process that surrounds a system needs to be proportional to the potential damage that the failure of that system will cause.


    I just had to go through hours of SarbOx audit trailing for accidentally re-saving a program I had opened to see what exactly it was doing...Opened it, looked at it, fixed the formatting on one complex part so it was easier to read, then, LIKE A MORON, I saved instead of cancelling.

    Every piece of code is monitored. When that one piece changed, it got flagged on our "needs documentation" list, and I had to hunt up someone who could authorise me to make a change on that program, post facto, so I could get documentation for the auditors.

    That's the modern business world. Don't expect to be able to modify code to yer hearts content.

  • (cs) in reply to felix
    felix:
    RevMike:

    The process that surrounds a system needs to be proportional to the potential damage that the failure of that system will cause.


    I hope you're not actually advocating that kind of policy. Imagine this:

    "Central here.... Yes.... There's a fire? Sure, we'll dispach a firemen crew ASAP, but first we need you to file a Fire Control Form in triplicate to... No, sir, I'm afraid we can't skip this step.... I'll fax the form to you right... Oh, your fax machine has melted? Ok, I'll mail it to you, what was the address again?..... Sir?...... Sir?!......."

    Oh man, when are we going to learn to just DO THE RIGHT THING?



    Your analogy sucks.

    Think of it as fixing a key structural support in a building that's 200 stories high. Would you rather some schmuck maintenance guy just randomly saw it, and decided to fix it, knowing NOTHING about the true nature of the problem? Or would you rather that there was a lot of consultation and preparation and thought that went into the problem, and a good team assmbled to fix it, and the building evacuated, etc.

    Emergencies are emergencies, and sometimes you've got to violate procedure in an emergency. But that sort of thing almost never happens, and when it does, generally you're not doing programming when it does.

    If someone working under me took it upon themselves to fix a problem in such a way that they shut down the system, or broke some critical app, or introduced a MATH ERROR, I would come down on them so hard that it'd be ten years before they could look at a keyboard without puking in fear. Then I'd fire them. Even if nothing broke, and I found out they had made a programming logic change without authorisation, there would be hell to pay. Then I would fire them. That is not the way you work in the professional world.

  • (cs) in reply to kipthegreat
    kipthegreat:

    I know nothing of how perl's regex's are implemented, but my guess is that, if substring matches are somehow optimized, they achieve something more like O( log(n) ).  Better than O(n), certainly not as great as O(1).

    O(log(n)) is what most good search algorithms that work on sorted data get.  But if a cache is somehow involved, performance could potentially be better than that (or probably significantly worse under the right conditions).


    No. A general substring search can never be faster than O(n), except if you parallelize it  - you can do it in O(1) time with O(n) CPUs, of course.

    This is easy to prove: consider the regexp /A/, i.e. matching any string that contains an A anywhere. Consider a String of length n consisting entirely of B's which may or may not have a single A thrown in at a random location. If you can, at the lowest level, only examine one character at a time (which is the case with a single-CPU computer), then you have to look at every single character to be certain there is no A, and if there is one you will find it, on average, after n/2 comparisons.

    Search algorithms require, as you said, sorted (or otherwise specially ordered) data. A string is generally not sorted or ordered in a special way.
  • (cs) in reply to felix
    felix:


    Oh man, when are we going to learn to just DO THE RIGHT THING?



    When we all agree what the right thing is and none of us ever make mistakes.
  • (cs) in reply to Anonymouse
    Anonymous:
    Did you consider that the stored procedure might be called by more than just this one application?

    This is also the problem of using logic in stored procedures, you start to spread the logic of the applications around so solving problems can be a bit of a search.

    Stored procedures are nice for their performance, but horrible for a clean architecture with a clearly defined business-logic layer.


    Then this is an even bigger WTF!   All the code will need to know about this -1, and most of them will have to account for it.    IF this -1 is needed in one place, it will be obvious in the other places that it is needed and why.  As a programmer working on the code where it is needed I will be wondering why they didn't do it, and how it can possibly work.   When I discover it in the stored procedure I will go WTF, you are making the sp less generic by doing something for me that I can do easily.
  • (cs) in reply to Ruakh
    Ruakh:

    Dude, even better: in Perl, a regex can scan a 10-character string in even less time than it can a million-character string. :-)


    This is, of course, only true for the special "pixie dust edition" of Perl that emptyset uses. The license requires you to give your firstborn to the sidhe and put a saucer of milk on your doorstep every night.
  • (cs) in reply to Satanicpuppy
    Satanicpuppy:

    If someone working under me took it upon themselves to fix a problem in such a way that they shut down the system, or broke some critical app, or introduced a MATH ERROR, I would come down on them so hard that it'd be ten years before they could look at a keyboard without puking in fear. Then I'd fire them. Even if nothing broke, and I found out they had made a programming logic change without authorisation, there would be hell to pay. Then I would fire them. That is not the way you work in the professional world.


    Actually, that's exactly the way you work in my professional world :).  My professional world is run by cowboys.  Everyone does their own thing, we've never even heard of code reviews, and testing?  Sure, we do that, because the company I contract for requires it.  But they have no guidelines for it except that it be done;  so long as a developer says, "yeah, I tested it," it gets moved to production.

    As harsh as you come off in your post, I think I'd like to work in someplace like you describe.  I have this awful feeling in the back of my mind that I'm learning a lot of really bad habits where I'm at.
  • (cs) in reply to Anonymouse

    Anonymouse:
    Stored procedures are nice for their performance, but horrible for a clean architecture with a clearly defined business-logic layer.

    On a good system, there is no such thing (nor can there ever be) a "clearly defined business-logic layer." It's a fallacy to believe that you can encompass all business logic in once place. If you try, you will create a horrendously ugly, un-maintainable system that will be full of nasty bugs. I’ve been there, seen it tried, and watched it fail.

    Business logic needs to be placed at all layers of the system, often times in duplicate, triplicate, or (painfully) more. The key is placing it thoughtfully in places that will be easy to find and easy to maintain.

    Stored procedures are a right step in this direction; they provide a limited interface with strong cohesion. This is what you want. A "who knows what's accessing what" type of cohesion that inline-SQL provides is a recipie for disaster.

  • (cs) in reply to UncleMidriff

    UncleMidriff:

    I have this awful feeling in the back of my mind that I'm learning a lot of really bad habits where I'm at.

    Good Practices (everything from Source Control to Change Management to Quality Assurance) make better software cheaper. You should take the opportunity to sell this fact (we have 40-some years of studies on this) to management and take it upon yourself to work to implmenet said practices. I'd like to say, "think of how much better your company will be." But, screw that -- think of the ROR*!

    * Like ROI, ROR is Return on Resume

  • (cs) in reply to Sceptic

    Anonymous:
    Regexes can detect substrings in O(1)? This is an argument I need to see.

    <FONT face="Courier New" size=2>i've described it before, and have written a very successful implementation of it in FiPL.</FONT>

    <FONT face="Courier New" size=2>your basic regex has to look at every character, providing no added benefit to the loop/parse method.  the way to achieve O(1) performance is by 'caching'.  when a substring to find in a string is provided as input, the regex 'caches' all the possible substrings in the string.  it is then a trivial matter to determine that the substring exists, with a constant number of characters.  simply poll a few cleverly chosen characters in the string and viola!  there you have it.  now, i hear what you're saying.  yes, an adversary may construct a string that can defeat this algorithm.  that's why you poll a constant number of well-chosen characters and a constant number of randomly chosen characters.  comparing which characters are found and their position in the strings against the 'cached' values produces an answer in O(1).</FONT>

    <FONT face="Courier New" size=2>granted, somebody mentioned 4gb strings.  the good news is that the algorithm takes things like that into account.  after a certain string length, you can't neccessarily garuntee constant time performance.  so, you do the next best thing: approximation.  by varying the constant number of random queries into the string, you can better approximate finding the substring.  perl programmers know this.  having a fast but 65% reliable method of finding substrings in constant time is pretty good.  as my complexity teacher once put it, "you can find out anything with a 51% approximate solution" - of course, he was talking in the more theoretical sense, but it still applies.</FONT>

  • (cs) in reply to brazzy

    brazzy:
    No. A general substring search can never be faster than O(n), except if you parallelize it  - you can do it in O(1) time with O(n) CPUs, of course.

    <FONT face="Courier New" size=2>you don't need O(n) CPUs for my solution.  i was able to program this algorithm in FiPL, and you can too.</FONT>

  • (cs) in reply to Lenny

    I work for a medium insurance company.  It would take 5 minutes to make the change and rebuild.  Then I need to fill out a change control form, meet with QA (hope I could keep them to 1 hour), create the implementation schedule, and set up the change in 3 test environments, obtain sign-off from the testers, and finally coordinate the deployment with the IT staff.

  • (cs) in reply to emptyset
    emptyset:

    Anonymous:
    Regexes can detect substrings in O(1)? This is an argument I need to see.

    <font face="Courier New" size="2">i've described it before, and have written a very successful implementation of it in FiPL.</font>

    <font face="Courier New" size="2">your basic regex has to look at every character, providing no added benefit to the loop/parse method.  the way to achieve O(1) performance is by 'caching'.  when a substring to find in a string is provided as input, the regex 'caches' all the possible substrings in the string.  it is then a trivial matter to determine that the substring exists, with a constant number of characters.  simply poll a few cleverly chosen characters in the string and viola!  there you have it.  now, i hear what you're saying.  yes, an adversary may construct a string that can defeat this algorithm.  that's why you poll a constant number of well-chosen characters and a constant number of randomly chosen characters.</font>



    And here I thought that saying "it's trivial" as a method of proof was a joke...

    Your are spewing bullshit. O(1) for a general substring search is impossible. I provided proof of that a few postings above. Caching and randomization are completely irrelevant. And that's true for many common cases, not freak accidents as with a randomized quicksort.

    As slamb wrote, clever substring searches can make the search independant from the length of the needle and return very fast with negative results for large needles. Maybe that's what you mean with O(1), but what everyone else meant is the length of the haystack, and there, anything faster than O(n) is impossible.
  • (cs) in reply to UncleMidriff
    UncleMidriff:
    Satanicpuppy:

    If someone working under me took it upon themselves to fix a problem in such a way that they shut down the system, or broke some critical app, or introduced a MATH ERROR, I would come down on them so hard that it'd be ten years before they could look at a keyboard without puking in fear. Then I'd fire them. Even if nothing broke, and I found out they had made a programming logic change without authorisation, there would be hell to pay. Then I would fire them. That is not the way you work in the professional world.


    Actually, that's exactly the way you work in my professional world :).  My professional world is run by cowboys.  Everyone does their own thing, we've never even heard of code reviews, and testing?  Sure, we do that, because the company I contract for requires it.  But they have no guidelines for it except that it be done;  so long as a developer says, "yeah, I tested it," it gets moved to production.

    As harsh as you come off in your post, I think I'd like to work in someplace like you describe.  I have this awful feeling in the back of my mind that I'm learning a lot of really bad habits where I'm at.


    If that is way you work, then there are two possibilities.  First, your management may be idiots.  But the alternative is that there is relatively little penalty for mistakes in production.  Right now I'm working for a lot of financial services customers, helping them generate reporting for their prime brokerage customers.  If a report is wrong, it may cause a billion dollar hedge fund to make the wrong trades.  The penalty for a mistake is high, and so their are a lot of controls.

    Previously, however, I worked on an internal financial system for a large professional services firm.  I've made fifty million dollar mistakes.  The penalty, was a call from the CFO, a 10 minute code fix, and re-running a batch process that evening.  That environment doesn't require excessive controls.

    Systems should be divided into three categories based on the impact of mistakes: those that cause minor inconvenience, those that cause significant loss of money, and those that cause loss of life.  The controls around a system should be based on this.
  • (cs) in reply to Porsche3000
    Anonymous:

    I work with VB and earn 6 figures. 



    You must be one heck of a programmer. Switch to a real programming language, and you'll earn 7 figures.

  • (cs) in reply to Alexis de Torquemada
    Alexis de Torquemada:
    Anonymous:

    I work with VB and earn 6 figures. 



    You must be one heck of a programmer. Switch to a real programming language, and you'll earn 7 figures.

    <FONT face="Courier New" size=2>he probably meant six figures in yen.</FONT>

  • (cs) in reply to emptyset
    emptyset:

    <FONT face="Courier New" size=2>your basic regex has to look at every character, providing no added benefit to the loop/parse method.  the way to achieve O(1) performance is by 'caching'.</FONT>

    <FONT face="Courier New" size=2>

    </FONT>

    <FONT face=Arial size=2>Hmm.. I don't think you actually understand what O(x) means...</FONT>

    <FONT face=Arial size=2>

    emptyset:
    </FONT>

    <FONT face="Courier New" size=2>after a certain string length, you can't neccessarily garuntee constant time performance.</FONT>

    <FONT face="Courier New" size=2>

    </FONT>

    <FONT face=Arial size=2>And now I'm sure of it. If your algorithim is O(1), then it runs in constant time regardless of the input. Period. Otherwise the algorithim is not O(1).</FONT>

    <FONT face=Arial size=2>Implementation does not affect the O(x) expression of the algorithim. I suggest you read up on Big-O before talking about things you clearly do not understand.</FONT>

    <FONT face=Arial size=2></FONT> 

  • (cs) in reply to brazzy

    brazzy:
    Your are spewing bullshit. O(1) for a general substring search is impossible. I provided proof of that a few postings above. Caching and randomization are completely irrelevant. And that's true for many common cases, not freak accidents as with a randomized quicksort.

    <FONT face="Courier New" size=2>look.  if you 'cache' all possible substrings of a string, you can preform the lookup in constant time.  the substrings have a well-defined order on them.</FONT>

    brazzy:
    As slamb wrote, clever substring searches can make the search independant from the length of the needle and return very fast with negative results for large needles. Maybe that's what you mean with O(1), but what everyone else meant is the length of the haystack, and there, anything faster than O(n) is impossible.

    <FONT face="Courier New" size=2>no, a generalized O(1) substring search is possible, and that's what perl uses.  if it wasn't, nobody would program in perl because then it wouldn't be faster than other languages.</FONT>

  • (cs) in reply to Otto

    Oh, and brazzy is correct. A general substring search cannot be done more efficently than O(n). Not under any circumstances, and no matter what your implementation is. If you do not understand why, then you really do not fully grasp the concept of Big-O algorithim measurement in the first place.

  • (cs) in reply to Otto

    Otto:
    <FONT face=Arial size=2>Hmm.. I don't think you actually understand what O(x) means...</FONT><FONT face=Arial size=2>
    </FONT>

    <FONT face="Courier New" size=2>i understand it quite well.</FONT>

    <FONT face=Arial size=2><FONT face="Times New Roman" size=3>

    Otto:
    </FONT>And now I'm sure of it. If your algorithim is O(1), then it runs in constant time regardless of the input. Period. Otherwise the algorithim is not O(1).
    </FONT>

    <FONT face="Courier New" size=2>well, it's an issue with the 'caching' algorithm.  generating all possible substrings in constant time is still a work in progress.  only approximate constant time solutions exist.</FONT>

    <FONT face=Arial size=2><FONT face="Times New Roman" size=3>

    Otto:
    </FONT>Implementation does not affect the O(x) expression of the algorithim. I suggest you read up on Big-O before talking about things you clearly do not understand.
    </FONT>

    <FONT face="Courier New" size=2>i would suggest any perl programmer to do the same.</FONT>

  • (cs) in reply to emptyset
    emptyset:

    Anonymous:
    Regexes can detect substrings in O(1)? This is an argument I need to see.

    <FONT face="Courier New" size=2>i've described it before, and have written a very successful implementation of it in FiPL.</FONT>

    <FONT face="Courier New" size=2>your basic regex has to look at every character, providing no added benefit to the loop/parse method.  the way to achieve O(1) performance is by 'caching'.  when a substring to find in a string is provided as input, the regex 'caches' all the possible substrings in the string.  it is then a trivial matter to determine that the substring exists, with a constant number of characters.  simply poll a few cleverly chosen characters in the string and viola!  there you have it.  now, i hear what you're saying.  yes, an adversary may construct a string that can defeat this algorithm.  that's why you poll a constant number of well-chosen characters and a constant number of randomly chosen characters.  comparing which characters are found and their position in the strings against the 'cached' values produces an answer in O(1).</FONT>

    <FONT face="Courier New" size=2>granted, somebody mentioned 4gb strings.  the good news is that the algorithm takes things like that into account.  after a certain string length, you can't neccessarily garuntee constant time performance.  so, you do the next best thing: approximation.  by varying the constant number of random queries into the string, you can better approximate finding the substring.  perl programmers know this.  having a fast but 65% reliable method of finding substrings in constant time is pretty good.  as my complexity teacher once put it, "you can find out anything with a 51% approximate solution" - of course, he was talking in the more theoretical sense, but it still applies.</FONT>

    Sounds like empty set is talking about O(1) with respect to the size of the regex pattern. Everyone else seems to be talking about O(n) where n = length of the string to be searched.

  • (cs) in reply to emptyset
    emptyset:

    <FONT face="Courier New" size=2>look.  if you 'cache' all possible substrings of a string, you can preform the lookup in constant time.  the substrings have a well-defined order on them.</FONT>

    The number of all possible substrings of a string can be expressed as n!. Searching through them, if well indexed, can be done in log n time. Therefore your method can be, at best, O(log (n!)), which is still greater than O(1).

    emptyset:

    <FONT face="Courier New" size=2>no, a generalized O(1) substring search is possible, and that's what perl uses.  if it wasn't, nobody would program in perl because then it wouldn't be faster than other languages.</FONT>

    Perl is not faster than other languages. Why would it be? Algorithim measurement is not language dependant.

  • (cs) in reply to OneFactor

    OneFactor:
    Sounds like empty set is talking about O(1) with respect to the size of the regex pattern. Everyone else seems to be talking about O(n) where n = length of the string to be searched.

    <FONT face="Courier New" size=2>no.</FONT>

  • (cs) in reply to emptyset
    emptyset:

    <FONT face=Arial size=2></FONT>

    <FONT face="Courier New" size=2>well, it's an issue with the 'caching' algorithm.  generating all possible substrings in constant time is still a work in progress.  only approximate constant time solutions exist.</FONT>

    <FONT face=Arial size=2>Doesn't really matter how long it takes to generate those substrings. Searching through them cannot be O(1) either. At best that search of any kind can only be O(log n), where n is the number of substrings you're searching through. </FONT>

    <FONT face=Arial size=2>And since there's n! possible substrings of a string of length n, the total time for your substring search algorithim is O(log n!). N</FONT><FONT face=Arial size=2>ow, this might be faster in terms of seconds for your specific case, but the Big-O representation of it is not O(1) no matter how you munge it about.</FONT>

  • (cs) in reply to Otto

    Otto:
    The number of all possible substrings of a string can be expressed as n!. Searching through them, if well indexed, can be done in log n time. Therefore your method can be, at best, O(log (n!)), which is still greater than O(1).

    <FONT face="Courier New" size=2>as a great man once said, O(log(n)) <= 64.  QED.</FONT>

  • (cs) in reply to emptyset
    emptyset:

    <FONT face="Courier New" size=2>as a great man once said, O(log(n)) <= 64.  QED.</FONT>

    If this guy was all that great, he probably would have tried to make sense. Comparing an O(x) expression to a real number doesn't make sense in any world I know of (including this one).

  • (cs) in reply to emptyset
    emptyset:

    <FONT face="Courier New" size=2>no, a generalized O(1) substring search is possible, and that's what perl uses.  if it wasn't, nobody would program in perl because then it wouldn't be faster than other languages.</FONT>

    It's not very often I want to be that message-board jackass who flames people by laughing at them without explanation, but...

    HAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHA

    Ok, sorry about that.

  • (cs) in reply to emptyset
    emptyset:

    Otto:
    The number of all possible substrings of a string can be expressed as n!. Searching through them, if well indexed, can be done in log n time. Therefore your method can be, at best, O(log (n!)), which is still greater than O(1).

    <FONT face="Courier New" size=2>as a great man once said, O(log(n)) <= 64.  QED.</FONT>

    Oh, now I get it.  In you're own little world, O(1) = O(log(n)) but you took three days and paragraphs of posts to tell us that.  Nice...

  • (cs) in reply to Otto
    Otto:
    emptyset:

    <FONT face="Courier New" size=2>as a great man once said, O(log(n)) <= 64.  QED.</FONT>

    If this guy was all that great, he probably would have tried to make sense. Comparing an O(x) expression to a real number doesn't make sense in any world I know of (including this one).

    It makes sense --  think about how big N has to be for log(N) to be more than 64.  He's making a point about how slowly an O(log n)-time cost increases.

    Nevertheless, you can't just throw out stuff like "O(1) substring matching" and then back off to "O(log n), but that's not really very bad!" when people call you on it.

  • (cs) in reply to zip

    zip:
    It makes sense --  think about how big N has to be for log(N) to be more than 64.  He's making a point about how slowly an O(log n)-time cost increases.

    Yeah, but 64 what? What is his measurement in? Comparing O(anything) to "64" doesn't make any sense at all. Measuring actual things in terms of numbers is not what Big-O does. So comparing it to anything measured in numbers doesn't really make any sense. I mean, saying O(log n) doesn't even specify the base for "log", for example.

     

  • (cs) in reply to emptyset
    emptyset:

    OneFactor:
    Sounds like empty set is talking about O(1) with respect to the size of the regex pattern. Everyone else seems to be talking about O(n) where n = length of the string to be searched.

    <FONT face="Courier New" size=2>no.</FONT>

    So umm... given a string of length-n how in the world can you find out if it contains the letter 'a' in O(1) time? Are you talking worst case scenario or simply mean case?

  • (cs) in reply to Otto
    Otto:
    emptyset:

    <font face="Courier New" size="2">look.  if you 'cache' all possible substrings of a string, you can preform the lookup in constant time.  the substrings have a well-defined order on them.</font>

    The number of all possible substrings of a string can be expressed as n!. Searching through them, if well indexed, can be done in log n time. Therefore your method can be, at best, O(log (n!)), which is still greater than O(1).


    Quite apart from the amount of storage it would require (which is O(n!*n)) and the amount of time it would take to generate all that substrings (which is proportional to the storage). You could probably cut that down massively by reusing sub-substrings, but it's still going to be at least O(n^2).

    Otto:

    emptyset:

    <font face="Courier New" size="2">no, a generalized O(1) substring search is possible, and that's what perl uses.  if it wasn't, nobody would program in perl because then it wouldn't be faster than other languages.</font>

    Perl is not faster than other languages. Why would it be? Algorithim measurement is not language dependant.



    As I said: he's spewing bullshit, probably intentionally so. Let's just stop taking him serious just because he makes the bullshit sound vaguely coherent in this thread.
  • (cs) in reply to OneFactor

    OneFactor:
    So umm... given a string of length-n how in the world can you find out if it contains the letter 'a' in O(1) time? Are you talking worst case scenario or simply mean case?

    <FONT face="Courier New" size=2>"caching".</FONT>

  • bozo (unregistered) in reply to emptyset

    emptyhead, why don't you post your algorithm, to show us how it is O(1).

    There had better not be any loops in it.
    You're not even allowed to count the length of the string.

    Don't forget to include the substring caching!

    Judging by the other posts,
    the caching is O(n!) and the searching is O(log(n)) at best, so your algorithm would then be:
    (big surprise here:) O(log(n)) + O(n!)

    Try not to be so stupid.

  • (cs) in reply to Otto
    Otto:

    zip:
    It makes sense --  think about how big N has to be for log(N) to be more than 64.  He's making a point about how slowly an O(log n)-time cost increases.

    Yeah, but 64 what? What is his measurement in? Comparing O(anything) to "64" doesn't make any sense at all. Measuring actual things in terms of numbers is not what Big-O does. So comparing it to anything measured in numbers doesn't really make any sense. I mean, saying O(log n) doesn't even specify the base for "log", for example.

     

     

    Ok, more accurately, it doesn't make SENSE but it does make a POINT.

  • (cs) in reply to brazzy
    brazzy:
    your method can be, at best, O(log (n!)), which is still greater than O(1).


    In fact, O(log (n!)) is O(n).

  • (cs) in reply to Otto
    Otto:

    The number of all possible substrings of a string can be expressed as n!. Searching through them, if well indexed, can be done in log n time. Therefore your method can be, at best, O(log (n!)), which is still greater than O(1).


    The number of substrings of an n-character string is not n!, it is n+(n-1)+...+2+1 = n(n+1)/2 (there are n 1 character substrings, n-1 2 character substrings, and so on up to 2 n-1 character substrings and 1 n character "substring" (i.e. the entrie string).)  So the search can be, at best, O(log n2), which is better than O(log n!), but still worse than O(1)
  • (cs) in reply to emptyset
    emptyset:

    OneFactor:
    So umm... given a string of length-n how in the world can you find out if it contains the letter 'a' in O(1) time? Are you talking worst case scenario or simply mean case?

    <font face="Courier New" size="2">"caching".</font>



    Ah, I understand! emptyset calls his bag of magic pixie dust a "cache". Sprinkling it over the data to execute an O(n) algorithm in O(1) is therefore "caching".
  • (cs) in reply to Anonymouse

    Anonymous:
    Did you consider that the stored procedure might be called by more than just this one application?

    Read Alex's post.

  • (cs) in reply to AJR

    AJR:
    The number of substrings of an n-character string is not n!, it is n+(n-1)+...+2+1 = n(n+1)/2

    Doh! You're correct, sir.

  • (cs) in reply to bozo

    Anonymous:
    emptyhead, why don't you post your algorithm, to show us how it is O(1).

    <FONT face="Courier New" size=2>i can't post my algorithm because it was developed in FiPL.  you have to interpret the aftermath of the scales in order to see if the substring was found or not:</FONT>

    <FONT face="Courier New" size=2>S1 is flip-flop of BLACK and WHITE;
    BlackEvent is incident when S1 becomes BLACK;
    Cycle1 is aftermath of BlackEvent as follows [
         S50 is haphazard; /* change s50’s color */
    ]

    WhiteEvent is incident when S1 becomes WHITE;
    Cycle2 is aftermath of WhiteEvent as follows [
         S50 is haphazard;
    </FONT><FONT face="Courier New" size=2>]

    TomatoEvent is incident when S50 becomes TOMATO;

    EndProgram is aftermath of TomatoEvent as follows [
         thanks for all the fish; /* stop the FVM */
    ]
    </FONT>

     

  • (cs) in reply to Maurits

    Maurits:
    In fact, O(log (n!)) is O(n).

    Putting aside the fact that it's not a factorial as I thought, do you still think this is true? If it was O(log (2^n)), then I'd agree with you. But n! grows much faster than an exponential function like 2^n. So in fact, O(log (n!)) is not the same as O(n).

     

  • (cs) in reply to emptyset
    emptyset:

    <FONT face="Courier New" size=2>i can't post my algorithm because it was developed in FiPL.  you have to interpret the aftermath of the scales in order to see if the substring was found or not:</FONT>

    <FONT face="Courier New" size=2>S1 is flip-flop of BLACK and WHITE;
    BlackEvent is incident when S1 becomes BLACK;
    Cycle1 is aftermath of BlackEvent as follows [
         S50 is haphazard; /* change s50’s color */
    ]

    WhiteEvent is incident when S1 becomes WHITE;
    Cycle2 is aftermath of WhiteEvent as follows [
         S50 is haphazard;
    </FONT><FONT face="Courier New" size=2>]

    TomatoEvent is incident when S50 becomes TOMATO;

    EndProgram is aftermath of TomatoEvent as follows [
         thanks for all the fish; /* stop the FVM */
    ]
    </FONT>

    Umm.. That just changes the state of scale 50 randomly until it becomes TOMATO. It uses S1 as a BLACK/WHITE timer.

    Don't act like using esoteric languages wins you any points here. I can deal with the Fish as well as anybody.

    Oh, and posting bits of code off of Wikipedia is just lame, man.

     

  • (cs) in reply to Otto
    Otto:
    Umm.. That just changes the state of scale 50 randomly until it becomes TOMATO. It uses S1 as a BLACK/WHITE timer.

    Don't act like using esoteric languages wins you any points here. I can deal with the Fish as well as anybody.

    Oh, and posting bits of code off of Wikipedia is just lame, man.

    <FONT face="Courier New" size=2>what can i say?  i was pressed for time.  time is money.  i'm a very busy person.</FONT>

  • Zack (unregistered) in reply to emptyset
    emptyset:
    <font face="Courier New" size="2">what can i say?  i was pressed for time.  time is money.  i'm a very busy person.</font>


    Uh...yeah. Right.
  • (cs) in reply to Otto
    Otto:

    Maurits:
    In fact, O(log (n!)) is O(n).

    Putting aside the fact that it's not a factorial as I thought, do you still think this is true? If it was O(log (2^n)), then I'd agree with you. But n! grows much faster than an exponential function like 2^n. So in fact, O(log (n!)) is not the same as O(n).

     


    For small values of n, log(n!) is approximately linear (but, of course, big O notation doesn't really care about small values of n).  For large n, a better approximation is ln(n!) ~= n ln(n) - n, which is O(n log(n)) (http://en.wikipedia.org/wiki/Factorial#Logarithm_of_the_factorial)

Leave a comment on “Working Around Yourself”

Log In or post as a guest

Replying to comment #46817:

« Return to Article