• (cs) in reply to Sean

    Sean:


    I'd say it's a WTF on both ends.  You'd think the developer would maybe, just maybe try to either determine the cause of the various forms of strings-containing-nulls that come or at least try to do some sort of simple data cleansing so he/she doesn't have to add a new conditional every time a new issue comes in.

    In a perfect world, yes but sometimes the developer knows exactly why this is happening and can't do anything about it.  If you are working with a 3rd party API, you might know that the API is causing the issue but you don't have a choice but to deal with it.  Or if you are getting this data from another team that can't or won't deal with the issue so you just suck it up and deal.

    I think this could be written better, though.  In the HashSet example I gave above, you could load the nullStrings from a file or something so that you don't have to change the code everytime a new one of these is found.

  • (cs) in reply to Rick
    Rick:

    Arguably the best symphony in the country and a fantastic antique car museum, though
      Nonsense!  The TCO doesn't hold a candle to the BugTussle String and Jug Philharmonic. 
  • (cs) in reply to rogthefrog

    <font size="1">

    rogthefrog:
    A Wizard A True Star:
    </font>

    <font size="1">

    rogthefrog:
    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db.
    </font>

    <font size="1">Storing delimited strings in database fields makes baby Jesus cry.</font>

    <font size="1">

    </font>

    <font size="1">Baby Jesus deserves it. Sometimes that's actually a good way to do things. </font>

    <font size="1">

    I agree, </font><font size="1">sometimes. Not this time though.

    One time I did it was when implementing a hierarchy in SQL. One of my columns was similar to an XPath expression representing the path to the row's node in the hierarchy. It looked like: "/1/2/3/" but the values were the PK's of other rows (nodes)...  "/GrandDad/Dad/Me/".

    To find all descendants of an arbitrary node, I could query it like "WHERE path like '%/GrandDad/%'" (replace "GrandDad" with an actual int PK value, of course).

    Worked like a charm.
    </font>
  • Jan (unregistered) in reply to rogthefrog
    rogthefrog:
    Rick:
    rogthefrog:

    Gawd... I think I know where __null and null__ and all other variants comes from.

    It looks like a string is used to hold values for several fields at once, like a comma-separated line, except using __ as the delimiter, or like a bitmask.

    So if the semantics of the strings is

    lastname__firstname__zipcode

    you could have

    papadimoulis__alex__12345



    This would explain why no test for "" or "null" since the input would always have the underscores added before the test.

    Correction to my post: the separator is _ not __. The occurrence of __ signifies that all 3 possible values in the string are blank.

    If this is right (which it probably is, sadly) this design is pretty moronic.

    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db. If each task's result uniquely identifies the task, this schema allows you to data-mine your db for questions like "how many users did task X right before task Y and didn't return null for either task"; "how many users stopped the whole process immediately after task X"; etc. You could store one record per task for each user but that would mean a lot more rows in your db and make said data mining possibly harder.

    I shudder at the mere thought, yet you seem to be right: /^(null_|_)(null_|_)(null)?$/ seems to indeed match all 8 cases (the number 8 is itself a clue)

    My eyes, my eyes

    Jan

  • (cs) in reply to rogthefrog
    rogthefrog:
    A Wizard A True Star:

    rogthefrog:
    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db.

    Storing delimited strings in database fields makes baby Jesus cry.

    Baby Jesus deserves it. Sometimes that's actually a good way to do things.

    I don't believe that is *ever* a good way to do things...

    The design of your database should be able to expand to accomodate multiple tasks, without overloading the meaning of a particular column or storing multiple values in a single column. What kind of monstrous SQL statement are you going to write if you need to do something simple such as count the sub-fields of a column? Are you going to load the data into your application and parse it, or would you attempt to write a parser in SQL?

    Please submit some of your database designs so that we don't run out of material any time soon...

     

  • Jan (unregistered) in reply to Jan
    Anonymous:

    I shudder at the mere thought, yet you seem to be right: /^(null_|_)(null_|_)(null)?$/ seems to indeed match all 8 cases (the number 8 is itself a clue)

    My eyes, my eyes

    Jan

    /^(null)?_(null)?_(null)?$/ would have been marginally more readable

  • (cs)

    haveworld:
    Bustaz Kool:
    Is anyone else surprise that there are at least #126185 issues reported so far?

    And how did you come to that conclusion?

    The comment of "Added for Issue #126185" is most likely a reference to a problem report number; implying that there were 126,184 previous errors (and counting).

    John Bigboote:

    Interestingly, an empty string will return false. Not sure if that's an oversight or an f'ed up twist on the rule.

    Maybe it's my database centricity but strictly steaking an empty string is not Null.  And if this is what's involved in determining if a value is Null, I am dying to see how they go about detecting an empty string.

    Anonymous:
    Reloacate to Cleveland! +o(   Now there is a WTF!

    There was a major tornado that hit Cleveland last night... It is estimated that it did $25,000,000.00 worth of improvements...  (Sorry, Alex, I just souldn't resist). 

    Great orchestra though and the Rock'n'Roll Hall of Fame.

  • (cs) in reply to Miszou
    rogthefrog:
    A Wizard A True Star:

    rogthefrog:
    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db.

    Storing delimited strings in database fields makes baby Jesus cry.

    Baby Jesus deserves it. Sometimes that's actually a good way to do things.

    I'm afraid I'll have to disagree as well.  I've yet to see an example of when it would be good and would love to see one!

    At my last job, one of my coworkers supported a years old system (which he wrote and only he really understood).  One of the key tables used a natural key as its primary key. Inevitably, the business model changed so that the entities defined by the table no longer could be handled by his key.  Redesigning the schema was clearly beyond him because his solution was to turn the PK into a concatenated string -- now every join to the table involves parsing the key.  Hmmm . . . wonder why the system's running so much slower now? 

     

  • woot (unregistered) in reply to El Duderino

    Try looking materialized paths for representing trees in SQL.

  • (cs) in reply to rogthefrog
    rogthefrog:
    A Wizard A True Star:

    rogthefrog:
    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db.

    Storing delimited strings in database fields makes baby Jesus cry.

    Baby Jesus deserves it. Sometimes that's actually a good way to do things.

    It is a great way to do things if you have no idea how databases work or any desire to use indexes, constraints, relationships, and so on.

  • (cs) in reply to John Smallberries
    John Smallberries:
    <FONT size=1>
    rogthefrog:
    A Wizard A True Star:
    </FONT>

    <FONT size=1>

    rogthefrog:
    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db.
    </FONT>

    <FONT size=1>Storing delimited strings in database fields makes baby Jesus cry.</FONT>

    <FONT size=1>

    </FONT>

    <FONT size=1>Baby Jesus deserves it. Sometimes that's actually a good way to do things. </FONT>

    <FONT size=1>


    I agree, </FONT><FONT size=1>sometimes. Not this time though.

    One time I did it was when implementing a hierarchy in SQL. One of my columns was similar to an XPath expression representing the path to the row's node in the hierarchy. It looked like: "/1/2/3/" but the values were the PK's of other rows (nodes)...  "/GrandDad/Dad/Me/".

    To find all descendants of an arbitrary node, I could query it like "WHERE path like '%/GrandDad/%'" (replace "GrandDad" with an actual int PK value, of course).

    Worked like a charm.
    </FONT>

    Exactly what I meant.

  • (cs) in reply to Jeff S
    Jeff S:
    rogthefrog:
    A Wizard A True Star:

    rogthefrog:
    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db.

    Storing delimited strings in database fields makes baby Jesus cry.

    Baby Jesus deserves it. Sometimes that's actually a good way to do things.

    It is a great way to do things if you have no idea how databases work or any desire to use indexes, constraints, relationships, and so on.

    I'm not saying it's a good way to do things all the time, or even 1% of the time. All I'm saying is that you occasionally run into one ad-hoc problem where this is the optimal solution. See John Dingleberries' example above.

  • (cs) in reply to rogthefrog

    It's A solution.  I wouldn't call it an optimal solution -- and for the task at hand it might be the fastest-to-implement solution.  I admit that I've taken lots of shortcuts like this when writing throw away code or am prototyping functionality not related to the DB schema but in a production database I'd use a properly indexed xref table.

  • (cs) in reply to dubwai
    dubwai:
    I don't find the single exit point to make anything clearer.  It makes the code path merge back on itself artifically (making the code more complex) instead of a clean branch.  It keeps the compiler from ensuring that all the code paths actually return something meaningful.  I like to be able to follow the code and easily see what it returns.  I don't like to have to walk through a bunch of extra code to see whether the returnValue variable is changed later on in the method.


    Depends on the code.  Sometimes it makes the code more complex and that would be one of those instances where you don't use a single exit point.  Other cases, the code is much cleaner if you restrict yourself to one exit point.  It's not a hard and fast rule you have to apply to every single function written.
  • (cs) in reply to rogthefrog
    rogthefrog:
    Jeff S:
    rogthefrog:
    A Wizard A True Star:

    rogthefrog:
    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db.

    Storing delimited strings in database fields makes baby Jesus cry.

    Baby Jesus deserves it. Sometimes that's actually a good way to do things.

    It is a great way to do things if you have no idea how databases work or any desire to use indexes, constraints, relationships, and so on.

    I'm not saying it's a good way to do things all the time, or even 1% of the time. All I'm saying is that you occasionally run into one ad-hoc problem where this is the optimal solution. See John Dingleberries' example above.

    Why wouldn't you just create a table that with a parent column which would point to the parent key or null if it was the root?

  • (cs) in reply to mizhi

    mizhi:

    Depends on the code.  Sometimes it makes the code more complex and that would be one of those instances where you don't use a single exit point.  Other cases, the code is much cleaner if you restrict yourself to one exit point.  It's not a hard and fast rule you have to apply to every single function written.

    OK, we are of the same mind on this.  I thought you were one of those 'multiple exit points are evil' people.

  • (cs) in reply to El Duderino
    El Duderino:
    It's A solution.  I wouldn't call it an optimal solution -- and for the task at hand it might be the fastest-to-implement solution.  I admit that I've taken lots of shortcuts like this when writing throw away code or am prototyping functionality not related to the DB schema but in a production database I'd use a properly indexed xref table.

    There are many facets to "optimal". Are talking, the most performant? the most scalable? the most maintainable? the most elegent? the most dogmatic?

    I modeled and prototyped a bunch of techniques (self-referential tables,  nested sets...) and this technique (known as materialized path) was the fastest, most flexible and was implemented with the simplest and least opaque code. In my estimation, it was optimal for this case.
  • (cs) in reply to dubwai
    dubwai:
    Why wouldn't you just create a table that with a parent column which would point to the parent key or null if it was the root?

    The app had to support multiple hierarchies with arbitrary depths. To build a tree of nodes with the schema you suggest would require n (the depth) number of joins (or worse, calls), but n is not known a priori. Your model makes it very easy to get all the children of a given node, but it's not as easy to find all the descendants or ancestors of an arbitrary node. If anyone is interested, I can dig out the code.

    My thinking is: when a row is inserted, we know its position in the hierarchy. Why not record that rather than recomputing it for each subsequent query? I know, I know...don't store computed data in the DB, but this kind of computation is very expensive.
  • (cs) in reply to rogthefrog
    rogthefrog:
    I'm not saying it's a good way to do things all the time, or even 1% of the time. All I'm saying is that you occasionally run into one ad-hoc problem where this is the optimal solution. See John Dingleberries' example above.

    Hey! No name calling!
  • (cs) in reply to John Smallberries

    John Smallberries:
    rogthefrog:
    I'm not saying it's a good way to do things all the time, or even 1% of the time. All I'm saying is that you occasionally run into one ad-hoc problem where this is the optimal solution. See John Dingleberries' example above.

    Hey! No name calling!

    That was meant as a compliment. [:D]

    Your solution sounds a lot like what I mentioned in my original post that had all the naysayers going WTF at me.

    Another advantage to this kind of solution is that you can easily spot long-distance dependencies. For example if you're compiling lists of tasks it's very easy to see who did both task A and task B in that order, adjacent or not, etc.

    Doesn't seem to be a lot of data miners on this board. A damn shame.

  • (cs) in reply to dubwai
    dubwai:
    rogthefrog:
    Jeff S:
    rogthefrog:
    A Wizard A True Star:

    rogthefrog:
    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db.

    Storing delimited strings in database fields makes baby Jesus cry.

    Baby Jesus deserves it. Sometimes that's actually a good way to do things.

    It is a great way to do things if you have no idea how databases work or any desire to use indexes, constraints, relationships, and so on.

    I'm not saying it's a good way to do things all the time, or even 1% of the time. All I'm saying is that you occasionally run into one ad-hoc problem where this is the optimal solution. See John Dingleberries' example above.

    Why wouldn't you just create a table that with a parent column which would point to the parent key or null if it was the root?

    As Bradford Lingonberries mentioned, your solution doesn't address arbitrary-length sequences (which in my case are the rule rather than the exception) and there's no easy way to traverse the tree/path/hierarchy in both directions.

  • (cs) in reply to Bustaz Kool
    Bustaz Kool:

    haveworld:
    Bustaz Kool:
    Is anyone else surprise that there are at least #126185 issues reported so far?

    And how did you come to that conclusion?

    The comment of "Added for Issue #126185" is most likely a reference to a problem report number; implying that there were 126,184 previous errors (and counting).


    But not necessarily all in the one product.  If it's a company-wide (or group with the company) bug tracking system, the numbers could well be unique across many different products.  Consider, for example, mozilla.org's bugzilla, which has bug numbers up to 297252 (the highest number shown on the "new reports" page atm.)  Bug number 297252 is listed as a Firefox bug, but that doesn't mean FF has had 297,252 bugs - that database includes many duplicates, bugs with other Mozilla products (like the Bugzilla system itself, for example), and other stuff, and the numbers cover all of those.
    Who knows what's in the 126,184 previous "issues" in the system that the author of the WTF was using?

  • (cs) in reply to AJR
    AJR:
    Bustaz Kool:

    haveworld:
    Bustaz Kool:
    Is anyone else surprise that there are at least #126185 issues reported so far?

    And how did you come to that conclusion?

    The comment of "Added for Issue #126185" is most likely a reference to a problem report number; implying that there were 126,184 previous errors (and counting).


    But not necessarily all in the one product.  If it's a company-wide (or group with the company) bug tracking system, the numbers could well be unique across many different products.  Consider, for example, mozilla.org's bugzilla, which has bug numbers up to 297252 (the highest number shown on the "new reports" page atm.)  Bug number 297252 is listed as a Firefox bug, but that doesn't mean FF has had 297,252 bugs - that database includes many duplicates, bugs with other Mozilla products (like the Bugzilla system itself, for example), and other stuff, and the numbers cover all of those.
    Who knows what's in the 126,184 previous "issues" in the system that the author of the WTF was using?

    You are absolutely correct... (My original remark was just kinda a joke...)

  • Speaker (unregistered) in reply to AJR

    It's probably just a bug in the shareware. Maybe the management came up
    with a new processor to test the validity of their sql-statement.
    On the other hand, it seems likely that the bug is within the grasp
    of a properly formatted text-string. It's pretty common. Maybe it could even
    be downloaded to a handheld system for greater endurance!!!! Both pros and cons.
    It's probably not even sold to thousands of households worldwide :(
    Sad but that's how it goes. Anyway, cool code!

  • (cs) in reply to rogthefrog
    rogthefrog:
    dubwai:
    rogthefrog:
    Jeff S:
    rogthefrog:
    A Wizard A True Star:

    rogthefrog:
    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db.

    Storing delimited strings in database fields makes baby Jesus cry.

    Baby Jesus deserves it. Sometimes that's actually a good way to do things.

    It is a great way to do things if you have no idea how databases work or any desire to use indexes, constraints, relationships, and so on.

    I'm not saying it's a good way to do things all the time, or even 1% of the time. All I'm saying is that you occasionally run into one ad-hoc problem where this is the optimal solution. See John Dingleberries' example above.

    Why wouldn't you just create a table that with a parent column which would point to the parent key or null if it was the root?

    As Bradford Lingonberries mentioned, your solution doesn't address arbitrary-length sequences (which in my case are the rule rather than the exception) and there's no easy way to traverse the tree/path/hierarchy in both directions.



    Oh God, please stop quoting. ;_;
    (=p)
  • (cs) in reply to John Smallberries

    John Smallberries:
    The app had to support multiple hierarchies with arbitrary depths. To build a tree of nodes with the schema you suggest would require n (the depth) number of joins (or worse, calls), but n is not known a priori. Your model makes it very easy to get all the children of a given node, but it's not as easy to find all the descendants or ancestors of an arbitrary node. If anyone is interested, I can dig out the code.

    My thinking is: when a row is inserted, we know its position in the hierarchy. Why not record that rather than recomputing it for each subsequent query? I know, I know...don't store computed data in the DB, but this kind of computation is very expensive.

    You mean my textbooks lied to me!?!?!?

    [:'(]

    'K I'll buy it.  I'm certainly one who'll try anything once and I'm a big proponent of making the best decision regardless of conventions.  Truth is, this sort of thing flys in the face of all the standards I've ever read about but I'll certainly keep it under my cap for an occasion when it might come in handy -- might even dazzle some DBAs with it! [;)]

  • (cs)

    This is caused by using this:

    String mungTogether= obj1 + "_" + obj2  + "_" + obj3;

    to implement equals() and hashCode().

    The correct way to do this kind of thing is to create a small object holding the potential tuples (triples), to treat the objects as objects (with types), and to write equals and hashCode properly. If you do it right, you only need to do it once. 

  • (cs)

    Somehow I miss values like "completely_null", "real_full_null", "believe_me_this _is_so_null", "dull_null" or "hyper_null".

  • Jimmy (unregistered)

    What scares me the most is that the "null_null_null" thing is obviously a symptom of something in the code being passed a "null_" string, which then looks at it, decides it's not a NULL string, and acts upon it.
    Expect a new addition to this function soon, as "null_null_null_null" is bound to arrive as soon as some function is passed a "null_" string 4 times ;)

  • (cs)

    This time I don't think it's the developer himself , but the previous developers who failed to clean and management who fail to put maintenance money in the project.
    I've seen stuff like this before.  Main reason is: "no time, project must not crash".

  • bleugh (unregistered) in reply to dubwai

    John Smallberries:

    True, str.equals("blah") || str.equals("blah") really makes me scratch my head trying to figure out the operator precendence.  You'd think it would evaluate to the result of the first method call or'd with the result of the second method call.

    Not necessarily. What if the || operator has higher precedence to the . operator? Then it could mean (str.equals("blah") || str).equals("blah")

    [^o)]

  • Tom (unregistered) in reply to AJR
    AJR:
    Bustaz Kool:

    haveworld:
    Bustaz Kool:
    Is anyone else surprise that there are at least #126185 issues reported so far?

    And how did you come to that conclusion?

    The comment of "Added for Issue #126185" is most likely a reference to a problem report number; implying that there were 126,184 previous errors (and counting).


    But not necessarily all in the one product.  If it's a company-wide (or group with the company) bug tracking system, the numbers could well be unique across many different products.  Consider, for example, mozilla.org's bugzilla, which has bug numbers up to 297252 (the highest number shown on the "new reports" page atm.)  Bug number 297252 is listed as a Firefox bug, but that doesn't mean FF has had 297,252 bugs - that database includes many duplicates, bugs with other Mozilla products (like the Bugzilla system itself, for example), and other stuff, and the numbers cover all of those.
    Who knows what's in the 126,184 previous "issues" in the system that the author of the WTF was using?



    Also its entirely possible they started at bug 100,000 so they could have 900,000 bugs before entering this one:

    bug #1,000,000: Bug numbers no longer sort properly as text.

  • (cs) in reply to dubwai
    dubwai:

    I thought you were one of those 'multiple exit points are evil' people.

    In certain situations gotos are a very elegant solution; emulating them with if/while is a pain. A safe goto could easily be added to Java or C#, too bad it's not there.

  • George Jempty (unregistered)

    This reminds me of something I read in a PHP 5 book earlier this year.  The issue was how to treat the string "false" inside of XML tags as a boolean in the context of a PHP application.  Soon the author was making this much harder than it had to be, applying a regular expression that not only tested for the existence of the string "false", but also of "no", "off", etcetera.  Immediately two simple solutions jumped out at me as to how to address the original issue of converting the string "false" to boolean: 1) use eval() on the string, or better yet 2) making the "false" implicit, and only emitting the XML tag when the attribute is true.  So it's no wonder people are writing code such as in your example, when "experts" who write books can hardly do better!

  • monkey (unregistered) in reply to A Wizard A True Star
    A Wizard A True Star:

    rogthefrog:
    There are, however, possible explanations. Say the app is tracking tasks, each of which returns some string (which may be null). You could store each task's result in some session variable, and when your user is done write it to your db as a single string separated by _. This makes it extensible: if you add another task, you don't have to add a field to your db.

    Storing delimited strings in database fields makes baby Jesus cry.

     



    another WTF in the making !!
  • (cs) in reply to Charles Nadolski
    Charles Nadolski:
    WTF. Why not just use str.Find("null") and str.Find("__").  That would handle all of these stupid cases, including Issue #eventy-billion.  That is, unless for some damn reason "null" is a valid string.

            //Added for Issue #107724
    //Added for Issue #126185

    now THAT is priceles. What is exactly issue number X? Reading too much slashdot on the job?


    Because somebody might be feeding "annull" or "nullify" to the function, and trigger false positives.
    A regular expression to make the test more precise might be worth considering.
  • Sadburger (unregistered) in reply to John Smallberries
    John Smallberries:
    dubwai:
    Why wouldn't you just create a table that with a parent column which would point to the parent key or null if it was the root?

    The app had to support multiple hierarchies with arbitrary depths. To build a tree of nodes with the schema you suggest would require n (the depth) number of joins (or worse, calls), but n is not known a priori. Your model makes it very easy to get all the children of a given node, but it's not as easy to find all the descendants or ancestors of an arbitrary node. If anyone is interested, I can dig out the code.

    My thinking is: when a row is inserted, we know its position in the hierarchy. Why not record that rather than recomputing it for each subsequent query? I know, I know...don't store computed data in the DB, but this kind of computation is very expensive.

    This reminds me of one of my very first projects I ever worked on. It was a fairly basic online shopping system written in PHP/MySQL. The category table had three columns, id, pid, and name, with the pid (parent-id) set to zero if it was a "top-level category."  Everything was going well until I got to the part where the client added new items to inventory...the select box needed to have a higher level of indentation based on the subcategory (no leading spaces for top-level item, 2 for each level of nested-ness(?) )
    Anyways, I ended up doing this:
    $maintext.="<select name='cid'>";
        $sql = "SELECT * FROM `categories` WHERE `pid` = \"0\";";
        $toplevelcats = $db->query($sql) or send_error_page($db->error());
        //add extra whitespace for subcategories based on current indentationlevel
        $currentindentlevel=0;
        while($thistoplevelcat = $db->getrow($toplevelcats)){
          //no parents for this level
          $thistoplevelcatname=prepString($thistoplevelcat['name']);
          $thistoplevelcatid = $thistoplevelcat['id'];
          $maintext.="<option value='$thistoplevelcatid'>$thistoplevelcatname</option>";
          $sql = "SELECT * FROM `categories` WHERE `pid` = \"$thistoplevelcatid\" ORDER BY `name` ASC;";
          //store results index in array...increment to go to a new, deeper sublevel
          $subcatresults[++$currentindentlevel]=$db->query($sql) or send_error_page($db->error());
          while($currentindentlevel>0){
            while($thissubcatrow = $db->getrow($subcatresults[$currentindentlevel])){
              $thissubcatname = prepString($thissubcatrow['name']);
              $thissubcatid = $thissubcatrow['id'];
              $maintext.="<option value='$thissubcatid' ";
              if($thissubcatid == $cid){
                $maintext.="SELECTED";
              }
              $maintext.=">";
              for($i=0;$i<$currentindentlevel;$i++){
                $maintext.="    ";
              }
              $maintext.="$thissubcatname</option>";
              $sql = "SELECT * FROM `categories` WHERE `pid` = \"$thissubcatid\" ORDER BY `name` ASC;";
              $subcatresults[++$currentindentlevel]=$db->query($sql) or send_error_page($db->error());
            }
            $currentindentlevel--;
          }
        }
        $maintext.="</select></td></tr>";

    Sadburger
  • (cs) in reply to John Smallberries

    Bobby Loganberries:
    The app had to support multiple hierarchies with arbitrary depths. To build a tree of nodes with the schema you suggest would require n (the depth) number of joins (or worse, calls), but n is not known a priori. Your model makes it very easy to get all the children of a given node, but it's not as easy to find all the descendants or ancestors of an arbitrary node. If anyone is interested, I can dig out the code.

    You can build a stored procedure to do this.  I belive Oracle has a built in recursive query for this type of thing.  As another person also mentioned, you can also create a cross-reference table that does this kind of thing.

    Giovanni Huckleberries:

    My thinking is: when a row is inserted, we know its position in the hierarchy. Why not record that rather than recomputing it for each subsequent query? I know, I know...don't store computed data in the DB, but this kind of computation is very expensive.

    It seems to me that your solution requires retrieving the data, parsing it, and then going back to the DB for the rows specified in the parsed data.  Maybe I am misunderstanding the solution.

  • (cs) in reply to rogthefrog

    rogthefrog:
    As Bradford Lingonberries mentioned, your solution doesn't address arbitrary-length sequences (which in my case are the rule rather than the exception) and there's no easy way to traverse the tree/path/hierarchy in both directions.

    I'm not sure how it doesn't address arbitrary length sequences.

    To find all the children, search for rows with the node as the parent and recurse.  To find the ancestors, you just find the parent row, find it's parent row until there is no parent.

    This was just a simple example of the idea, you can get much fancier with a xref table as mentioned by someone else.

  • Anon (unregistered) in reply to Miszou

    Why evaluate a boolean to return the same boolean?

    return (blah || bleah || yadda || bubba);

    should be more than enough.

  • (cs) in reply to mizhi

    mizhi:
    dubwai:
    I don't find the single exit point to make anything clearer.  It makes the code path merge back on itself artifically (making the code more complex) instead of a clean branch.  It keeps the compiler from ensuring that all the code paths actually return something meaningful.  I like to be able to follow the code and easily see what it returns.  I don't like to have to walk through a bunch of extra code to see whether the returnValue variable is changed later on in the method.


    Depends on the code.  Sometimes it makes the code more complex and that would be one of those instances where you don't use a single exit point.  Other cases, the code is much cleaner if you restrict yourself to one exit point.  It's not a hard and fast rule you have to apply to every single function written.

    I make no effort to avoid multiple exit points.  In fact, sometimes I'll move a block of code into its own procedure so that I can use an extra exit point rather than nesting the if's deeper.  However, I can nearly always keep those exits within a few lines of the beginning of the procedure, so they're obvious.

  • (cs) in reply to PstScrpt
    PstScrpt:

    I make no effort to avoid multiple exit points.  In fact, sometimes I'll move a block of code into its own procedure so that I can use an extra exit point rather than nesting the if's deeper.  However, I can nearly always keep those exits within a few lines of the beginning of the procedure, so they're obvious.

    Exactly.  People will sit there and tell you that 'return false' is more difficult to understand than an extra level of nesting and setting a variable to false and retuning it at the end of the method.

  • (cs) in reply to dubwai
    dubwai:
    You can build a stored procedure to do this.  I belive Oracle has a built in recursive query for this type of thing.  As another person also mentioned, you can also create a cross-reference table that does this kind of thing.

    I'm not sure about Oracle, but in SQL Server, recursive queries are hard to write and don't perform well. There is also a 32 level limit on the recursion.  The xref table (nested sets technique) is also difficult to implement and requires many joins for deep hierarchies

    dubwai:
    It seems to me that your solution requires retrieving the data, parsing it, and then going back to the DB for the rows specified in the parsed data.  Maybe I am misunderstanding the solution.

    Not at all. The query only has a LIKE clause and no joins or recursion.
  • monkey (unregistered) in reply to John Smallberries

    I love this site.  It helps keep me calm when I am in despair, at work as one dirty hack is suggested to sort out the problem caused by another dirty hack.  Ok there is a time and a place for a dirty hack but there is also a time and place for actually doing things 'right' in the first place or at least fixing them when they come to light.  Anyway I digress.
    Currently I am finding solace in the fact that one of my colleagues who I sent a link to this site in the hope that he might learn something useful seems to find it hilarious whilst remaining blissfully unaware that if I started sending his code in you wouldn't need any extra content for at least several months.

  • monkey (unregistered) in reply to monkey

    times new roman??

    that's what comes of using word to check your spelling

  • (cs) in reply to John Smallberries

    Nigel Ginberries:
    dubwai:
    You can build a stored procedure to do this.  I belive Oracle has a built in recursive query for this type of thing.  As another person also mentioned, you can also create a cross-reference table that does this kind of thing.

    I'm not sure about Oracle, but in SQL Server, recursive queries are hard to write and don't perform well. There is also a 32 level limit on the recursion.  The xref table (nested sets technique) is also difficult to implement and requires many joins for deep hierarchies

    OK, what about a stored-procedure?


    Wesley Raspberries:
    dubwai:
    It seems to me that your solution requires retrieving the data, parsing it, and then going back to the DB for the rows specified in the parsed data.  Maybe I am misunderstanding the solution.

    Not at all. The query only has a LIKE clause and no joins or recursion.

    But don't you need to retrieve the rows at some point?  Perhaps you are dealing with smaller data sets that I have to work with.  Generally, we can't rely on like queries (especially unbounded ones) because they can take hours, if not days to return.

  • (cs) in reply to monkey

    Anonymous:
    there is also a time and place for actually doing things 'right' in the first place or at least fixing them when they come to light.

    Hmm.  This is highly unorthodox.  Doing things right in the first place?  Fixing the root problem?  I don't know what planet you are from but this is earth, buddy.  We don't do things that way around here.

  • monkey (unregistered) in reply to dubwai
    dubwai:

    Anonymous:
    there is also a time and place for actually doing things 'right' in the first place or at least fixing them when they come to light.

    Hmm.  This is highly unorthodox.  Doing things right in the first place?  Fixing the root problem?  I don't know what planet you are from but this is earth, buddy.  We don't do things that way around here.


    sorry that comment was out of order
  • (cs) in reply to PstScrpt
    PstScrpt:

    I make no effort to avoid multiple exit points.  In fact, sometimes I'll move a block of code into its own procedure so that I can use an extra exit point rather than nesting the if's deeper.  However, I can nearly always keep those exits within a few lines of the beginning of the procedure, so they're obvious.



    That's usually when I feel having an extra exit point is fine, at the beginning of the function where it's obvious.  It's usually to check input to make sure that it's kosher.  In that case, though, you really haven't gotten to the meat of the function itself.
  • (cs) in reply to dubwai
    dubwai:

    OK, we are of the same mind on this.  I thought you were one of those 'multiple exit points are evil' people.



    I'm a 'whatever makes the code elegant and easy to understand' person.  :-)

Leave a comment on “There's More Than One Way to Null a String”

Log In or post as a guest

Replying to comment #:

« Return to Article