- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
In a perfect world, yes but sometimes the developer knows exactly why this is happening and can't do anything about it. If you are working with a 3rd party API, you might know that the API is causing the issue but you don't have a choice but to deal with it. Or if you are getting this data from another team that can't or won't deal with the issue so you just suck it up and deal.
I think this could be written better, though. In the HashSet example I gave above, you could load the nullStrings from a file or something so that you don't have to change the code everytime a new one of these is found.
Admin
Admin
<font size="1">
I agree, </font><font size="1">sometimes. Not this time though.
One time I did it was when implementing a hierarchy in SQL. One of my columns was similar to an XPath expression representing the path to the row's node in the hierarchy. It looked like: "/1/2/3/" but the values were the PK's of other rows (nodes)... "/GrandDad/Dad/Me/".
To find all descendants of an arbitrary node, I could query it like "WHERE path like '%/GrandDad/%'" (replace "GrandDad" with an actual int PK value, of course).
Worked like a charm.
</font>
Admin
I shudder at the mere thought, yet you seem to be right: /^(null_|_)(null_|_)(null)?$/ seems to indeed match all 8 cases (the number 8 is itself a clue)
My eyes, my eyes
Jan
Admin
I don't believe that is *ever* a good way to do things...
The design of your database should be able to expand to accomodate multiple tasks, without overloading the meaning of a particular column or storing multiple values in a single column. What kind of monstrous SQL statement are you going to write if you need to do something simple such as count the sub-fields of a column? Are you going to load the data into your application and parse it, or would you attempt to write a parser in SQL?
Please submit some of your database designs so that we don't run out of material any time soon...
Admin
/^(null)?_(null)?_(null)?$/ would have been marginally more readable
Admin
The comment of "Added for Issue #126185" is most likely a reference to a problem report number; implying that there were 126,184 previous errors (and counting).
Maybe it's my database centricity but strictly steaking an empty string is not Null. And if this is what's involved in determining if a value is Null, I am dying to see how they go about detecting an empty string.
There was a major tornado that hit Cleveland last night... It is estimated that it did $25,000,000.00 worth of improvements... (Sorry, Alex, I just souldn't resist).
Great orchestra though and the Rock'n'Roll Hall of Fame.
Admin
I'm afraid I'll have to disagree as well. I've yet to see an example of when it would be good and would love to see one!
At my last job, one of my coworkers supported a years old system (which he wrote and only he really understood). One of the key tables used a natural key as its primary key. Inevitably, the business model changed so that the entities defined by the table no longer could be handled by his key. Redesigning the schema was clearly beyond him because his solution was to turn the PK into a concatenated string -- now every join to the table involves parsing the key. Hmmm . . . wonder why the system's running so much slower now?
Admin
Try looking materialized paths for representing trees in SQL.
Admin
It is a great way to do things if you have no idea how databases work or any desire to use indexes, constraints, relationships, and so on.
Admin
Exactly what I meant.
Admin
I'm not saying it's a good way to do things all the time, or even 1% of the time. All I'm saying is that you occasionally run into one ad-hoc problem where this is the optimal solution. See John Dingleberries' example above.
Admin
It's A solution. I wouldn't call it an optimal solution -- and for the task at hand it might be the fastest-to-implement solution. I admit that I've taken lots of shortcuts like this when writing throw away code or am prototyping functionality not related to the DB schema but in a production database I'd use a properly indexed xref table.
Admin
Depends on the code. Sometimes it makes the code more complex and that would be one of those instances where you don't use a single exit point. Other cases, the code is much cleaner if you restrict yourself to one exit point. It's not a hard and fast rule you have to apply to every single function written.
Admin
Why wouldn't you just create a table that with a parent column which would point to the parent key or null if it was the root?
Admin
OK, we are of the same mind on this. I thought you were one of those 'multiple exit points are evil' people.
Admin
There are many facets to "optimal". Are talking, the most performant? the most scalable? the most maintainable? the most elegent? the most dogmatic?
I modeled and prototyped a bunch of techniques (self-referential tables, nested sets...) and this technique (known as materialized path) was the fastest, most flexible and was implemented with the simplest and least opaque code. In my estimation, it was optimal for this case.
Admin
The app had to support multiple hierarchies with arbitrary depths. To build a tree of nodes with the schema you suggest would require n (the depth) number of joins (or worse, calls), but n is not known a priori. Your model makes it very easy to get all the children of a given node, but it's not as easy to find all the descendants or ancestors of an arbitrary node. If anyone is interested, I can dig out the code.
My thinking is: when a row is inserted, we know its position in the hierarchy. Why not record that rather than recomputing it for each subsequent query? I know, I know...don't store computed data in the DB, but this kind of computation is very expensive.
Admin
Hey! No name calling!
Admin
That was meant as a compliment. [:D]
Your solution sounds a lot like what I mentioned in my original post that had all the naysayers going WTF at me.
Another advantage to this kind of solution is that you can easily spot long-distance dependencies. For example if you're compiling lists of tasks it's very easy to see who did both task A and task B in that order, adjacent or not, etc.
Doesn't seem to be a lot of data miners on this board. A damn shame.
Admin
As Bradford Lingonberries mentioned, your solution doesn't address arbitrary-length sequences (which in my case are the rule rather than the exception) and there's no easy way to traverse the tree/path/hierarchy in both directions.
Admin
But not necessarily all in the one product. If it's a company-wide (or group with the company) bug tracking system, the numbers could well be unique across many different products. Consider, for example, mozilla.org's bugzilla, which has bug numbers up to 297252 (the highest number shown on the "new reports" page atm.) Bug number 297252 is listed as a Firefox bug, but that doesn't mean FF has had 297,252 bugs - that database includes many duplicates, bugs with other Mozilla products (like the Bugzilla system itself, for example), and other stuff, and the numbers cover all of those.
Who knows what's in the 126,184 previous "issues" in the system that the author of the WTF was using?
Admin
You are absolutely correct... (My original remark was just kinda a joke...)
Admin
It's probably just a bug in the shareware. Maybe the management came up
with a new processor to test the validity of their sql-statement.
On the other hand, it seems likely that the bug is within the grasp
of a properly formatted text-string. It's pretty common. Maybe it could even
be downloaded to a handheld system for greater endurance!!!! Both pros and cons.
It's probably not even sold to thousands of households worldwide :(
Sad but that's how it goes. Anyway, cool code!
Admin
Oh God, please stop quoting. ;_;
(=p)
Admin
You mean my textbooks lied to me!?!?!?
[:'(]
'K I'll buy it. I'm certainly one who'll try anything once and I'm a big proponent of making the best decision regardless of conventions. Truth is, this sort of thing flys in the face of all the standards I've ever read about but I'll certainly keep it under my cap for an occasion when it might come in handy -- might even dazzle some DBAs with it! [;)]
Admin
This is caused by using this:
String mungTogether= obj1 + "_" + obj2 + "_" + obj3;
to implement equals() and hashCode().
The correct way to do this kind of thing is to create a small object holding the potential tuples (triples), to treat the objects as objects (with types), and to write equals and hashCode properly. If you do it right, you only need to do it once.
Admin
Somehow I miss values like "completely_null", "real_full_null", "believe_me_this _is_so_null", "dull_null" or "hyper_null".
Admin
What scares me the most is that the "null_null_null" thing is obviously a symptom of something in the code being passed a "null_" string, which then looks at it, decides it's not a NULL string, and acts upon it.
Expect a new addition to this function soon, as "null_null_null_null" is bound to arrive as soon as some function is passed a "null_" string 4 times ;)
Admin
This time I don't think it's the developer himself , but the previous developers who failed to clean and management who fail to put maintenance money in the project.
I've seen stuff like this before. Main reason is: "no time, project must not crash".
Admin
Not necessarily. What if the || operator has higher precedence to the . operator? Then it could mean (str.equals("blah") || str).equals("blah")
[^o)]
Admin
Also its entirely possible they started at bug 100,000 so they could have 900,000 bugs before entering this one:
bug #1,000,000: Bug numbers no longer sort properly as text.
Admin
In certain situations gotos are a very elegant solution; emulating them with if/while is a pain. A safe goto could easily be added to Java or C#, too bad it's not there.
Admin
This reminds me of something I read in a PHP 5 book earlier this year. The issue was how to treat the string "false" inside of XML tags as a boolean in the context of a PHP application. Soon the author was making this much harder than it had to be, applying a regular expression that not only tested for the existence of the string "false", but also of "no", "off", etcetera. Immediately two simple solutions jumped out at me as to how to address the original issue of converting the string "false" to boolean: 1) use eval() on the string, or better yet 2) making the "false" implicit, and only emitting the XML tag when the attribute is true. So it's no wonder people are writing code such as in your example, when "experts" who write books can hardly do better!
Admin
another WTF in the making !!
Admin
Because somebody might be feeding "annull" or "nullify" to the function, and trigger false positives.
A regular expression to make the test more precise might be worth considering.
Admin
This reminds me of one of my very first projects I ever worked on. It was a fairly basic online shopping system written in PHP/MySQL. The category table had three columns, id, pid, and name, with the pid (parent-id) set to zero if it was a "top-level category." Everything was going well until I got to the part where the client added new items to inventory...the select box needed to have a higher level of indentation based on the subcategory (no leading spaces for top-level item, 2 for each level of nested-ness(?) )
Anyways, I ended up doing this:
$maintext.="<select name='cid'>";
$sql = "SELECT * FROM `categories` WHERE `pid` = \"0\";";
$toplevelcats = $db->query($sql) or send_error_page($db->error());
//add extra whitespace for subcategories based on current indentationlevel
$currentindentlevel=0;
while($thistoplevelcat = $db->getrow($toplevelcats)){
//no parents for this level
$thistoplevelcatname=prepString($thistoplevelcat['name']);
$thistoplevelcatid = $thistoplevelcat['id'];
$maintext.="<option value='$thistoplevelcatid'>$thistoplevelcatname</option>";
$sql = "SELECT * FROM `categories` WHERE `pid` = \"$thistoplevelcatid\" ORDER BY `name` ASC;";
//store results index in array...increment to go to a new, deeper sublevel
$subcatresults[++$currentindentlevel]=$db->query($sql) or send_error_page($db->error());
while($currentindentlevel>0){
while($thissubcatrow = $db->getrow($subcatresults[$currentindentlevel])){
$thissubcatname = prepString($thissubcatrow['name']);
$thissubcatid = $thissubcatrow['id'];
$maintext.="<option value='$thissubcatid' ";
if($thissubcatid == $cid){
$maintext.="SELECTED";
}
$maintext.=">";
for($i=0;$i<$currentindentlevel;$i++){
$maintext.=" ";
}
$maintext.="$thissubcatname</option>";
$sql = "SELECT * FROM `categories` WHERE `pid` = \"$thissubcatid\" ORDER BY `name` ASC;";
$subcatresults[++$currentindentlevel]=$db->query($sql) or send_error_page($db->error());
}
$currentindentlevel--;
}
}
$maintext.="</select></td></tr>";
Sadburger
Admin
You can build a stored procedure to do this. I belive Oracle has a built in recursive query for this type of thing. As another person also mentioned, you can also create a cross-reference table that does this kind of thing.
It seems to me that your solution requires retrieving the data, parsing it, and then going back to the DB for the rows specified in the parsed data. Maybe I am misunderstanding the solution.
Admin
I'm not sure how it doesn't address arbitrary length sequences.
To find all the children, search for rows with the node as the parent and recurse. To find the ancestors, you just find the parent row, find it's parent row until there is no parent.
This was just a simple example of the idea, you can get much fancier with a xref table as mentioned by someone else.
Admin
Why evaluate a boolean to return the same boolean?
return (blah || bleah || yadda || bubba);
should be more than enough.
Admin
I make no effort to avoid multiple exit points. In fact, sometimes I'll move a block of code into its own procedure so that I can use an extra exit point rather than nesting the if's deeper. However, I can nearly always keep those exits within a few lines of the beginning of the procedure, so they're obvious.
Admin
Exactly. People will sit there and tell you that 'return false' is more difficult to understand than an extra level of nesting and setting a variable to false and retuning it at the end of the method.
Admin
I'm not sure about Oracle, but in SQL Server, recursive queries are hard to write and don't perform well. There is also a 32 level limit on the recursion. The xref table (nested sets technique) is also difficult to implement and requires many joins for deep hierarchies
Not at all. The query only has a LIKE clause and no joins or recursion.
Admin
I love this site. It helps keep me calm when I am in despair, at work as one dirty hack is suggested to sort out the problem caused by another dirty hack. Ok there is a time and a place for a dirty hack but there is also a time and place for actually doing things 'right' in the first place or at least fixing them when they come to light. Anyway I digress.
Currently I am finding solace in the fact that one of my colleagues who I sent a link to this site in the hope that he might learn something useful seems to find it hilarious whilst remaining blissfully unaware that if I started sending his code in you wouldn't need any extra content for at least several months.
Admin
times new roman??
that's what comes of using word to check your spelling
Admin
OK, what about a stored-procedure?
But don't you need to retrieve the rows at some point? Perhaps you are dealing with smaller data sets that I have to work with. Generally, we can't rely on like queries (especially unbounded ones) because they can take hours, if not days to return.
Admin
Hmm. This is highly unorthodox. Doing things right in the first place? Fixing the root problem? I don't know what planet you are from but this is earth, buddy. We don't do things that way around here.
Admin
sorry that comment was out of order
Admin
That's usually when I feel having an extra exit point is fine, at the beginning of the function where it's obvious. It's usually to check input to make sure that it's kosher. In that case, though, you really haven't gotten to the meat of the function itself.
Admin
I'm a 'whatever makes the code elegant and easy to understand' person. :-)