The Daily WTF: Curious Perversions in Information Technology

duck1123 · 2005-09-10 Reply Admin

recursion is also using goto. We need programs that do the same thing every time, without fail. None of this conditional processing mumbo-jumbo. That's how you write error proof code, by not allowing it to respond to anything.

2005-09-11 Reply Admin

MISRA (Motor Industry Software Reliability Association) specify in their coding standards that a function must only have a single exit point. This approach dates from the days of 'structured programming'. Many of their standards are good common sense, but some are daft. I once worked with a team who used MISRA, and they delighted in writing 1000 line functions, full of nested if statements, with dozens of flags. It was hell.

All I can say is 'guard clauses'.

I suspect that jed was having fun with these functions. The date of the code sample is only 3 days before his departure.

technites · 2005-09-11 Reply Admin

Smalltalk seems to manage fine without that stuff.

technites · 2005-09-11 Reply Admin

WIldpeaks:
So you advise to write programs without loops or conditional structures ? How ? (more exactly: what's the program doing besides saying Hello World then ?)

Damn forum/my incompetence. I was trying to say Smalltalk does it with code blocks instead.

AndrewVos · 2005-09-11 Reply Admin

// need to set the result if it hasn't been set yet, we'll randomly decide true or false
// because there should be no bias.

MY GOD! BEST WTF EVER!!!

johnl · 2005-09-11 Reply Admin

I actually agree with the "single exit point" idea - I don't like looking at a return statement that might never be executed under a normal situation. I prefer

if (something is true) { r = someValue; } else { r = someOtherValue; }

return r;

It might not be the 100% fastest way to program (an extra assignment, which may slow things noticeably depending on the value which is returned) but I do find it more readable. When debugging a method or function's return value, it's easier to find the (single) exit point and work backwards from there. Find out what is being returned, find out how that value was generated, find the problem in that generation code.

The only exception (ha!) is an exception, which represents a deviation from the "normal situation".

brazzy · 2005-09-11 Reply Admin

johnl:
I actually agree with the "single exit point" idea - I don't like looking at a return statement that might never be executed under a normal situation. I prefer if (something is true) { r = someValue; } else { r = someOtherValue; } return r; It might not be the 100% fastest way to program (an extra assignment, which may slow things noticeably depending on the value which is returned) but I do find it more readable. When debugging a method or function's return value, it's easier to find the (single) exit point and work backwards from there.

An assignment costs next to nothing, independently of what is assigned, unless it does an implicity copy of the entire object (which it doesn't in a sane language).

As for the rest, I disagree 100%. Early returns are MUCH clearer. Anything else creates uncertainly and potential for bugs because the value might get changed again later, or there might be hidden side effects. Generally, local variables introduce additional state, and any additional state makes the program more complex and harder to understand. Therefore, the amount of state (and thus, local variables) should be minimized.

Quinnum · 2005-09-11 Reply Admin

brazzy:
johnl:
I actually agree with the "single exit point" idea - I don't like looking at a return statement that might never be executed under a normal situation. I prefer if (something is true) { r = someValue; } else { r = someOtherValue; } return r; It might not be the 100% fastest way to program (an extra assignment, which may slow things noticeably depending on the value which is returned) but I do find it more readable. When debugging a method or function's return value, it's easier to find the (single) exit point and work backwards from there.

An assignment costs next to nothing, independently of what is assigned, unless it does an implicity copy of the entire object (which it doesn't in a sane language).

As for the rest, I disagree 100%. Early returns are MUCH clearer. Anything else creates uncertainly and potential for bugs because the value might get changed again later, or there might be hidden side effects. Generally, local variables introduce additional state, and any additional state makes the program more complex and harder to understand. Therefore, the amount of state (and thus, local variables) should be minimized.

I agree. Early returns help readability a lot. The only argument I see from people for a single return point is to aid debugging, but.with today's IDEs, is it really that much of an issue to set a couple of extra breakpoints ???

joost · 2005-09-11 Reply Admin

Anonymous:
>> In x86 assembler MOV AX, 0 used one more byte than XOR AX, AX.
>> Therefore any coder worth his salt would always use XOR AX, AX
>> to put the zero into the AX register.

Hmmm, is this so? "I dont think so, Tim" - The second one (xor) does some more (if I recall correctly), it resets a flag-bit, too. So I personally try to be very careful when programmers do things that are not obvious in first sight.

Just my 0,02 cent, jerkWood

C:\>debug
-a
0CD3:0100 mov ax,0
0CD3:0103 xor ax,ax
0CD3:0105
-d 100 104
0CD3:0100 B8 00 00 31 C0 ...1.
-

What have we learned?
- Two debug.com commands, you sissies
- Opcode for "mov ax, 0" = B80000
- Opcode for "xor ax, ax" = 31C0
- Further research confirms that the XOR version also sets the CF and OF to 0, modifies SF, ZF and PF according to the XOR operation and leaves AF undefined. (http://pdos.csail.mit.edu/6.828/2004/readings/i386/appc.htm)

I can also confirm that any coder worth his salt uses the XOR version. It's a whole byte shorter, dammit! Also, any coder worth his salt won't destroy the flags if he needs em.

tSQL · 2005-09-11 Reply Admin

Well as we enter yet another WTF religious battle – which are as much fun as the initial WTF – I need to sound off as a maintenance programmer. I due wear that hat more and more these days. I have little respect for developers that “write and leave” that give NO respect for the maintenance programmer.

GOTO code is not cool, it is right there with no comments and declaring variables where ever and when ever a sudden urge for a new strTemp comes to mind. This forum puts a lot of energy into performance concerns. With this emphases a bContinue lover cannot win support. Your 200 line batch method with 20 exit points will perform faster than my 200 line method with 20 bContinue conditionals sections, because it can exit on the first exit point instead of executing my additional 19 if (bContinue=True).

The point I wish to explore is comparing that performance gain with lifetime maintenance of the code. If we concede for the moment that the performance difference is very minimal and the slower method is easier to maintain and read, which method should we implement? As a developer I think most would choose the faster but more cryptic method. Other developers that appreciate “the hand that feeds us” might acknowledge that business does not appreciate software maintenance. It costs them money and takes away from the time that could be spent developing new products.

For example, I was once faced with implementing some very cool bit arithmetic logic or a more standard array list. I choose the standard array list approach and it served me well. Months later business had an enhancement to this method, and I found I could hand this maintenance off to an eager intern who had the time to complete the request. He did so, and didn’t even ask me one question about the method.

So I ask … Tortoise or the Hare?

2005-09-11 Reply Admin

Unless assingning 'IsTrue' causes the function to return, it will be infinitely recursive. Otherwise, note how the guy uses 'False' directly to initialize ResultHasBeenSet at the begninning (which will prevent infinite recursion), but then does weird shit like 'IsTrue(1=0)' later (after the function has returned). Either way, it's obviously a joke.

2005-09-11 Reply Admin

Being a Delphi developer for so many years, I can only say 'Jed' is so brilliant [6] Just wonder did he get paid by the number of lines of codes he produced???

As for the goto statement, personally I try to avoid it by all means... just sometimes 'goto' itself can comes in handy... Last time I used it was trying to port (or translate to be exact) some FORTRAN routines, some typical FORTRAN 77 codes: all uppercase, short identifiers' name, arrays everywhere and LOTS of goto [^o)]

2005-09-12 Reply Admin

Maurits:
Alex Papadimoulis:
there should be 2^2 [4] possible outcomes

Glad he included the "4". Here I was thinking that 2^2 == 0.

Reminds me of the time I asked a delphi programmer if he could convert a string to an array of bytes so I could see what it contains (I was trying to figure out if he was passing my dll a unicode or ascii string) and his answer (after looking in the help) was: "I don't see how that will help you. Bytes are numeric."

ammoQ · 2005-09-12 Reply Admin

tSQL:
The point I wish to explore is comparing that performance gain with lifetime maintenance of the code. If we concede for the moment that the performance difference is very minimal and the slower method is easier to maintain and read, which method should we implement? As a developer I think most would choose the faster but more cryptic method. Other developers that appreciate “the hand that feeds us” might acknowledge that business does not appreciate software maintenance. It costs them money and takes away from the time that could be spent developing new products.

For example, I was once faced with implementing some very cool bit arithmetic logic or a more standard array list. I choose the standard array list approach and it served me well. Months later business had an enhancement to this method, and I found I could hand this maintenance off to an eager intern who had the time to complete the request. He did so, and didn’t even ask me one question about the method.

So I ask … Tortoise or the Hare?

I always choose readability over performance gain (except those extremely rare cases where the performance gain is a must), since many of my programs are used for many years, and in many cases I am the one who has to maintain them.

That said, i can hardly see how a bContinue could improve readability.

SysRq2000 · 2005-09-12 Reply Admin

stewie:

My favourite line:

 // count the number of true expressions (WARNING: COMPLEX CODE AHEAD!!)

By "complexity" Jed obviously means simple code with loads of WTFs in it! This WTF is imo on the same level as another "brillant" WTF posted on this site.

2005-09-12 Reply Admin

TrueCount:=(1-1); // get 'real zero'

WTF?? Everybody knows the real zero is 'Z'!

dhromed · 2005-09-12 Reply Admin

In the skirmish between Single Exit and Multiple/Early Exit, I am in the Multiple Exit camp.

But what about using return; to abort a function? Is that use or misuse? What other way of ending a function is there, besides an empty return;?

Essay questions to be on my desk in two days, 9:30 sharp. And remember: Those who are tardy do not get fruit cup.[sic]

tSQL · 2005-09-12 Reply Admin

ammoQ:

That said, i can hardly see how a bContinue could improve readability.

And that is the truly beautiful thing about readability - boil it down and it is personal preference and perhaps more importantly a shop preference. If I spent a year in a shop that allowed multiple exit points, I am sure I would find that code readable soon or later.

But adjusting to our environment isn't any forum fun, so I say - repent from your sinful ways you early exit'r <evil laugh>

2005-09-12 Reply Admin

You forgot:

function MostlyHarmless(Sender: TObject) : Boolean;

ammoQ · 2005-09-12 Reply Admin

tSQL:
And that is the truly beautiful thing about readability - boil it down and it is personal preference and perhaps more importantly a shop preference. If I spent a year in a shop that allowed multiple exit points, I am sure I would find that code readable soon or later.

True: Consistent style throughout the whole project is the most important thing to maintain readability - even if the style is somehow "ugly". That's why I dislike working a certain coworker, who has a notorous habit of breaking style - but we are not talking about "early exit/single exit", "uppercase vs. lowercase" or "how many spaces for indention" here. No, what he does is hardcore style breaking, like "incorporating incomprehensible object orientation into a structured (not oo) program". And when I say "incomprehensible", I mean it. Arbitrary classes lacking a recognizeable object model. No wonder most (maybe all) of his classes have exactly one instance - the sentence "an instance of this class represents..." cannot be finished, because his classes don't have such a meaning.

2005-09-12 Reply Admin

Could, by any chance, that person used to work in a company, named F..., doing billing systems for telecom operators?

emptyset · 2005-09-12 Reply Admin

dubwai:
Is it possible that neck-beards are going to come back into style? I don't have much facial hair but my neck is like a forest.

dude, i'm banking on it!

OMG · 2005-09-12 Reply Admin

Anonymous:
Ohh my f***ing god !

I just love this part:!
// need to set the result if it hasn't been set yet, we'll randomly decide true or false
// because there should be no bias.

Wow, a whole new form of Fuzzy Logic!!!

emptyset · 2005-09-12 Reply Admin

dhromed:

What other way of ending a function is there, besides an empty return;?

)

ammoQ · 2005-09-12 Reply Admin

Anonymous:
Could, by any chance, that person used to work in a company, named F..., doing billing systems for telecom operators?

AFAIK not. But he doesn't work much for us any more. Anyway, I'm sure there's a lot of his kind.

2005-09-12 Reply Admin

This is in regards to Jed being easily distracted and being off-task. The fact that he decided to spend some time posting on a newsgroup doesn't necessarily have a reflection on how much effective time he is spending on a project. It seems entirely possible that he spent his free time posting, or, as many people do, needed a mental distraction in order to better concentrate when on the task. I happen to do both and have been accused of being "off-task" when all i really needed was to take a mental break and was doing so in my own time after work.

kipthegreat · 2005-09-12 Reply Admin

Alex Papadimoulis:
Jed also had a tendency to get off-task a bit (a failing for which he was eventually canned). During one week of our rushed development schedule he managed to hold a position in the top ten most prolific posters in the borland.offtopic newsgroups.

Hope my boss doesn't read dailywtf... ;)

Mike R · 2005-09-12 Reply Admin

Anonymous:
Zorlen:
dubwai:

And lo, the lord said unto the flock, "let thy code have one exit point and only one exit point only. Any exit point beyond the one true exit point shall be called 'goto'"

[/blockquote]

Like I said: QED.

Still amazes me when people buy into this. Hopefully your followers are dwindling.

If you worked for me, I'd fire you. Immediately. That kind of cavalier attitude toward proven, sound programming principles has no place in a commercial development environment.

I'd fire you immediately for encouraging the "best practices are rules written in stone" mentality.

But, thats just me.

2005-09-12 Reply Admin

// count the number of true expressions (WARNING: COMPLEX CODE AHEAD!!)

I have to start adding that to all my most f'd up code...

tSQL · 2005-09-12 Reply Admin

ammoQ:
tSQL:

And that is the truly beautiful thing about readability - boil it down and it is personal preference and perhaps more importantly a shop preference. If I spent a year in a shop that allowed multiple exit points, I am sure I would find that code readable soon or later.

True: Consistent style throughout the whole project is the most important thing to maintain readability - even if the style is somehow "ugly". That's why I dislike working a certain coworker, who has a notorous habit of breaking style - but we are not talking about "early exit/single exit", "uppercase vs. lowercase" or "how many spaces for indention" here. No, what he does is hardcore style breaking, like "incorporating incomprehensible object orientation into a structured (not oo) program". And when I say "incomprehensible", I mean it. Arbitrary classes lacking a recognizeable object model. No wonder most (maybe all) of his classes have exactly one instance - the sentence "an instance of this class represents..." cannot be finished, because his classes don't have such a meaning.

Dead On. One of the best coding shops I've worked practiced consistency as a religion. I could walk into any method and never skip a beat, I couldn't even tell who wrote it. All the developers were in such sync, even variable names were reused. Management and process control were another story, sometimes even good code isn't worth staying for.

Mike R · 2005-09-12 Reply Admin

Mike R:

I'd fire you immediately for encouraging the "best practices are rules written in stone" mentality.

But, thats just me.

Having said that, in using early exits, it depends. If theres no risk of leaving resources strewn about, I tend torward the early return camp.

brazzy · 2005-09-12 Reply Admin

tSQL:
ammoQ:

That said, i can hardly see how a bContinue could improve readability.

And that is the truly beautiful thing about readability - boil it down and it is personal preference and perhaps more importantly a shop preference. If I spent a year in a shop that allowed multiple exit points, I am sure I would find that code readable soon or later.

I think there are (at least) two aspects to code readability, One is following conventions, things like formatting, (in Java) variable names beginning with a lowercase letter and class names with an uppercase letter, or using common idioms. In these things, consistence is the most important thing. But there certainly are also objective criteria. Meaningful variable names increase code readability no matter whether you're used to them or not.

And as I said above, I think allowing multiple exit points makes for objectively more readable code because it reduces the amount of state you have to keep in mind while retracing the code.

christoofar · 2005-09-12 Reply Admin

Anonymous:

>> TrueCount:=(1-1); // get 'real zero'

John Smallberries and I came across an interesting WTF with (surprise) Microsoft's own implementation of ADO.NET with respect to the SQL Server client.

Let's suppose you have a T-SQL stored procedure that accepts a single parameter of type Integer. You would think the simplest overload of the SqlParameter ctor() to provide a value, the (string Name, object Value) version, when passed zero (0) for the Value would cause the SqlParameter the 2nd parameter to the Integer datatype.

But, it doesn't! Apparently this also maps to the SqlDbType enum 0 value type (however, it mapped to Int for value when I used a 1, so it's only doing this in the case of zero).

The workaround for this is obviously, don't pass a constant of zero to this parameter... either box it or cast zero to an object. Calling Convert.ToInt32() works as well but introduces an extra line of IL to call the static method on the Convert class.

Observe the IL:

//INCORRECT!

new SqlParameter("@transaction_id", 0)

IL_0055: newobj instance void [System.Data]System.Data.SqlClient.SqlParameter::.ctor(string,
valuetype [System.Data]System.Data.SqlDbType)

//cast to object

new SqlParameter("@transaction_id", (object)0)

IL_0055: box [mscorlib]System.Int32
 IL_005a: newobj instance void [System.Data]System.Data.SqlClient.SqlParameter::.ctor(string,
 object)

//try the f'ed up way of doing same cast

new SqlParameter("@transaction_id", 1-1)
 IL_0055: box [mscorlib]System.Int32
 IL_005a: newobj instance void [System.Data]System.Data.SqlClient.SqlParameter::.ctor(string,
 object)

//explicitly call convert

new SqlParameter("@transaction_id", Convert.ToInt32(0))

IL_0055: call int32 [mscorlib]System.Convert::ToInt32(int32)
 IL_005a: box [mscorlib]System.Int32
 IL_005f: newobj instance void [System.Data]System.Data.SqlClient.SqlParameter::.ctor(string,
 object)
//

christoofar · 2005-09-12 Reply Admin

brazzy:
tSQL:
ammoQ:

That said, i can hardly see how a bContinue could improve readability.

And that is the truly beautiful thing about readability - boil it down and it is personal preference and perhaps more importantly a shop preference. If I spent a year in a shop that allowed multiple exit points, I am sure I would find that code readable soon or later.

I think there are (at least) two aspects to code readability, One is following conventions, things like formatting, (in Java) variable names beginning with a lowercase letter and class names with an uppercase letter, or using common idioms. In these things, consistence is the most important thing. But there certainly are also objective criteria. Meaningful variable names increase code readability no matter whether you're used to them or not.

And as I said above, I think allowing multiple exit points makes for objectively more readable code because it reduces the amount of state you have to keep in mind while retracing the code.

Hear hear. The "one exit only" per method mantra is just so silly. That's only useful in non-OO languages such as C where you spend most of your life deciphering symbols and managing state and all the moving parts associated with such.

In OO languages, you should really stop and think once you've crossed over 200 lines of implementation code within a method:

Is my design flawed?
Objects should be tracking state, so why does my method have so much state checking littered about? (maybe that work should be shifted)
Have I sufficiently put the right layers of abstraction in my classes so that my methods aren't overwhelmed with implementation minutae?

Typically, the design pattern a lot of procedural programmers get themselves into when working in an OO language is a lightweight set of classes with one giant God object that does most of the work... and boy is it a pain to fix. :-(

KenW · 2005-09-12 Reply Admin

Anonymous:
Maurits:
Alex Papadimoulis:
there should be 2^2 [4] possible outcomes

Glad he included the "4". Here I was thinking that 2^2 == 0.
Reminds me of the time I asked a delphi programmer if he could convert a string to an array of bytes so I could see what it contains (I was trying to figure out if he was passing my dll a unicode or ascii string) and his answer (after looking in the help) was: "I don't see how that will help you. Bytes are numeric."

Not much of a Delphi programmer, if you ask me. Converting a string to a byte array is quite simple.

2005-09-12 Reply Admin

Great stuff! I wish I could pull a stunt like this. I'm just too damn nice.

dubwai · 2005-09-12 Reply Admin

Single exit makes a lot of sense when you don't have a langauage with exceptions (try/finally, try/catch/finally) and don't want to or can't use gotos.

Even if there is no cleanup in the method with initially written, it may be needed later. It it risky to rewrite the entire method.

In a language that allows exceptions and try/finally blocks, it doesn't make sense to force methods to artifically end at a single point. There's no functional difference and it makes it hard to tell what a method does after a certain condition. If it returns, you know it ends there. If it continues, a lot of other things could occur. You can't know without checking. Also, it opens up a possiblity for bugs. A deleted bContinue or an added OR condition could cause the code to behave very differently. It's much easier to know that a certain branch verifiably ends when it hits the return.

Also, if you can't see the entire method on the screen in a normal font size/ resolution, it's probably too long.

emptyset · 2005-09-12 Reply Admin

dubwai:
Also, if you can't see the entire method on the screen in a normal font size/ resolution, it's probably too long.

all functions should fit into neat, 80x24 blocks, for readability and faster compilation times.

for example, change this:

for (i = 0; i < POS; i++)
{
bzero(wing[i]) ;
}

into this:

for(i=0;i<POS;i++){bzero(wing[i]);}

the speed of compilation is inversely proportional to the amount of whitespace in the program. since standard UNIX terminals are 80x24, compilers drop the program into an const char *terminal[80][24] array. each time whitespace is encountered, the corresponding block of memory is free'd. then, as you walk across the array, catch all the SIGSEGV signals that are raised. this is how the tokens are collected prior to the parsing phrase, by keeping track of the index at the last SIGSEGV and then looking at the current index at the present SIGSEGV. so, for each block of text in your program that exceeds 80x24, the compiler has to re-read the file, load into the array, and find the tokens. it also has to go back and realloc the whitespace blocks it free'd! exceeding 80 columns is bad because then the compiler has to wrap the text around into the next row. keeping your files a maximum of 80x24 characters and without whitespace minimizes file access, the number of SIGSEGV encountered, and utilizes the environment's pre-allocated memory.

ammoQ · 2005-09-12 Reply Admin

dubwai:
Also, if you can't see the entire method on the screen in a normal font size/ resolution, it's probably too long.

It's not always that easy. Imagine a method that has to cut a long fixed-format string into pieces, where "pieces" might mean: 150 fields.

SomeObject string2someObject(String s) {
  SomeObject so = new SomeObject();
  so.someField = s.substring(0,3);
  so.anotherField = Integer.parseInt(s.substring(3,7));
  ...
  return so;
}

The method is 154 LOC long, but creating 5 sub-methods, each handling 30 fields, would be arbitrary and not improve readability at all. Sometimes a job requires more lines than fit the screen.

dubwai · 2005-09-12 Reply Admin

ammoQ:
dubwai:

Also, if you can't see the entire method on the screen in a normal font size/ resolution, it's probably too long.

It's not always that easy. Imagine a method that has to cut a long fixed-format string into pieces, where "pieces" might mean: 150 fields.
SomeObject string2someObject(String s) {
  SomeObject so = new SomeObject();
  so.someField = s.substring(0,3);
  so.anotherField = Integer.parseInt(s.substring(3,7));
  ...
  return so;
}
The method is 154 LOC long, but creating 5 sub-methods, each handling 30 fields, would be arbitrary and not improve readability at all. Sometimes a job requires more lines than fit the screen.

Well, I would be tempted to make one or utility fuction that would reduce that to a series of single method calls but that's not really here or there.

Key word is 'probably'. It's a rule of thumb. I've seen some really nasty code come from rules like: you must indent 8 spaces and the lines must be no longer than 80 chars. All I'm saying is that most methods should be easily viewable without scrolling in a typical IDE. Some (trivial) methods are going to be longer, of course. But it's helpful to others to keep it in line. Also, not overusing whitespace can help a lot. Some people put a blank line (or two or three) between each statement. This is excessive. I use blank lines to group sections of the method into logical groups. I find it to be a better use of real-estate.

2005-09-12 Reply Admin

well the code is funny :D
I don't know if someon can be that stupid, but I suppose not.
So the guy was probably joking.
Anyway he cheked true, false but... what about the maybe?
This is a serious matter, just take a loo at
http://www.boost.org/doc/html/tribool.html
http://www.boost.org/doc/html/tribool/tutorial.html#id1567990
It's kind of fun :D
Actually boost is a very interesting set of libs.
chears

2005-09-12 Reply Admin

ammoQ:
Imagine a method that has to cut a long fixed-format string into pieces, where "pieces" might mean: 150 fields.
SomeObject string2someObject(String s) {
  SomeObject so = new SomeObject();
  so.someField = s.substring(0,3);
  so.anotherField = Integer.parseInt(s.substring(3,7));
  ...
  return so;
}
The method is 154 LOC long, but creating 5 sub-methods, each handling 30 fields, would be arbitrary and not improve readability at all. Sometimes a job requires more lines than fit the screen.

You shouldn't be putting the field offsets in your code like that. Define a schema file with the field definitions in it and then load that and use it to process the data. Or better yet, don't reinvent the wheel, use an existing tool to process the data. It's not like nobody has ever had to solve that particular problem before (of course, if all you can find are 80 inch wheels and you need a 20 inch wheel, feel free). I know, its just an example, but the point is, if you are writing code like that, don't be surprised when your coworkers post it on some website that ridicules bad code.

ammoQ · 2005-09-13 Reply Admin

David Knaack:

You shouldn't be putting the field offsets in your code like that.

The code of this particular method might be generated e.g. from an excel sheet. But the topic of my post was not: what is the best way to break a long string into pieces.
In real life, I would probably not use substring() directly, but write a specialised class:

SomeObject string2someObject(String s) {
  SomeObject so = new SomeObject();
  StringBreaker sts = new StringBreaker(s);
  so.someField = sts.getString(3);
  so.anotherField = sts.getInteger(4);
  ...
 return so;
}

But that's still 155 lines of code.

Define a schema file with the field definitions in it and then load that and use it to process the data. Or better yet, don't reinvent the wheel, use an existing tool to process the data. It's not like nobody has ever had to solve that particular problem before (of course, if all you can find are 80 inch wheels and you need a 20 inch wheel, feel free). I know, its just an example, but the point is, if you are writing code like that, don't be surprised when your coworkers post it on some website that ridicules bad code.

IMO sometimes it's much better to write a straight-forward 150 LOC method that simply does what it has to. Using external tools for every single little task means a lot of dependencies and you might end debugging that tools, too. The performance of an external tool might be noticably worse than the straight-forward way. While I can expect my coworkers to be able to read and understand plain code, I cannot expect that they know these tools.

tSQL · 2005-09-13 Reply Admin

ammoQ:
David Knaack:

You shouldn't be putting the field offsets in your code like that.

The code of this particular method might be generated e.g. from an excel sheet. But the topic of my post was not: what is the best way to break a long string into pieces.
In real life, I would probably not use substring() directly, but write a specialised class:
SomeObject string2someObject(String s) {
  SomeObject so = new SomeObject();
  StringBreaker sts = new StringBreaker(s);
  so.someField = sts.getString(3);
  so.anotherField = sts.getInteger(4);
  ...
 return so;
}
But that's still 155 lines of code.

Define a schema file with the field definitions in it and then load that and use it to process the data. Or better yet, don't reinvent the wheel, use an existing tool to process the data. It's not like nobody has ever had to solve that particular problem before (of course, if all you can find are 80 inch wheels and you need a 20 inch wheel, feel free). I know, its just an example, but the point is, if you are writing code like that, don't be surprised when your coworkers post it on some website that ridicules bad code.

IMO sometimes it's much better to write a straight-forward 150 LOC method that simply does what it has to. Using external tools for every single little task means a lot of dependencies and you might end debugging that tools, too. The performance of an external tool might be noticably worse than the straight-forward way. While I can expect my coworkers to be able to read and understand plain code, I cannot expect that they know these tools.

I've got your back ammoQ. I do a fair amount of batch type data loads. The data load is 150 steps, which are 150 individual lines of code. There isn't much of anything that can be done. I saw one attempt to make 150 one line functions and a load functin calling 150 functions, but nothing was gained - althought it was funny.

Sometimes it just takes 150 lines, but i'll be sure to watch my white space next time

2005-09-13 Reply Admin

Nah, that's Jeremy North. The guy we're talking about here is Jed Mattson.

2005-09-13 Reply Admin

Anonymous:
I hope it wasn't this JED http://www.jed-software.com/

He contributes a lot to the Delphi community.

Oops !

Nah, that's Jeremy North. The guy we're talking about here is Jed Mattson.

joost · 2005-09-13 Reply Admin

ammoQ:

SomeObject string2someObject(String s) {

  SomeObject so = new SomeObject();
  StringBreaker sts = new StringBreaker(s);
  so.someField = sts.getString(3);
  so.anotherField = sts.getInteger(4);
  ...
 return so;
}

But that's still 155 lines of code.

I'm too lazy to spell it all out right now, but using a couple of lists and some introspection, you can do:

import java.lang.reflect;

const int FT_INT = 1, FT_STR = 2;
String[] fieldNames = { 'someField', 'anotherField' };
int[] fieldTypes = { FT_STR, FT_INT };
int[] fieldOffsets = { 3, 4 };

SomeObject string2SomeObject(String s) {
SomeObject so = new SomeObject();
StringBreaker sts = new StringBreaker(s);
Field f = so.getField(fieldNames[i]);
switch(fieldTypes[i]): {
    case FT_INT:
      Field f = so.getField(fieldNames[i]);
      f.setInt(sts.getInteger(fieldOffsets[i]));
      break;
....
}
return so;
}

This is probably what the other guy meant with 'use a schema'. If you don't like the parallel arrays (error-prone), curse Java for not having Tuples and create a light-weight class that has one string and two ints as public fields, and make sure you give it a short name:

FieldInfo[] fieldInfo = {
new FieldInfo('someField', FT_INT, 3),
new FieldInfo('anotherField', FT_STR, 4)
}

This reduces 155 lines of code to ~20 lines of code 155 lines of data. They taught me it's better to have smart data structures and stupid code than to have stupid data structures and smart code.

joost · 2005-09-13 Reply Admin

Of course I forgot to put a for loop around the "Field f" assignment and the "switch" statement that assigns to "int i". But that's fairly obvious.

joost · 2005-09-13 Reply Admin

And the "Field f" assignment should only be outside the "switch" statement, but I blame that on this edit box's copy/paste WTF-ery.

ammoQ · 2005-09-13 Reply Admin

In my eyes,

so.someField = sts.getString(3);

and

new FieldInfo('someField', FT_INT, 3),

are equally complex, so IMO there is no gain in readablity; on the other hand, the need for introspection costs some performance and you have less checks during compiling, more during runtime. Anyway, it's likely that there is some spreadsheet where the record structure is documented so I try to would create either version out of this spreadsheet.

Happy (Belated) Jed Day!

Leave a comment on “Happy (Belated) Jed Day!”