Comment On Self Documenting

"A little while back, someone introduced the concept of 'self-documenting' code to our team," writes Ryan L. "It was certainly a step forward, but it's somehow taken us two steps backwards. Consider, for example, the following code from an MVC controller." [expand full text]
« PrevPage 1 | Page 2 | Page 3 | Page 4Next »

Re: Self Documenting

2012-05-02 15:20 • by Matt Westwood
380257 in reply to 380245
Some Damn Yank:
Rodd:
Matt Westwood:
You're doing it wrong. A phone number should be defined by a class in its own right with its own internal validation. If you're holding it as a string and banking on the class you're holding it in validating it, you've got it backarsewards.


Obviously. And inside that class, you store a PhoneNumberDigitArray class containing 9 instances of the PhoneNumberDigit class.
You're' all wrong. Assuming the US telephone system developed by TPC (The Phone Company, for those of you who somehow missed "The President's Analyst"), what the public refers to as a "phone number" is actually a concatenation of three numbers: the Area Code (3 digits), the Central Office (exchange) code (3 digits), and the Subscriber Number (four digits).

So you need an AreaCode class, a CentralOffice class, and a SubscriberNumber class to go along with your PhoneNumberDigit class.


Only if you've allowed a subclass of the PhoneNumber class to be AmericanPhoneNumber ... hang on, that can't be the correct approach - where's me gang-of-four, I feel a Pattern coming on ...

Re: Self Documenting

2012-05-02 15:30 • by Some Damn Yank (unregistered)
380258 in reply to 380245
Some Damn Yank:
So you need an AreaCode class, a CentralOffice class, and a SubscriberNumber class to go along with your PhoneNumberDigit class.
From a practical standpoint this actually isn't a bad idea. For example, no value in the AreaCode and CentralOffice classes should be of the form N11 as those are reserved for 411, 911, etc. Also, the allowable values in the AreaCode class are well-known, as new area codes are introduced in a controlled fashion by a central authority, much like state/province codes and zip/postal codes, so ideally you'd make the user select from a list, or at least validate what they type. Then there's the whole country code business, too, which is also a well-known fixed set, so if you're doing international phone numbers you need a CountryCode class.

There's no reason for the PhoneNumber class to know how to validate all these parts. When you're looking at the PhoneNumber class you can easily see what it does without getting into the details of validating an area code, and if you want to see that you just look at the AreaCode class.

Re: Self Documenting

2012-05-02 15:36 • by Some Damn Yank (unregistered)
380259 in reply to 380257
Matt Westwood:
Only if you've allowed a subclass of the PhoneNumber class to be AmericanPhoneNumber
That's NorthAmericanPhoneNumber - per Wikipedia, the United States and its territories, Canada, Bermuda, and 17 nations of the Caribbean use that format.

Re: Self Documenting

2012-05-02 15:42 • by Ello (unregistered)
ThisFunctionReturnsTrueIfInputParameterArgumentIsOneAndFalseIfInputParameterArgumentIsZero(int InputParameterArgument)
{}

Re: Self Documenting

2012-05-02 16:38 • by The Mr. T. Experience (unregistered)
380261 in reply to 380257
Your mom had to call Maury because she was quite the gang-of-four whore. The worst thing is you still don't know who's your daddy, and you got to see four dudes dance a celebratory jig.

Re: Self Documenting

2012-05-02 18:12 • by gratuitous_arp (unregistered)
380262 in reply to 380247
Your Name:
gratuitous_arp:
Your Name:
KattMan:
CAPTCHA:sino:

Sorry that was an offtopic rant. Also I hate setter and getter functions that only consist of "this.value = value" or "return this.value". Anyone who creates a language where they are necessary is a dickweed arsehole.


Look into encapsulation and the "law" of demeter and you will know why these simple get/set methiods exist. It's not the language that requires them, you can make your variables public, but then the thing that owns them has no control over when and how they are updated.


No, this is an artifact of the languages you use, and you're so used to it that you can't imagine it any other way. Look at the way Python, for instance, implements "properties". You have a public variable, and then later, if you need access to that public variable to go through a method, you can change it to do that transparently to the code that uses that class doesn't need to change.


As KattMan said, you can do the same thing in C++ using a public variable and later writing a get function for it if desired. People don't do this because it's bad design -- in any language -- for both of the reasons given by KattMan. As you know, in Python, there is an idiom to denote properties that should not be accessed directly by adding a leading underscore to the property's name. This access control "suggestion" is used to provide the private membership given by C++ (though not enforcing it) while retaining the language's flexibility. In PEP-8, Guido gives further examples of why this is a good idea. See: http://www.python.org/dev/peps/pep-0008/#designing-for-inheritance

If you're writing small Python scripts it doesn't really matter (as long as you *know* they'll stay small). If you're writing large programs, it does matter.

Only in C++ and the languages derived from it (e.g. Java) do you see everyone "defensively" creating getters and setters for everything right off the bat.


You're talking about defensive programming, something that is hopefully common in code written in any language. Setters are just as important in Python as they are in C++ if you want any kind of assurance that anything valid is getting passed to your object. Getters hide implementation details. Other than taking some extra time to write, what specifically do you find undesirable about get/set functions?


No, you're still not getting it.

Look, in C++ or Java, you have a class and some information encapsulated in it.

class foo
{
//...
public:
int bar;
}

Now, code uses this objects of this class and accesses the variable foo.bar directly. It's all good until you decide that accesses to foo.bar need to do something else -- maybe there's some internal counter you want to increment or you want to validate or log or you want to calculate bar on the fly or whatever. Now all the code that USES class foo needs to be rewritten to call foo.getBar() and foo.setBar(int newBar). After getting bitten by this once, from then on you always make all member variables private and waste time (both writing and compiling) and code size (both source and, if your optimizing compiler isn't sufficiently good, binary size) on getters and setters that might not actually be necessary (e.g. if in the lifespan of your project they never do anything but act as passthroughs for the private member variable, you did that work for nothing).

In a language with properties (or whatever the feature ends up getting called), this changeover is transparent to the code using class foo. So, you start with a class, as before, with a public member variable.

class foo:
def __init__(self):
self.bar = 0


Other code accesses bar directly.


obj = foo()
obj.bar = 10


Later, you change your mind and want foo.bar to go through a getter or whatever. So you implement it as a property in the foo class definition.


class foo:
def __init__(self):
self._private_bar = 0
def set_bar(self,bar):
self._private_bar = 0
#insert side effects here
def get_bar(self):
#insert side effects here
return self._private_bar

bar = property(set_bar, get_bar)


That last line there is key. Now, the code USING the foo class, the stuff you already wrote, doesn't need to change. You don't even need to recompile it if all you have is the compiled bytecode! All those calls to directly access foo.bar are transparently converted into function calls to the property. (The syntax is ugly by Python standards but that still puts it head and shoulders above C++.) Because of this, you can have publicly accessible member variables and only write getters and setters when you actually need them, rather than write a bunch of pointless ones right off the bat just in case they become necessary later. (As a useful bonus, you can tell at a glance if some Python code you're looking at was written by someone who doesn't really know the language, because it will have trivial getters and setters everywhere.)

Do you get it yet?


Ah, I thought you were using "properties" as a general description for class attributes -- I wasn't aware that Python had a property keyword. I still have a hard time understanding the disagreement expressed at having some unneeded but trivial get/set functions, and how this is worse than having code use multiple ways of accessing a member variable in the same way. The difference between obj.setVar() and obj.var isn't clear without looking through the class file. The "property" keyword seems like a get out of jail free card -- it shouldn't be part of the plan, but it's very handy when you need it.

Maybe as I think about it more, I'll see the light that you see. In any case, thank you for taking the time to explain.

Re: Self Documenting

2012-05-03 04:02 • by me (unregistered)
380286 in reply to 380080
What is "frist", and which purpose did your post serve?

Re: Self Documenting

2012-05-03 05:25 • by Sam (unregistered)
380291 in reply to 380140
It scares me that I had to get half way down page two of the comments before reading a comment by someone with a brain.

Re: Self Documenting

2012-05-03 06:46 • by eric (unregistered)
380295 in reply to 380084
Perhaps he expected validation checking to be more complicated, but it didn't end up that way.

Re: Self Documenting

2012-05-03 08:07 • by Hmmmm (unregistered)
380301 in reply to 380295
eric:
Perhaps he expected validation checking to be more complicated, but it didn't end up that way.

I'm not sure if you meant to reply to my comment as I don't see the connection and I assume you haven't read all the comments yet but if you have then can you explain how the complexity of the required validation checking has any bearing on why he felt it necessary to write the bool wrapper functions when they are only used after an explicit, fairly well named, test, e.g.

if (NoActivityTypesAreAttachedToThisEvent(activityTypes))
return TrueBecauseThereAreNoActivityTypesToFilterOnThisEvent();

The Because... stuff is implicit in the fact that it is being done as a result of the test.

The only thing that could possibly induce me to change my opinion about this code is if the original author gave a different reason (and I would still find it very hard to believe).

Re: Self Documenting

2012-05-03 08:30 • by clive (unregistered)
380303 in reply to 380250
Jay:
clive:
If you can't rename your function when you change what it does, you've got serious problems with your codebase or processes anyway.


Well, if the function is only used within one project, and only used in standard ways, then at least some IDEs -- Eclipse, for example -- can make renaming it pretty easy. I've only used a handful of IDEs so I can't say if this is a common feature.

In any case, if a function is used across many projects, e.g. it's part of library, then this might not be quite so simple: You'd have to identify every project that uses it and update all of them.

If you use reflection to access functions, renaming it may not be trivial. Where does the system get the name from? A simple hard-coded string? An XML file? A database record? You'd have to investigate all these possibilities and more. Even if this particular function is never accessed via reflection, you would have to know that or be able to prove that.

If it's part of a published API used by many people outside your organization, then changing a function name should be a step taken with extreme caution if you don't want to have lots of unhappy clients.


And in that case, changing what the function does so it no longer agrees with the name should also be done with a similar amount of caution.

Adding some logging code to a "GetTransaction" method is fine, adding some validation is probably fine, and neither require a name change. If you're adding something which requires a name change, you're back in 'serious problems' territory.

Re: Self Documenting

2012-05-03 12:24 • by Lurch (unregistered)
380325 in reply to 380251
Re: Self Documenting
2012-05-02 13:40 • by Jay

Re: Self Documenting:

Rodd:

Matt Westwood:
You're doing it wrong. A phone number should be defined by a class in its own right with its own internal validation. If you're holding it as a string and banking on the class you're holding it in validating it, you've got it backarsewards.



Obviously. And inside that class, you store a PhoneNumberDigitArray class containing 9 instances of the PhoneNumberDigit class.



For north american phone numbers, ten instances would be more effective - but good point anyway.



That's why the PhoneNumber class needs to be subclassed into AmericanPhoneNumber, BritishPhoneNumber, RussianPhoneNumber, etc.


===========================

NO NO NO

What about Skype?

You need a VoiceCommunicationsFactoryFactoryFactory returning an InternationalPhoneNumberFactoryFactory returning a PhoneNumberFactory, returning a PhoneNumber object.

Yeah, that's a good start.

Re: Self Documenting

2012-05-03 13:02 • by BillClintonIsTheMan (unregistered)
Three things could be at play here

1) idiocy
2) The developer thought the evaluations would need to be more than they actually turned out to be (Kind of YAGNI - see #1)
3) There is some injected tracing/diagnostic code that gets put into the product at a later state. AOP or something like that.

But its probably #1.

Re: Self Documenting

2012-05-03 13:03 • by AY (unregistered)
380329 in reply to 380094
What does a 50 characters identifier has to do with memory? Much less performance?

Compiler theory indeed....

Re: Self Documenting

2012-05-03 18:26 • by Joe (unregistered)
380346 in reply to 380106
I believe it originated on Slashdot. At least, that's the first place *I* saw it, although there may have evolved elsewhere, or evolved multiple times in parallel.

Whenever a new article would go up, some lamer would try to claim "First post!" in the comments section, without adding anything to the conversation. On Slashdot, I believe they started moving such comments randomly down the page, so the next step naturally was to mis-spell it, leading to "frist post" or "frost pist" or any other tortured locutions.

On FARK, the phrase "first post" gets turned into "boobies", and "last post" gets turned into "wieners", if I recall correctly. Its filter was inspired by similar problems.

Re: Self Documenting

2012-05-03 18:30 • by Joe (unregistered)
380347 in reply to 380329
What does a 50 characters identifier has to do with memory? Much less performance?

Compiler theory indeed....


For interpreted code (remember BASIC?), it ate up your variable store. Depending on your flavor of BASIC, it could also lead to lower program speeds for variable lookup.

For compiled languages, shorter variables and terser comments lead to smaller source files and therefore faster floppy disk access. (Remember those?) I'm always surprised how large my source files are, and then I remember I use spaces instead of tabs, have full-paragraph comments when needed, and write things out long-hand when it helps clarity.

When I look at a floppy's worth of source code from 30 years ago, I think to myself "wow, that's terse." And then I realize "But if it were less terse, the source probably wouldn't have fit on that floppy alongside the executable."

Re: Self Documenting

2012-05-03 18:38 • by Joe (unregistered)
380349 in reply to 380190
The length of a variable name is a particularly silly thing to criticize on the basis of performance, as it will be tokenized almost immediately by the compiler or interpreter. The only machine resource it wastes is developers' disk space.


That's true these days. There was a time not that long ago when such a loquacious program might not fit cleanly in the system memory, though, or would fill your floppy prematurely. I have written compiled programs on an Apple ][, where the computer had 128K and the floppy was only 140K.

One valid criticism against such verbosity is that my ability to read and discriminate between strings falls off quickly with length. If faced with two long identifiers that differ only in one word in the middle, it's sometimes difficult to keep them straight.

Re: Self Documenting

2012-05-03 21:19 • by Luiz Felipe (unregistered)
380353 in reply to 380127
me:
I remember when Engineers had to have degrees and built tangible things.

But we build intagible things, that is the problem.

Re: Self Documenting

2012-05-04 02:18 • by Kuba
380357 in reply to 380258
Some Damn Yank:
Some Damn Yank:
So you need an AreaCode class, a CentralOffice class, and a SubscriberNumber class to go along with your PhoneNumberDigit class.
From a practical standpoint this actually isn't a bad idea.
And hey, it's a built-in time bomb, too. If the customer doesn't update, and new area codes get introduced: poof -- they can't handle customer phones in those.

Re: Self Documenting

2012-05-04 02:45 • by Kuba
380358 in reply to 380262
gratuitous_arp:
Your Name:
[...]
Later, you change your mind and want foo.bar to go through a getter or whatever. So you implement it as a property in the foo class definition.


class foo:
def __init__(self):
self._private_bar = 0
def set_bar(self,bar):
self._private_bar = 0
#insert side effects here
def get_bar(self):
#insert side effects here
return self._private_bar

bar = property(set_bar, get_bar)


That last line there is key. Now, the code USING the foo class, the stuff you already wrote, doesn't need to change. You don't even need to recompile it if all you have is the compiled bytecode! All those calls to directly access foo.bar are transparently converted into function calls to the property. (The syntax is ugly by Python standards but that still puts it head and shoulders above C++.) Because of this, you can have publicly accessible member variables and only write getters and setters when you actually need them, rather than write a bunch of pointless ones right off the bat just in case they become necessary later. (As a useful bonus, you can tell at a glance if some Python code you're looking at was written by someone who doesn't really know the language, because it will have trivial getters and setters everywhere.)

Do you get it yet?
Ah, I thought you were using "properties" as a general description for class attributes -- I wasn't aware that Python had a property keyword.
I think you still don't get it, though! Python properties are completely generic attributes of a class, so you thought right and it doesn't make the property system somehow impossible, not at all. All that is special about the properties is that they are bound to an interesting object. The property "keyword" is a plain old built-in library function. It returns a customized property object, that's all. Said object has getter, setter and deleter methods that invoke the relevant functions passed as arguments in the property() call. AFAIK you can write your own reimplementation of the built-in property(), there's nothing magic about it.

Python has a few fairly powerful mechanisms like that. Another interesting one is the decorator system. A decoration is just you telling the compiler to pass the decorated function as an argument to a decorator, and binding the return value of such a call to the decorated name. A decorator is just a function, only that it expects a callable object (say, a function) as an argument, and should return a yet another callable object. The latter can wrap the invocation of the former in some clever way -- say, by building up a property object. This is why property getters/setters/deleters can be just as well declared by using decorators.

Here's a pretty example from StackOverflow:
def makebold(fn):

def wrapped():
return "<b>" + fn() + "</b>"
return wrapped

def makeitalic(fn):
def wrapped():
return "<i>" + fn() + "</i>"
return wrapped

@makebold
@makeitalic
def hello():
return "hello world"

print hello() ## returns <b><i>hello world</i></b>

Ye gods, it's functional programming!

2012-05-04 06:17 • by Watson
380362 in reply to 380358
Kuba:
A decorator is just a function, only that it expects a callable object (say, a function) as an argument, and should return a yet another callable object.

Callable objects? Passing functions as arguments and return values?! WHAT MANNER OF SORCERY IS THIS?!!

Re: Self Documenting

2012-05-04 07:30 • by tox (unregistered)
380363 in reply to 380111
Geoffrey T. Buchanan:
That 50 character length function name is surely a sign of the times. I learned my craft back in the day when software engineering was a serious discipline, before the kids turned up with their IPad/Facebook/Web2.0 bullshit.

Back then 16kb was considered a LUXURY and we had to learn to write highly efficient and compact code so that it would PERFORM. Well I am sure you've all heard this before and I don't want to bang on about some "golden age" - it wasn't perfect I am not saying that. It's just that an engineer who wasted programme memory so frivolously creating a 50 character identifier back then would have been fired on the spot.

I suppose modern businesses tolerate the complete waste of resources today because computers have got more powerful and the customer rarely complains. But you have to wonder how much faster typical software would run if it was written as per the old ways. There's also a slippery slope in play - out there now are a load of upstart "programmers" who are making all the software we use but they have no idea about how computers work let alone basic compiler theory. And people wonder why there are so many security holes and viruses in software these days...

/rant



What have identifiers to do with performance.

Re: Self Documenting

2012-05-04 07:32 • by tox (unregistered)
380364 in reply to 380363
tox:
Geoffrey T. Buchanan:
That 50 character length function name is surely a sign of the times. I learned my craft back in the day when software engineering was a serious discipline, before the kids turned up with their IPad/Facebook/Web2.0 bullshit.

Back then 16kb was considered a LUXURY and we had to learn to write highly efficient and compact code so that it would PERFORM. Well I am sure you've all heard this before and I don't want to bang on about some "golden age" - it wasn't perfect I am not saying that. It's just that an engineer who wasted programme memory so frivolously creating a 50 character identifier back then would have been fired on the spot.

I suppose modern businesses tolerate the complete waste of resources today because computers have got more powerful and the customer rarely complains. But you have to wonder how much faster typical software would run if it was written as per the old ways. There's also a slippery slope in play - out there now are a load of upstart "programmers" who are making all the software we use but they have no idea about how computers work let alone basic compiler theory. And people wonder why there are so many security holes and viruses in software these days...

/rant



What have identifiers to do with performance.



Don't feed the trolls. He's a Web 2.0/Facebook/iPad kid imposing a Real Programmer who uses Real Compilers.

Re: Self Documenting

2012-05-04 09:14 • by no laughing matter
380370 in reply to 380347
Joe:
What does a 50 characters identifier has to do with memory? Much less performance?

Compiler theory indeed....

For compiled languages, shorter variables and terser comments lead to smaller source files and therefore faster floppy disk access.

Actually quite a lot of the modern compiled languages (Java, C#, to name the most popular) provide a feature called Reflection that allows to access attributes and methods of an object by name.

To be able to do this, the names of the attributes and methods must be present in the runtime environment - the compiler will not erase them!

You can use byte code obfuscators to replace your long identifiers with shorter ones, but reflection will not work for the obfuscated code.

Re: Self Documenting

2012-05-04 16:30 • by Jack (unregistered)
bling blong blang, this isn't spam...

TheOnlyProblemWithThisIsThatAfterThreeToFourWordsTheReadabilityOfAMethodIsCompletelyDestroyedOhAndReturnTrue();

Re: Self Documenting

2012-05-04 23:50 • by me (unregistered)
Looks like COBOL to me!!

Re: Self Documenting

2012-05-05 04:10 • by pressureman (unregistered)
380427 in reply to 380112
CAPTCHA:sino:
Then you try to get an MD5 hash in Python and you have to do this:


m = hashlib.md5()
m.update("mahstring")
hash = m.hexdigest();


Hey! Since MD5 is a function (string -> hash), why not make it, you know, a function! (yes you can do it in one line but that's not the point)


Most languages implement message digests (and crypto functions) like this, because they're designed to be fed input in chunks. Consider generating a hash of a 4.7GB DVD ISO. Do you want to feed all 4.7GB at once to your hash function?

Re: Self Documenting

2012-05-11 06:55 • by LeChuck (unregistered)
380877 in reply to 380080
return "Fail";

Re: Self Documenting

2012-05-13 21:14 • by Andrew (unregistered)
381003 in reply to 380358
The rule to always encapsulate your field is there because there is a need to change without propagating to client, and the rule is there unchanged.

In C++, if you exposed a field, and then some client depended on it, there is no way you can change that to a property without touching client code. So you better write the accessor methods.

In C#, with the property keyword, make life easier. When code is changed from a field to a property, client code does not need to change, but they need to be re-compiled. To make life easier, it is better to expose as property to begin with.

In Python, it is possible now. No code change, no recompile.

Still, the rule doesn't quite change. We should always encapsulate the field to make it possible to change, it is just that the authoring needed to accomplish this changed.

The key is to keep the capability to add logic at field access.

Re: Self Documenting

2012-05-15 14:51 • by A. Nonymous Coward (unregistered)
381233 in reply to 380193
KattMan:
Matt Westwood:

You're doing it wrong. A phone number should be defined by a class in its own right with its own internal validation. If you're holding it as a string and banking on the class you're holding it in validating it, you've got it backarsewards.


Wow, did you even read what I said then read what you said? Can you even get the concept in your head without forgetting some other rudimentary thought?
Let's hope you don't forget how to wipe your own nose when I say it this way.
Yes you have a phone number class, but you need to set the value in it somewhere, and that is where you validate. You essentially just said what I did, because the setter you use to get your phone number into your phone number class is where you should have your validation call, not someplace later in the class, this is part of what setters are for! Validate when it comes in, not sit on it and maybe validate later.


You're both doing it wrong. Anyone who creates a class for a phone number should stick to astronaut architecting at home, and stay away from production code.

Re: Self Documenting

2012-05-15 19:26 • by Gedhead (unregistered)
381245 in reply to 380098
Your Name:
Botia:
I remember when methods were verbs.

I remember when programmers were engineers.


I remember when engineers built bridges, and programmers wrote code.

Re: Self Documenting

2012-06-20 13:16 • by comment'appelle tu (unregistered)
383559 in reply to 381245
I was always told not to rely on remembering, but to use comments.
"I'm going to get some tea now."
« PrevPage 1 | Page 2 | Page 3 | Page 4Next »

Add Comment