- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- It Figures
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
complex_t frist(i, wouldnt, want, to, support, this, heap, of, parameters, either);
Admin
Ah, yes, that classic song from The Who, "I can scroll for miles and miles"
Admin
Decades ago I worked in a corporate research lab with some amazingly intelligent pHD types. It's not spaghetti code, it's code that they understand in a higher dimension topology than you are capable of visualizing. Why bother with organizing files in subfolders, it's no trouble to remember what each file is among hundreds of others with cryptic 8-letter names.
All you can do is stare in awe and be grateful they are using their powers for good.
Admin
I counted 161 parameters.
C is powerful but hard and not very suited for quick-and-dirty prototyping. 161 arguments is insane. So it sounds like a terrible idea to make an experimental program like that, especially in C.
Also, let's take a closer look at some of the parameters: char* excludename, char* keepname, char* removename, (wtf?)
char* mergename1, char* mergename2, char* mergename3, (now that's outright bad!)
There's also this interesting data type that hints at a deeper WTF: Two_col_params*
There are also some network-related params in the middle: char* rplugin_host_or_socket (I'm not sure I want to know what they exactly do with that)
Some parameter just has a weird name full of consonants uint32_t allelexxxx,
There's a variable that'd look more out of Photoshop than from a genetic experiment: lasso_select_covars_range_list_ptr
There's also sex in several parameters: uint32_t sex_missing_pheno, uint32_t update_sex_col, (why not, since it's genetics. But there are countless other genetic aspects that'd be worth taking into account, here the code feels just perverted)
Plus, at the end, some file-deletion-related parameter: Ll_str** file_delete_list_ptr (now that really looks like a God function)
Admin
There is no "w" in "holistic"
Admin
I'd like to see the body of this function. Wait, I changed my mind, I'd rather not.
Admin
I'm thinking The Proclaimers: "And I would write a hundred args, and I would write a hundred more..."
Admin
I am going to disagree. There is lots of application and need for relatively dirty prototyping in data science and cases where the initial work has to be done by subject mater expert who is not perhaps a subject matter expert in software design. However even someone whose only knowledge of C comes from "Teach yourself C 24 Hours" or similar and has no other programing background, should recognize that the above is a situation that has got out of hand. actually they probably should have spotted that 60 parameters ago...
At some point a truly smart person would have recognized they don't know how to do what they are trying to do and engaged someone who does -or- at least decided to stop and shift their focus from the analysis problem to the architecture problem and done some refactoring within the limits of their own ability. With this many inputs there is a zero chance the author understands the control flow of the method/function body fully at this point and they certainly can't show anyone else they are correct in terms of their process. Even if by some miracle this does what its supposed to do, the value of whatever it produces is greatly diminished by the lack of venerability (and independent reproduction potential) , and impossibility of tweaking improving at this point without breakage.
Admin
Easy: Wrap it all into a struct instead!
Admin
"Whitespace added for readability." <<< THERE's your WTF!
I have worked in IT supporting research for over 25 years. This is "normal", and usually these programs are maintained by the folks who write them. Genetics programs are as complicated as the topic they are meant to understand. Plink is extra-compilicated in that it does a lot of things that incorporate the functionality of several other program suites. Most of these have "setup scripts" to get the parameters correct in a file, and few of these types of programs take more than a few command line parameters (one or two files, maybe 3, a database, a few keywords, etc.) so the user never has to worry about 100 parameters being in the correct order.
That said, this is pretty WTFy because A LOT of those parameters are flags and filenames -- and I'm fairly certain that the output from this function would not change if filenames were referenced by pointers rather than to be parameters. (I've never used Plink, but based on my experience with science-y stuffs, you probably can't change the name of the output file, so why is it a parameter here??) This was definitely written with a lot of addon modules to expand what the program does. That's how Science works.
So, I guess what I'm saying is, Science is TRWTF?
Admin
There should be an "alco" hidden somewhere in this article you need to prepend for the "holistic" to make sense.
Admin
That's slightly less goofy than you think, given that "allele" is a word (meaning, roughly and simplistically, "variants of a gene").
This parameter clearly thinks that Castlemaine Perkins makes beers that are worth drinking.
Admin
To a degree. Given the incentives of "publish of perish", there's never the time to think about code quality in science. Results are out the door, so the code is good enough. From experience in my environment, these projects tend to eventually come to a point where it becomes a maintenance issue that slows down new publications, and a CS guy is brought in, only to throw the towel within less than a year.
Admin
My immediate thought was, if they had to bundle all this crap up and feed it to a function, why bother? Why not just inline the code that was in the function?
Answer, probably because it does error returns from all over the place but I wouldn't bet on it.
This whole thing should at least be an object, but that would have been in the second lecture and they went to the pub instead.
Admin
I think this is unironically a good first step towards refactoring.
Admin
Once they get to 127 parameters they're bumping up against the C standard's 'guaranteed max # params' limit
Admin
Holy shit. This is what you get when people go to coding "boot camps", that if they bother to teach what a struct is, they don't teach why you would want to use one, and they certainly don't even have the time to explain the importance of encapsulation.
Most of that stuff should be added with separate functions that set related parameters, not this monstrosity that requires you to count your commas OR ELSE. And the summary calls it a "method", which means that it's an object, so it shouldn't need everything set at once.
In embedded, the annoying anti-pattern I see is putting everything into arrays of structs of structs, then using stacked blocks of long lines of code that reference them explicitly, like ten lines of "thing[index].shelf.box.drawer[drawer_num].item.size.x = size_x;" etc., in a function with the integer indexes as parameters, because they don't understand how to (or are even scared of) pass a pointer to just the thing being worked with. If I have to maintain code like that, the inner block gets something like "item_t *item = &thing[index].shelf.box.drawer[drawer_num].item;", then "item->size.x = size_x;" etc. so I only have to see that bullshit once. DRY, dammit.
Admin
I don't see a WTF here.
Excludename: If you see this name in what you are scanning, ignore it. Keepname: If you see this name in what you are scanning, retain it. RemoveName: If you see this name in what you are scanning, delete the file.
This looks like a set of filters which would be unlikely to all be used at once.
No worse than other magic values that we run into. "Allele" is a common term in genetics and often are named.
God function? My first guess is that it's a reporting function--these are the files that were deleted by this run--note the RemoveName parameter above.
Actually a sane starting point. In my experience when the number of parameters gets unreasonable there are usually a bunch of them that are rarely used. Initialize a structure to safe null values and then fill in the ones you actually need.
Admin
The parameter that caught my eye was "Oblig_missing_info* om_ip". Is this where you add in the "obligatory missing information"?! Probably needs to be a null pointer or it doesn't work.
Admin
Hum, how would this be used? Probably you'd have a few screen pages worth of variables defining these parameters and then the standard call. Maybe using a scripting language with C function bindings. Or just have a script reading in in these definitions from a config file. So you'd set it up once and just modify a few line for your needs. I guess, this would do the job in a manageable manner.
Admin
The code was hard to write, it should be hard to read. Nothing to see here.
Admin
When I was on the university there were three kinds of people:
A) People that could code B) People that could develop & code C) Everyone else
Just because some scientist could write could didn't really made him a developer obviously, so there was a lot of horror code flying around, mostly in FORTRAN. And I get it, a backer usually makes a lousy fisherman and that's true the other way around. The main issue was the annoying competition and race to the first publication in the scientific space which resulted in a lot of category A people because they didn't want to share credit with category B people :-)
Admin
They should have all just been global variables. That way the function call is really clean.
At least, that's the way the sciency people I worked with thought.
Admin
Port the program to the whitespace language and it will look a lot cleaner.
Admin
You're invoking my trauma here ^^'
I am currently in a project that works like that. Lots of mutable global state, that made sense for the original purpose of the code, but the lack of clean separation now comes crashing down on attempts to reuse the existing subroutines in a new context. Every field of every mutable global variable is a hidden input and/or out output parameter that may or may not affect the wanted result or future invocations.
Admin
Been doing that gig for a long time now. The usual limitation is how much money there is for cleaning things up; scientists are often good at getting enough cash for the initial development but not enough for turning the proof of concept into something more supportable.
Actually taking stuff to the quality of being a proper product can take a team of software engineers a few years, particularly as there will probably be a lot of scope creep and a lot more handling of complicated edge cases to do. And the software engineers can quite often be paid more than the scientists, which can be really shocking to the scientists.
Admin
More like "no interest". Unless supporting the software results in publications and citations, there is no incentive for scientists to take a look at the long-term. The career model simply isn't built that way.
Admin
Especially because, speaking from experience here, the software engineers don't have to have PhDs.
I don't want to be paid less, but I do think it's messed up that scientists aren't paid more.
Admin
Thank you for providing this important information. In my opinion, it's wonderful. They have important things to say, and I should pay attention to what they have to say. https://vashishti.com/collections/natural-honey
Admin
Shooting myself in the knee here, but it is sort-of relatable. Demand and supply. Science attracts people for the sake of it, though conditions vary wildly across countries. People will take a decent-but-not-optimal income, if the work is attractive.
I've heard people turning down positions in the UK though, due to having the same salary level, but much higher costs-of-living. As one colleague put it: "I'm not going to take a PostDoc position, that gives me a lower quality of life than my PhD position." But there's usually plenty of talented people from the local populations or less-rich countries to fill the gaps, so a loss of quality will probably take forever to become visible, and then take forever to repair.