The Daily WTF: Curious Perversions in Information Technology

2017-08-16 Reply Admin

Hang on...

IList<HJFRate> rates = db.HJFRates.where(rate=>rate.typeOfUse = typeOfUse)

Shouldn't that be

rate=>rate.typeOfUse == typeOfUse?

2017-08-16 Reply Admin

Oh, and is there a guide to TheDailyWTFs Markwhatever?

I cant seem to get that anywhere near 'right'...

2017-08-16 Reply Admin

Each call to .First() re-executes the query, meaning this takes three round trips to the database.

That's unfortunate (I say this as an understatement). It should cache it on first access. Users should expect .First() to return the same value every time for all sequence-ish objects, without side-effects. I think LINQ does all this correctly.

2017-08-16 Reply Admin

It will return the same value. But it will query to database for this value (Entity Framework gives you the same object, with same reference, for row in table). LINQ have operator "ToArray" or "ToList" that runs the queue and STORE result in local array/list. Then you can access this structure without limits ;)

2017-08-16 Reply Admin

It won't be the same value every time, because the value in the database can change between invocations.

2017-08-16 Reply Admin

You should probably find a guide to basic SQL, too.

2017-08-16 Reply Admin

Also yes, you get ToArray and ToList on every class implementing IEnumerable, which hopefully includes Entity Framework queries!

2017-08-16 Reply Admin

without side-effects. I think LINQ does all this correctly.

Nope. If you have side-effects (for instance in a select) they will get executed every time you iterate over the result.

The same thing goes for caching. LINQ (on IEnumerable<T>) does nothing for you on both cases.

2017-08-16 Reply Admin

Well, color me corrected. That is fairly silly, but nothing I shouldn't have come to expect, I suppose.

2017-08-16 Reply Admin

I know SQL uses =, but the Code that I attempted to post is the query in C# using EntityFramework, so == should be the correct choice there...

Or am I completely missing your point and responding to something entirely else?

2017-08-16 Reply Admin

" I can’t ask him what his reasoning was.. but I did send him an email with the text “WHY!?!?!?” and a screenshot of this code. No response."

Bob confused passive-aggressive rants (which won't ever work, but may gain a few social points with other socially inept people when they are told about the "hilarious" interaction) with aggressive-aggressive rants. These just proves that you are an a-hole. To quote the king of aggressive ranters: Sad!

DocMonster · 2017-08-16 Reply Admin

The issue with ORMs is because they abstract away SQL, it's easy to pretend SQL isn't something you have to worry about. But I often also see the opposite mindset: SQL is so great, that you never want to do anything in code that could be done in SQL/database instead. Thus you have the push to make things "table driven" where things are just stuffed in a database somewhere and retrieved and acted upon, with minimal code needed. That's also wrong.

I feel a hybrid approach is the best way: Use a minimally-invasive ORM (like Dapper or, my new favorite, Insight.Database) that lets you use your SQL of choice whether it's a raw (parameterized of course) SQL query, a function or a stored procedure. But I also feel that you should not just 100% decide to, for example, always use a stored proc or a function. Sometimes the query is super basic and an inline SQL query (again, parameterized) is just fine. When you start to apply blind dogma by, for example, telling a developer that a trivial query such as "select IsActive from Customers where CustomerID=@CustomerID" needs to be put into a function simply because you declared everything has to be in a function no inline queries allowed in code, you're missing the point and just doing a different type of WTF.

2017-08-16 Reply Admin

I use the db.HJFRates.SqlQuery(query); direct SQL query for the case of "DELETE FROM HJFRates WHERE .....", where the LINQ generates a an SQL query for each item to be deleted unless a torturous, unintuitive, and unreadable LINQ statement is used.

2017-08-16 Reply Admin

Unfortunately if you get consistent results or not is a "It depends" situation.

You'll get consistent results if, and only if

Your database engine supports and is using consistent snapshots
You don't run any queries that update the results
Your query doesn't have side effects
You're not sorting or filtering on an expression that changes(like filtering on the current time or sorting randomly)
You're within the same transaction
No one does something to the database(like restarting the service, dropping tables, revoking permissions, etc)
You're actually sorting or have only one result(without sorting DBs are allowed to return in any order they like, this normally means sorted based on their order in an index, but if the index is a complex one and gets rebuilt you may find you get differently ordered results)

If you keep to all of these you'll get the same results, but if you violate any one of them what you get has no guarantee of consistency.

TL;DR: You can't ever really guarantee consistency without controlling the entire database environment, but it is possible to ensure that you always get the same results.

2017-08-16 Reply Admin

"Easy Reader Version: It's the sum of badness that's the WTF. Each individual problem is relatively minor."

Am I the only one who thinks that because of the SQL injection vulnerability, nothing else really matters, especially if there's no usable backup?

2017-08-16 Reply Admin

Given that the database design includes per-year columns, I'm not surprised the code would be this terrible.

2017-08-16 Reply Admin

Well yes, you could stop reading right there, but we are taking everyone's word for it that we have an unsanitised bit of UI (or similar input) being handed along in that parameter.

To me it looks like a procedure that was maybe written in a legacy version pre-ORM which then got adapted but not refactored when some bright spark wanted to retro-fit an ORM. Hard to say if the developer didn't understand the ORM basics or just couldn't be arsed to do any more than kick the function until it started working. Might be total bollocks, but there's a certain smell about it.

And I don't buy the whole object relational impedance nonsense. You have front end developers who don't get databases and database developers who are a bit too perfectionist and the two won't work together (or worse still, a DB being thrown together by developers who don't understand them at all), so people are marketing this ORM silver bullet that will glue the two together. It might ease your pain, for a cost, but the real answer is to sort your team out.

2017-08-16 Reply Admin

It seems to me that most of those properties are something an ORM either must do (because otherwise it is badly written) or cannot do (because some things come back to user code).

Although my point was more to say that this is an unhelpful way of hiding how multi-process systems work. The DB is the state of the world; it will change on each access, and an ORM is evil if it gives you different behavior from something innocuous-seeming like accessing the first element of a returned sequence more than once.

Anyway, I've already settled on only accessing the DB (both queries and updates) by using stored procedures via ORM (in my current case, LINQ and DBML). I think it eases most of these problems if it doesn't entirely solve them.

2017-08-16 Reply Admin

ORMs are fantastic in that you don't have to re-write a DAO for every project. I'm moving from an environment with an ORM to an environment with a handwritten DAO and I'm baffled why we're putting a lot of effort into manually performing a SELECT and mapping it onto fields.

DocMonster · 2017-08-16 Reply Admin

I definitely think if you are doing mapping by hand, you're wasting time. If you are trying to roll your own ORM, you're wasting time as this problem has been solved plenty of times by now, with virtually every possible situation covered by some better-than-your-own tool out there.

The biggest thing though IMHO is knowing when to apply SQL and when to not. A complex query or report can be put into a stored procedure and optimized fully; your code should be able to work with domain objects and such (I firmly believe in 99% of cases if you are passing around raw DataTables or DataSets, you're doing it wrong. The exception can be actual tabular data or data that doesn't need to map to anything e.g. a data dump where data is just being read from the Database and spit out into a file, for example). Sometimes the answer is to call a stored proc. Sometimes it's to select from a function. Sometimes a basic query is good enough. Different solutions for different problems, but nowadays a solid ORM (usually a so-called "micro ORM") is able to work with SQL and not have to go through hefty configurations or even code to set up a database; simply create a connection and you can query it.

2017-08-16 Reply Admin

All of this confirms my belief that databases are a royal pain the ass. Especially if you want to get it right, which doesn't happen very often it appears.

Now where was that deck of cards that is my Fortran program????

2017-08-16 Reply Admin

"And I don't buy the whole object relational impedance nonsense. You have front end developers who don't get databases and database developers who are a bit too perfectionist and the two won't work together..."

I suspect you can insert the word "impedance" at the relevant points of your own scenario here. (It also applies to various leaky abstractions, as pointed out in the OP.)

So, basically, you have "social" impedance -- front end, back end, blah blah -- and "technical" impedance -- an object model is not a relational model, nuff said -- and omigod PHB and WTF and even QA and testing impedance!

Dial my impedance down to -1. Other bands go from zero to ten, but, hey! My impedance is basically imaginary!.

2017-08-16 Reply Admin

And btw ... the "effort" in the OP would be a WTF if written by anybody at all, let alone a "senior dev," for all the reasons given and more. Try to think about it outside of the context of an ORM.

It doesn't need that context. It's several kinds of cretinous bollocks, all on its own.

2017-08-16 Reply Admin

but I did send him an email with the text “WHY!?!?!?” and a screenshot of this code. No response.

You realize this guy almost certainly knew your boss, right?

2017-08-16 Reply Admin

.first() can run the entire, possibly very expensive multi-gigabyte query and store it, and thus never change. Or it can return quickly without potentially fully running and storing a big expensive query, and possibly change each time you read it. Or it can cache the answer, and then you run .second() on it, and get an inconsistent answer if the data changed under you; that's obviously the worst answer. The only reasonable response is to let the programmer decide whether to run and store the whole query for stability, or have cheap, quick references to a live database to minimize database usage and maximize currency of the answer.

Watson · 2017-08-17 Reply Admin

"You have front end developers who don't get databases and database developers who are a bit too perfectionist and the two won't work together ...[hence ORM]... the real answer is to sort your team out."

You do that by sack both groups and replacing them with full-stack developers.

2017-08-17 Reply Admin

The main issue is that an ORM query is using a generic sequence type to hide side-effects. It would be fine if it returned a specialized result type instead, with precisely-defined semantics, rather than a type for which every other instance is something on which you can call .First() and .Second() as many times as you like (without worrying about the current state of a database) and return to user code (accidentally or otherwise) without worrying about how many times it will attempt to read the values.

What I'm saying is, if you're given a generic type and told you have to use it in specific ways, what you really want is a less generic type.

(Post-script: if you're running a very expensive multi-gigabyte query and only looking at the first result, you're running the wrong query, but I understand that was not the crux of your point.)

2017-08-17 Reply Admin

ORMs are all right by me: let the front-end guys use whatever works from them, and let the DBA check that they're not generating hideous SQL queries.

What drives me nuts is when I'm expected to deal with some abject CodeFirst monstrosity of a database, created by "full-stack" developers who don't have a clue. The complaint is usually "reporting doesn't work well". My answer is "dig a deep hole, chuck the vile thing down it, pour in 500L of petrol, then chuck in a match. Then start again with some clue about normalisation and database design".

If I turned up for a front-end dev job, I'd make a fool of myself very quickly and wouldn't last a week, because I don't know squat about it. Why do front-end devs (even if they call themselves "full-stack") assume that they won't make an equal fool of themselves with databases?

2017-08-17 Reply Admin

"an ORM is evil if it gives you different behavior from something innocuous-seeming like accessing the first element of a returned sequence more than once."

The problem with that reasoning is just a lack of your understanding of how EF works. The code is not returning the first element of a returned sequence (IEnumerable), it is returning the first element of a query (IQueryable). EF will hit the database any time it has to materialize an IQueryable. The solution is to save the result of FirstOrDefault() to a variable and then access the properties.

2017-08-17 Reply Admin

OK, perhaps that is its intention, but the naming of IEnumerable vs IQueryable aside, it still seems like counter-intuitive and counterproductive behavior. This solution you propose is more of a workaround. If an IQueryable represents a query, why does it perform more than one query when you look at it? Why not have it such that one query must be consumed at most once, at once?

2017-08-17 Reply Admin

This is an instance where you would want to execute the same IQueryable more than once.

var page = 2; var pageCount = 20; var usersQuery = DB.Users.Where(u => u.Type == "Admin"); var userCount = usersQuery.Count(); // <--- runs query var userList = usersQuery.Skip(page).Take(pageCount).ToList(); // <--- also runs query

Storing the result of FirstOrDefault() is not a workaround, that is how you tell EF to run the query.

2017-08-17 Reply Admin

This is an instance where you would want to execute the same IQueryable more than once.

var page = 2; var pageCount = 20; var usersQuery = DB.Users.Where(u => u.Type == "Admin"); var userCount = usersQuery.Count(); // <--- runs query var userList = usersQuery.Skip(page).Take(pageCount).ToList(); // <--- also runs query

Storing the result of FirstOrDefault() is not a workaround, that is how you tell EF to run the query.

2017-08-17 Reply Admin

You don't want to run the same query twice there, you run one query, then you have both the results and the length from it. Doing what you're doing could give incorrect results if a new admin user is added between those two query operations, as this comments thread has already covered.

I am aware that this is how you tell EF to run a query. I'm saying that it is a workaround for a problem that wouldn't exist in a better-designed ORM.

2017-08-17 Reply Admin

Ah, the famous Bobby Tables is featured yet again :)

2017-08-17 Reply Admin

The code would not run the same query twice, it would generate 2 different sql queries based off the same IQueryable object. The point of doing Count() before Skip() and Take() was to show you the benefit of using the same IQueryable more than once, which is the counter argument to your comment about why you would not want restrict in IQueryable from being executed more than once.

You need to execute 2 sql queries to fetch the correct data efficiently. Getting the Count() of userList after doing userQuery.Skip().Take() would only return the pageSize at most, which would not be the total number of admin users in the entire database. Creating 2 separate, identical IQueryable objects is counter-productive when you only need to define the criteria once.

I am not trying to sound rude, but it doesn't seem like you are grasping exactly what an IQueryable is and what it is used for, so its hard to take the points of your argument seriously. If you can tolerate vb.net, this article might help: https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/ef/language-reference/query-execution

2017-08-17 Reply Admin

You're not wasting your time if your set of tables/classes is relatively small. That's where ORM evangelists and their minions always get lost. They create a poor design with thousands of placeholder classes so they can say that it would take forever to do the mapping any other way. Then the side effect happen. A column has to be altered and everybody has to stop and wait for hours while it regenerates the entire thousand-class model. Then you run into something that needs a special attribute or non-identity key and now you can't have it because the generator wipes out everything. Then you go to production and there are performance problems because what used to be a recordset with four columns is now a list of objects with forty properties. An ORM never really saves much time in the long run, outside of being able to turn down user requests because "it can't" instead of "I can't" but user requests don't just go away, they either find another way around the problem or they push it up the chain of command.

2017-08-17 Reply Admin

Yeah, I think that's the kind of situation I had in mind with my grumpy bit of controversy, as it describes what I've seen with ORMs and before that rancid over-complicated data abstraction layers. Automation doesn't fix bad design decisions, it merely embeds it for future pain.

As you say, the ORM will stop certain stupidity, but odds are some brillant genus will write an eye-watering WTF to circumvent the constraint and you've now no longer even got an ORM helping you.

I'll poke a stick at full-stack developers, it's very trendy, but most are one bit or the other who got forced to rebrand their skillset when an employer took the latest trendy idea pill. Fair play to those who genuinely are, in a sense I am, but that gets blown someone swaps the technology somewhere in the stack and we're left twisting in the wind. I tried to leave a project (without prejudice) as it had become mostly a stack I was unfamiliar with and I knew I was poor value for money ... the PHB was desperate to keep me and I still don't get why.

It's honesty about strengths and weaknesses and teamwork that makes this thing work. And the guts to tear layers up and rewrite them when stuff goes wrong (not that I ever saw that from a PHB).

2017-08-18 Reply Admin

I understand now we are arguing about an example rather than an idea, but your code still has the race condition problem. You need a single query returning the total count and the skip/take subsequence at the same time. Creating a second IQueryable object vs. having it generate two different SQL queries is not a worthwhile distinction in this case.

I trust you that you are correctly describing how IQueryable behaves and should be used, and thank you, but I again stress that I am not concerned with implementation details, especially when it is so prone to mistakes and performance issues. I am arguing that this is not how it should be done.

2017-08-18 Reply Admin

"controllers shouldn’t be doing data access directly"

No, but with EF and MVC.NET doing so is practically shoved down the dev's throat. Even most (all?) documentation and examples show this sort of anti-pattern. Especially the ones from MSDN.

2017-08-18 Reply Admin

That sounds like more of an issue with bad developers than with ORMs.

2017-08-19 Reply Admin

If you're complaining about the ORM deriving from a certain type with certain invariants and doesn't follow those invariants, then that's problematic. But you said "sequence-ish objects", and there's good reasons for an ORM to offer a rich, sequence-ish view of a query.

"if you're running a very expensive multi-gigabyte query and only looking at the first result, you're running the wrong query"

If "SELECT ... LIMIT 1" is reasonable to run, then making a query object with the same semantics and calling .first() on it, with an ORM with the right semantics, should be equally reasonable.

2017-08-19 Reply Admin

ORMs enable bad developers. Just like ASP.NET web controls enable bad developers. They get used to dragging and dropping and then when a problem that isn't so Mickey Mouse comes up, they freeze or do something ridiculous within the framework that sortof sometimes works at a huge cost of performance, complexity, or both. Because it never occurs to them to take the direct route. They need a drag and drop easy button abstraction.

Then, at interview time, these places throw around buzzwords and frameworks like crazy. You can be a SQL wizard but you won't get a second look unless you talk about how great Entity is. You can know Javascript inside and out but forget it unless you've hitched you wagon to the framework of the day.

No doubt some smart people write some of these tools but they deliberately limit what smart people outside of their group can do. That's why so much software today is missing so many basic features and/or churns out "oops, something happened, try later" for no apparent reason. Behind the scenes there's a stone tablet with "thou shalt useth TrendyTool, and only TrendyTool, because buzzword" inscribed upon it.

2017-08-19 Reply Admin

Operating systems enable bad developers. In C on Unix you can open a file and start writing to it, and not know whether it's a line printer or on a RAM disk or hard disk or on a server across the room with a direct wire connection, or a file on the dark side of the Moon with a connection running across WiFi, under the sea, and being bounced off two satellites. Or, even, being dumped directly in /dev/null. It never occurs to most developers to take the direct route and move the hard drive head where they need it.

I've written my fair share of SQL, and it's pretty easy to write something that sort of works at a huge cost of performance. It hides from you what the database is doing. But it makes it much more portable and frequently quicker when you let the server do it. ORMs are likely more performant in many cases, because they just do it right, instead of depending on the programmer to wire up the system in just the right way.

In many places, the problem is a stone tablet with "thou shalt use JavaScript and only JavaScript" inscribed on it. In other places, there is no tablet, and the problem is we have a system running some 40 year old Fortran that's interfacing to C++, Swift, Idris, Node.JS, Clojure, F# and Prolog.

2017-08-19 Reply Admin

You joke but your world is one where all drivers are stuffed into go-karts. There are no trucks because you can just transport a ton of dirt one bucket at a time. There are no offroad vehicles because mountains and snow don't exist. There are no sports cars because precision engineering is too hard. There are no boats or planes because there's nothing beyond our shores. You may be happy puttering through stop and go traffic from Starbucks to Starbucks but not everybody wants to be locked inside that prison.

2017-08-19 Reply Admin

And a disastrous failure when you use .First() .. apparently we can leave the "or default" for a separate discussion ... three times over.

What are you, young man? Some sort of masochist? Or simply an evangelist for vicarious masochism in general?

2017-08-21 Reply Admin

If you're complaining about the ORM deriving from a certain type with certain invariants and doesn't follow those invariants, then that's problematic.

That's the one thing I've said in pretty much every one of my comments so far. I'm glad you've understood.

But you said "sequence-ish objects", and there's good reasons for an ORM to offer a rich, sequence-ish view of a query.

I said "sequence-ish" and I also said "without side-effects". That is the main issue.

"if you're running a very expensive multi-gigabyte query and only looking at the first result, you're running the wrong query" If "SELECT ... LIMIT 1" is reasonable to run, then making a query object with the same semantics and calling .first() on it, with an ORM with the right semantics, should be equally reasonable.

What are the right semantics? Future-vision semantics? .First() on a 1000-result query is not the same as .First() on a limit 1 query. How on earth is the ORM supposed to know that you're not going to try and read the remaining 999 rows?

2017-08-21 Reply Admin

SQL / ORM failures aside, I really, really, really hate it when people declare their variables separately, only to initialize them to some values a few lines (often a lot lines) below. There is absolutely no reason why it shouldn't say:
string TypeOfUse = selectedUse.FirstOrDefault().TypeOfUse; decimal unitValue2016 = selectedUse.First().FieldUnitValueRate2016; decimal unitValue2017 = selectedUse.First().FieldUnitValueRate2017;
The only thing worse, is initializing variables to null or 0 or empty just for the sake of initializing it.

2017-08-21 Reply Admin


string TypeOfUse = selectedUse.FirstOrDefault().TypeOfUse;

decimal unitValue2016 = selectedUse.First().FieldUnitValueRate2016;

decimal unitValue2017 = selectedUse.First().FieldUnitValueRate2017;

2017-08-21 Reply Admin

Agreed. Whenever I read code like that it seems like someone can't get over ANSI C 1881.

Object Relational Mangling

Leave a comment on “Object Relational Mangling”