The Daily WTF: Curious Perversions in Information Technology

2022-08-15 Reply Admin

If the poor sap who did take the position (or any of her potential co-workers) reads this site, please finish the story, preferable with a violent and painful ending for the antagonist.

2022-08-15 Reply Admin

Fundamental Rule of Optimization: Don't do it.

Advanced Rule of Optimization: Don't do it.

Expert Rule of Optimization: Don't do it. Yet.

Kamil Podlesak · 2022-08-15 Reply Admin

Technically speaking, it IS faster to use a tree. Or rather, the tree in the index.

TheCPUWizard · 2022-08-15 Reply Admin

Again, so much out of context..... If the tree is build once and then used millions of times (without the data changeing)....

Also if you have hundreds of front ends and need immediate consistence; attempting to have the synchronization anywhere lower than the lowest common error is suicide. Given Web->SQL that would be SQL : There is no mention of WEB->FBL->BBL->SUS->SQL

2022-08-15 Reply Admin

Using a "StartsWith" tree makes sense, if you do pattern matching. The article sounded like it is used to find things from a fixed string, which would not make sense at all. Also I'd argue, that this is a technical optimization and not business logic at all.

2022-08-15 Reply Admin

I once had a case where I added an index to a certain column to improve a certain very slow query, and when I did I noticed that the amount of time it took to build the tree + the amount of time it took to run the query once the index was built was less than the amount of time it took to run the query without the index. I always thought that was weird.

Mr. TA · 2022-08-15 Reply Admin

I'm confused about what a tree is for in this case. The search string is converted into a tree? How does that help?

Insert your favorite tree hugger joke here

2022-08-15 Reply Admin

@RussellF

Consider what happens if you have two tables with 10.000 rows each that you want to join. if you do this in a very naive way, then you can make 10.000*10.000 = 100.000.000 combinations.

If it turns out that both tables are using only 300 distinct values. Then the database engine will only have to create 300*300 = 90.000 combinations. It's easy to realise that 90.000 takes up way less resources (time/memory/cpu) than 100.000.000 It's also easy to see that creating an index with 300 entries can be done fairly quick.

Database/query optimisation is (generally) all about eliminating as many combinations as soon as possible

2022-08-15 Reply Admin

Hey a rare case where both sides didn't know what they were doing.

2022-08-15 Reply Admin

Not sure how his own tree is faster than the index's binary search... I'm actually very curious to see what this was all really about.

2022-08-15 Reply Admin

The work to build an index is proportional to the number of rows and equivalent to k full table scans for a small value of k: read the indexed portion of every row (possibly not all fields), then write a smaller amount of data for the index files. For the common type of slow query that does a join without an index and, for each of n relevant row in a table, repeats a full table scan of another table of M rows, convincing the query to use an index on the second table improves the searching cost (not including the incompressible cost of producing the final query result) from M*n to "very little"*n, easily amortizing index creation because we can expect k<<n.

2022-08-15 Reply Admin

This sounds like a particularly common kind of WTF in interviewing in which the interviewer wants the candidate to be exactly like themselves. The logic seems to go something like:

If I was a candidate, what would I want the interviewer to ask me about? What's the coolest/cleverest solution to a problem I've come up with? I know, I'll give the candidate that problem and we'll both feel great when they come up with my solution!

I think the interviewer's fear in this scenario is that if they don't center the interview on the subject matter they know best, then they won't have a solid understanding of how to evaluate the candidate. That said, it's always a WTF to actively reject people who would broaden your perspective.

2022-08-15 Reply Admin

I'm suddenly feeling very naïve here. I'm not really a database/SQL expert despite my years of programming, but I'm failing to see why putting business logic into a stored procedure is a problem. Some years back, almost against my will, I was dragged into a project (a project worthy of many WTFs) where I discovered there was not a single stored procedure to be found. The most egregious bit of code involved placing users on a leaderboard. Ranking on demand was essentially a bubble sort. As I encountered it, the procedure was: 1) open database connection 2) get the user score 3) see where the user was in relation to everyone else 4) close the database. But step 3 required doing that procedure with everyone. With 500 users in the system (a tiny faction of what they anticipated) this amounted to 500x500=250,000 times you opened and closed the DB connection and fetched data. The operation, unsurprisingly, timed out. I created a single complicated stored procedure that had the brains of the operation. It operated in a negligible few milliseconds no matter what the user count. Why would anyone NOT put such business logic in a stored procedure?

Steve_The_Cynic · 2022-08-15 Reply Admin

It's strongly dependent on where you draw the boundaries of "business logic".

2022-08-15 Reply Admin

By far the most difficult thing to deal with in having business logic in stored procedures is that it's not unit-testable. There's no way to exercise that code without having it in a real database. You can't mock anything, so testing performance is dismal.

BatConley · 2022-08-15 Reply Admin

Trees FAIL faster, no doubt., when written by a dud like the interviewer.

2022-08-15 Reply Admin

Front-end devs don’t like business logic in stored procedures, because they can’t change it directly. That’s what the DBA is for.

2022-08-15 Reply Admin

Seems like the problem was not about stored procedures, but opening and closing the database for every single query. Why would they do that? And more importantly, running one query per user, rather than getting the scores of all users and then "ORDER BY" that score. That's very basic SQL and the server should do it more efficiently than bubble sort by default. I'm not against stored procedures, but I don't think this story has any relevance to their use.

2022-08-16 Reply Admin

You really shouldn't look for programming tips in Daily WTF articles :)

2022-08-16 Reply Admin

As others have said, it depends a bit on where you draw the line between data logic, business logic and application logic.

Your example of “return sorted users for a leaderboard” is a single operation which, on its own, makes sense as a stored procedure. However, in general, people will implement a bunch of procedures, and then require that the be called in a particular order to complete a business operation. At this point, your business logic is then SPLIT between DB and App, and understanding any of what’s happening requires context switching between different programming languages mid-operation. Not ideal.

Another issue is scalability - if you design it this way from the start, it can be cheaper to scale using multiple application servers, each with relatively little CPU/RAM… compared to all your processing being in the DB itself, so you have to keep throwing more CPU into the same machine. Not necessarily a problem with DB clustering etc, but a consideration.

The final issue that drives people away from Business Logic in stored procedures is vendor tie-in. Sure, for a small SaaS product, once you’ve decided on a DB layer you’re unlikely to ever change it - but in the enterprise world, we still sometimes get customers who say “Sorry, we’re a <Vendor> exclusive shop, you have to install it on prem for us, and use <DB of choice>“. At this point, if you have a lot of stored procedures, you’re stuck converting them to that vendor’s ever-so-slightly-different variant of SQL. Better, in that scenario, to have all your logic in the app, and just use the RDBMS for storing data and enforcing referential integrity.

nerd4sale · 2022-08-16 Reply Admin

Why would anyone NOT put such business logic in a stored procedure?

There are lots of reasons, some good and many bad.

The main 3 reasons are:

It is out of fashion. The current trend is microservices (often implemented badly, but that's a whole subject on its own).
Most developers have too little knowledge about databases.
For some weird reason, it is often thought that stored procedures are to be written and maintained by DBAs instead of developers. That is incorrect, although input from a DBA should always be appreciated.

jkshapiro · 2022-08-16 Reply Admin

Would version control also be an issue?

Jaime · 2022-08-16 Reply Admin

Don't forget:

Stored procedures are written in a language that is great for set-based operations, but is absolutely atrocious for traditional logic operations.
If the business logic relies on data sets from multiple sources, putting that logic in a stored procedure is both much more difficult to implement and even more difficult to secure properly.
Caching. Q. What's faster than getting your data five hundred times via stored procedure? A. Not even getting it 499 of those times.
Debugging, error handling, logging... all of the things that programming runtimes have done well for a very long time, but stored procedures have barely functional capabilities of.

nerd4sale · 2022-08-16 Reply Admin

You are repeatedly confirming my second point.

emurphy · 2022-08-16 Reply Admin

I've been working heavily in SQL for about 14 years and I still don't know enough about it. (The finer points of how multi-column indexes get used, mainly. I generally just extract small subsets into temp tables if I think it's gonna be an issue.)

Caching is a valid point if the need for it is strong enough to outweigh (a) the extra complexity, (b) checking for stale data when relevant (e.g. recording and looking up last-modified timestamps), and (c) needing to remember to use the clear-cache option during testing. If you still feel the need, then caching can be done in some outer layer and/or within SQL itself.

jkshapiro: There are tools for maintaining version control on stored procedures. You do need to keep the front and back end in sync (and/or provide backward compatibility), but that's not more of a problem than the type of problem that Argle described.

Argle: I hope you discovered RANK() at some point.

One example of business logic outside stored procedures: If you have an API layer in between UI and SQL, then the API layer can run some simple sanity checks like "did the UI miss passing a required parameter" or "is this parameter value clearly invalid (e.g. ID is supposed to be a positive integer)", and if it fails one of those, then don't even bother SQL (and also don't risk SQL doing something dumb in response). Then SQL can run other sanity checks like "is this ID actually in the relevant table".

2022-08-16 Reply Admin

Trees in programming are like trees in nature: You need significantly more time to grow them than it takes to use them.

2022-08-16 Reply Admin

I could recommend the antagonist to hang themselve from that tree, but that would be declarative, yet it's clear they need an imperative employee.

2022-08-17 Reply Admin

That's all very true. The "weird reason" for point 3 is point 2, which also explains the popularity of ORM tools. A lot depends on what you call "business logic". To me "Here's some data - store it away for me" or "Here are some parameters - get me the relevant data" don't really count. Stored procs provide an interface which I've always found to be hugely useful for all the usual reasons that an interface is a good thing.. you can make changes to the database schema without taking everything down as long as the interface still works. You can secure them properly, and a database specialist who actually understands it can optimize them. I also agree with your response to Jaime's comment - Sorry Jaime; you know you can still cache what comes back from a stored proc, and nobody is talking about moving the whole application into the database (I've seen some horrible things done with mod_owa!)

2022-08-22 Reply Admin

There is no problem in unit testing as well. Begin transaction, setup data, call stored procedure under test, select and validate results, revert transaction.

The interface thing that @Rich mentioned above is also very true. Design your business logic against interfaces, not some random queries generated by whatever ORM is fashionable this month.

jkshapiro · 2022-08-23 Reply Admin

the poor sap

Oh my goodness I only just now noticed this pun. Nice work @Industrial Automation Engineer!

Eric Cartman · 2022-10-27 Reply Admin

cool really

Addendum 2022-10-27 16:34: You now require a capable investment if you want to launch a business. The significance of this has increased. As a result, I urge everyone to look and investigate thoroughly with pokiesman.com/online-casinos/ ; they assisted me in choosing a reputable casino and helped me discover where you can actually begin playing for real money. I'm hoping you'll be helpful. I wish you luck in that.

Trees are Faster

Leave a comment on “Trees are Faster”