The Daily WTF: Curious Perversions in Information Technology

2019-11-28 Reply Admin

Is Frank responsible for writing the queries for WTDWTF? That'd explain all those cooties...

2019-11-28 Reply Admin

I've never understood the mindset that the Franks of the world have. You're a software developer. You have a job to develop software. Part of that job entails putting things into and getting them out of storage. It is incumbent upon you to learn how to do this. Unless of course your job is to provide the top-level pseudocode that the code monkeys then convert into a working design. But in that case you're not a software developer, just a business analyst.

2019-11-28 Reply Admin

A 3800% decrease equals factor -37. J.T. would gain 74 seconds from running this query instead of losing 2 seconds.

2019-11-28 Reply Admin

Some of the bad behavior is due to SQL Server having a less than informative error when data is too long for a field. For decades over multiple releases it has given a truncation error with no table. field or value references provided. It pisses off developers enough they start just making every field really long.

Sulis · 2019-11-28 Reply Admin

Some of the bad behaviour is because developers can't be bothered with actually learning how to design database structures. I remember one time I was asked to design a subsystem, and I included an ERM for a properly normalised set of tables which would allow for future functionality changes. The 'senior' developer who was implementing it didn't want to deal with the few extra joining tables that would entail, so he cobbled together something that worked for the immediate project and screwed us over for all future work.

Alexis_de_Torquemada · 2019-11-28 Reply Admin

This comment is 1000% correct.

2019-11-28 Reply Admin

"databases that stored flexible documents with no real schema and could just be used as a data dump"

You just described MongoDB and similar NoSQL systems. Face it, you lost this fight.

Jeremy Pereira · 2019-11-28 Reply Admin

A 3800% decrease equals factor -37. J.T. would gain 74 seconds from running this query instead of losing 2 seconds.

In a universe where going from 2000ms to 52ms is described as an increase, it seems perfectly reasonable to me.

2019-11-28 Reply Admin

Forcing obsolete data storage technologies where everything needs a schema and a data type is TRWTF.

Even if for some mysterious reason the project truly requires a RDBMS, using something like PostgreSQL where you don't have to specify data size (you can just say if it's text or numbers) would already make more sense than sticking with products that haven't evolved much in 25 years.

If you want to see how much more advanced PostgreSQL is, just look at the precision they support for numbers compared to Squirrel Server or Slowracle.

Mr. TA · 2019-11-28 Reply Admin

God where do I begin. "Old" doesn't mean "wrong". Type safety is paramount to good software design. Column sizes allow the engine to optimize storage better. In short, you're way off base here. As in, by 250,000 miles.

2019-11-28 Reply Admin

Lol at Raj.

Is data integrity a 'mysterious reason'? Or storage and retrieval efficiency?

2019-11-28 Reply Admin

Be that as it may, it's an appalling attitude for a developer to make that goes:

"I don't like the technology that some PHB has chosen to implement the database in, so I'm going to completely ignore all good design practices and do just the bare minimum to get the code to compile."

That's you, that is.

JiP · 2019-11-28 Reply Admin

Why bother storing stuff in a structured database when you can store it in, say, text files?

All you have to do is a little optimization, say use fixed length fields, some way of offsetting inside the file contents by a multiple of the sum of the field lengths, create some way of finding data lines based on offset (you can keep record of the ways to find that by keeping lists in the correct order for a particular field in separate files), create a metadata language that will retrieve your data in a workable way and make sure your text data files and offset data files do not get corrupted. For a bonus, if you have a network, you can write a wrapper service for availability everywhere.

Unfortunately I do not have the money to apply for a patent.

2019-11-28 Reply Admin

Thank you for providing perfect TRWTF of The Day. Have a nice day hunting most bonkers bugs while dealing with slow and unhelpful storage implementation that's likely good candidate for TDWTF.

And you just showed how little you know about MS-SQL or Oracle...

Watson · 2019-11-28 Reply Admin

Where's that Nicholas Cage "You Don't Say" reaction shot?

2019-11-28 Reply Admin

Couldn't be further off. Using an object oriented database doesn't relieve you from thinking about structure. Only if you actually intend to use it as a write-only or read-only dump. The second you need to scale it up, beyond what a relational database can reasonably provide, you have to think hard how to establish working (distributed) indexes, and how to ensure transactional guarantees in a database which is by design only eventually consistent. No chance of ACID. You suddenly have to deal with per-entity migrations (as there is no globally consistent schema after an update), you have to deal with grouping data into entities by access pattern in order to minimize write conflicts, you have to deal with a database where you have to worry about the actual cost of accessing linked entities as you don't have such nice things as explicit joins.

Addendum 2019-11-28 17:35: Yes, the perceived entry level is lower. If you only have a simple node, and you just dump everything in there, with zero risk of transactional conflicts in your dev environment. But once yo leave your sandbox, things get hairy. And what you know about relational databases no longer applies, while new practical design constraints emerge.

2019-11-28 Reply Admin

When you're developing code and you test it in your Dev database it works because you have 3 sample rows to operate on. Nice and fast.

Nobody calculates that throughput is 667ms per row because the entire query completes in 2 seconds. Excellent. Works for me! Ship it!

Then in production where the database has a hundred thousand rows it is very confusing that the query now takes over an hour. How can this be possible? We need a new database server that is faster!

Here I'm assuming O(n) performance but knowing these sort of A-grade 10x developers it will probably be O(n**2) performance. Good luck.

2019-11-28 Reply Admin

2006 called, they want their varchars back.

2019-11-29 Reply Admin

2006 called, they want their varchars back.

If you want to dump everything into a BLOB (a.k.a. glorified coredump), go ahead. I prefer my database manageable and useable, even with hundreds of users and hundreds of thousands of records.

2019-11-29 Reply Admin

We sure do :)

And int and datetime and ..., you definitely do not want to use just varchar, thats even worse ;)

For hobby projects or small startups with the only goal to be bought out before you have to face the consequences no structure might work.

But once the tables start to grow and you have to fetch things fast you can either throw a couple of million dollars on massive fast servers, or design things that are fast to begin with.

I have worked with MsSql, MySql, Xml, redis, MongoDB and a few other databases over the last 20 years and no matter the hardware or solution, once it starts to grow you find all kinds of performance bottlenecks that require better structure of the data, good indexes and good planing.

MongoDB is schema less, but one you start having millions of documents you need indexes and once you have indexes your still locked in how those documents need to be designed.

Changing one will require building new indexes, rebuilding all existing documents and or changing all code that query them.

You just push it to a later date when it will cost much much more to fix.

2019-11-29 Reply Admin

Ultimately both your code and your data storage is meant to model the business domain. Strict object type checking, ERM mapping, etc., and all the other SQL-typical methods are excellent PROVIDED the business domain has a known, relatively logical, and relatively static structure.

In a world where there are no business rules, only "business habits", that idea falls pretty flat. If the business answer to any proposed data model constraint is "we usually abide by that but have to be able to process transactions / records / sales / etc. without it", then we dev (or DBA) types are doomed to WTFery. Or, more commonly, if the business answer is "What? I don't understand the question. It doesn't matter, just make it work. La La La La I can't hear you!"

I agree (as usual) with LBT that rampant unprofessionalism is ruining the industry for the rest of us. But it's not solely the Dev / DBA / IT layers of business that are rampantly unprofessional. As David Mårtensson just said, when your business layer's only goal is to fool the public long enough to fool some investors and cash out, your product's correctness & quality are simply project priority numbers 672,456 & subsequent.

IOW, in a gold rush town there are no zoning regulations.

Mr. TA · 2019-11-29 Reply Admin

In a world where some business processes aren't strictly defined, those that are defined should be coded correctly, and those that aren't should be handled using normal productivity tools such as MS Office. These organizations are likely to be small and to have a relatively low volume, further negating the need to have a software engineering team.

idzy · 2019-12-01 Reply Admin

This is exactly the reason that siloing departments, with crude communication method such as email slows you down. I started my working life writing 8051 assembler, to achieve a very narrow goal. Now that systems have become so broad and diverse, it's kinda irresponsible for a senior C# coder to have absolutely zero knowledge of relational databases, even if they are using an abstraction like EntityFramework. Our code-base is deployed against SQL Server, PostgreSQL and Oracle, with all manner of performance optimizing shims for each. These are managed by the developers, and I've never met our dedicated DBA (if we even have one). The only person who is more dedicated to database structure and optimization than the rest of us, is an EX Oracle DBA, who deals with the fact that Oracle is "Special" (like the kid you need to read slowly too).

idzy · 2019-12-01 Reply Admin

Oh, and IMHO, PostgreSQL strikes a great balance between a flexible relational database and an object-oriented data store. It performs great as both, and it's free, it's High Availability works as well as the CAP theorem permits, and it generally causes no trouble... SQL Server on the other hand, not so easy. You need a domain controller, Windows Clustering, half a dozen other things that cost $10,000 each. So a PostgreSQL installation has ZERO license fees, Microsoft around $350,000 for the same thing and Oracle $500,000 + ongoing maintenance costs.

Addendum 2019-12-01 05:34: PostgreSQL can handle he Same workload, same HA requirements and normally within a smaller footprint than their proprietary counterparts. The important thing is that every developer on the team has more than a rudimentary understanding of the persistence technology which underlies their application layer, and there is no "war" or "malevolence" between those who maintain the storage clusters, and those who develop the applications.

As a manager, breaking down those traditional organizational barriers between application development, database administration and maintenance, optimization and even backups, as well as smooth, well tested deployments is a huge advantage (with a really low capital cost).

To change a company culture such that your coworkers are not your enemies, who "stifle" your work, to one where every member is behind the same goal is something worth investing in. The term DevOps has been abused a lot, but with every member of the team having at least a basic understanding of, and respect for the domain they share with others can not only improve your team's productivity and product quality, but provide an opportunity for learning and a cohesive environment where everyone sees the "bigger picture". There is less friction, and less factionalism. My recommendation would be less "restriction" and "personal ownership" but rather continuous education relating to the interfacing points between different disciplines.

2019-12-02 Reply Admin

A correction regarding MS Sql Server. It does not require all the extras as it can run under Linux

https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-overview?view=sql-server-ver15

And this is not a different server.

Sql Server started in collaboration with Sybase and Ashton Tate as a database for OS/2 and was based on Sybase relational database on Unix.

To avoid having to rebuild it completely they implemented a proxy layer handling all calls to the underlying OS, I assume to be able to continue support of Unix along side with OS/2 and when Windows NT came along they just created a new proxy for windows.

Now they have done it again on Linux and it requires no windows server.

There are some tools that are not ported, but all main functionality is supposed to work.

https://en.wikipedia.org/wiki/History_of_Microsoft_SQL_Server

https://blog.heroix.com/blog/how-does-sql-2017-work-on-linux

2019-12-02 Reply Admin

Frank sounds like a guy who works mostly with Excel and only uses a database when he has to make a report a certain way. They just dump their files into tables using a wizard in SSMS and then write one-time report queries which they never need to re-run. Sadly, I've dealt with other "Franks". The ego makes things far worse.

hornet30064 · 2019-12-02 Reply Admin

Why do you have to submit a script period? Version-controlled database deployments have been a thing for a long time, and you can just use standard Pull Requests to approve schema changes at that point....

This is SQL Server, and it has one built in that is excellent (SQL Server Database Projects), though other databases have generic migration tools such as Flyway, Dbup, etc...

Scarlet_Manuka · 2019-12-03 Reply Admin

using something like PostgreSQL where you don't have to specify data size (you can just say if it's text or numbers) would already make more sense

Only if you don't want to pick up errors like the one we found this morning, where someone tried to stuff a six-digit number into a field that is generously defined as allowing up to three digits (the actual maximum value in there is 6, so I should probably drop it down to two digits).

2019-12-06 Reply Admin

Forcing obsolete data storage technologies where everything needs a schema and a data type is TRWTF.

TRWTF is this ridiculous statement which tells me your code is going to end up on here one day or another.

Specifying schema and datatypes has been around for a long time but it's neither archaic not obsolete. Certain implementations might be thought and the slow evolution of data definition languages is sometimes a bit sad.

It's simply another way to do things like the difference between strictly typed languages and dynamically typed languages that are usually not very strict. Data is often sacred and it's a fact you're going to have problems playing it fast and loose not bothering to type things or verify data.

Some cases and some people can handle schemaless. People with your flippant attitude should not be allowed to touch a system that requires self discipline at all.

It's a strange irony the often the people who flip off all the safeties, especially based on stupid non-arguments, are the last people that should be operating technology with the safeties off.

If your moronic comments aren't an effort to troll then you're not someone who should be working with technology.

A database often becomes the most crucial and sensitive asset once the software is up and running. People are going to be fixing your messes for years.

2019-12-11 Reply Admin

I managed a 100.001% speed increase once, but I didn't trust the results, since it was giving them to me right before I hit return.

2019-12-11 Reply Admin

Do people actually use pull requests outside of open source? It was always my impression the reason for pull requests was to not need to have code from thousands of volunteers which won't go anywhere, because they got in, made some changes, did a few commits, but then couldn't get it all working the way they wanted and then got distracted by something else.

If you use pull requests, then you don't have any code in the central repository for reasons you don't know from people you've never heard of. But in closed source development, it's my impression that we know all of the people working on a project. We still want to control who can merge to branches that matter, but locking down a select few branches is easy.

So why would we pull in closed source development or for internal configuration management repos?

Of course, that having been said, the submission process should be light. Also, most of the processes I've encountered where one needed to submit things like this for review and acceptance through email had some major WTF element or two. I feel like it's theoretically possible to have a good email based process, but I'm not sure I've seen one.

It feels like it'd be easier to set something reasonable up using ssh to push to your version control software for the submission, with each person having one or more ssh keys that are used to authenticate them to this system. There's still ways that could also be a WTF, such as any system where you have a general shell account on the version control host that has access to run the version control commands directly, because you just know somebody's going to abuse that in some way.

Classic WTF: Manager of the Data Dump

Leave a comment on “Classic WTF: Manager of the Data Dump”