Divine Comedy

"Code should be clear and explain what it does, comments should explain why it does that." This aphorism is a decent enough guideline, though like any guidance short enough to fit on a bumper sticker, it can easily be overapplied or misapplied.

Today, we're going to look at a comment Salagir wrote. This comment does explain what the code does, can't hope to explain why, and instead serves as a cautionary tale. We're going to take the comment in sections, because it's that long.

This is about a stored procedure in MariaDB. Think of Salagir as our Virgil, a guide showing us around the circles of hell. The first circle? A warning that the dead code will remain in the code base:

	/************************** Dead code, but don't delete!

	  What follows if the history of a terrible, terrible code.
	  I keep it for future generations.
	  Read it in a cold evening in front of the fireplace.

My default stance is "just delete bad, dead code". But it does mean we get this story out of it, so for now I'll allow it.

	  **** XXX ****   This is the story of the stored procedure for getext_fields.   **** XXX ****

	Gets the english and asked language for the field, returns what it finds: it's the translation you want.
		   Called like this:
		   " SELECT getext('$table.$field', $key, '$lang') as $label "
		   The function is only *in the database you work on right now*.

Okay, this seems like a pretty simple function. But why does this say "the function is only in the database you work on right now"? That's concerning.

		***** About syntax!!
			The code below can NOT be used by copy and paste in SQL admin (like phpmyadmin), due to the multiple-query that needs DELIMITER set.
			The code that works in phpmyadmin is this:
DELIMITER $$
DROP FUNCTION IF EXISTS getext$$
CREATE FUNCTION (...same...)
		LIMIT 1;
	RETURN `txt_out`;
END$$
			However, DELIMITER breaks the code when executed from PHP.

Am I drowning in the river Styx? Why would I be copy/pasting SQL code into PhpMyAdmin from my PHP code? Is… is this a thing people were doing? Or was it going the opposite way, and people were writing delimited statements and hoping to execute them as a single query? I'm not surprised that didn't work.

		***** About configuration!!!
			IMPORTANT: If you have two MySQL servers bind in Replication mode in order to be able to execute this code, you (or your admin) should set:
			SET GLOBAL log_bin_trust_function_creators = 1;
			Without that, adding of this function will fail (without any error).

I don't know the depths of MariaDB, so I can't comment on if this is a WTF. What leaps out to me though, is that this likely needs to be in a higher-level form of documentation, since this is a high-level configuration flag. Having it live here is a bit buried. But, this is dead code, so it's fine, I suppose.

		***** About indexes!!!!
			The primary key was not used as index in the first version of this function. No key was used.
			Because the code you see here is modified for it's execution. And
				`field`=my_field
			becomes
				`field`= NAME_CONST('my_field',_ascii'[value]' COLLATE 'ascii_bin')
			And if the type of my_field in the function parameter wasn't the exact same as the type of `text`, no index is used!
			At first, I didn't specify the charset, and it became
				`field`= NAME_CONST('my_field',_utf8'[value]' COLLATE 'utf8_unicode_ci')
			Because utf8 is my default, and no index was used, the table `getext_fields` was read entirely each time!
			Be careful of your types and charsets... Also...

Because the code you see here is modified for its execution. What? NAME_CONST is meant to create synthetic columns not pulled from tables, e.g. SELECT NAME_CONST("foo", "bar") would create a result set with one column ("foo"), with one row ("bar"). I guess this is fine as part of a join- but the idea that the code written in the function gets modified before execution is a skin-peelingly bad idea. And if the query is rewritten before being sent to the database, I bet that makes debugging hard.

		***** About trying to debug!!!!!
			To see what the query becomes, there is *no simple way*.
			I literally looped on a SHOW PROCESSLIST to see it!
			Bonus: if you created the function with mysql user "root" and use it with user "SomeName", it works.
			But if you do the show processlist with "SomeName", you won't see it!!

Ah, yes, of course. I love running queries against the database without knowing what they are, and having to use diagnostic tools in the database to hope to understand what I'm doing.

		***** The final straw!!!!!!
			When we migrated to MariaDB, when calling this a lot, we had sometimes the procedure call stucked, and UNKILLABLE even on reboot.
			To fix it, we had to ENTIRELY DESTROY THE DATABASE AND CREATE IT BACK FROM THE SLAVE.
			Several times in the same month!!!

This is the 9th circle of hell, reserved for traitors and people who mix tabs and spaces in the same file. Unkillable even on reboot? How do you even do that? I have a hunch about the database trying to retain consistency even after failures, but what the hell are they doing inside this function creation statement that can break the database that hard? The good news(?) is the comment(!) contains some of the code that was used:

		**** XXX ****    The creation actual code, was:   **** XXX ****

		// What DB are we in?
		$PGf = $Phoenix['Getext']['fields'];
		$db = $PGf['sql_database']? : (
				$PGf['sql_connection'][3]? : (
						$sql->query2cell("SELECT DATABASE()")
					)
				);

		$func = $sql->query2assoc("SHOW FUNCTION STATUS WHERE `name`='getext' AND `db`='".$sql->e($db)."'");

		if ( !count($func) ) {
			$sql->query(<<<MYSQL
				CREATE FUNCTION {$sql->gt_db}getext(my_field VARCHAR(255) charset {$ascii}, my_id INT(10) UNSIGNED, my_lang VARCHAR(6) charset {$ascii})
				RETURNS TEXT DETERMINISTIC
				BEGIN
					DECLARE `txt_out` TEXT;
					SELECT `text` INTO `txt_out`
						FROM {$sql->gt_db}`getext_fields`
						WHERE `field`=my_field AND `id`=my_id AND `lang` IN ('en',my_lang) AND `text`!=''
						ORDER BY IF(`lang`=my_lang, 0, 1)
						LIMIT 1;
					RETURN `txt_out`;
				END;
MYSQL
			);
			...
		}

I hate doing string munging to generate SQL statements, but I especially hate it when the very name of the object created is dynamic. The actual query doesn't look too unreasonable, but everything about how we got here is terrifying.

		**** XXX ****    Today, this is not used anymore, because...   **** XXX ****

		Because a simple sub-query perfectly works! And no maria-db bug.

		Thus, in the function selects()
		The code:
			//example: getext('character.name', `character_id`, 'fr') as name
			$sels[] = $this->sql_fields->gt_db."getext('$table.$field', $key, '$lang') as `$label`";

		Is now:
			$sels[] = "(SELECT `text` FROM {$this->sql_fields->gt_db}`getext_fields`
				WHERE `field`='$table.$field' AND `lang` IN ('en', '$lang') AND `id`=$key AND `text`!=''
				ORDER BY IF(`lang`='$lang', 0, 1) LIMIT 1) as `$label`";

		Less nice to look at, but no procedure, all the previous problems GONE!


		**** XXX   The end.
*/

Of course a simple subquery (or heck, probably a join!) could handle this. Linking data across two tables is what databases are extremely good at. I agree that, at the call site, this is less readable, but there are plenty of ways one could clean this up to make it more readable. Heck, with this, it looks a heck of a lot like you could have written a much simpler function.

Salagir did not provide the entirety of the code, just this comment. The comment remains in the code, as a warning sign. That said, it's a bit verbose. I think a simple "Abandon all hope, ye who enter here," would have covered it.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!