The Daily WTF: Curious Perversions in Information Technology

2012-03-01 Reply Admin

Well written article, and all points that I agree with! thank you for sharing.

We have a lovely EAV-style table simply called the MASTER table. In it, you will find all sorts of things... Validations, connectivity info sometimes, mapping data, even full-blown tables hidden away, perhaps that people were too lazy (I'm guessing?) to put in to their own tables. We even have a column called "table". :D

Anyways, it's a nice a conversation piece to have around when introducing the "new guy" to the leviathan!

2012-03-01 Reply Admin

Laird Nelson:
Another vote for Liquibase. Each change is a "changeset", each changeset is checksummed; changesets have rollback actions and cannot be modified.

This looks cool... Thanks for the link! Looks like it is worth reading more on this.

2012-03-02 Reply Admin

Is this article is written in the SQL Server world? Some "missing features" there actually exist in other databases, like Oracle.

"The only way to change a database is by running a SQL script against that database, and once you’ve done that, there’s no going back. You can run another SQL script to change the database again, but the only way to truly rollback changes is by restoring the entire database from back-up." - wrong. Oracle has "Flashback Transaction", in which every statement you run to make data changes auto-generates an "undosql" statement for you. Use the LogMiner to review and execute these statements to undo your changes. No need to restore the entire database.

Oracle also has "Flashback Database" when you can roll back the entire database to a single point in time - no need to restore from a backup (unless the point in time is bigger than the flashback window).

"once you’ve dropped that column, it’s gone forever; you could certainly restore your database from back-up, but that’s not exactly an undo, especially when you lose all the intermediate data" - wrong. Oracle has "flashback table" to restore a dropped column or table, meaning "instant undo" on the table.

These features exist so that it is quick and easy for an experienced DBA to quickly undo bad changes with the minimum of downtime. If you didn't hate Oracle so much then perhaps you'd learn what it had to offer and benefit from it rather than moan how you've got to do all this additional work instead.

2012-03-02 Reply Admin

btw the batch file we use for executing our scripts in order is as follows;

@echo off

REM cycle through all sql scripts in the current folder and execute them 
For /F "eol= tokens=* delims= " %%f in ('dir *.sql /a-d /on /b') do (
    echo Processing %%f...

    sqlcmd -S .\SQLEXPRESS2008 -E -b -i "%%f"
    if errorlevel 1 goto badend

    echo %%f completed
    echo.
)

goto goodend


:badend
echo.
echo **** BAD SCRIPT FILE ENCOUNTERED **** 
goto end


:goodend
echo.
echo Batch script update successful


:end
pause

This script could be modified to accept the instance name as a parameter but in practice we all have a default instance installed on our dev machines. The advantage of the script is that it stops executing if it encounters an error on any of the scripts, although the severity must be 10 or above to register on the program exit code. This isn't an issue if you're using TRY/CATCH with BEGIN TRANS/COMMIT/ROLLBACK. We use the following sample in all stored procs and update scripts;

SET NOCOUNT ON
BEGIN TRY
    BEGIN TRAN
            
            
            
            
    COMMIT TRAN
END TRY
BEGIN CATCH
    --might need to rollback
    IF (@@TRANCOUNT = 1)
    BEGIN
        ROLLBACK TRAN
    END
    ELSE
    BEGIN
        -- Nested transaction. We can't rollback from here or else we will
        -- get the SQL Msg 266 about erroneous transaction counts after SP
        -- completion. Instead we Commit this block of work trusting that
        -- the caller will eventually Rollback at the outermost level when
        -- it sees my error code.
        COMMIT TRAN
    END
    DECLARE @ErrMsg nvarchar(MAX), @ErrSeverity int
    SELECT @ErrMsg = ERROR_MESSAGE(),@ErrSeverity = ERROR_SEVERITY()
    RAISERROR(@ErrMsg, @ErrSeverity, 1)
END CATCH

All this, along with a script that checks the db version (and throws an exception if not) and named as "001) Check db version.sql", means that (in theory) it's not possible to run scripts for a particular patch against the wrong version of the database

2012-03-02 Reply Admin

Mmmmm... troll food!

2012-03-03 Reply Admin

You've gotta accept that you're a tiny minority, not worthy of mention compared to the massive hordes of Oracle/DB2/Postgres/MySQL/etc. developers, admins, and users... It's sad but true.

:fibcgf

You aren't working at Petco are you?
(Not a lot of other big U2 shops out there.)

2012-03-03 Reply Admin

I've found the use of dbMaintain to be an awesome tool to replace the batch file you talked about. It also enforces a few basic rules.

http://www.dbmaintain.org/

2012-03-03 Reply Admin

"Since these object scripts simply create objects, they can be run time and time again without any real consequences."

If your database has, say, 100 procs, you are going to drop and recreate all of them any time any one of them changes? What if you have thousands of procs in a large environment with many servers (I do)? Deploying every proc even if it didn't change is not going to scale, since deployment would take hours. What if the procedure is replicated (then you won't be able to drop it)?

Perhaps my environment is significantly different from yours, but mine is an environment with thousands of procs on dozens of servers and a deployment system that goes directly into production and does 40+ deployments per day (for a website). I use an SVN repository structure of ServerName\DatabaseName\ObjectType\ObjectName.sql and a powershell script for deployment. The script runs an SVN update to get the latest version, parses the output (if any files changed), and then deploys the changed objects only at the appropriate server and database. If it fails it sends the error via e-mail to the committer, and tries again several times (in case the failure was due to a timeout or something, i.e. caused by blocking). If it keeps failing and doesn't get fixed, eventually the DBAs get an e-mail asking us to look into it or roll it back (since it's vital that SVN is always in sync with production).

Since you can't drop a replicated procedure (and we use lots of replication), a better method than "if exists drop, then create" is to do "if not exists create (as select 1), then alter". This will work in all cases. For functions it's slightly more annoying as you have to be aware of whether you're doing a table valued function or a scalar valued function when creating that dummy object.

Also I'm not sure if you use SQLCLR assemblies or SSIS, but I deploy those directly from source control too which takes a ton of tedious work off the DBA team's plate.

I completely agree about tables though - they should be deployed manually always by a DBA. Unless you really really trust that nobody is ever going to commit an alter that adds a not-nullable column with a default to your largest and most important table (thereby bringing your application to its knees), you really do want a person doing a deployment for tables and being conscious of the impact on the application, the time of day, etc. To keep source control in sync though I set up a job that automatically scripts out changed table definitions and commits them to SVN, which is quite helpful.

Omnifarious · 2012-03-04 Reply Admin

I'm not fond of databases, even though I agree with all the change control steps listed in this article.

I'm not fond of them because they require a completely specialized set of tools to deal with, and the nature of the specialized tools is different very every single stupid database.

I like sqlite for development work because copying the database file works. Maybe the file is a stupidly opaque blob without a specialized. But at least copying it works. And the OS permissions operate as they should. And a whole host of other things work the way they're supposed to.

But otherwise, databases are just incredibly stupid. They have their own authentication and permission system with their own obscure syntax (different from vendor to vendor of course) and you need to use specialized tools to even make a copy of the stupid things.

2012-03-05 Reply Admin

Jeff:
L.:
However, the correct way to solve your "blabla database structure changes problem" is to put much more thought in how you manage and handle data, and how to best represent it in a relational model in order to both greatly increase reliability and efficiency + maintainability.
/--/

that schema change problem does not exist, it's merely an effect of the way you model data / application
By this, do you mean that:
A) data should be modeled so well in the first place that structure changes will never be needed

or

B) the development process should be set up so that structure changes could be applied more reliably (e.g. the stuff Alex talks about in the article)?

Or some option C? :)

A) You will never require a major structure change once your data is correctly modelled, i.e. a logical reflection of the concepts you are addressing.

B) Your system shouldn't have any issues with new item types / new attributes / new whatever .

It's not about development process but about application architecture.

If your app. can't handle such tiny changes, it should be rewritten.

2012-03-05 Reply Admin

HappyEngineer:
L.:
HappyEngineer:
The solution we came up with turned 4 hour DB deployments that often failed into half hour deployments that always succeed. We used to have the horrible system where we had one sql script to deploy and another sql script to rollback (per DB). It failed all the time. People failed to update the rollback scripts consistently and they were rarely tested.
We now have xml deploy scripts that have very short bits of sql and the corresponding bit of rollback sql. For example:
	<sql db="cps_cpsdata">
		<terseDesc>Add and set cannedRole field on Role.</terseDesc>
		<deploy>
			alter table Role add cannedRole NUMBER(1)  DEFAULT(0);
			update Role set cannedRole=1 WHERE company='102' and name not like 'Employee%';
		</deploy>
		<rollback>
			alter table Role drop column cannedRole;
		</rollback>
	</sql>
We then have teamcity constantly running the deploy scripts against a test DB every time anything changes.

The point is that each <sql> element is in charge of deploying and rolling back. If you drop a column then you need to save the data in another table first so that the rollback can add it back.

With teamcity constantly deploying and rolling back we quickly find out of the script properly rolls things back.

For more complex operations we have actual dedicated deployer types:
<editPrivilege isNew="true" key="View Android Package" name="View Android Package"/>
The "editPrivilege" type corresponds to java code that knows how to deploy and roll back privilege objects perfectly every time.

The <sql> elements can also run against any DB, so deployments that require changes to several DB will all run at the same time and stay in sync.

The final step is the deploy phase. The deployer runs these scripts and then, if a failure occurs, it automatically rolls back. When the deployer is finished the DB is either in the new updated state or it's back in the original state.

DB deploy used to be a nightmare, but in the 2 years of using this system we have NEVER had a failed DB deployment.
Would that have the same effect as, say, this ?

BEGIN <insert migration script> COMMIT

-- wtf
No, it would not. DDL doesn't take place as part of a transaction, so rollback wouldn't give you your column back.

Sometimes the errors are detected by the java code before they hit the DB.

The java deployers are far more reliable because they are tested and reused and they validate the input given.

Our method of deployment can be easily tested. Writing separate sql scripts for deploy and rollback was demonstrably more difficult to test (the constant deploy failures were proof of that).

Our system allows much more control. The error messages in the logs are very clear about which things failed. A full tree is output that shows the order of the deployment and which things failed and which did not. It also includes timing information so we can see how each part is taking.

For a while I thought about writing an open source version for use by others. I never got around to it, but perhaps I will someday. The difference between sql scripts and our method is huge.

Well, if your DBMS does not handle packaging drop columns inside a transaction, it's just wrong and that's it.

If you say begin / whatever/ rollback and you're not in the initial state, your dbms should be trashed instantly for not doing its job.

Furthermore, while your system is pretty nice (modelling those changes and associating roll back code is not bad), I really think it should rely more on your dbms and less on hand-coded rollback plans (when in most cases transaction rollback does that).

2012-03-05 Reply Admin

[quote user="Joe Smith"][quote user=HappyEngineer] No, it would not. DDL doesn't take place as part of a transaction, so rollback wouldn't give you your column back. [/quote]

That depends entirely on the database. Postgres has perfect DLL transactions, which are completely nvisible to other transactions unless and until the commit line is succesfully reached.

Microsof'ts SQL server does reasonably well DDL transactions, although not everything is supported, and there are a few cases where in progress DDL transactions are visible to other transactions.

I'm guessing you are talking about Oracle, which is objectively a rather terrible database. Sure it is scalable, but it lacks critical things like distinguishing between null and empty strings, and even minimal DDL transaction support.[/quote]

Amen brother.

2012-03-05 Reply Admin

Omnifarious:
I'm not fond of databases, even though I agree with all the change control steps listed in this article.
I'm not fond of them because they require a completely specialized set of tools to deal with, and the nature of the specialized tools is different very every single stupid database.

I like sqlite for development work because copying the database file works. Maybe the file is a stupidly opaque blob without a specialized. But at least copying it works. And the OS permissions operate as they should. And a whole host of other things work the way they're supposed to.

But otherwise, databases are just incredibly stupid. They have their own authentication and permission system with their own obscure syntax (different from vendor to vendor of course) and you need to use specialized tools to even make a copy of the stupid things.

Well obviously you're an ignorant who shouldn't have anything to do with databases, and also a very successful troll.

Let's all go back to the madness of filetypes and XLS, it's all so much better than actual query-able information in a logical format.

2012-03-05 Reply Admin

geoffrey:
NoSQL is a fad invented by programmers who think they have outsmarted giant enterprises when it comes to data storage and retrieval. Nothing more.

software history is all about programmers who outsmart giant enterprises. if that's your argument, I'll back nosql to the hilt...

... well, except the enterprises will probably deploy an army of lawyers, lobbyists, and steak-dinner-buying salespeople to ensure that their crappy solutions will sell...

2012-03-07 Reply Admin

We're dealing with a survey system. Hundreds of different surveys, thousands of questions, with a method for skipping questions. And the survey owners can change the question orders and add and remove questions from any part of the survey at any time, and the survey questions themselves are versioned (so if you change the question text, a new version of the question is created).

So if I wanted to ditch EAV, I would need to write something to create new tables on the fly, insert columns in any index in the table on the fly, and reorder the columns. If a user comes alone and deletes 30 previous questions and adds 25 more, then the table becomes full of dead space - the first N1000 patients don't have answers to the 25 new questions, the next N1000 patients don't have answers to the 30 that were removed.

Or instead when the user ditches 25 questions and adds 30 more we could create a new table. Now instead of hundreds of survey tables we would have thousands, and the research analyst would still need to combine their data in some fashion.

I'm not saying I have a great solution or even the optimal solution. But implementing a non-EAV solution looks incredibly complex.

2012-03-07 Reply Admin

L.:
HappyEngineer:
The solution we came up with turned 4 hour DB deployments that often failed into half hour deployments that always succeed. We used to have the horrible system where we had one sql script to deploy and another sql script to rollback (per DB). It failed all the time. People failed to update the rollback scripts consistently and they were rarely tested.
We now have xml deploy scripts that have very short bits of sql and the corresponding bit of rollback sql. For example:
	<sql db="cps_cpsdata">
		<terseDesc>Add and set cannedRole field on Role.</terseDesc>
		<deploy>
			alter table Role add cannedRole NUMBER(1)  DEFAULT(0);
			update Role set cannedRole=1 WHERE company='102' and name not like 'Employee%';
		</deploy>
		<rollback>
			alter table Role drop column cannedRole;
		</rollback>
	</sql>
We then have teamcity constantly running the deploy scripts against a test DB every time anything changes.

The point is that each <sql> element is in charge of deploying and rolling back. If you drop a column then you need to save the data in another table first so that the rollback can add it back.

With teamcity constantly deploying and rolling back we quickly find out of the script properly rolls things back.

For more complex operations we have actual dedicated deployer types:
<editPrivilege isNew="true" key="View Android Package" name="View Android Package"/>
The "editPrivilege" type corresponds to java code that knows how to deploy and roll back privilege objects perfectly every time.

The <sql> elements can also run against any DB, so deployments that require changes to several DB will all run at the same time and stay in sync.

The final step is the deploy phase. The deployer runs these scripts and then, if a failure occurs, it automatically rolls back. When the deployer is finished the DB is either in the new updated state or it's back in the original state.

DB deploy used to be a nightmare, but in the 2 years of using this system we have NEVER had a failed DB deployment.
Would that have the same effect as, say, this ?

BEGIN <insert migration script> COMMIT

-- wtf

No it wouldn't. Their tool appears to allow promotion / demotion. e.g. promote my app to v1.1 demote my app to v1.0 by executing the rollback clause.

I might be wrong as well.

I find the whole rollback of a database upgrade a little ridiculous. Unless your app is just a toy, the only way you can be certain is by restoring.

If upgrading the application was merely adding or renaming a column, that's one thing, but don't forget about the data migration scripts. If you drop a column, how do you effectively roll back?

2012-03-12 Reply Admin

Mike S.:
We're dealing with a survey system. Hundreds of different surveys, thousands of questions, with a method for skipping questions. And the survey owners can change the question orders and add and remove questions from any part of the survey at any time, and the survey questions themselves are versioned (so if you change the question text, a new version of the question is created).
So if I wanted to ditch EAV, I would need to write something to create new tables on the fly, insert columns in any index in the table on the fly, and reorder the columns. If a user comes alone and deletes 30 previous questions and adds 25 more, then the table becomes full of dead space - the first N1000 patients don't have answers to the 25 new questions, the next N1000 patients don't have answers to the 30 that were removed.

Or instead when the user ditches 25 questions and adds 30 more we could create a new table. Now instead of hundreds of survey tables we would have thousands, and the research analyst would still need to combine their data in some fashion.

I'm not saying I have a great solution or even the optimal solution. But implementing a non-EAV solution looks incredibly complex.

You sir, have officially demonstrated how much of a moron you are. congratulations.

The very friggin idea of having a column for each question shows you should be shot on sight if seen programming. But whatever, just stop working in the field and I'll be happy.

Damn dimwits seriously...

2012-03-12 Reply Admin

Johhny NoGood:
L.:
HappyEngineer:
The solution we came up with turned 4 hour DB deployments that often failed into half hour deployments that always succeed. We used to have the horrible system where we had one sql script to deploy and another sql script to rollback (per DB). It failed all the time. People failed to update the rollback scripts consistently and they were rarely tested.
We now have xml deploy scripts that have very short bits of sql and the corresponding bit of rollback sql. For example:
	<sql db="cps_cpsdata">
		<terseDesc>Add and set cannedRole field on Role.</terseDesc>
		<deploy>
			alter table Role add cannedRole NUMBER(1)  DEFAULT(0);
			update Role set cannedRole=1 WHERE company='102' and name not like 'Employee%';
		</deploy>
		<rollback>
			alter table Role drop column cannedRole;
		</rollback>
	</sql>
We then have teamcity constantly running the deploy scripts against a test DB every time anything changes.

The point is that each <sql> element is in charge of deploying and rolling back. If you drop a column then you need to save the data in another table first so that the rollback can add it back.

With teamcity constantly deploying and rolling back we quickly find out of the script properly rolls things back.

For more complex operations we have actual dedicated deployer types:
<editPrivilege isNew="true" key="View Android Package" name="View Android Package"/>
The "editPrivilege" type corresponds to java code that knows how to deploy and roll back privilege objects perfectly every time.

The <sql> elements can also run against any DB, so deployments that require changes to several DB will all run at the same time and stay in sync.

The final step is the deploy phase. The deployer runs these scripts and then, if a failure occurs, it automatically rolls back. When the deployer is finished the DB is either in the new updated state or it's back in the original state.

DB deploy used to be a nightmare, but in the 2 years of using this system we have NEVER had a failed DB deployment.
Would that have the same effect as, say, this ?

BEGIN <insert migration script> COMMIT

-- wtf
No it wouldn't. Their tool appears to allow promotion / demotion. e.g. promote my app to v1.1 demote my app to v1.0 by executing the rollback clause.

I might be wrong as well.

I find the whole rollback of a database upgrade a little ridiculous. Unless your app is just a toy, the only way you can be certain is by restoring.

If upgrading the application was merely adding or renaming a column, that's one thing, but don't forget about the data migration scripts. If you drop a column, how do you effectively roll back?

The only reason you would rollback is if the intended action failed, and that's what you get from SQL transactions . the rest is for kids playing around with shiny toys : woo see how i can switch versions faster than light oops broken again.

And if you drop columns using their thingy, it's just dead that's it.

2012-03-12 Reply Admin

Mike S.:
We're dealing with a survey system. Hundreds of different surveys, thousands of questions, with a method for skipping questions. And the survey owners can change the question orders and add and remove questions from any part of the survey at any time, and the survey questions themselves are versioned (so if you change the question text, a new version of the question is created).
So if I wanted to ditch EAV, I would need to write something to create new tables on the fly, insert columns in any index in the table on the fly, and reorder the columns. If a user comes alone and deletes 30 previous questions and adds 25 more, then the table becomes full of dead space - the first N1000 patients don't have answers to the 25 new questions, the next N1000 patients don't have answers to the 30 that were removed.

Or instead when the user ditches 25 questions and adds 30 more we could create a new table. Now instead of hundreds of survey tables we would have thousands, and the research analyst would still need to combine their data in some fashion.

I'm not saying I have a great solution or even the optimal solution. But implementing a non-EAV solution looks incredibly complex.

I had to say it twice ... seriously how retarded do you have to be to write that as "not a troll" which is what it looks like.

HOW THE F*CK COULD YOUR BRAIN ACTUALLY GO THROUGH SEVERAL SECONDS OF THOUGHT WITHOUT NOTICING HOW DUMB YOUR IMPLEMENTATION WAS ?????

And don't say you didn't think when writing your n1k w/e bullcrap.

And no i'm not giving you the solution, that would prevent you from ever becoming smart enough to actually do your job.

retnuh · 2012-03-13 Reply Admin

Any guidelines, tips, for keeping a DVCS repo in sync and avoid file naming conflicts?

We've got the following setup:

Live server
Dev server
3x individual developer VMs with database
Mercurial

I'd like to have a Live and Dev SQL Schema repo, in that anything pushed into the Live repo will be included in the live deployment build. My main concern is keeping the Dev in sync between all developers. So far the best solution I can think of is to use a common network share that gets auto committed and pushed every n minutes so we avoid Dev 1 and Dev 2 making change script 00135.sql near the same time and trying to push it. This really shouldn't happen all that often as schema changes are few and far between.

I feel like I'm missing something obvious here, so any suggestions?

2013-02-06 Reply Admin

I am another user and fan of liquibase:

http://www.liquibase.org/

The tools provides the framework and rigour one needs to do db deltas properly, as expressed in your initial article Alex.

Kind regards, Conor

2013-03-08 Reply Admin

Yes, yes, yes. Exactly right.

Some other things I've found useful:

Separate the database structural changes into "additive" and "subtractive" changes. Additive changes add columns, tables, indices, etc., while subtractive changes drop columns or tables, discard data, etc.

This can help avoid a reload of the database -- only subtractive changes require a restore to fix, but that can be mitigated by a policy of not applying subtractive changes until well after nothing uses that data.

(I like to split out constraints and index-creation changes as well, but that's probably a personal quirk.)

Enforce the rule that code should continue to work correctly no matter what additive changes have been applied. Reject out of hand any code that depends on the order of columns in a database table, and have a word with that developer.

(It can be useful to randomly re-order the columns in a table in the testing/QA environments. Pathological applications will likely fail messily. This is good.)

This lets you modify the database without breaking applications, and to roll back applications without having to rebuild/restore your database. Subtractive changes can be applied well after the application is deployed without worry that the application will blow up if an unused column is removed.

Track and deploy the database versions independently of the applications (assumption is that multiple applications are using the database). Applications should be considered to run in the context of the database's published structure, rather than imposing changes on the database structure. Letting an application force a database change is a recipe for breaking other applications, or for evolving an un-maintainable "design".

2013-04-15 Reply Admin

Write-Once – change scripts cannot change throughout the release lifecycle; they are not changed, they are undone (ideally, through a restore) and discarded

A tool external to the database should probably be used to ensure change scripts are deployed once and once only, to prevent erroneous SQL being parsed to the database engine.

I work on a system where change versioning is performed by checking the database versioning table from the change script SQL. Imagine a scenario where the following changes occur within a change cycle (meaning the scripts cannot yet be discarded): 1 - add column A 2 - modify column A 3 - delete column A 4 - any subsequent change

Changes 1-3 have previously been deployed, then we go to build with change 4. Whilst on execution the change scripts contains code to check if changes 1-3 have already been deployed, change 2 will fail pre-execution on parsing as it references a non-existent column. The only option I've found in this scenario is to retrospectively modify the SQL for change 2 to compile the modify statement dynamically at runtime, bypassing the parser.

2014-02-11 Reply Admin

No, he's rooting FOR his product. But in principle you're allowed to re-implement. And that might cost more, if it's inconsistent with your core business.

2016-03-20 Reply Admin

Wow, thanks a lot !!! I've been searching for such an honest answer since quite a while - and was debious every time I read about the diff tools... I couldn't help but think that something was wrong using them (and honestly was panicked about losing data). Indeed, writing a SQL script, running it against all servers (dev, staging & prod) seems to be the only best way... This is definitely the process I'll set up for my website (www.mysoftwaregate.com) Thanks again for this article!!

2024-03-26 Reply Admin

I really like your blog.. very nice colors & theme.

Did you design this website yourself or did yyou hire somjeone to do it ffor you? Plz reply as I'm looking to construct my own blog and would like to find out where u ggot this from. kudos https://www.Waste-ndc.pro/community/profile/tressa79906983/

Database Changes Done Right

Leave a comment on “Database Changes Done Right”