The Daily WTF: Curious Perversions in Information Technology

2011-06-07 Reply Admin

Alex Papadimoulis:
Keep in mind that the term "version" is really just an alias for a specific build, and all of the same rules apply (immutable, etc). From a release management standpoint, it doesn't matter the build contains, more that it's a wholly deployable component.

This is certainly a nice ideal, and does make everybody's life a couple orders of magnitude easier when it can be enforced. But what when it cannot? Scenario:

Say we have a large, complex production environment running. On day D it all runs a beautiful, immutable, monolithic build where every little bit of code was compiled in one big operation from one precise point in version-control code.

On day D+1 it's suddenly revealed that the VP of Sales has committed the company to being able to interact with a particular out-of-house web service. And the web service turns out to be incompatible on some deep level with the existing code -- most of the required middleware is already there, but the web service insists, for perverse reasons known only to its developers, that it will sometimes send a "305 Use proxy" response, and the HTTP engine deep down at the bottom of the stack is not prepared to deal with that.

Now some developer spends a number of days recoding the HTTP engine such that it can react properly to 305. This involves rather a lot of hand-refactoring of code that underlies 75% of everything directly or indirectly, and who knows what little corner case may have accidentally been broken in the process. There is no way a complete rebuild based on that change will end up in production without at least a month of meticulous QA.

But the salesman promised that we'd have the web service integration up and running by tomorrow. What to do? Take just the server that provides contains the new feature, compile it against the bleeding-edge semi-trusted source tree, and then install a single instance of that in production, where it will interact with the existing, tested servers from the old build. The new server is still high risk, of course, but at least if it fails the failure will be contained; we can configure everything such that existing customers without insane web services will not be handled by that instance at all. Then start working a fresh complete build through QA at a saner pace.

In that particular case I posit that such a piecewise deployment is the best one can make of a bad situation. However, it does mean that Production now runs a chimera of software, where different components comes from different times of the source control history. It's not a single monolithic build anymore.

If I understand you correctly, your very vocabulary will refuse to deal with the resulting situation at all, so the process you advocate (whether it is based on whiteboards, or spreadsheets, or specialized tracking software) will be unable to help, not even with managing the task of getting Production back to a purer, less confusing sate.

(The problem is of course not limited to production environments. For example, if one has the time to spare, one would want to do some integration testing of the chimera configuration before letting loose in production so we'll need to speak about running a chimera in a staging environment).

2011-06-07 Reply Admin

For f%^*'s sake, shoot the salesman. Now.

Alex Papadimoulis · 2011-06-07 Reply Admin

Henning Makholm:
Alex Papadimoulis:
Keep in mind that the term "version" is really just an alias for a specific build, and all of the same rules apply (immutable, etc). From a release management standpoint, it doesn't matter the build contains, more that it's a wholly deployable component.
This is certainly a nice ideal, and does make everybody's life a couple orders of magnitude easier when it can be enforced. But what when it cannot?

To summarize the scenario you described: a "highly experimental" feature needs to be introduced in production to solve the problem for a known subset of users.

This is not an uncommon scenario, and is actually a SOP at some places. There are two good ways to address this (isolation by configuration and isolation by instance) and both are compatible with this process.

The scenario you described is isolation by instance, and requires multiple instances of the software running (load balanced environment, downloaded software, etc). Some instances become "stable", others become "highly experimental"; this yields two, distinct production environments as well. To do this, two versions are created from two different source trees:

  # /branches/highly_experimental/mycodefile.xyz
  GO_CRAZY()

  # /trunk/mycodefile.xyz
  DO_IT_NORMALLY()

Eventually (after sufficient testing), the changes are merged in and the "highly experimental" instances can be shut down. From a deployment standpoint, builds of "highly experimental" releases can be promoted through a different set of environments ("highly experimental dev", "highly experimental testing", etc) or juts be promoted straight to "highly experimental prod". The latter is obviously reserved for emergencies.

The other (and slightly easier to manage) isolation looks like this.

  # /trunk/mycodefile.xyz
  IF (GET_SYS_CONFIG("MY_HIGHLY_EXPERIMENTAL_FEATURE_ENABLED")) THEN
     GO_CRAZY()
  ELSE
     DO_IT_NORMALLY()
  END IF

This allows "highly experimental" features to be enabled/disabled/testing through configuration instead of through a release process. Of course, it also requires a relatively sane code base to pull off, which is why isolation by instance is used just as often.

Isolation by instance requires some mad release management skillz, but you end up with a "dashboard" that looks like this:

 ENVIRONMENT    BLD#     DEPLOYED
 Integration    v3.8.1   (deployed mm/dd by ...)
 Testing        v3.8.1   (deployed mm/dd by ...)
 Staging        v3.7.4   (deployed mm/dd by ...)
 Production     v3.7.4   (deployed mm/dd by ...)
 HE Testing     v4.1.3   (deployed mm/dd by ...)
 HE Production  v4.0.9   (deployed mm/dd by ...)

da Doctah · 2011-06-07 Reply Admin

anonymous_coward:
And to the guy who said to make an installer, that may work for desktop software or websites. Once you get into the legacy enterprise systems (think TPF assembler and AS400/mainframes) that tends to fall apart fairly quickly.

Rolling out to 27 sites, each with its own customized configuration and set of features, we always had an installer to keep contention at bay.

His name was Doug.

2011-06-08 Reply Admin

In your own words, a build might be something that "can’t be built in the first place". Have you considered that the problem might lie in your own confusing nomenclature, rather than just your colleague's insistence in using the more usual definition?

2011-06-08 Reply Admin

We don't have builds, and all those components, is that bad? only one big executable that comes with everything needed compiled in...

2011-06-08 Reply Admin

C-Octothorpe:
boog:
RMDD: Release Management Driven Development.

How do I subscribe to your newsletter/religion?

Send $10 to the first person on the list, and $5 to the second. THIS IS ENTIRELY LEGAL! Remove the first person, and add yourself to the end of the list. THIS IS ENTIRELY LEGAL!

2011-06-08 Reply Admin

some guy:
Look, I'm about as cynical as it gets, but this article is hardly spamtastic. Of course he's going to mention the product he works on. But it's hardly the typical "advertisement disguised as a technical article" that plagues magazines and the web alike.

Whoooooosh!

2011-06-08 Reply Admin

C-Octothorpe:
anonymous_coward:
And this is why you need a file list... to tell the non-developer moving the files what to do. This is common. Keep the change as small as possible (minimum number of files) and provide install directions (you can now audit the actual changes against the provided instructions).
The easy way would be to zip the entire filesystem image and dump it onto the production server. You just have to hope someone didn't modify something unrelated in dev or intentionally slip in malware. E.g. When you see the change list at an atomic file level, common sense says changes to the insurance prenmium rate file shouldn't also include a change to the login security library.

But you're contradicting yourself here: include a file list of where the files should go, then you say to include a zip of the root folder and dump the whole thing (which is what I would do).

The big problem with the first deployment method is this: we're humans and we screw up... If you include everything, there is no room for error; less moving parts and less chances for us monkeys to futz things up.

This is compounded once you start including more monkeys ("the separation of duty" deployment guys) in the production line who are told to perform quite a complex set of instructions who know nothing about your application nor the framework it was developed on.

Anonymous_coward said that dumping a zipfile of the entire filesystem in production is the easy way, not that it is the correct way.

2011-06-08 Reply Admin

I work in a small environment where we do everything ourselves, so I'm really having trouble following your stages: “integration” - Writing the code? “testing” - Unit tests “staging” - No idea “production” - Making it live

Can anyone explain this for me please.

2011-06-08 Reply Admin

boog:
Perhaps if it had a fancy name with initials, more people would find the subject interesting?

And put it in job applications.

You know what I hate? Going through every acronym in a job application that I'm not familiar with, just to find out it's something I've been doing all my life. Am I too old?

frits · 2011-06-08 Reply Admin

If there was an appropriate time for Nagesh to be crowing about CMM Level 5, that time would be now.

2011-06-08 Reply Admin

Alex Papadimoulis:
as part of the process, they will "diff" compiled (.NET) assemblies before deploying just to be extra sure. Every now and then, they catch code that was accidently compiled in.

I just threw up a little in my mouth.

2011-06-08 Reply Admin

justsomedudette:
Can anyone explain this for me please.

Sure.

Integration: This is where you put everyone's little pieces together, and realize that they don't fit. Testing: This is where you prod it and poke it until it falls apart. Then, when it's fallen apart, you realize that it won't fit together again. Staging: This is where you make sure the gaffa tape holds, pretend to the customer that it falling apart is a feature if it doesn't. Production: The customer gets to keep all the pieces when it breaks.

C-Octothorpe · 2011-06-08 Reply Admin

Anonymous Cow-Herd:
justsomedudette:
Can anyone explain this for me please.
Sure.
Integration: This is where you put everyone's little pieces together, and realize that they don't fit. Testing: This is where you prod it and poke it until it falls apart. Then, when it's fallen apart, you realize that it won't fit together again. Staging: This is where you make sure the gaffa tape holds, pretend to the customer that it falling apart is a feature if it doesn't. Production: The customer gets to keep all the pieces when it breaks.

You forgot the last step: PROFIT!

2011-06-08 Reply Admin

Blogging done right

2011-06-08 Reply Admin

Anonymous Cow-Herd:
justsomedudette:
Can anyone explain this for me please.
Sure. <snip whimsical definition>

Or, mapping concepts to your likely environment

Integration: Build all projects that make up your product (may be only one). Run the unit tests. Testing: Take that build and try it out Staging: You likely don't have that. Production: Put it on the live server

Better would be Integration: Run an automated build of everything on the build server, including a run of all unit tests. Testing: Take that build and do whatever manual testing you want to get in before hitting customers. Run automated tests (that are not unit tests) if you have them. Staging: Maybe have an extra server to let select customers try the new version before general deployment. Production: Put it on the live server

2011-06-08 Reply Admin

Thanks Mark I understand now. Anonymous Cow-Herd I kind of got yours, apart from the staging bit, but now it makes more sense. Cheers.

boog · 2011-06-08 Reply Admin

cappeca:
You know what I hate? Going through every acronym in a job application that I'm not familiar with, just to find out it's something I've been doing all my life.

Indeed; a job app has a minimum requirement of 4 years working with FRESH (Fully-REinvented Software Hodgepodge - which I'll add has only been around for 3 years), but when you look it up you see it's really just an impoverished version of RELIC (RELiable Industry Components), which you've used for more than a decade (or two).

The part I hate is going in for the job interview. The culture around FRESH has not only reinvented the technology of RELIC, but reinvented the (spoken) language around it as well. So how do you convince these FRESH-fluent interviewers that you understand the concepts, when you don't know how to speak their language?

I can be a good bullshitter from time to time, but never when it really counts.

boog · 2011-06-08 Reply Admin

frits:
If there was an appropriate time for Nagesh to be crowing about CMM Level 5, that time would be now.

I haven't seen anything from him in a while (registered version, that is). Maybe he got bored, since we seemed to enjoy his trolling more than he did.

2011-06-08 Reply Admin

Whimsy aside,

Mark:
Integration: Run an automated build of everything on the build server, including a run of all unit tests. Testing: Take that build and do whatever manual testing you want to get in before hitting customers. Run automated tests (that are not unit tests) if you have them. Staging: Maybe have an extra server to let select customers try the new version before general deployment. Production: Put it on the live server

That's a pretty sensible set of definitions. If you're doing something customer-specific, staging might be where the customer does user acceptance testing before approving it for production.

2011-06-08 Reply Admin

Very strange article.

In my line we are heavily audited/examined (we have Deloitte and various state/federal entities looking through our work item tracking/source control/releases/QA process/etc at least once a year) and need to have an accountable system.

When a release goes through initial QA and a defect is found, we do another build. A release may have 20 builds or 2000.

However, once the release makes it past a certain checkpoint, defects found in a release are addressed in subsequent release(s). This is during the final QA and deployment process and may occur before or after a release is deployed to customers. Normally, low impact defects found in this period are not show stoppers and the release proceeds forward. Sometimes, with a significant defect, a release is finished but never deployed and a new (minor or point) release which addresses the defects is created - in which case go back to the top of the previous paragraph and start over.

So imagine you are having lunch with me. My phone rings. I answer. You hear half the conversation ... “Fine! I guess we’ll just do a new release for QA.” ... Alex's head explodes ....

Take a chill pill dude.

2011-06-08 Reply Admin

boog:
Indeed; a job app has a minimum requirement of 4 years working with FRESH

Am I the only one that finds temporal requirements nonsense? "Essential: 5 years experience of $language" - does something magical happen on the 5th anniversary that couldn't have happened after, say, 2 years and 324 days? Any reason why people that write these job descriptions seem to think that IT people are singularly incapable of learning new things? I don't recall ever seeing an advert for an accountant that required the candidate to have "3 years experience of writing in two-page A4 ledgers using blue ink". , yet I've heard of a Java shop that turned down several candidates purely because they hadn't used the shop's framework of choice.

boog · 2011-06-08 Reply Admin

Anonymous Cow-Herd:
Am I the only one that finds temporal requirements nonsense? "Essential: 5 years experience of $language" - does something magical happen on the 5th anniversary that couldn't have happened after, say, 2 years and 324 days? Any reason why people that write these job descriptions seem to think that IT people are singularly incapable of learning new things?

No, I think the real reason for such requirements is to simply assign numbers to the candidate. Screeners/interviewers don't know technology or how to evaluate the quality of a technical candidate. But they do know numerical comparisons. They know the crap out of numerical comparisons.

frits · 2011-06-08 Reply Admin

boog:
Anonymous Cow-Herd:
Am I the only one that finds temporal requirements nonsense? "Essential: 5 years experience of $language" - does something magical happen on the 5th anniversary that couldn't have happened after, say, 2 years and 324 days? Any reason why people that write these job descriptions seem to think that IT people are singularly incapable of learning new things?
No, I think the real reason for such requirements is to simply assign numbers to the candidate. Screeners/interviewers don't know technology or how to evaluate the quality of a technical candidate. But they do know numerical comparisons. They know the crap out of numerical comparisons.

As long as it's apples and apples. Throw them a pear and they'll die of hunger.

2011-06-08 Reply Admin

I'd just like to thank you for this article - nothing much else to say, I just wanted to give you some feedback and there is no "like" button ;-)

I spent countless hours of meetings and more annoying "if you have a minute" meetings interrupting my zone, ending in more confusion on this subject, starting the conversation from scratch over and over again.

2011-06-08 Reply Admin

I advocate using the ARGH process

Apply Random Guesswork Hueristics

xtremezone · 2011-06-09 Reply Admin

s/shameless/shameful/gi

cloudberry · 2011-06-09 Reply Admin

This is an excellent article, explaining in a concise way the basic concepts of release management. With a little tweaking, it could, and maybe should, be an introductory chapter to any explanation of release management.

2011-06-09 Reply Admin

Henning Makholm:
This is certainly a nice ideal, and does make everybody's life a couple orders of magnitude easier when it can be enforced. But what when it cannot? Scenario:

You do the experimental work on a Branch in your source control system, and give that customer a release from the Branch.

Our version numbering system consists of two numbers: An external, ever incrementing version that marketing increment as they see politically fit to keep customers happy (e.g. do we go from 3.5 to 3.5.1, 3.6 or 4.0? Depends on how we want customers to perceive the new release). An internal version: major.branch.subbranch.build. Uniquely identifies the release in a meaningful way, and we can query source control for the exact version of code files which produced it (first step of the build machine? Add a label to every file with the version of the build we're doing.)

2011-06-09 Reply Admin

Alex Papadimoulis:
The scenario you described is isolation by instance, and requires multiple instances of the software running (load balanced environment, downloaded software, etc). Some instances become "stable", others become "highly experimental"; this yields two, distinct production environments as well. To do this, two versions are created from two different source trees:

Fair enough. But notice that even though the two production environments may be "distinct" in some appropriate sense, they are not independent. The HE instances need to work together with (exchange data, report billing information, redirect clients to/from, etc) the usual stable instances. Therefore from the point of view of the software that manages the run-time configuration of the entire production environment, there cannot be more than one "production" ensemble, even though it may happen to run a mixture of code from different builds.

There's nothing inherently wrong in using different terminology between release management and configuration management. But they'd better not be so different that the terminology itself invites confusion ...

The other (and slightly easier to manage) isolation looks like this.
  # /trunk/mycodefile.xyz
  IF (GET_SYS_CONFIG("MY_HIGHLY_EXPERIMENTAL_FEATURE_ENABLED")) THEN
     GO_CRAZY()
  ELSE
     DO_IT_NORMALLY()
  END IF

But if the crazy new way requires changes to basic framework/utility code which things that are not supposed to crash, you will need to copy-paste clone all of that code in order to be sure that DO_IT_NORMALLY() indeed does it exactly normally. That results in a source tree with rampant code duplication in it, which entails maintenance problems all of its own.

2011-06-09 Reply Admin

foo:
Henning Makholm:
This is certainly a nice ideal, and does make everybody's life a couple orders of magnitude easier when it can be enforced. But what when it cannot? Scenario:

You do the experimental work on a Branch in your source control system, and give that customer a release from the Branch.

In the scenario I describe, customers are not given releases at all. The software is developed and deployed in-house in order to provide a particular service to customers.

How and where to branch is a source-control issue, which is distinct from release management (though they obviously inform one another). The point here is that getting the experimental feature to work at all require deep source changes that threaten the stability of the existing non-experimental features. Because everything that serves non-experimental customers has remain stable, there must be a run-time interface between code from the stable build and an experimental build somewhere.

Alex Papadimoulis · 2011-06-09 Reply Admin

Henning Makholm:
even though the two production environments may be "distinct" in some appropriate sense, they are not independent. The HE instances need to work together with (exchange data, report billing information, redirect clients to/from, etc) the usual stable instances. Therefore from the point of view of the software that manages the run-time configuration of the entire production environment, there cannot be more than one "production" ensemble, even though it may happen to run a mixture of code from different builds.

Good point; from a configuration mangement standpoint, they don't need to be separate environments. I guess in this case, I would set the deployment plan like...

IF (RELEASE_WORKFLOW = "STANDARD") THEN
   GET_SOURCE_CODE("/TRUNK")
   COMPILE()
   DEPLOY_COMPONENT("CORELIB", "MAIN_SERVERS")
ELSE IF (RELEASE_WORKFLOW = "HIGHLY_EXPERIMENTAL") THEN
   GET_SOURCE_CODE("/BRANCHES/" + RELEASE_NUMBER)
   COMPILE()
   DEPLOY_COMPONENT("CORELIB", "HIGHLY_EXPERIMENTAL_SERVERS")
END IF
DEPLOY_ALL_COMPONENTS_EXCEPT("CORELIB")

Obviously for later environments, you could simply use artifacts for that build instead of compiling.

Henning Makholm:
If the crazy new way requires changes to basic framework/utility code which things that are not supposed to crash, you will need to copy-paste clone all of that code in order to be sure that DO_IT_NORMALLY() indeed does it exactly normally. That results in a source tree with rampant code duplication in it, which entails maintenance problems all of its own.

Indeed. There are a ton of architectural patterns that can help, but they need to be in place early on. IOW, you shouldn't add a factory pattern for "emergency" isolations - that came come later (and done carefully).

2011-06-14 Reply Admin

If there is so much confusion between the contextual definitions of an ambiguously defined word, then the best solution isn't to write a lengthy article discussing the difference between the meanings. It is to assign a DIFFERENT word the less common / more formal definition. No confusion thereafter.

My brain was constantly confused by the article. I liked it, but I had to read it twice. Just because my already trained mind kept assigning the default definition to each and every instance of the word "build" - the common place vernacular one, the (I get the feeling) "wrong" one. But that's just what we've used since we advanced from Qbasic to C, way before high school and college.... that's the way it is for most of us my age...

2011-06-14 Reply Admin

Every SVN commit should be a release. You should release 50 times a day! How else can you get immediate feedback from customers on what you're working on?

Don't believe me? We've done it for years! http://engineering.imvu.com/ If you actually get your ENTIRE code base under test, it works, and it's fantastic!

2011-06-15 Reply Admin

xdiv0x:
My brain was constantly confused by the article. I liked it, but I had to read it twice. Just because my already trained mind kept assigning the default definition to each and every instance of the word "build" - the common place vernacular one, the (I get the feeling) "wrong" one.

Which default definition do you mean?

The common-place vernacular sense of "build" has to do with assembling pieces of timber and fired clay into a house. Any other meaning you might want is idiosyncratic jargon, and you're not going to make yourself understood by waving phrases like "the default definition" about without actually specifying which meaning you're going for.

2011-07-07 Reply Admin

omg, do you work where I work? wait no, we aren't fixing our deployment process.

2011-07-26 Reply Admin

I suggest you have a look at the book on Continuous Delivery:

http://continuousdelivery.com/

the build pipeline pretty much addresses most of the points you talk about by introducing visibility in the release process and tying all these aspects together: building, testing, deploying.

2012-04-26 Reply Admin

this is one of the best articles i've read in a long time, very well written and full of meat, not just all talk. thanks!

2014-07-12 Reply Admin

I like this article, any thoughts on using tools such as Plutora for release management? http://www.plutora.com/release-management-software/

Release Management Done Right

Leave a comment on “Release Management Done Right”