Photo Credit: 'peterallen' @ FlickrFlash back to the early 1990s when David S. was working for a large health care provider, maintaining an application that passed data files between two servers across a leased line.

Normally, data transfers went off without a hitch - the applications responsible for shuffling data between the servers were well written and while the hardware on either end wasn't exactly bleeding or even cutting edge, it was mature enough that all of the kinks were worked out.  For David, day to day maintenance was a piece of cake.

Until some problems showed up one summer morning.

The System Was (Almost) Down

David came into work to find an alert email stating that network connectivity had been interrupted during the night and transfers between 01:30 and 04:30 were taking almost 2 hours to complete.  This caused great concern because many time critical processes ran only AFTER the files were completely sent. David fired up an ftp session from the server on his end and the 40 MB test file was sent at a comfortable 500kb/s and finished in less than a minute.

He reviewed the code, checked the application logs, asked the infrastructure team for input, and even asked if the cleaning staff had unplugged some network switch to run the vacuum cleaner at night - none gave any leads. 

After several days of file transfer delays, management had become acutely aware of the connectivity problem. So, with the heat mounting, David decided that the only way to find the source of the problem would be to come into work at 2am and catch the leased line connection in the act of failing.

Echo...Echo?

The night of David's early arrival, the air was sticky and thick with fog.  He had to run his wipers the whole way into the parking area just to see what must have been 10 feet in front of him. However, the good news was, despite risking his life driving at night through near zero visibility, he had hit pay dirt - at the time he checked, network speed was running around 1/10 of what it should be.  Elated that network connectivity had been brought to a crawl, David immediately phoned the network support at AT&V to report the problem.

The tech on the other end, who actually seemed quite perky for 2am, stated that the problem was a known issue due an old line splice between the two ends of the leased line.  The fog, had caused some moisture to build up, thereby interrupting the connection.  By the time the sun was up, the moisture problem, and the connectivity issues would have evaporated.

However, David was told not to fret as there was good news - the section of the line with the splice was due to be replaced - in another 6 months.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!