It was a mandatory change control meeting. Steven S.’s department, a research branch of the Ministry of Social Affairs and Health in Belgium, assembled in a cramped meeting room without enough chairs for everyone. Camille, head of IT, was nonplussed.

“These orders come directly from Security,” she began. “Just last month, we monitored over a hundred attempts to break into the HCP.” The Home Care Platform was a database of citizens’ requests for doctors’ visits, prescription coverage, etc. Steven’s team had developed a mobile app that gave citizens access to HCP’s records.

“An automated script,” she continued, “purged our server logs before Security could investigate. Now we have little information on what these attackers were trying to access, nor if they were able to find a breach.”

A Woodpile 3D

Steven could guess what was coming next.

“Under no circumstances is any member of this department to delete logs from the servers without the consent of IT. That is all.”

The First Drops

The first support calls came a few days later. Some app users complained that they weren’t able to access their records. When they entered their credentials into the app, the login screen would display a spinner indefinitely.

At first, Steven didn’t think much of it, as some users would refresh their app so much that the firewall would block the IP for a bit. He entered the details into a new ticket, assigned it to IT, and marked it low priority. He always had something better to do.

But the calls kept coming. He escalated the ticket to medium, then high, then critical. Meanwhile, no one from IT had touched it.

Steven groaned. He opened the department’s internal API tool in a browser window and tried out a few requests. They all timed out.

Then, all fo a sudden, the requests started going through again.

The HCP backend was remarkably robust, with request caching and multiple middleware servers. If the entire API had failed, it had to be more serious than a network configuration change or a temporary server outage. He marked the ticket as “In Progress” and kept it assigned to himself.

The Flood

The next day, the API went down again, and this time it wasn’t recovering.

Steven stormed to the IT office. Camille would know what took the servers down yesterday, and she would know what was happening now. He found her hovering over a monitor, furtively typing into terminal window.

He read her command prompt: srm /var/log/*.

“Are you purging the logs?” Steven asked.

Camille closed the terminal window. “Of course not.”

Steven pressed the issue. “The API servers are down, and I can’t keep up with all the support calls.”

Camille sighed. “After we disabled the script that was purging the logs, the hard disk kept running out of space. I was stuck on the metro and couldn’t get here in time to purge it manually. We miscalculated how many requests these servers were processing.”

“So … why don’t you just turn the script back on?”

“Security has expressly forbidden automatic server log deletion. We have to do it ourselves.” With that, Camille re-opened the terminal and re-entered the command.

Plugging the Holes

This went on for another few months. Every few days the API would fail, typically early in the morning, until someone from IT could go in and purge the logs. Steven even wrote a phone script to use for the inevitable, predictable support calls.

Finally, he had had enough. He emailed a representative from Security, the department that started this ball rolling, about the issue. He asked if the automated script could be re-enabled.

The representative emailed back a few minutes later. They said that IT had been given the authorization to re-enable the script only a week after.

The API had been going down almost every day for months because Camille never read the request to turn the script back on.

It was the end of his shift. After forwarding the email to Camille, he left the office to look for a nearby pub. He needed a good lambic to soothe his soul. Months of support calls could have been avoided if anyone in IT checked their email.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!