WidCo was a victim of its own success. It had started as a small purveyor of widgets: assembling, storing, transporting, and shipping the widgets to their small client base in their podunk state. They'd once had the staff to fill orders placed by phone. As they'd begun to make a name for themselves in the surrounding tri-state region, however, their CEO had caught wise to the value of "this Internet thing."
Within a decade, they were not only the country's foremost producer of widgets, but their entire staff makeup had changed. They now had more employees in IT than the other departments combined, and relied on in-house software to manage inventory, take orders, and fulfill them in a timely fashion. If the software went down, they could no longer fall back on a manual process.
And—as the IT manager was fond of pointing out in budget meetings—they had a QA department of 0 and no automated tests.
Bug reports piled up. Who had time to fix bugs? Monday was for scrambling to remedy production incidents after the weekend's jobs ran, Tuesday was for slapping together something sensible out of requirements documents, Wednesday for coding, Thursday for manual testing, and Friday for shoving half-tested code into production. Agile!
Finally, over the post-Christmas slump, the IT manager managed to convince the CTO to bring in a trainer to teach developers about "improving our unit tests". All 0 of them.
Adding tests to existing code was deemed a waste of time. It compiles, therefore it works. Begrudgingly, the CTO admitted that unit tests might be a good idea for new applications. Sometime in the next decade, they were bound to build a new application. They'd do this testing thing then.
Desperate, the IT manager put in place a code review policy: before anything could be deployed, someone had to look at it. But they were checking only the changes, the CTO pointed out. It was a waste of time to examine working code in production. Standards were seen as just documentation, and documentation was waste. Look at the existing code and do more of that.
"Think lean," the CTO said.
The IT manager sighed and hung his head.
And then a lucrative new contract was signed: WidCo would be selling widgets in Canada as well. When their distribution partner expressed concern about the lack of a QA department, the CTO loudly proclaimed that all their developers had QA training, and they were bringing in an engineer to streamline the testing process. A position promptly opened under the IT manager, who was seen with an actual, honest-to-God smile on his face for the first time all year.
Interview after interview was conducted. The first engineer was a bright young chap, a brunet with an easy smile and big round glasses. He started on a Tuesday, bringing in donuts to share with the team and promising to have things cleaned up within the week.
Two days later, he handed in his resignation, crawled into a bottle, and refused to answer his phone, muttering about raw pointers and RAII. Later, the team found him waiting tables at the local pub, his bright smile turned into a sullen sneer.
The second QA engineer was made of sterner stuff. She had the benefit of already being familiar with the code: she'd been brought in as a development contractor to save a project that was running over deadline earlier that year, and while she hadn't managed to work a miracle, she did impress the IT manager.
By now, the CTO was so over the whole QA engineer thing. He was onto this newfangled "DTAP" standard, declaring that he'd stand up a Development server, a Test server, an Acceptance server, and a Production server, and all code would be promoted between them instead of going right from the developers' machines to prod.
And so the QA Engineer rolled up her sleeves and tried to develop a sane promotions process. She set up an instance of Subversion and stuffed all the code into it. In order to comply with the standard, she made four branches: Dev, QA, Acceptance, and Trunk. Code would be done in dev, merged into QA for testing, merged into Acceptance for acceptance testing, and merged into Production to deploy. A daily cron job would push the code onto each server automatically. Continuous Integration!
The build system complete, she could get to her main job: testing code on the Testing server. She sat the project managers in a room and explained how to test on the Acceptance server. Within a month, however, she was doing it for them instead.
At least we're testing, she told herself, resigned to switching hats between "Try to break it" mode and "Is it actually nice to use?" mode when she changed servers.
Of course, there were 30 developers and only 1 QA engineer, so she didn't have time to go through changes one by one. Instead, she'd test a batch of them on a weekly schedule. But by the time they were merged together and pushed to the servers, she had no idea which change was responsible for a test failing. So all the tickets in that batch would have to be failed, and the whole thing reverted back to the dev branch. Sometimes multiple times per batch.
It was getting better, though. Every batch had fewer issues. Every code review had fewer comments. The QA engineer was beginning to see the light at the end of the tunnel. Maybe this would get to be so low-key she could handle it. Maybe she'd even be rewarded for her herculean efforts. Maybe she'd get a second helper. Maybe things would be okay.
Then the CTO read an article about how some big-name companies had "staging servers," and declared that they ought to have one to improve quality. After code was accepted, code would then be deployed to the staging server for a round of regression testing before it could go live. Of course, since they had a dedicated tester, that'd simply be her responsibility.
On the way out of that meeting, her heart sinking, the QA engineer was stopped in the hall by a project manager. "We've noticed a lot of tickets are failing," he began with a stern look. "That looks bad on our reports, so we're going to be running an extra round of testing on the dev server before it gets to you."
"Oh, good, so maybe there'll be less tickets I have to send back," she said, raking a hand through her hair.
"Exactly," he said with a hint of a smile. "I'll forward the invite to you so you can sit in and give your feedback. It'll be 4 hours on Monday afternoon."
Her efforts to weasel out of it were to no avail. She was the QA expert, after all, and her job was to train anyone who needed training. Henceforth, code would be checked into dev, tested by the project management group and the QA Engineer, moved to QA, checked by the QA Engineer, moved to Acceptance, checked by the QA Engineer, moved to Staging, checked by the QA Engineer, and then finally moved into production.
After a few failure cycles in QA (somehow, the project managers weren't very effective at finding bugs), she was staying late into the evening on Friday nights, testing code as fast as possible so it could go to production before it was technically Saturday and thus overdue.
Of course, because only she knew when the code was ready to move, the QA Engineer found herself in charge of merging between branches. And she had no idea what the code even did anymore. She never got to work with it, only seeing things from the UI level. And she was usually exhausted when she had to merge things. So there began to be a new class of error: merge errors, introduced by sloppy testers.
The CTO had a brillant solution to that as well. He split up the shared libraries, making multiple copies of the repository. Each team would have their own stack of DTASP servers and their own version of source code. That way, they could deploy without having to merge code belonging to different teams. In addition to their existing separate dev instances, all 10 3-man teams would get 8 servers each: test, acceptance, staging, and prod, plus a database for each. And the QA engineer would have to test all of them.
These efforts failed to make a dent in WidCo's bug backlog. However, the QA engineer made a killing running a betting pool in the breakroom. The odds were calculated fresh every Monday morning, and the payout went out when metrics were pulled on Friday afternoon. There was only one question to gamble on.
Which was larger this week: the average time required to deploy a fix, or the average time for a developer to leave for a new job?