Wojciech has to maintain a C++ service that, among other things, manages a large pool of TCP sockets. These sockets are, as one might expect, for asynchronous communication. They mostly sit idle for a long time, wait for data to arrive, then actually do work.

As one might expect, this is handled via threads. Spin up a thread for the sockets, have the thread sleep until data arrives on the socket- usually by selecting, wake up and process the data. This kind of approach keeps CPU usage down, ensures that the rest of the program remains responsive to other events, and is basically the default practice for this kind of problem. And that's exactly what the threads in Wojciech's program did.

So where's the WTF? Well, it's not in the child threads, it's in the parent thread that kicks them off.

while (childThreadIsWorking) {}

Instead of maybe polling the child thread, or sleeping the parent thread, or doing any of the many other options, this just busy-waits until the child thread is done, defeating the entire purpose of spinning up child threads.

Unfortunately for Wojciech, this pattern was spammed all through the code base. The easiest way to minimize its consequences without completely restructuring how the program handled asynchronous operations was to add 10ms sleeps into the body of the while loop. That alone took things from 100% CPU utilization down to 10% at peak usage.

[Advertisement] Continuously monitor your servers for configuration changes, and report when there's configuration drift. Get started with Otter today!