Noam and a few friends decided it was time for them to launch their own product. They were young, optimistic about their career, and had some opinions about the best way to handle some basic network monitoring and scanning tasks. So they iterated on the idea a few times, until one day the program just started hanging. At first, Noam thought it was just a hang, but after walking away from the machine for a few minutes in frustration, he discovered that it was just running really slow.
After a little investigation, he tracked down the problem to the function responsible for checking if an IP matched a filter. That filter could contain globs, which made things a bit tricky, but his partner had some ideas about how best to handle them.
def ip_check(ip, rule):
ret_value = False # Return Value
if ip == rule['host']: # Compare against rule
ret_value = True
elif '*' in rule['host']: # Handle wildcards
mask = rule['host'].split('.')
length = mask.count('*')
final = []
for subset in itertools.permutations(range(256), length):
final.append(list(subset))
for item in final:
address = rule['host'].split('.')
for index in range(length):
address[address.index('*')] = str(item[index])
address = '.'.join(address)
if address == ip:
ret_value = True
return ret_value
This code takes a long way around.
We start with for subset in itertools.permutations(range(256), length):
. itertools.permutations
does exactly what you think- in this case, it creates every possible permutation of the numbers in the range 0–255, taken length
at a time- where length is the number of wildcards. So, for example, 10.1.*.*
, is a mere 65,280 entries. *.*.*.*
, which is what Noam was doing when testing, is a lot more. 4,195,023,360 entries, to be exact.
Then we iterate across every possible combination to put them into the final
list. The permutations
method is smart, it lazily evaluates the permutations, calculating the next one when you need it. As you can see, Python does allow you to iterate across it. So we don’t need the final
variable at all, we could have simply done for item in itertols.permutations(…)
and that would have been fine. Well, not fine, none of this is fine.
So, we populate a list with every possible permutation, then we iterate across every permutation. We incorrectly slam the permuted values into the test string, and if that test string matches our IP, we set the ret_value
to True
. And then we keep looping. This block doesn’t even take the simplest optimization of simply quitting when it finds what it’s looking for.
Noam rewrote this function, replacing it with a much simpler 3-line function using built-in methods. Then Noam went on to have a long conversation with his team about how something like this happened.