PyCharm warns local variable might be referenced

Question:

Why does PyCharm highlight the boolean variable nearby the return with Local variable "boolean" might be referenced before assignment?

This code checks whether a number is prime or not:

import random
import math
import time
def prime_t(x):
    print x
    if x < 2:
        return False
    if x == 2:
        return True
    if x == 3:
        return True
    for i in range(2, int(math.sqrt(x))+1):
        if x % i == 0:
            boolean = False
            break
        else:
            boolean = True
    return boolean
random.seed()
how_much = input()
start = time.time()
for i in range(0, how_much):
    print(prime_t(random.randint(0, 1000)))
print time.time()-start

I’ve read that might be some problem with global variables, but there’s no ones which might be used in prime_t(). I had similar thing – exception while executing the code, but I think it has been eliminated with if x == 2 and if x == 3.

What else might be the problem?

Asked By: Mikeros

||

Answers:

PyCharm is not certain that boolean will be set. It is not smart enough to work out the flow of your code, so it doesn’t know that your for loop will always have at least 1 iteration (since x > 3 is true by that point).

Instead, it assumes that variables bound in a for loop could potentially never be set, and thus raises this warning.

The work-around is of course to set boolean = False before the loop, just to shut it up. It is only a warning, you could also just ignore it as the IDE is trying to help you but misunderstood.

Answered By: Martijn Pieters

For those looking to ignore this, put

# noinspection PyUnboundLocalVariable

Above the line.

Thanks to: https://github.com/whitews/pc-inspection-suppression-list/blob/master/suppress-inspection.csv

Answered By: Joshua Wolff

In general, code that’s inside a for or while loop doesn’t have to run. The condition for a while loop could be unmet as soon as the loop is reached initially. The for loop could be trying to iterate over something that’s empty. If the loop doesn’t run, and the code in the loop is the only place that a particular variable gets set, then it wouldn’t get set. Trying to use it would cause an UnboundLocalError (a subtype of NameError) to be raised.

IDEs often try to detect this situation and offer warnings. Because they are warnings, they will be conservative. Python is a highly dynamic language and there’s often very little that you can prove about the code’s behaviour before it runs. So pretty well any loop that doesn’t use constant, literal data (for x in [1, 2, 3]:) needs to be treated as "might not run at all".

And, indeed, if I try out the example function at the interpreter prompt, I can easily get that UnboundLocalError:

>>> prime_t(math.pi)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 14, in prime_t
UnboundLocalError: local variable 'boolean' referenced before assignment

After all, nothing else in the code type-checked the value; and pi isn’t equal to 3, nor to 2, nor is it less than 2.


There are several reasonable approaches to the problem, in context.

  1. We can simply assign boolean = True before the loop:
    boolean = True
    for i in range(2, int(math.sqrt(x))+1):
        if x % i == 0:
            boolean = False
            break
    return boolean
    
  2. We can use an else clause on the loop to set the "default" value:
    for i in range(2, int(math.sqrt(x))+1):
        if x % i == 0:
            boolean = False
            break
    else:
        boolean = True
    return boolean
    

    The else block code runs whenever there isn’t a break out of the loop; in particular, it runs if the loop doesn’t run (since there was nothing to break out of). So static checking tools should be able to verify that our bases are covered. Many people don’t like this syntax because it’s confusing and often unintended (it looks like a typo, right?). However, this use pattern is pretty much exactly why it’s in the language. (That said, I think we can do better; keep reading.)

  3. Since there is nothing more to do after this loop in the function, we don’t need a flag at all; we can return False as soon as we find a factor, and return True at the end of the function (since we didn’t find one if we got this far):
    for i in range(2, int(math.sqrt(x))+1):
        if x % i == 0:
            return False
    return True
    

(Notice that in all of these cases, I removed the else from the if – because it doesn’t serve a purpose here. It doesn’t make a lot of logical sense to keep reminding ourselves that we didn’t find a factor yet – we wouldn’t still be looping if we did.)

All of that said, for "searching" loops like these I prefer to avoid explicit for loops altogether. The built-in any and all functions are practically designed for the purpose – especially when paired with a generator expression, which ensures their short-circuiting behaviour stays relevant:

return not any(x % i == 0 for i in range(2, int(math.sqrt(x))+1))

Equivalently (following de Morgan’s law):

return all(x % i != 0 for i in range(2, int(math.sqrt(x))+1))
Answered By: Karl Knechtel