all vs and AND any vs or

Question:

I was eager to know about the what is the difference between python all and and, as well as any and or? For example:

status1 = 100
status2 = 300
status3 = 400

Which is better to use:

if status1 == 100 and status2 == 300 and status3 == 400:

or

if all([status1 == 100, status2 == 300, status3 == 400]):

similarly for the any and or condition:

if status1 == 100 or status2 == 300 or status3 == 400:

or

if any([status1 == 100, status2 == 300, status3 == 400]):

which one is more efficient, using the built-in functions or the primitive or and and conditions ?

Asked By: Nishant Kashyap

||

Answers:

The keywords and and or follow Python’s short circuit evaluation rules. Since all and any are functions, all arguments would be evaluated. It’s possible to get different behaviour if some of the conditions are functions calls.

Answered By: eduffy

tl;dr

From what I know, all is better to use when you may be comparing a varying amount of boolean statements and using and is much better for a finite boolean statement, and when using all, try to use a generator function.

Explanation in detail

Edit (for clarity of use of the term short-ciruit)
Their usage in finite statements is preferred because Python will short circuit the evaluation of each Boolean statement once the Truth can be determined. See end of answer for proof and detailed example of this.

Since any statement comprised of successive and statements will be False if at least one statement is False then the compiler knows to check only until it reaches one false answer:

status1 == 100 and status2 == 300 and status3 == 400

It will check status1 == 100 if this were found to be False, it would immeadiately stop processing the statement, if it were True if would now check status2 == 300, etc.

This kind of logic can be visually demonstrated using a loop:

Image we were writing the behavior for the and statement, you would check each statement along the line and determine if all of them are True and return True or we would find a False value and return False. You can save time after reaching the first false statement and just quit immediately.

def and(statements):
    for statement in statements:
        if not statement:
            return False
    return True

and for or we would write logic that would exit as soon as a True statement is found, as this proves all or statements to be irrelevant to the overall truth of the statement as a whole:

def or(statements):
    for statement in statements:
        if statement:
            return True
    return False

This logic is of course mixed and intertwined appropriately obeying order of operations when and and or statements are mixed together

The and and any statements serve to avoid this situation:

collection_of_numbers = [100,200,300,400,500,600,.....]
if collection_of_numbers[0] == 100 and collection_of_numbers[1] == 200 and .......:
    print "All these numbers make up a linear set with slope 100"
else:
    print "There was a break in the pattern!!!"

Similarly with or

collection_of_numbers = [100,200,300,400,500,600,.....]
if collection_of_numbers[0] == 100 or collection_of_numbers[1] == 200 or .......:
    print "One of these numbers was a multiple of 100"
else:
    print "None of these numbers were multiples of 100"

for example:

temp = []
itr = 0
for i in collection_of_numbers:
    temp.append(i == itr)
    itr += 100
if all(temp):
    print "The numbers in our collection represent a linear set with slope 100"
else:
    print "The numbers in out collection do not represent a linear set with slope 100"

A kind of silly example, but I think it demonstrates the type of scenario when all might be of some use.

A Similar argument is made for any:

temp = []
for i in collection_of_numbers:
    temp.append(i%3 == 0)
if any(temp):
    print "There was at least one number in our collect that was divisible by three"
else:
    print "There were no numbers in our collection divisible by three"

Though it could be argued that you will save a lot more time implementing this kind of logic using loops.

for and instead of all:

itr = 0
result = True
for i in collection_of_numbers:
    if not i == itr:
        result = False
        break
    itr += 100
if result:
    print "The numbers in our collection represent a linear set with slope 100"
else:
    print "The numbers in out collection do not represent a linear set with slope 100"

The difference being this will break before checking every single entry, saving a lot of time in large sets where an early entry breaks your condition.

for or instead of any:

temp = []
result = False
for i in collection_of_numbers:
    if i%3 == 0:
        result = True
        break
if result:
    print "There was at least one number in our collect that was divisible by three"
else:
    print "There were no numbers in our collection divisible by three"

This will check until it finds one to meet the condition as anything after that will not change how True the statement is.

** Edit ** Example for above use of short circuit phrasing and proof of statement.
Consider

1 == 2 and 2 == 2

and

all([1 == 2, 2 == 2])

the first statement will evaluate 1 == 2 to be False and the statement as a whole will immeadiately short-circuit and be evaulated to False. Whereas the second statement will evaluate 1 == 2 to be False, 2 == 2 to be True, then upon entering the function and it will now return False. The extra step of having to evaluate each statement first is why it is preferable if you are checking some small case finite set of boolean checks to not use the function.

While inconsequential with two statements, if you take an extreme example you will see what I mean by the evaluation of all the boolean statements is short circuited. The below test evaluates 1000 Boolean statements in different fashions and times their execution time. Each statements first Boolean statement would cause a short circuit on the boolean statement as a whole but not on the evaluation.

test.py

import timeit

explicit_and_test = "1 == 0 and " + " and ".join(str(i) + " == " + str(i) for i in range(1000))

t = timeit.Timer(explicit_and_test)
print t.timeit()

function_and_test = "all([1 == 0, " + ", ".join(str(i) + " == " + str(i) for i in range(1000)) + "])"

t = timeit.Timer(function_and_test)
print t.timeit()

setup = """def test_gen(n):
    yield 1 == 0
    for i in xrange(1,n):
        yield i == i"""

generator_and_test = "all(i for i in test_gen(1000))"

t = timeit.Timer(generator_and_test,setup=setup)
print t.timeit()

And when run:

$ python test.py
0.0311999320984      # explicit and statement
26.3016459942        # List of statements using all()
0.795602083206       # Generator using all()

The effects of the short circuit evaluation of statements is clearly evident here by an exorbitant factor. You can see that even still the best approach for any sort of finite Boolean statement is to use an explicit statement, and as I stated in the beginning of my lengthy answer. These functions exist for cases where you may not know how many Boolean statements you need to evaluate.

Answered By: Farmer Joe
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.