What are acceptable use-cases for python's `assert` statement?

Question:

I often use python’s assert statement to check user input and fail-fast if we’re in a corrupt state. I’m aware that assert gets removed when python with the -o(optimized) flag. I personally don’t run any of my apps in optimized mode, but it feels like I should stay away from assert just in-case.

It feels much cleaner to write

assert filename.endswith('.jpg')

than

if not filename.endswith('.jpg'):
    raise RuntimeError

Is this a valid use case for assert? If not, what would a valid use-case for python’s assert statement be?

Asked By: Gattster

||

Answers:

If being graceful is impossible, be dramatic

Here is the correct version of your code:

if filename.endswith('.jpg'):
    # convert it to the PNG we want
    try:
        filename = convert_jpg_to_png_tmpfile(filename)
    except PNGCreateError:
        # Tell the user their jpg was crap
        print "Your jpg was crap!"

It is a valid case, IMHO when:

  1. The error is totally, 100% fatal, and dealing with it would be too grim to comprehend
  2. The assert should only fail if something causes the laws of logic to change

Otherwise, deal with the eventuality because you can see it coming.

ASSERT == “This should never occur in reality, and if it does, we give up”

Of course, this isn’t the same as

#control should never get here

But I always do

#control should never get here
#but i'm not 100% putting my money where my mouth
#is
assert(False)

That way I get a nice error. In your example, I would use the if version and convert the file to jpg!

Answered By: Aiden Bell

Personally, I use assert for unexpected errors, or things you don’t expect to happen in real world usage. Exceptions should be used whenever dealing with input from a user or file, as they can be caught and you can tell the user “Hey, I was expecting a .jpg file!!”

Answered By: Corey D

assert is best used for code that should be active during testing when you are surely going to run without -o

You may personally never run with -o but what happens if your code ends up in a larger system and the admin wants to run it with -o?

It’s possible that the system will appear to run fine, but there are subtle bugs which have been switched on by running with -o

Answered By: John La Rooy

Assertions should be used for expressing invariants, or preconditions.
In your example, you are using them for checking unexpected input – and that’s a completely different class of exceptions.

Depending on the requirements, it may be perfectly OK to raise an exception on wrong input, and stop the application; however the code should always be tailored for expressiveness, and raising an AssertionError is not that explicit.
Much better would be to raise your own exception, or a ValueError.

Answered By: rob

Totally valid. The assertion is a formal claim about the state of your program.

You should use them for things which can’t be proven. However, they’re handy for things you think you can prove as a check on your logic.

Another example.

def fastExp( a, b ):
    assert isinstance(b,(int,long)), "This algorithm raises to an integer power"
    etc.

Yet another. The final assertion is a bit silly, since it should be provable.

# Not provable, essential.
assert len(someList) > 0, "Can't work with an empty list."
x = someList[]
while len(x) != 0:
    startingSize= len(x)
    ... some processing ...
    # Provable.  May be Redundant.
    assert len(x) < startingSize, "Design Flaw of the worst kind."

Yet another.

def sqrt( x ):
    # This may not be provable and is essential.
    assert x >= 0, "Can't cope with non-positive numbers"
    ...
    # This is provable and may be redundant.
    assert abs( n*n - x ) < 0.00001 
    return n

There are lots and lots of reasons for making formal assertions.

Answered By: S.Lott

The Python Wiki has a great guide on using Assertion effectively.

The answers above don’t necessary clarify the -O objection when running Python.
The quote the above page:

If Python is started with the -O option, then assertions will be stripped out and not evaluated.

Answered By: johnpaulhayes

S.Lott’s answer is the best one. But this was too long to just add to a comment on his, so I put it here. Anyways, it’s how I think about assert, which is basically that it’s just a shorthand way to do #ifdef DEBUG.

Anyways, there are two schools of thought about input checking. You can do it at the target, or you can do it at the source.

Doing it at the target is inside the code:

def sqrt(x):
    if x<0:
        raise ValueError, 'sqrt requires positive numbers'
    root = <do stuff>
    return root

def some_func(x):
    y = float(raw_input('Type a number:'))
    try:
        print 'Square root is %f'%sqrt(y)
    except ValueError:
        # User did not type valid input
        print '%f must be a positive number!'%y

Now, this has a lot of advantages. Probably, the guy who wrote sqrt knows the most about what are valid values for his algorithm. In the above, I have no idea if the value I got from the user is valid or not. Someone has to check it, and it makes sense to do it in the code that knows the most about what’s valid – the sqrt algorithm itself.

However, there is a performance penalty. Imagine code like this:

def sqrt(x):
    if x<=0:
        raise ValueError, 'sqrt requires positive numbers'
    root = <do stuff>
    return root

def some_func(maxx=100000):
    all_sqrts = [sqrt(x) for x in range(maxx)]
    i = sqrt(-1.0)
    return(all_sqrts)

Now, this function is going to call sqrt 100k times. And every time, sqrt is going to check to see if the value is >= 0. But we already know it’s valid, because of how we generate those numbers – those extra valid checks are just wasted execution time. Wouldn’t it be nice to get rid of them? And then there’s one, that will throw a ValueError, and so we’ll catch it, and realize we made a mistake. I write my program relying on the subfunction to check me, and so I just worry about recovering when it doesn’t work.

The second school of thought is that, instead of the target function checking inputs, you add constraints to the definition, and require the caller to ensure that it’s calling with valid data. The function promises that with good data, it will return what its contract says. This avoids all those checks, because the caller knows a lot more about the data it’s sending than the target function, where it came from and what constraints it has inherently. The end result of these is code contracts and similar structures, but originally this was just by convention, i.e. in the design, like the comment below:

# This function valid if x > 0
def sqrt(x):
    root = <do stuff>
    return root

def long_function(maxx=100000):
    # This is a valid function call - every x i pass to sqrt is valid
    sqrtlist1 = [sqrt(x) for x in range(maxx)]
    # This one is a program error - calling function with incorrect arguments
    # But one that can't be statically determined
    # It will throw an exception somewhere in the sqrt code above
    i = sqrt(-1.0)

Of course, bugs happens, and the contract might get violations. But so far, the result is about the same – in both cases, if I call sqrt(-1.0), i will get an exception inside the sqrt code itself, can walk up the exception stack, and figure out where my bug is.

However, there are much more insidious cases…suppose, for instance, my code generates a list index, stores it, later looks up the list index, extracts a value, and does some processing. and let’s say we get a -1 list index by accident. all those steps may actually complete without any errors, but we have wrong data at the end of the test, and we have no idea why.

So why assert? It would be nice to have something with which we could get closer-to-the-failure debug information while we are in testing and proving our contracts. This is pretty much exactly the same as the first form – after all, it’s doing the exact same comparison, but it is syntactically neater, and slightly more specialized towards verify a contract. A side benefit is that once you are reasonably sure your program works, and are optimizating and looking for higher performance vs. debuggability, all these now-redundant checks can be removed.

Answered By: Corley Brigman

Botton line:

  • assert and its semantics are a heritage from earlier languages.
  • Python’s drastic shift of focus has proven to make the traditional use case irrelevant.
  • No other use cases have been officially proposed as of this writing, though there are ideas out there to reform it into a general-purpose check.
  • It can be used (and is indeed used) as a general-purpose check as it is now if you’re okay with the limitations.

The intended use case which ASSERT statements were originally created in mind with is:

  • to make sanity checks that are relevant but considered too expensive or unnecessary for production use.
    • E.g. these can be input checks for internal functions, checks that free() is passed a pointer that was previously got from malloc() and such, internal structure integrity checks after non-trivial operations with them etc.

This was a big deal in older days and still is in environments designed for performance. That’s the entire reason for its semantics in, say, C++/C#:

  1. Being cut out in a release build
  2. Terminating a program immediately, ungracefully and as loudly as possible.

Python, however, consciously and intentionally sacrifices code performance for programmer’s performance (believe it or not, I recently got a 100x speedup by porting some code from Python to Cython – without even disabling boundary checks!). Python code runs in a “safe” environment in that you cannot utterly “break” your process (or an entire system) to an untraceable segfault/BSoD/bricking – the worst you’ll get is an unhandled exception with a load of debugging information attached, all gracefully presented to you in a readable form.

  • Which implies that runtime and library code should include all kinds of checks at appropriate moments – at all times, whether it’s “debug mode” or not.

In addition, Python’s strong impact on providing the sources at all times (transparent compilation, source lines in traceback, the stock debugger expecting them alongside .pyc‘s to be of any use) very much blurs the line between “development” and “use” (that’s one reason why setuptools‘ self-contained .eggs created a backlashand why pip always installs them unpacked: if one is installed packed, the source is no longer readily available and problemes with it – diagnosable).

Combined, these properties all but destroy any use cases for “debug-only” code.

And, you guessed it, an idea about repurposing assert as a general-purpose check eventually surfaced.

Answered By: ivan_pozdeev
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.