Counting positive integer elements in a list with Python list comprehensions

Question:

I have a list of integers and I need to count how many of them are > 0.
I’m currently doing it with a list comprehension that looks like this:

sum([1 for x in frequencies if x > 0])

It seems like a decent comprehension but I don’t really like the “1”; it seems like a bit of a magic number. Is there a more Pythonish way to do this?

Asked By: fairfieldt

||

Answers:

You could use len() on the filtered list:

len([x for x in frequencies if x > 0])
Answered By: sth

A slightly more Pythonic way would be to use a generator instead:

sum(1 for x in frequencies if x > 0)

This avoids generating the whole list before calling sum().

Answered By: Greg Hewgill

If you want to reduce the amount of memory, you can avoid generating a temporary list by using a generator:

sum(x > 0 for x in frequencies)

This works because bool is a subclass of int:

>>> isinstance(True,int)
True

and True‘s value is 1:

>>> True==1
True

However, as Joe Golton points out in the comments, this solution is not very fast. If you have enough memory to use a intermediate temporary list, then sth’s solution may be faster. Here are some timings comparing various solutions:

>>> frequencies = [random.randint(0,2) for i in range(10**5)]

>>> %timeit len([x for x in frequencies if x > 0])   # sth
100 loops, best of 3: 3.93 ms per loop

>>> %timeit sum([1 for x in frequencies if x > 0])
100 loops, best of 3: 4.45 ms per loop

>>> %timeit sum(1 for x in frequencies if x > 0)
100 loops, best of 3: 6.17 ms per loop

>>> %timeit sum(x > 0 for x in frequencies)
100 loops, best of 3: 8.57 ms per loop

Beware that timeit results may vary depending on version of Python, OS, or hardware.

Of course, if you are doing math on a large list of numbers, you should probably be using NumPy:

>>> frequencies = np.random.randint(3, size=10**5)
>>> %timeit (frequencies > 0).sum()
1000 loops, best of 3: 669 us per loop

The NumPy array requires less memory than the equivalent Python list, and the calculation can be performed much faster than any pure Python solution.

Answered By: unutbu

How about this?

reduce(lambda x, y: x+1 if y > 0 else x, frequencies)

EDIT:
With inspiration from the accepted answer from @~unutbu:

reduce(lambda x, y: x + (y > 0), frequencies)

Answered By: Peter Jaric

This works, but adding bools as ints may be dangerous. Please take this code with a grain of salt (maintainability goes first):

sum(k>0 for k in x)
Answered By: Escualo

If the array only contains elements >= 0 (i.e. all elements are either 0 or a positive integer) then you could just count the zeros and subtract this number form the length of the array:

len(arr) - arr.count(0)
Answered By: ben_nuttall

I would like to point out that all said applies to lists. If we have a numpy array,
there are solutions that will be at least fourty times faster…

Summing up all solutions given and testing for efficiency, plus adding some more (had to modify the reduce code to be able to run it in Python 3), note that the last answer is in micros, not millis:
enter image description here

code in copy-pastable format:

import random
import functools
frequencies = [random.randint(0,2) for i in range(10**5)]
from collections import Counter
import numpy as np

%timeit len([x for x in frequencies if x > 0])   # sth
%timeit sum([1 for x in frequencies if x > 0])
%timeit sum(1 for x in frequencies if x > 0)
%timeit sum(x > 0 for x in frequencies)
%timeit functools.reduce(lambda x, y: x + (y > 0), frequencies)
%timeit Counter(frequencies)

#'-------Numpy-----------------------')
%timeit ((np.array(frequencies))>0).sum()
npf=np.array(frequencies)
#'-------Numpy without conversion ---')
%timeit (npf>0).sum()
Answered By: ntg

You can also use numpy.count_nonzero like this:

import numpy as np
xs = [1,0,4,0,7]
print(np.count_nonzero(xs)) #3
Answered By: KyleMit