Set numpy array elements to zero if they are above a specific threshold

Question:

Say, I have a numpy array consists of 10 elements, for example:

a = np.array([2, 23, 15, 7, 9, 11, 17, 19, 5, 3])

Now I want to efficiently set all a values higher than 10 to 0, so I’ll get:

[2, 0, 0, 7, 9, 0, 0, 0, 5, 3]

Because I currently use a for loop, which is very slow:

# Zero values below "threshold value".
def flat_values(sig, tv):
    """
    :param sig: signal.
    :param tv: threshold value.
    :return:
    """
    for i in np.arange(np.size(sig)):
        if sig[i] < tv:
            sig[i] = 0
    return sig

How can I achieve that in the most efficient way, having in mind big arrays of, say, 10^6 elements?

Asked By: bluevoxel

||

Answers:

In [7]: a = np.array([2, 23, 15, 7, 9, 11, 17, 19, 5, 3])

In [8]: a[a > 10] = 0

In [9]: a
Out[9]: array([2, 0, 0, 7, 9, 0, 0, 0, 5, 3])
Answered By: unutbu

Generally, list comprehensions are faster than for loops in python (because python knows that it doesn’t need to care for a lot of things that might happen in a regular for loop):

a = [0 if a_ > thresh else a_ for a_ in a]

but, as @unutbu correctly pointed out, numpy allows list indexing, and element-wise comparison giving you index lists, so:

super_threshold_indices = a > thresh
a[super_threshold_indices] = 0

would be even faster.

Generally, when applying methods on vectors of data, have a look at numpy.ufuncs, which often perform much better than python functions that you map using any native mechanism.

Answered By: Marcus Müller

If you don’t want to change your original array

In [2]: a = np.array([2, 23, 15, 7, 9, 11, 17, 19, 5, 3])
      
In [3]: b = np.where(a > 10, 0, a)

In [4]: b
Out[4]: array([2, 0, 0, 7, 9, 0, 0, 0, 5, 3])

In [5]: a
Out[5]: array([ 2, 23, 15,  7,  9, 11, 17, 19,  5,  3])
Answered By: fabda01

From the neural networks from scratch series by sentdex on Youtube, he used np.maximum(0, [your array]) to make all values less than 0 into 0.

For your question I tried np.minimum(10, [your array]) and it seemed to work incredibly fast. I even did it on an array that was 10e6 (uniform distribution generated using 50 * np.random.rand(10000000)), and it worked in 0.039571 seconds. I hope this is fast enough.

Answered By: Matthew Kozubov
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.