How to count values in a certain range in a Numpy array?

Question:

I have a NumPy array of values. I want to count how many of these values are in a specific range say x<100 and x>25. I have read about the counter, but it seems to only be valid for specif values not ranges of values. I have searched, but have not found anything regarding my specific problem. If someone could point me towards the proper documentation I would appreciate it. Thank you

I have tried this

   X = array(X)
   for X in range(25, 100):
       print(X)

But it just gives me the numbers in between 25 and 99.

EDIT
The data I am using was created by another program. I then used a script to read the data and store it as a list. I then took the list and turned it in to an array using array(r).

Edit

The result of running

 >>> a[0:10]
 array(['29.63827346', '40.61488812', '25.48300065', '26.22910525',
   '42.41172923', '20.15013315', '34.95323355', '13.03604098',
   '29.71097606', '9.53222141'], 
  dtype='<U11')
Asked By: Stripers247

||

Answers:

If your array is called a, the number of elements fulfilling 25 < x < 100 is

((25 < a) & (a < 100)).sum()

The expression (25 < a) & (a < 100) results in a Boolean array with the same shape as a with the value True for all elements that satisfy the condition. Summing over this Boolean array treats True values as 1 and False values as 0.

Answered By: Sven Marnach

You could use histogram. Here’s a basic usage example:

>>> import numpy
>>> a = numpy.random.random(size=100) * 100 
>>> numpy.histogram(a, bins=(0.0, 7.3, 22.4, 55.5, 77, 79, 98, 100))
(array([ 8, 14, 34, 31,  0, 12,  1]), 
 array([   0. ,    7.3,   22.4,   55.5,   77. ,   79. ,   98. ,  100. ]))

In your particular case, it would look something like this:

>>> numpy.histogram(a, bins=(25, 100))
(array([73]), array([ 25, 100]))

Additionally, when you have a list of strings, you have to explicitly specify the type, so that numpy knows to produce an array of floats instead of a list of strings.

>>> strings = [str(i) for i in range(10)]
>>> numpy.array(strings)
array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], 
      dtype='|S1')
>>> numpy.array(strings, dtype=float)
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])
Answered By: senderle

Sven’s answer is the way to do it if you don’t wish to further process matching values.
The following two examples return copies with only the matching values:

np.compress((25 < a) & (a < 100), a).size

Or:

a[(25 < a) & (a < 100)].size

Example interpreter session:

>>> import numpy as np
>>> a = np.random.randint(200,size=100)
>>> a
array([194, 131,  10, 100, 199, 123,  36,  14,  52, 195, 114, 181, 138,
       144,  70, 185, 127,  52,  41, 126, 159,  39,  68, 118, 124, 119,
        45, 161,  66,  29, 179, 194, 145, 163, 190, 150, 186,  25,  61,
       187,   0,  69,  87,  20, 192,  18, 147,  53,  40, 113, 193, 178,
       104, 170, 133,  69,  61,  48,  84, 121,  13,  49,  11,  29, 136,
       141,  64,  22, 111, 162, 107,  33, 130,  11,  22, 167, 157,  99,
        59,  12,  70, 154,  44,  45, 110, 180, 116,  56, 136,  54, 139,
        26,  77, 128,  55, 143, 133, 137,   3,  83])
>>> np.compress((25 < a) & (a < 100),a).size
34
>>> a[(25 < a) & (a < 100)].size
34

The above examples use a “bit-wise and” (&) to do an element-wise computation along the two boolean arrays which you create for comparison purposes.
Another way to write Sven’s excellent answer, for example, is:

np.bitwise_and(25 < a, a < 100).sum() 

The boolean arrays contain True values when the condition matches, and False when it doesn’t.
A bonus aspect of boolean values is that True is equivalent to 1 and False to 0.

Answered By: mechanical_meat

I think @Sven Marnach answer is quite nice, because it operates in on the numpy array itself which will be fast and efficient (C implementation).

I like to put the test into one condition like 25 < x < 100, so I would probably do it something like this:

len([x for x in a.ravel() if 25 < x < 100])

Answered By: wim

Building on Sven’s good approach, you can also do the slightly more explicit:

numpy.count_nonzero((25 < a) & (a < 100))

This first creates an array of booleans with one boolean for each input number in array a, and then count the number of non-False (i.e. True) values (which gives the number of matching numbers).

Note, however, that this approach is twice as slow as Sven’s .sum() approach, on an array of 100k numbers (NumPy 1.6.1, Python 2.7.3)–about 300 µs versus 150 µs.

Answered By: Eric O Lebigot
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.