# How do I count the occurrence of a certain item in an ndarray?

## Question:

How do I count the number of `0`s and `1`s in the following array?

``````y = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
``````

`y.count(0)` gives:

`numpy.ndarray` object has no attribute `count`

Using `numpy.unique`:

``````import numpy
a = numpy.array([0, 3, 0, 1, 0, 1, 2, 1, 0, 0, 0, 0, 1, 3, 4])
unique, counts = numpy.unique(a, return_counts=True)

>>> dict(zip(unique, counts))
{0: 7, 1: 4, 2: 1, 3: 2, 4: 1}
``````

Non-numpy method using `collections.Counter`;

``````import collections, numpy
a = numpy.array([0, 3, 0, 1, 0, 1, 2, 1, 0, 0, 0, 0, 1, 3, 4])
counter = collections.Counter(a)

>>> counter
Counter({0: 7, 1: 4, 3: 2, 2: 1, 4: 1})
``````
``````y = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
``````

If you know that they are just `0` and `1`:

``````np.sum(y)
``````

gives you the number of ones. `np.sum(1-y)` gives the zeroes.

For slight generality, if you want to count `0` and not zero (but possibly 2 or 3):

``````np.count_nonzero(y)
``````

gives the number of nonzero.

But if you need something more complicated, I don’t think numpy will provide a nice `count` option. In that case, go to collections:

``````import collections
collections.Counter(y)
> Counter({0: 8, 1: 4})
``````

This behaves like a dict

``````collections.Counter(y)
> 8
``````

Convert your array `y` to list `l` and then do `l.count(1)` and `l.count(0)`

``````>>> y = numpy.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
>>> l = list(y)
>>> l.count(1)
4
>>> l.count(0)
8
``````

For your case you could also look into numpy.bincount

``````In : a = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])

In : np.bincount(a)
Out: array([8, 4])  #count of zeros is at index 0, i.e. 8
#count of ones is at index 1, i.e. 4
``````

I’d use np.where:

``````how_many_0 = len(np.where(a==0.))
how_many_1 = len(np.where(a==1.))
``````

It involves one more step, but a more flexible solution which would also work for 2d arrays and more complicated filters is to create a boolean mask and then use .sum() on the mask.

``````>>>>y = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
8
``````

What about using `numpy.count_nonzero`, something like

``````>>> import numpy as np
>>> y = np.array([1, 2, 2, 2, 2, 0, 2, 3, 3, 3, 0, 0, 2, 2, 0])

>>> np.count_nonzero(y == 1)
1
>>> np.count_nonzero(y == 2)
7
>>> np.count_nonzero(y == 3)
3
``````

# Filter and use `len`

Using `len` could be another option.

``````A = np.array([1,0,1,0,1,0,1])
``````

Say we want the number of occurrences of `0`.

``````A[A==0]  # Return the array where item is 0, array([0, 0, 0])
``````

Now, wrap it around with `len`.

``````len(A[A==0])  # 3
len(A[A==1])  # 4
len(A[A==7])  # 0, because there isn't such item.
``````

Personally, I’d go for:
`(y == 0).sum()` and `(y == 1).sum()`

E.g.

``````import numpy as np
y = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
num_zeros = (y == 0).sum()
num_ones = (y == 1).sum()
``````

`y.tolist().count(val)`

with val 0 or 1

Since a python list has a native function `count`, converting to list before using that function is a simple solution.

If you don’t want to use numpy or a collections module you can use a dictionary:

``````d = dict()
a = [0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1]
for item in a:
try:
d[item]+=1
except KeyError:
d[item]=1
``````

result:

``````>>>d
{0: 8, 1: 4}
``````

Of course you can also use an if/else statement.
I think the Counter function does almost the same thing but this is more transparant.

Yet another simple solution might be to use numpy.count_nonzero():

``````import numpy as np
y = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
y_nonzero_num = np.count_nonzero(y==1)
y_zero_num = np.count_nonzero(y==0)
y_nonzero_num
4
y_zero_num
8
``````

Don’t let the name mislead you, if you use it with the boolean just like in the example, it will do the trick.

This can be done easily in the following method

``````y = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
y.tolist().count(1)
``````

Honestly I find it easiest to convert to a pandas Series or DataFrame:

``````import pandas as pd
import numpy as np

df = pd.DataFrame({'data':np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])})
print df['data'].value_counts()
``````

Or this nice one-liner suggested by Robert Muil:

``````pd.Series([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1]).value_counts()
``````

A general and simple answer would be:

``````numpy.sum(MyArray==x)   # sum of a binary list of the occurence of x (=0 or 1) in MyArray
``````

which would result into this full code as exemple

``````import numpy
MyArray=numpy.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])  # array we want to search in
x=0   # the value I want to count (can be iterator, in a list, etc.)
numpy.sum(MyArray==0)   # sum of a binary list of the occurence of x in MyArray
``````

Now if MyArray is in multiple dimensions and you want to count the occurence of a distribution of values in line (= pattern hereafter)

``````MyArray=numpy.array([[6, 1],[4, 5],[0, 7],[5, 1],[2, 5],[1, 2],[3, 2],[0, 2],[2, 5],[5, 1],[3, 0]])
x=numpy.array([5,1])   # the value I want to count (can be iterator, in a list, etc.)
temp = numpy.ascontiguousarray(MyArray).view(numpy.dtype((numpy.void, MyArray.dtype.itemsize * MyArray.shape)))  # convert the 2d-array into an array of analyzable patterns
xt=numpy.ascontiguousarray(x).view(numpy.dtype((numpy.void, x.dtype.itemsize * x.shape)))  # convert what you search into one analyzable pattern
numpy.sum(temp==xt)  # count of the searched pattern in the list of patterns
``````

To count the number of occurrences, you can use `np.unique(array, return_counts=True)`:

``````In : boo = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])

# use bool value `True` or equivalently `1`
In : uniq, cnts = np.unique(boo, return_counts=1)
In : uniq
Out: array([0, 1])   #unique elements in input array are: 0, 1

In : cnts
Out: array([8, 4])   # 0 occurs 8 times, 1 occurs 4 times
``````

Since your ndarray contains only 0 and 1,
you can use sum() to get the occurrence of 1s
and len()-sum() to get the occurrence of 0s.

``````num_of_ones = sum(array)
num_of_zeros = len(array)-sum(array)
``````

No one suggested to use `numpy.bincount(input, minlength)` with `minlength = np.size(input)`, but it seems to be a good solution, and definitely the fastest:

``````In : choices = np.random.randint(0, 100, 10000)

In : %timeit [ np.sum(choices == k) for k in range(min(choices), max(choices)+1) ]
100 loops, best of 3: 2.67 ms per loop

In : %timeit np.unique(choices, return_counts=True)
1000 loops, best of 3: 388 µs per loop

In : %timeit np.bincount(choices, minlength=np.size(choices))
100000 loops, best of 3: 16.3 µs per loop
``````

That’s a crazy speedup between `numpy.unique(x, return_counts=True)` and `numpy.bincount(x, minlength=np.max(x))` !

Numpy has a module for this. Just a small hack. Put your input array as bins.

``````numpy.histogram(y, bins=y)
``````

The output are 2 arrays. One with the values itself, other with the corresponding frequencies.

If you know exactly which number you’re looking for, you can use the following;

``````lst = np.array([1,1,2,3,3,6,6,6,3,2,1])
(lst == 2).sum()
``````

returns how many times 2 is occurred in your array.

For generic entries:

``````x = np.array([11, 2, 3, 5, 3, 2, 16, 10, 10, 3, 11, 4, 5, 16, 3, 11, 4])
n = {i:len([j for j in np.where(x==i)]) for i in set(x)}
ix = {i:[j for j in np.where(x==i)] for i in set(x)}
``````

Will output a count:

``````{2: 2, 3: 4, 4: 2, 5: 2, 10: 2, 11: 3, 16: 2}
``````

And indices:

``````{2: [1, 5],
3: [2, 4, 9, 14],
4: [11, 16],
5: [3, 12],
10: [7, 8],
11: [0, 10, 15],
16: [6, 13]}
``````

You can use dictionary comprehension to create a neat one-liner. More about dictionary comprehension can be found here

``````>>> counts = {int(value): list(y).count(value) for value in set(y)}
>>> print(counts)
{0: 8, 1: 4}
``````

This will create a dictionary with the values in your ndarray as keys, and the counts of the values as the values for the keys respectively.

This will work whenever you want to count occurences of a value in arrays of this format.

You have a special array with only 1 and 0 here. So a trick is to use

``````np.mean(x)
``````

which gives you the percentage of 1s in your array. Alternatively, use

``````np.sum(x)
np.sum(1-x)
``````

will give you the absolute number of 1 and 0 in your array.

take advantage of the methods offered by a Series:

``````>>> import pandas as pd
>>> y = [0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1]
>>> pd.Series(y).value_counts()
0    8
1    4
dtype: int64
``````
``````using numpy.count

\$ a = [0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1]

\$ np.count(a, 1)
``````
``````dict(zip(*numpy.unique(y, return_counts=True)))
``````

Just copied Seppo Enarvi’s comment here which deserves to be a proper answer

Try this:

``````a = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
list(a).count(1)
``````

here I have something, through which you can count the number of occurrence of a particular number:

``````count_of_zero=list(y[y==0]).count(0)

print(count_of_zero)

// according to the match there will be boolean values and according
// to True value the number 0 will be return.
``````

If you are interested in the fastest execution, you know in advance which value(s) to look for, and your array is 1D, or you are otherwise interested in the result on the flattened array (in which case the input of the function should be `np.ravel(arr)` rather than just `arr`), then Numba is your friend:

``````import numba as nb

@nb.jit
def count_nb(arr, value):
result = 0
for x in arr:
if x == value:
result += 1
return result
``````

or, for very large arrays where parallelization may be beneficial:

``````@nb.jit(parallel=True)
def count_nbp(arr, value):
result = 0
for i in nb.prange(arr.size):
if arr[i] == value:
result += 1
return result
``````

These can be benchmarked against `np.count_nonzero()` (which also has a problem of creating a temporary array — something that is avoided in the Numba solutions) and a `np.unique()`-based solution (which is actually counting all unique value values contrarily to the other solutions).

``````import numpy as np

def count_np(arr, value):
return np.count_nonzero(arr == value)
``````
``````import numpy as np

def count_np_uniq(arr, value):
uniques, counts = np.unique(a, return_counts=True)
counter = dict(zip(uniques, counts))
return counter[value] if value in counter else 0
``````

Since the support for "typed" dicts in Numba, it is also possible to have a function counting all occurrences of all elements.
This competes more directly with `np.unique()` because it is capable of counting all values in a single run. Here is proposed a version which eventually only returns the number of elements for a single value (for comparison purposes, similarly to what is done in `count_np_uniq()`):

``````@nb.jit
def count_nb_dict(arr, value):
counter = {arr: 1}
for x in arr:
if x not in counter:
counter[x] = 1
else:
counter[x] += 1
return counter[value] if value in counter else 0
``````

The input is generated with:

``````def gen_input(n, a=0, b=100):
return np.random.randint(a, b, n)
``````

The timings are reported in the following plots (the second row of plots is a zoom on the faster approaches):

Showing that the simple Numba-based solution is fastest for smaller inputs and the parallelized version is fastest for larger inputs.
They NumPy version is reasonably fast at all scales.

When one wants to count all values in an array, `np.unique()` is more performant than a solution implemented manually with Numba for sufficiently large arrays.

EDIT: It seems that the NumPy solution has become faster in recent versions. In a previous iteration, the simple Numba solution was outperforming NumPy’s approach for any input size.

Full code available here.

if you are dealing with very large arrays using generators could be an option. The nice thing here it that this approach works fine for both arrays and lists and you dont need any additional package. Additionally, you are not using that much memory.

``````my_array = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
sum(1 for val in my_array if val==0)
Out: 8
``````

This funktion returns the number of occurences of a variable in an array:

``````def count(array,variable):
number = 0
for i in range(array.shape):
for j in range(array.shape):
if array[i,j] == variable:
number += 1
return number
``````

The simplest,do comment if not necessary

``````import numpy as np
y = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
count_0, count_1 = 0, 0
for i in y_train:
if i == 0:
count_0 += 1
if i == 1:
count_1 += 1
count_0, count_1
``````
Categories: questions
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.