python – remove array column if it contains at least one 0

Question:

Let’s suppose I have a np.array like:

array([[1., 1., 0., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 0.],
       [1., 1., 1., 1., 0.]])

I would like to know if there is a pythonic way to find all the columns that contain at least one occurence of 0. In the example I would like to retrieve the indexes 2 and 4.

I need to remove those columns, but I also need to know how many columns I have removed (the indexes are not strictly necessary).
So in the end I simply need the result

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])
Asked By: Sala

||

Answers:

If you want to simply remove the columns, you can use np.all (or its ndarray) variant to find the columns you want to keep. Use the resulting boolean mask to index the 2nd axis:

>>> arr[:, arr.all(axis=0)]
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

If you want to find the indices of those columns with at least one zero, you can use np.any in conjunction with np.nonzero (or np.flatnonzero if you prefer):

>>> np.any(arr == 0, axis=0).nonzero()
(array([2, 4], dtype=int64),)

If you want to count them, you can sum the boolean mask directly:

>>> np.any(arr == 0, axis=0).sum()
2
Answered By: Chrysophylaxs
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.