R's which() and which.min() Equivalent in Python
Question:
I read the similar topic here. I think the question is different or at least .index()
could not solve my problem.
This is a simple code in R and its answer:
x <- c(1:4, 0:5, 11)
x
#[1] 1 2 3 4 0 1 2 3 4 5 11
which(x==2)
# [1] 2 7
min(which(x==2))
# [1] 2
which.min(x)
#[1] 5
Which simply returns the index of the item which meets the condition.
If x
be the input for Python, how can I get the indeces for the elements which meet criteria x==2
and the one which is the smallest in the array which.min
.
x = [1,2,3,4,0,1,2,3,4,11]
x=np.array(x)
x[x>2].index()
##'numpy.ndarray' object has no attribute 'index'
Answers:
A simple loop will do:
res = []
x = [1,2,3,4,0,1,2,3,4,11]
for i in range(len(x)):
if check_condition(x[i]):
res.append(i)
One liner with comprehension:
res = [i for i, v in enumerate(x) if check_condition(v)]
Here you have a live example
Numpy does have built-in functions for it
x = [1,2,3,4,0,1,2,3,4,11]
x=np.array(x)
np.where(x == 2)
np.min(np.where(x==2))
np.argmin(x)
np.where(x == 2)
Out[9]: (array([1, 6], dtype=int64),)
np.min(np.where(x==2))
Out[10]: 1
np.argmin(x)
Out[11]: 4
You could also use heapq
to find the index of the smallest. Then you can chose to find multiple (for example index of the 2 smallest).
import heapq
x = np.array([1,2,3,4,0,1,2,3,4,11])
heapq.nsmallest(2, (range(len(x))), x.take)
Returns
[4, 0]
NumPy for R provides you with a bunch of R functionalities in Python.
As to your specific question:
import numpy as np
x = [1,2,3,4,0,1,2,3,4,11]
arr = np.array(x)
print(arr)
# [ 1 2 3 4 0 1 2 3 4 11]
print(arr.argmin(0)) # R's which.min()
# 4
print((arr==2).nonzero()) # R's which()
# (array([1, 6]),)
The method based on python indexing and numpy, which returns the value of the desired column based on the index of the minimum/maximum value
df.iloc[np.argmin(df['column1'].values)]['column2']
built-in index
function can be used for this purpose:
x = [1,2,3,4,0,1,2,3,4,11]
print(x.index(min(x)))
#4
print(x.index(max(x)))
#9
However, for indexes based on a condition, np.where
or manual loop and enumerate
may work:
index_greater_than_two1 = [idx for idx, val in enumerate(x) if val>2]
print(index_greater_than_two1)
# [2, 3, 7, 8, 9]
# OR
index_greater_than_two2 = np.where(np.array(x)>2)
print(index_greater_than_two2)
# (array([2, 3, 7, 8, 9], dtype=int64),)
I read the similar topic here. I think the question is different or at least .index()
could not solve my problem.
This is a simple code in R and its answer:
x <- c(1:4, 0:5, 11)
x
#[1] 1 2 3 4 0 1 2 3 4 5 11
which(x==2)
# [1] 2 7
min(which(x==2))
# [1] 2
which.min(x)
#[1] 5
Which simply returns the index of the item which meets the condition.
If x
be the input for Python, how can I get the indeces for the elements which meet criteria x==2
and the one which is the smallest in the array which.min
.
x = [1,2,3,4,0,1,2,3,4,11]
x=np.array(x)
x[x>2].index()
##'numpy.ndarray' object has no attribute 'index'
A simple loop will do:
res = []
x = [1,2,3,4,0,1,2,3,4,11]
for i in range(len(x)):
if check_condition(x[i]):
res.append(i)
One liner with comprehension:
res = [i for i, v in enumerate(x) if check_condition(v)]
Here you have a live example
Numpy does have built-in functions for it
x = [1,2,3,4,0,1,2,3,4,11]
x=np.array(x)
np.where(x == 2)
np.min(np.where(x==2))
np.argmin(x)
np.where(x == 2)
Out[9]: (array([1, 6], dtype=int64),)
np.min(np.where(x==2))
Out[10]: 1
np.argmin(x)
Out[11]: 4
You could also use heapq
to find the index of the smallest. Then you can chose to find multiple (for example index of the 2 smallest).
import heapq
x = np.array([1,2,3,4,0,1,2,3,4,11])
heapq.nsmallest(2, (range(len(x))), x.take)
Returns
[4, 0]
NumPy for R provides you with a bunch of R functionalities in Python.
As to your specific question:
import numpy as np
x = [1,2,3,4,0,1,2,3,4,11]
arr = np.array(x)
print(arr)
# [ 1 2 3 4 0 1 2 3 4 11]
print(arr.argmin(0)) # R's which.min()
# 4
print((arr==2).nonzero()) # R's which()
# (array([1, 6]),)
The method based on python indexing and numpy, which returns the value of the desired column based on the index of the minimum/maximum value
df.iloc[np.argmin(df['column1'].values)]['column2']
built-in index
function can be used for this purpose:
x = [1,2,3,4,0,1,2,3,4,11]
print(x.index(min(x)))
#4
print(x.index(max(x)))
#9
However, for indexes based on a condition, np.where
or manual loop and enumerate
may work:
index_greater_than_two1 = [idx for idx, val in enumerate(x) if val>2]
print(index_greater_than_two1)
# [2, 3, 7, 8, 9]
# OR
index_greater_than_two2 = np.where(np.array(x)>2)
print(index_greater_than_two2)
# (array([2, 3, 7, 8, 9], dtype=int64),)