Pandas 'min' aggregate, but how to set 'min'+1?
Question:
I’m trying to aggregate my pandas dataframe with minimum and also minimum+1. Let me explain.
Suppose I have a dataframe:
distance vertex type
0 8 104 A
1 1 114 A
2 1 103 B
3 2 102 A
4 3 18 A
5 3 108 B
I get the minimum distance with groupby on Type as follows:
mask = df['distance'].isin(df.groupby(['type'])['distance'].agg('min').values)
df[mask]
This gives me minimum distance for each Type.
distance vertex type
1 1 114 A
2 1 103 B
My question: How do I get rows that are satisfying the following formula:
distance = minimum(distance) + 1
This is what I’m trying to get.
distance vertex type
1 1 114 A
2 1 103 B
3 2 102 A
Answers:
The exact logic is unclear, but assuming you want to filter per group to keep the values equal to the minimum or minimum +1, use groupby.transform
:
mask = df['distance'].le(df.groupby(['type'])['distance'].transform('min').add(1))
out = df[mask]
Output:
distance vertex type
1 1 114 A
2 1 103 B
3 2 102 A
This is what I wrote at first, but its wrong and here just as a "Dead code" for people to see and be aware :). See comments why its wrong..
mask = df['distance'].isin([df.groupby(['Type'])['distance'].agg('min').values[0], df.groupby(['Type'])['distance'].agg('min').values[0]+1])
Another possible solution:
m = df.groupby(['type'])['distance'].min().unique()
mask = df['distance'].isin(np.concatenate([m, m + 1]))
df[mask]
Output:
distance vertex type
1 1 114 A
2 1 103 B
3 2 102 A
I’m trying to aggregate my pandas dataframe with minimum and also minimum+1. Let me explain.
Suppose I have a dataframe:
distance vertex type
0 8 104 A
1 1 114 A
2 1 103 B
3 2 102 A
4 3 18 A
5 3 108 B
I get the minimum distance with groupby on Type as follows:
mask = df['distance'].isin(df.groupby(['type'])['distance'].agg('min').values)
df[mask]
This gives me minimum distance for each Type.
distance vertex type
1 1 114 A
2 1 103 B
My question: How do I get rows that are satisfying the following formula:
distance = minimum(distance) + 1
This is what I’m trying to get.
distance vertex type
1 1 114 A
2 1 103 B
3 2 102 A
The exact logic is unclear, but assuming you want to filter per group to keep the values equal to the minimum or minimum +1, use groupby.transform
:
mask = df['distance'].le(df.groupby(['type'])['distance'].transform('min').add(1))
out = df[mask]
Output:
distance vertex type
1 1 114 A
2 1 103 B
3 2 102 A
This is what I wrote at first, but its wrong and here just as a "Dead code" for people to see and be aware :). See comments why its wrong..
mask = df['distance'].isin([df.groupby(['Type'])['distance'].agg('min').values[0], df.groupby(['Type'])['distance'].agg('min').values[0]+1])
Another possible solution:
m = df.groupby(['type'])['distance'].min().unique()
mask = df['distance'].isin(np.concatenate([m, m + 1]))
df[mask]
Output:
distance vertex type
1 1 114 A
2 1 103 B
3 2 102 A