Ignore nan elements in a list using loc pandas

Question:

I have 2 different dataframes: df1, df2

df1:
index a
0     10    
1     2    
2     3
3     1
4     7
5     6

df2:
index a
0     1    
1     2
2     4
3     3
4     20
5     5

I want to find the index of maximum values with a specific lookback in df1 (let’s consider lookback=3 in this example). To do this, I use the following code:

tdf['a'] = df1.rolling(lookback).apply(lambda x: x.idxmax())

And the result would be:

id    a
0     nan    
1     nan
2     0
3     2
4     4
5     4

Now I need to save the values in df2 for each index found by idxmax() in tdf[‘b’]

So if tdf[‘a’].iloc[3] == 2, I want tdf[‘b’].iloc[3] == df2.iloc[2]. I expect the final result to be like this:

id    b
0     nan    
1     nan
2     1
3     4
4     20
5     20

I’m guessing that I can do this using .loc() function like this:

tdf['b'] = df2.loc[tdf['a']]

But it throws an exception because there are nan values in tdf[‘a’]. If I use dropna() before passing tdf[‘a’] to the .loc() function, then the indices get messed up (for example in tdf[‘b’], index 0 has to be nan but it’ll have a value after dropna()).

Is there any way to get what I want?

Asked By: Masih Bahmani

||

Answers:

Simply use a map:

lookback = 3
s = df1['a'].rolling(lookback).apply(lambda x: x.idxmax())

s.map(df2['a'])

Output:

0     NaN
1     NaN
2     1.0
3     4.0
4    20.0
5    20.0
Name: a, dtype: float64
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.