How to do a linear interpolation in pandas taking values of X into account?

Question:

I have a data-frame with two columns: X and Y. Some of the values in Y are missing (np.nan).

I would like to fill the NaNs using a linear interpolation. In more details, I want to order the data frame by X and any missing values of Y should be a “linear mixture” of the two neighbouring values of Y (one corresponding to smaller X and another one to larger X).

If the value of X corresponding to a missing Y is closer to one of the two X with available Y, then the filled value of Y should be close to the corresponding Y. How to do it efficiently and elegantly in pandas?

Please note, that pandas.Series.interpolate does not do what I need, as far as I understand.

Asked By: Roman

||

Answers:

Setting up a dataframe:

x = [0,1,3,4,7,9,11,122,123,128]
y = [2,8,12,np.NaN, 22, 31, 34, np.NaN, 43, 48]

df = pd.DataFrame({"x":x, "y":y})
print(df)

     x     y
0    0   2.0
1    1   8.0
2    3  12.0
3    4   NaN
4    7  22.0
5    9  31.0
6   11  34.0
7  122   NaN
8  123  43.0
9  128  48.0

Set column ‘x’ to the index:

df = df.set_index('x')

Then set the method in the interplote to ‘index’.

df.y = df.y.interpolate(method='index')

This results in:

df

        y
x   
0      2.000000
1      8.000000
3     12.000000
4     14.500000
7     22.000000
9     31.000000
11    34.000000
122   42.919643
123   43.000000
128   48.000000
Answered By: run-out

Method : Linear

limit_direction = Both : Will cover your first and last rows for filling

limit = total number of nan to be replaced. check the % of nan of the total searies and then you can decide what’s the best Limit.

df['column_Name'] = df['column_Name'].interpolate(
                        method='linear', limit_direction='both', limit=45)
Answered By: Arpan Saini
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.