Shift column in pandas dataframe up by one?
Question:
I’ve got a pandas dataframe. I want to ‘lag’ one of my columns. Meaning, for example, shifting the entire column ‘gdp’ up by one, and then removing all the excess data at the bottom of the remaining rows so that all columns are of equal length again.
df =
y gdp cap
0 1 2 5
1 2 3 9
2 8 7 2
3 3 4 7
4 6 7 7
df_lag =
y gdp cap
0 1 3 5
1 2 7 9
2 8 4 2
3 3 7 7
Anyway to do this?
Answers:
In [44]: df['gdp'] = df['gdp'].shift(-1)
In [45]: df
Out[45]:
y gdp cap
0 1 3 5
1 2 7 9
2 8 4 2
3 3 7 7
4 6 NaN 7
In [46]: df[:-1]
Out[46]:
y gdp cap
0 1 3 5
1 2 7 9
2 8 4 2
3 3 7 7
shift column gdp up:
df.gdp = df.gdp.shift(-1)
and then remove the last row
df.gdp = df.gdp.shift(-1) ## shift up
df.gdp.drop(df.gdp.shape[0] - 1,inplace = True) ## removing the last row
To easily shift by 5 values for example and also get rid of the NaN rows, without having to keep track of the number of values you shifted by:
d['gdp'] = df['gdp'].shift(-5)
df = df.dropna()
First shift the column:
df['gdp'] = df['gdp'].shift(-1)
Second remove the last row which contains an NaN Cell:
df = df[:-1]
Third reset the index:
df = df.reset_index(drop=True)
Time is going. And current Pandas documentation recommend this way:
df.loc[:, 'gdp'] = df.gdp.shift(-1)
I’ve got a pandas dataframe. I want to ‘lag’ one of my columns. Meaning, for example, shifting the entire column ‘gdp’ up by one, and then removing all the excess data at the bottom of the remaining rows so that all columns are of equal length again.
df =
y gdp cap
0 1 2 5
1 2 3 9
2 8 7 2
3 3 4 7
4 6 7 7
df_lag =
y gdp cap
0 1 3 5
1 2 7 9
2 8 4 2
3 3 7 7
Anyway to do this?
In [44]: df['gdp'] = df['gdp'].shift(-1)
In [45]: df
Out[45]:
y gdp cap
0 1 3 5
1 2 7 9
2 8 4 2
3 3 7 7
4 6 NaN 7
In [46]: df[:-1]
Out[46]:
y gdp cap
0 1 3 5
1 2 7 9
2 8 4 2
3 3 7 7
shift column gdp up:
df.gdp = df.gdp.shift(-1)
and then remove the last row
df.gdp = df.gdp.shift(-1) ## shift up
df.gdp.drop(df.gdp.shape[0] - 1,inplace = True) ## removing the last row
To easily shift by 5 values for example and also get rid of the NaN rows, without having to keep track of the number of values you shifted by:
d['gdp'] = df['gdp'].shift(-5)
df = df.dropna()
First shift the column:
df['gdp'] = df['gdp'].shift(-1)
Second remove the last row which contains an NaN Cell:
df = df[:-1]
Third reset the index:
df = df.reset_index(drop=True)
Time is going. And current Pandas documentation recommend this way:
df.loc[:, 'gdp'] = df.gdp.shift(-1)