Transforming every n rows of pandas Series into n columns of a DataFrame
Question:
I’m trying to transform a pandas Series like:
Date
Value
2020-01-01
-1175
2020-01-02
-475
2020-01-03
1945
2020-01-06
-1295
2020-01-07
-835
2020-01-08
-785
2020-01-09
895
2020-01-10
-665
into a pandas DataFrame like:
date
0
1
2
3
4
2020-01-01
-1175
-475
1945
-1295
-665
2020-01-02
-475
1945
-1295
-835
-785
2020-01-03
1945
-1295
-835
-785
895
2020-01-06
-1295
-835
-785
895
-665
Every 5 (or n) rows of the Series forms one row in the DataFrame.
Sample data along with my current (ugly but working) code is as follows:
import pandas as pd
srs = pd.Series(index=pd.DatetimeIndex(pd.date_range(start="2020-01-01",end="2020-1-10",freq="B")),
data=[-1175,-475,1945,-1295,-835,-785,895,-665])
n = 5
df = pd.concat({i: srs.shift(-i) for i in range(n)}, axis=1).dropna()
df = df[range(n)]
df.index = df.index.droplevel(level=0)
I was wondering if there is a better/neater/nicer way to do this?
Answers:
Try with sliding_window_view
from numpy
:
n = 5
v = np.lib.stride_tricks.sliding_window_view(srs.to_numpy(), n)
df = pd.DataFrame(v, index=srs.index[:v.shape[0]])
df
:
0 1 2 3 4
2020-01-01 -1175 -475 1945 -1295 -835
2020-01-02 -475 1945 -1295 -835 -785
2020-01-03 1945 -1295 -835 -785 895
2020-01-06 -1295 -835 -785 895 -665
Complete Working Example:
import numpy as np
import pandas as pd
srs = pd.Series(index=pd.DatetimeIndex(
pd.date_range(start="2020-01-01", end="2020-1-10", freq="B")
), data=[-1175, -475, 1945, -1295, -835, -785, 895, -665])
n = 5
v = np.lib.stride_tricks.sliding_window_view(srs.to_numpy(), n)
df = pd.DataFrame(v, index=srs.index[:v.shape[0]])
print(df)
I’m trying to transform a pandas Series like:
Date | Value |
---|---|
2020-01-01 | -1175 |
2020-01-02 | -475 |
2020-01-03 | 1945 |
2020-01-06 | -1295 |
2020-01-07 | -835 |
2020-01-08 | -785 |
2020-01-09 | 895 |
2020-01-10 | -665 |
into a pandas DataFrame like:
date | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
2020-01-01 | -1175 | -475 | 1945 | -1295 | -665 |
2020-01-02 | -475 | 1945 | -1295 | -835 | -785 |
2020-01-03 | 1945 | -1295 | -835 | -785 | 895 |
2020-01-06 | -1295 | -835 | -785 | 895 | -665 |
Every 5 (or n) rows of the Series forms one row in the DataFrame.
Sample data along with my current (ugly but working) code is as follows:
import pandas as pd
srs = pd.Series(index=pd.DatetimeIndex(pd.date_range(start="2020-01-01",end="2020-1-10",freq="B")),
data=[-1175,-475,1945,-1295,-835,-785,895,-665])
n = 5
df = pd.concat({i: srs.shift(-i) for i in range(n)}, axis=1).dropna()
df = df[range(n)]
df.index = df.index.droplevel(level=0)
I was wondering if there is a better/neater/nicer way to do this?
Try with sliding_window_view
from numpy
:
n = 5
v = np.lib.stride_tricks.sliding_window_view(srs.to_numpy(), n)
df = pd.DataFrame(v, index=srs.index[:v.shape[0]])
df
:
0 1 2 3 4
2020-01-01 -1175 -475 1945 -1295 -835
2020-01-02 -475 1945 -1295 -835 -785
2020-01-03 1945 -1295 -835 -785 895
2020-01-06 -1295 -835 -785 895 -665
Complete Working Example:
import numpy as np
import pandas as pd
srs = pd.Series(index=pd.DatetimeIndex(
pd.date_range(start="2020-01-01", end="2020-1-10", freq="B")
), data=[-1175, -475, 1945, -1295, -835, -785, 895, -665])
n = 5
v = np.lib.stride_tricks.sliding_window_view(srs.to_numpy(), n)
df = pd.DataFrame(v, index=srs.index[:v.shape[0]])
print(df)