Rolling or Siding window for transpose of a matrix
Question:
I have the following data:
I want to obtain it this way:
What we have manually done here is considered a 3×1 matrix and transposed it to obtain 1×3 matrix for all 4 columns: A1, A2, A3 & A4. I need to automate this process in python, and I believe rolling window is the solution. This needs to iterate through all rows (around 300 odd rows).
Reproducible input:
df = pd.DataFrame({'A1': [1,1,0,1],
'A2': [1,0,0,0],
'A3': [1,0,1,1],
'A4': [1,0,0,0],
})
Answers:
You can use numpy for reshaping:
from numpy.lib.stride_tricks import sliding_window_view
N = 3
out = pd.DataFrame(
sliding_window_view(df, (N,1))
.reshape(-1,N*df.shape[1]),
)
print(out)
Output:
0 1 2 3 4 5 6 7 8 9 10 11
0 1 1 0 1 0 0 1 0 1 1 0 0
1 1 0 1 0 0 0 0 1 1 0 0 0
With MultiIndex:
out = pd.DataFrame(
sliding_window_view(df, (N,1))
.reshape(-1,N*df.shape[1]),
columns=pd.MultiIndex.from_product([df.columns, np.arange(N)])
)
Output:
A1 A2 A3 A4
0 1 2 0 1 2 0 1 2 0 1 2
0 1 1 0 1 0 0 1 0 1 1 0 0
1 1 0 1 0 0 0 0 1 1 0 0 0
Used input:
df = pd.DataFrame({'A1': [1,1,0,1],
'A2': [1,0,0,0],
'A3': [1,0,1,1],
'A4': [1,0,0,0],
})
I have the following data:
I want to obtain it this way:
What we have manually done here is considered a 3×1 matrix and transposed it to obtain 1×3 matrix for all 4 columns: A1, A2, A3 & A4. I need to automate this process in python, and I believe rolling window is the solution. This needs to iterate through all rows (around 300 odd rows).
Reproducible input:
df = pd.DataFrame({'A1': [1,1,0,1],
'A2': [1,0,0,0],
'A3': [1,0,1,1],
'A4': [1,0,0,0],
})
You can use numpy for reshaping:
from numpy.lib.stride_tricks import sliding_window_view
N = 3
out = pd.DataFrame(
sliding_window_view(df, (N,1))
.reshape(-1,N*df.shape[1]),
)
print(out)
Output:
0 1 2 3 4 5 6 7 8 9 10 11
0 1 1 0 1 0 0 1 0 1 1 0 0
1 1 0 1 0 0 0 0 1 1 0 0 0
With MultiIndex:
out = pd.DataFrame(
sliding_window_view(df, (N,1))
.reshape(-1,N*df.shape[1]),
columns=pd.MultiIndex.from_product([df.columns, np.arange(N)])
)
Output:
A1 A2 A3 A4
0 1 2 0 1 2 0 1 2 0 1 2
0 1 1 0 1 0 0 1 0 1 1 0 0
1 1 0 1 0 0 0 0 1 1 0 0 0
Used input:
df = pd.DataFrame({'A1': [1,1,0,1],
'A2': [1,0,0,0],
'A3': [1,0,1,1],
'A4': [1,0,0,0],
})