Transition Matrix from Pandas Dataframe That Has Two Periods in Python

Question:

I have a dataset that shows the customer’s payment information for their dept.
Each customer have two periods of data.

CUST_ID PERIOD  Delinquency Value
100729    1              1
100729    2              3
100888    1              2
100888    2              1
137300    1              0
137300    2              1

I need to compute the transition ratios between delinquency values from period 1 to period 2, and create a table that stores this matrix

Expected output is:

    0   1   2   3
0   0   1   0   0
1   0   0   0   1
2   0   1   0   0
3   0   0   0   0
Asked By: Burak

||

Answers:

You can use a pivot and crosstab:

tmp = df.pivot(index='CUST_ID', columns='PERIOD', values='Delinquency Value')
M = df['Delinquency Value'].max()+1

out = (pd.crosstab(tmp[1], tmp[2])
         .reindex(index=range(M), columns=range(M), fill_value=0)
       )

print(out.rename_axis(index=None, columns=None))

Output:

   0  1  2  3
0  0  1  0  0
1  0  0  0  1
2  0  1  0  0
3  0  0  0  0
Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.