Sorting correlation matrix
Question:
I want to convert the correlation matrix to the "pandas" table, sorted from the largest value to the smallest, as in the image. How can I do it?
df = pd.DataFrame(np.random.randint(0,15,size=(20, 6)), columns=["Ply_1","Ply_2","Ply_3","Ply_4","Ply_5","Ply_6"])
df['date'] = pd.date_range('2000-1-1', periods=20, freq='D')
df = df.set_index(['date'])
cor=df.corr()
print(cor)
Answers:
pd.concat([cor[col_name].sort_values(ascending=False)
.rename_axis(col_name.replace('Ply', 'index'))
.reset_index()
for col_name in cor],
axis=1)
Explanation:
-
pd.concat([df_1, ..., df_6], axis=1)
concatenates 6 dataframes (each one will be already sorted and will have 2 columns: ‘index_i’ and ‘Ply_i’).
-
[cor[col_name] for col_name in cor]
would create a list of 6 Series, where each Series is the next column of cor
.
-
ser.sort_values(ascending=False)
sorts values of a Series ser
in the descending order (indices also move – together with their values).
-
col_name.replace('Ply', 'index')
creates a new string from a string col_name
by replacing ‘Ply’ with ‘index’.
-
ser.rename_axis(name).reset_index()
renames the index axis, and extracts the index (with its name) as a new column, converting a Series into a DataFrame. The new index of this dataframe is the default range index (from 0 to 6).
Result:
(with my randomly generated numbers)
index_1
Ply_1
index_2
Ply_2
index_3
Ply_3
index_4
Ply_4
index_5
Ply_5
index_6
Ply_6
0
Ply_1
1
Ply_2
1
Ply_3
1
Ply_4
1
Ply_5
1
Ply_6
1
1
Ply_2
0.387854
Ply_1
0.387854
Ply_1
0.258825
Ply_1
0.337613
Ply_4
0.0618012
Ply_1
0.058282
2
Ply_4
0.337613
Ply_4
0.293496
Ply_4
0.0552454
Ply_2
0.293496
Ply_2
0.060881
Ply_3
-0.207621
3
Ply_3
0.258825
Ply_5
0.060881
Ply_2
-0.0900126
Ply_5
0.0618012
Ply_3
-0.110885
Ply_2
-0.22012
4
Ply_6
0.058282
Ply_3
-0.0900126
Ply_5
-0.110885
Ply_3
0.0552454
Ply_1
-0.390893
Ply_4
-0.291842
5
Ply_5
-0.390893
Ply_6
-0.22012
Ply_6
-0.207621
Ply_6
-0.291842
Ply_6
-0.394074
Ply_5
-0.394074
I want to convert the correlation matrix to the "pandas" table, sorted from the largest value to the smallest, as in the image. How can I do it?
df = pd.DataFrame(np.random.randint(0,15,size=(20, 6)), columns=["Ply_1","Ply_2","Ply_3","Ply_4","Ply_5","Ply_6"])
df['date'] = pd.date_range('2000-1-1', periods=20, freq='D')
df = df.set_index(['date'])
cor=df.corr()
print(cor)
pd.concat([cor[col_name].sort_values(ascending=False)
.rename_axis(col_name.replace('Ply', 'index'))
.reset_index()
for col_name in cor],
axis=1)
Explanation:
-
pd.concat([df_1, ..., df_6], axis=1)
concatenates 6 dataframes (each one will be already sorted and will have 2 columns: ‘index_i’ and ‘Ply_i’). -
[cor[col_name] for col_name in cor]
would create a list of 6 Series, where each Series is the next column ofcor
. -
ser.sort_values(ascending=False)
sorts values of a Seriesser
in the descending order (indices also move – together with their values). -
col_name.replace('Ply', 'index')
creates a new string from a stringcol_name
by replacing ‘Ply’ with ‘index’. -
ser.rename_axis(name).reset_index()
renames the index axis, and extracts the index (with its name) as a new column, converting a Series into a DataFrame. The new index of this dataframe is the default range index (from 0 to 6).
Result:
(with my randomly generated numbers)
index_1 | Ply_1 | index_2 | Ply_2 | index_3 | Ply_3 | index_4 | Ply_4 | index_5 | Ply_5 | index_6 | Ply_6 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Ply_1 | 1 | Ply_2 | 1 | Ply_3 | 1 | Ply_4 | 1 | Ply_5 | 1 | Ply_6 | 1 |
1 | Ply_2 | 0.387854 | Ply_1 | 0.387854 | Ply_1 | 0.258825 | Ply_1 | 0.337613 | Ply_4 | 0.0618012 | Ply_1 | 0.058282 |
2 | Ply_4 | 0.337613 | Ply_4 | 0.293496 | Ply_4 | 0.0552454 | Ply_2 | 0.293496 | Ply_2 | 0.060881 | Ply_3 | -0.207621 |
3 | Ply_3 | 0.258825 | Ply_5 | 0.060881 | Ply_2 | -0.0900126 | Ply_5 | 0.0618012 | Ply_3 | -0.110885 | Ply_2 | -0.22012 |
4 | Ply_6 | 0.058282 | Ply_3 | -0.0900126 | Ply_5 | -0.110885 | Ply_3 | 0.0552454 | Ply_1 | -0.390893 | Ply_4 | -0.291842 |
5 | Ply_5 | -0.390893 | Ply_6 | -0.22012 | Ply_6 | -0.207621 | Ply_6 | -0.291842 | Ply_6 | -0.394074 | Ply_5 | -0.394074 |