Correlation with heatmap between 2 columns with different dataset in jupyter notebook


I would like to seek support pertaining to the correlation matrix for 2 different dataset and generating it to a heatmap.

Listed below is the sample data:

Expression PR Metrics
Engagement 0.33 0.70
Excitement 0.33 0.15
Focus 0.33 0.36
Interest 0.67 0.47
Relaxation 0.55 0.20
Stress 0.44 0.40

As these data are not imported from a csv file (Due to the need for modification in future), it is created via a df. And the values are converted to float using astype(float)

The way that I have created the df and converting the types are provided here.

data = {
    'Expression':['Engagement', 'Excitement', 'Focus','Interest','Relaxation','Stress'],
    'PR': ['0.33','0.33','0.33','0.67','0.55','0.44'],
    'Metrics': ['0.70','0.15','0.36','0.47','0.20','0.40']

df['PR']=df['PR'].astype(float) #Converts object dtype to float
df['Emotiv Metrics']=df['Emotiv Metrics'].astype(float) #Converts object dtype to float

After which, if I were to use df.corr(), it will only provide the correlation result as shown:

                      PR        Metrics
PR              1.000000       -0.048189
Metrics        -0.048189        1.000000

However, what I would like to generate is a correlation matrix that shows the correlation between EACH expression from the PR and Metrics, as to what is provided in the snipped image, inclusive of the Metrics and PR.

enter image description here

How should I go about it in this case then?

Or if there’s any error pertaining to the above code, please do point out as well.

Asked By: Jun



Use with transpose DataFrame with seaborn.heatmap:

import seaborn as sb

df1 = df.set_index('Expression')[['PR','Metrics']]
df ='name1', columns='name2')
print (df)

name2       Engagement  Excitement   Focus  Interest  Relaxation  Stress
Engagement      0.5989      0.2139  0.3609    0.5501      0.3215  0.4252
Excitement      0.2139      0.1314  0.1629    0.2916      0.2115  0.2052
Focus           0.3609      0.1629  0.2385    0.3903      0.2535  0.2892
Interest        0.5501      0.2916  0.3903    0.6698      0.4625  0.4828
Relaxation      0.3215      0.2115  0.2535    0.4625      0.3425  0.3220
Stress          0.4252      0.2052  0.2892    0.4828      0.3220  0.3536

sb.heatmap(df, annot=True)
Answered By: jezrael