correlation

Correlation between two non-numeric columns in a Pandas DataFrame

Correlation between two non-numeric columns in a Pandas DataFrame Question: I get my data from an SQL query from the table to my pandas Dataframe. The data looks like: group phone_brand 0 M32-38 小米 1 M32-38 小米 2 M32-38 小米 3 M29-31 小米 4 M29-31 小米 5 F24-26 OPPO 6 M32-38 酷派 7 M32-38 小米 …

Total answers: 3

Error Msg: replace with Series.rolling(window=5).corr(other=<Series>)

Error Msg: replace with Series.rolling(window=5).corr(other=<Series>) Question: I am trying to find the rolling correlation of 5 periods between columns [‘High’] and [‘Low’]. I manage to calculate it but there is an error: FutureWarning: pd.rolling_corr is deprecated for Series and will be removed in a future version, replace with Series.rolling(window=5).corr(other=) Tried replacing it but it doesnt …

Total answers: 1

How to check for correlation among continuous and categorical variables?

How to check for correlation among continuous and categorical variables? Question: I have a dataset including categorical variables(binary) and continuous variables. I’m trying to apply a linear regression model for predicting a continuous variable. Can someone please let me know how to check for correlation among the categorical variables and the continuous target variable. Current …

Total answers: 3

Use .corr to get the correlation between two columns

Use .corr to get the correlation between two columns Question: I have the following pandas dataframe Top15: I create a column that estimates the number of citable documents per person: Top15[‘PopEst’] = Top15[‘Energy Supply’] / Top15[‘Energy Supply per Capita’] Top15[‘Citable docs per Capita’] = Top15[‘Citable documents’] / Top15[‘PopEst’] I want to know the correlation between …

Total answers: 10

Correlation heatmap

Correlation heatmap Question: I want to represent correlation matrix using a heatmap. There is something called correlogram in R, but I don’t think there’s such a thing in Python. How can I do this? The values go from -1 to 1, for example: [[ 1. 0.00279981 0.95173379 0.02486161 -0.00324926 -0.00432099] [ 0.00279981 1. 0.17728303 0.64425774 …

Total answers: 7

Pandas: How to drop self correlation from correlation matrix

Pandas: How to drop self correlation from correlation matrix Question: I’m trying to find highest correlations for different columns with pandas. I know can get correlation matrix with df.corr() I know I can get the highest correlations after that with df.sort() df.stack() df[-5:] The problem is that these correlation also contain values for column with …

Total answers: 4

numpy corrcoef – compute correlation matrix while ignoring missing data

numpy corrcoef – compute correlation matrix while ignoring missing data Question: I am trying to compute a correlation matrix of several values. These values include some ‘nan’ values. I’m using numpy.corrcoef. For element(i,j) of the output correlation matrix I’d like to have the correlation calculated using all values that exist for both variable i and …

Total answers: 3

Computing the correlation coefficient between two multi-dimensional arrays

Computing the correlation coefficient between two multi-dimensional arrays Question: I have two arrays that have the shapes N X T and M X T. I’d like to compute the correlation coefficient across T between every possible pair of rows n and m (from N and M, respectively). What’s the fastest, most pythonic way to do …

Total answers: 3

Pandas Correlation Groupby

Pandas Correlation Groupby Question: Assuming I have a dataframe similar to the below, how would I get the correlation between 2 specific columns and then group by the ‘ID’ column? I believe the Pandas ‘corr’ method finds the correlation between all columns. If possible I would also like to know how I could find the …

Total answers: 6