Why Pandas .divide() method adding the divisor as a column with NAN values?
Question:
I am trying to divide a Pandas timeseries data frame with another data frame with exact matching datetime index, using Pandas.divide() method. When I do so instead of the expected element wise division, the divisor column is getting as a column the data frame being divided with NAN values. Why is that? I am unable to figure out. Please note that if I use a scalar to divide there is no problem. Appreciate inputs. Below are the info on the data frames .
print(week1_range)
Max TemperatureF Min TemperatureF
Date
2013-07-01 79 66
2013-07-02 84 66
2013-07-03 86 71
2013-07-04 86 70
2013-07-05 86 69
2013-07-06 89 70
2013-07-07 77 70
-----------------------------------------
print(week1_range.info())
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 7 entries, 2013-07-01 to 2013-07-07
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Max TemperatureF 7 non-null int64
1 Min TemperatureF 7 non-null int64
dtypes: int64(2)
memory usage: 168.0 bytes
None
-----------------------------------------------
print(week1_mean)
Mean TemperatureF
Date
2013-07-01 72
2013-07-02 74
2013-07-03 78
2013-07-04 77
2013-07-05 76
2013-07-06 78
2013-07-07 72
----------------------------------------------------
print(week1_mean.info())
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 7 entries, 2013-07-01 to 2013-07-07
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Mean TemperatureF 7 non-null int64
dtypes: int64(1)
memory usage: 112.0 bytes
None
-------------------------------------------
print(week1_range.divide(week1_mean, axis='rows'))
Max TemperatureF Mean TemperatureF Min TemperatureF
Date
2013-07-01 NaN NaN NaN
2013-07-02 NaN NaN NaN
2013-07-03 NaN NaN NaN
2013-07-04 NaN NaN NaN
2013-07-05 NaN NaN NaN
2013-07-06 NaN NaN NaN
2013-07-07 NaN NaN NaN
Answers:
That’s because you’re dividing two dataframes with differents columns names.
You need to squeeze
the second dataframe so pandas does not try/need to align the columns :
out = week1_range.div(week1_mean.squeeze(), axis='rows')
#or simply select the column
#out = week1_range.div(week1_mean['Mean TemperatureF'], axis='rows')
#or use to_numpy() as per @Corralien
#out = week1_range.div(week1_mean.to_numpy(), axis='rows')
Output :
print(out)
Max TemperatureF Min TemperatureF
Date
2013-07-01 1.097222 0.916667
2013-07-02 1.135135 0.891892
2013-07-03 1.102564 0.910256
2013-07-04 1.116883 0.909091
2013-07-05 1.131579 0.907895
2013-07-06 1.141026 0.897436
2013-07-07 1.069444 0.972222
I am trying to divide a Pandas timeseries data frame with another data frame with exact matching datetime index, using Pandas.divide() method. When I do so instead of the expected element wise division, the divisor column is getting as a column the data frame being divided with NAN values. Why is that? I am unable to figure out. Please note that if I use a scalar to divide there is no problem. Appreciate inputs. Below are the info on the data frames .
print(week1_range)
Max TemperatureF Min TemperatureF
Date
2013-07-01 79 66
2013-07-02 84 66
2013-07-03 86 71
2013-07-04 86 70
2013-07-05 86 69
2013-07-06 89 70
2013-07-07 77 70
-----------------------------------------
print(week1_range.info())
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 7 entries, 2013-07-01 to 2013-07-07
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Max TemperatureF 7 non-null int64
1 Min TemperatureF 7 non-null int64
dtypes: int64(2)
memory usage: 168.0 bytes
None
-----------------------------------------------
print(week1_mean)
Mean TemperatureF
Date
2013-07-01 72
2013-07-02 74
2013-07-03 78
2013-07-04 77
2013-07-05 76
2013-07-06 78
2013-07-07 72
----------------------------------------------------
print(week1_mean.info())
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 7 entries, 2013-07-01 to 2013-07-07
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Mean TemperatureF 7 non-null int64
dtypes: int64(1)
memory usage: 112.0 bytes
None
-------------------------------------------
print(week1_range.divide(week1_mean, axis='rows'))
Max TemperatureF Mean TemperatureF Min TemperatureF
Date
2013-07-01 NaN NaN NaN
2013-07-02 NaN NaN NaN
2013-07-03 NaN NaN NaN
2013-07-04 NaN NaN NaN
2013-07-05 NaN NaN NaN
2013-07-06 NaN NaN NaN
2013-07-07 NaN NaN NaN
That’s because you’re dividing two dataframes with differents columns names.
You need to squeeze
the second dataframe so pandas does not try/need to align the columns :
out = week1_range.div(week1_mean.squeeze(), axis='rows')
#or simply select the column
#out = week1_range.div(week1_mean['Mean TemperatureF'], axis='rows')
#or use to_numpy() as per @Corralien
#out = week1_range.div(week1_mean.to_numpy(), axis='rows')
Output :
print(out)
Max TemperatureF Min TemperatureF
Date
2013-07-01 1.097222 0.916667
2013-07-02 1.135135 0.891892
2013-07-03 1.102564 0.910256
2013-07-04 1.116883 0.909091
2013-07-05 1.131579 0.907895
2013-07-06 1.141026 0.897436
2013-07-07 1.069444 0.972222