How to divide two columns element-wise in a pandas dataframe
Question:
I have two columns in my pandas dataframe. I’d like to divide column A
by column B
, value by value, and show it as follows:
import pandas as pd
csv1 = pd.read_csv('auto$0$0.csv')
csv2 = pd.read_csv('auto$0$8.csv')
df1 = pd.DataFrame(csv1, columns=['Column A', 'Column B'])
df2 = pd.DataFrame(csv2, columns=['Column A', 'Column B'])
dfnew = pd.concat([df1, df2])
The columns:
Column A Column B
12 2
14 7
16 8
20 5
And the expected result:
Result
6
2
2
4
How do I do this?
Answers:
Just divide the columns:
In [158]:
df['Result'] = df['Column A']/df['Column B']
df
Out[158]:
Column A Column B Result
0 12 2 6.0
1 14 7 2.0
2 16 8 2.0
3 20 5 4.0
Series.div()
Equivalent to the /
operator but with support to substitute a fill_value
for missing data in either one of the inputs.
So normally div()
is the same as /
:
df['C'] = df.A.div(df.B)
# df.A / df.B
But div()
‘s fill_value
is more concise than 2x fillna()
:
df['C'] = df.A.div(df.B, fill_value=-1)
# df.A.fillna(-1) / df.B.fillna(-1)
And div()
‘s method chaining is more idiomatic:
df['C'] = df.A.div(df.B).cumsum().add(1).gt(10)
# ((df.A / df.B).cumsum() + 1) > 10
Note that when dividing a DataFrame with another DataFrame or Series, DataFrame.div()
also supports broadcasting across an axis
or MultiIndex level
.
div
or /
doesn’t work if the indices don’t match. It’s common when the columns come from different dataframes or if some rows are divided by some other rows in the same dataframe. In that case, convert the divisor column into numpy array.
df1 = pd.DataFrame({'A': range(5)})
df2 = pd.DataFrame({'B': range(10,15)}, index=range(10,15))
df1['C'] = df1['A'] / df2['B'] # <---- Bunch of NaNs
df1['C'] = df1['A'] / df2['B'].values # <---- OK
df1['C'] = df1['A'].div(df2['B'].values) # <---- OK
I have two columns in my pandas dataframe. I’d like to divide column A
by column B
, value by value, and show it as follows:
import pandas as pd
csv1 = pd.read_csv('auto$0$0.csv')
csv2 = pd.read_csv('auto$0$8.csv')
df1 = pd.DataFrame(csv1, columns=['Column A', 'Column B'])
df2 = pd.DataFrame(csv2, columns=['Column A', 'Column B'])
dfnew = pd.concat([df1, df2])
The columns:
Column A Column B
12 2
14 7
16 8
20 5
And the expected result:
Result
6
2
2
4
How do I do this?
Just divide the columns:
In [158]:
df['Result'] = df['Column A']/df['Column B']
df
Out[158]:
Column A Column B Result
0 12 2 6.0
1 14 7 2.0
2 16 8 2.0
3 20 5 4.0
Series.div()
Equivalent to the
/
operator but with support to substitute afill_value
for missing data in either one of the inputs.
So normally div()
is the same as /
:
df['C'] = df.A.div(df.B)
# df.A / df.B
But div()
‘s fill_value
is more concise than 2x fillna()
:
df['C'] = df.A.div(df.B, fill_value=-1)
# df.A.fillna(-1) / df.B.fillna(-1)
And div()
‘s method chaining is more idiomatic:
df['C'] = df.A.div(df.B).cumsum().add(1).gt(10)
# ((df.A / df.B).cumsum() + 1) > 10
Note that when dividing a DataFrame with another DataFrame or Series, DataFrame.div()
also supports broadcasting across an axis
or MultiIndex level
.
div
or /
doesn’t work if the indices don’t match. It’s common when the columns come from different dataframes or if some rows are divided by some other rows in the same dataframe. In that case, convert the divisor column into numpy array.
df1 = pd.DataFrame({'A': range(5)})
df2 = pd.DataFrame({'B': range(10,15)}, index=range(10,15))
df1['C'] = df1['A'] / df2['B'] # <---- Bunch of NaNs
df1['C'] = df1['A'] / df2['B'].values # <---- OK
df1['C'] = df1['A'].div(df2['B'].values) # <---- OK