Divide multiple columns by another column in pandas
Question:
I need to divide all but the first columns in a DataFrame by the first column.
Here’s what I’m doing, but I wonder if this isn’t the “right” pandas way:
df = pd.DataFrame(np.random.rand(10,3), columns=list('ABC'))
df[['B', 'C']] = (df.T.iloc[1:] / df.T.iloc[0]).T
Is there a way to do something like df[['B','C']] / df['A']
? (That just gives a 10×12 dataframe of nan
.)
Also, after reading some similar questions on SO, I tried df['A'].div(df[['B', 'C']])
but that gives a broadcast error.
Answers:
I believe df[['B','C']].div(df.A, axis=0)
and df.iloc[:,1:].div(df.A, axis=0)
work.
do: df.iloc[:,1:] = df.iloc[:,1:].div(df.A, axis=0)
This will divide all columns other than the 1st column with the ‘A’ column used as divisor.
Results are 1st column + all columns after / 'divisor column'
.
You are actually doing a matrix multiplication (Apparently numpy understands that "/" operator multiplies by the inverse), so you need the shapes to match (see here).
e.g.
df['A'].shape
–> (10,)
df[['B','C']].shape
–> (10,2)
You should make them match as (2,10)(10,):
df[['B','C']].T.shape, df['A'].shape
–>((2, 10), (10,))
But then your resulting matrix is:
( df[['B','C']].T / df['A'] ).shape
–> (2,10)
Therefore:
( df[['B','C']].T / df['A'] ).T
Shape is (10,2). It gives you the results that you wanted!
I need to divide all but the first columns in a DataFrame by the first column.
Here’s what I’m doing, but I wonder if this isn’t the “right” pandas way:
df = pd.DataFrame(np.random.rand(10,3), columns=list('ABC'))
df[['B', 'C']] = (df.T.iloc[1:] / df.T.iloc[0]).T
Is there a way to do something like df[['B','C']] / df['A']
? (That just gives a 10×12 dataframe of nan
.)
Also, after reading some similar questions on SO, I tried df['A'].div(df[['B', 'C']])
but that gives a broadcast error.
I believe df[['B','C']].div(df.A, axis=0)
and df.iloc[:,1:].div(df.A, axis=0)
work.
do: df.iloc[:,1:] = df.iloc[:,1:].div(df.A, axis=0)
This will divide all columns other than the 1st column with the ‘A’ column used as divisor.
Results are 1st column + all columns after / 'divisor column'
.
You are actually doing a matrix multiplication (Apparently numpy understands that "/" operator multiplies by the inverse), so you need the shapes to match (see here).
e.g.
df['A'].shape
–> (10,)
df[['B','C']].shape
–> (10,2)
You should make them match as (2,10)(10,):
df[['B','C']].T.shape, df['A'].shape
–>((2, 10), (10,))
But then your resulting matrix is:
( df[['B','C']].T / df['A'] ).shape
–> (2,10)
Therefore:
( df[['B','C']].T / df['A'] ).T
Shape is (10,2). It gives you the results that you wanted!