Adding calculated column in Pandas

Question:

I have a dataframe with 10 columns. I want to add a new column ‘age_bmi’ which should be a calculated column multiplying ‘age’ * ‘bmi’. age is an INT, bmi is a FLOAT.

That then creates the new dataframe with 11 columns.

Something I am doing isn’t quite right. I think it’s a syntax issue. Any ideas?

Thanks

df2['age_bmi'] = df(['age'] * ['bmi'])
print(df2)
Asked By: JD2775

||

Answers:

try df2['age_bmi'] = df.age * df.bmi.

You’re trying to call the dataframe as a function, when you need to get the values of the columns, which you can access by key like a dictionary or by property if it’s a lowercase name with no spaces that doesn’t match a built-in DataFrame method.

Someone linked this in a comment the other day and it’s pretty awesome. I recommend giving it a watch, even if you don’t do the exercises: https://www.youtube.com/watch?v=5JnMutdy6Fw

Answered By: Cory Madden

As pointed by Cory, you’re calling a dataframe as a function, that’ll not work as you expect. Here are 4 ways to multiple two columns, in most cases you’d use the first method.

In [299]: df['age_bmi'] = df.age * df.bmi

or,

In [300]: df['age_bmi'] = df.eval('age*bmi')

or,

In [301]: df['age_bmi'] = pd.eval('df.age*df.bmi')

or,

In [302]: df['age_bmi'] = df.age.mul(df.bmi)
Answered By: Zero

You have combined age & bmi inside a bracket and treating df as a function rather than a dataframe. Here df should be used to call the columns as a property of DataFrame-

df2['age_bmi'] = df['age'] *df['bmi']
Answered By: UG007

You can also use assign:

df2 = df.assign(age_bmi = df['age'] * df['bmi'])
Answered By: rachwa
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.