Change Series inplace in DataFrame after applying function on it

Question:

I’m trying to use pandas in order to change one of my columns in-place, using simple function.

After reading the whole Dataframe, I tried to apply function on one Series:

wanted_data.age.apply(lambda x: x+1)

And it’s working great. The only problem occurs when I try to put it back into my DataFrame:

wanted_data.age = wanted_data.age.apply(lambda x: x+1)

or:

wanted_data['age'] = wanted_data.age.apply(lambda x: x+1)

Throwing the following warning:

> C:Anacondalibsite-packagespandascoregeneric.py:1974:
> SettingWithCopyWarning: A value is trying to be set on a copy of a
> slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
> value instead
> 
> See the the caveats in the documentation:
> http://pandas.pydata.org/pandas-docs/stable
> /indexing.html#indexing-view-versus-copy   self[name] = value

Of Course, I can set the DataFrame using the long form of:

wanted_data.loc[:, 'age'] = wanted_data.age.apply(lambda x: x+1)

But is there no other, easier and more syntactic-nicer way to do it?

Thanks!

Asked By: Yam Mesicka

||

Answers:

Use loc:

wanted_data.loc[:, 'age'] = wanted_data.age.apply(lambda x: x + 1)
Answered By: Alexander

I would suggest
wanted_data['age']= wanted_data['age'].apply(lambda x: x+1),then save file as
wanted_data.to_csv(fname,index=False),
where “fname” is the name of a file to be updated.

Answered By: Irfanullah

I cannot comment, so I’ll leave this as an answer.

Because of the way chained indexing is handled internally, you may get back a deep copy, instead of a reference to your initial DataFrame (For more see chained assignment – this is a very good source. Bare .loc[] always returns a reference). Thus, you may not assign back to your DataFrame, but to a copy of it. On the other hand, your format may return a reference to your initial Dataframe and, while mutating it, the initial DataFrame will mutate, too. Python prints this warning to beat the drum for the situation, so as the user can decide whether this is the wanted treatment or not.

If you know what you’re doing, you can silence the warning using:

with pd.options.mode.chained_assignment = "None":
    wanted_data.age = wanted_data.age.apply(lambda x: x+1)

If you think that this is an important manner (e.g. there is the possibility of unintentionally mutating the initial DataFrame), you can set the above option to "raise", so that an error would be raised, instead of a warning.

Also, I think usage of the term "inplace" is not fully correct. "inplace" is used as an argument at some methods, so as to mutate an object without assigning it to itself (the assignment is handled internally), and apply() does not support this feature.

Answered By: Thanasis Mattas
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.