Returning a copy versus a view warning when using Python pandas dataframe

Question:

My purpose is to transform date column from object type in dateframe df into datetime type, but suffered a lot from view and copy warning when running the program.

I’ve found some useful information from link: https://stackoverflow.com/a/25254087/3849539

And tested following three solutions, all of them work as expected, but with different warning messages. Could anyone help explain their differences and point out why still warning message for returning a view versus a copy? Thanks.

Solution 1: df[‘date’] = df[‘date’].astype(‘datetime64’)

test.py:85: SettingWithCopyWarning: A value is trying to be set on a
copy of a slice from a DataFrame. Try using
.loc[row_indexer,col_indexer] = value
instead

See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[‘date’] = df[‘date’].astype(‘datetime64’)

Solution 2: df[‘date’] = pd.to_datetime(df[‘date’])

~/report/lib/python3.8/site-packages/pandas/core/frame.py:3188:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using
.loc[row_indexer,col_indexer] = value
instead

See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self[k1] = value[k2]
test.py:85: SettingWithCopyWarning: A value is
trying to be set on a copy of a slice from a DataFrame. Try using
.loc[row_indexer,col_indexer] = value
instead

See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Solution 3: df.loc[:, ‘date’] = pd.to_datetime(df.loc[:, ‘date’])

~/report/lib/python3.8/site-packages/pandas/core/indexing.py:1676:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value
instead

See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self._setitem_single_column(ilocs[0], value, pi)

Asked By: x86_64

||

Answers:

Changing how you do the datetime conversion will not fix the SettingWithCopyWarning. You get it because the df you are working with is already a slice of some larger data frame. Pandas is simply warning you that you are working with the slice and not the full data. Try instead to create a new column in df – you’ll get the warning, but the column will exist in your slice. It won’t in the original data set.

You can turn off these warnings if you know what you are doing by using pd.options.mode.chained_assignment = None # default='warn'

Answered By: Darina

I got similar warnings recently. After several tries, at least in my case, the problem is not related to your 3 solutions. It might be your ‘df’.

If your df was a slice of another pandas df, such as:

df = dfOrigin[slice,:] or
df = dfOrigin[[some columns]] or
df = dfOrigin[one column]

Then, if you do anything on df, that warning will appear. Try using df = dfOrigin[[]].copy() instead.

Code to reproduce this:

import numpy as np
import pandas as pd
np.random.seed(2021)
dfOrigin = pd.DataFrame(np.random.choice(10, (4, 3)), columns=list('ABC'))
print("Orignal dfOrigin")
print(dfOrigin)
#    A  B  C
# 0  4  5  9
# 1  0  6  5
# 2  8  6  6
# 3  6  6  1
df = dfOrigin[['B', 'C']]  # Returns a view
df.loc[:,'B'] = df['B'].astype(str) #Get SettingWithCopyWarning

df2 = dfOrigin[['B', 'C']].copy() #Returns a copy
df2['B'] = df2['B'].astype(str) #OK
Answered By: Raymond