using backslash to break lines into multiple lines for readability

Question:

I am very bothered by the fact I can’t do something like this in Python.

df['File name'].replace(False, attachment_name, inplace=True) 
               .replace(True, np.nan, inplace=True)

instead of this

df['File name'].replace(False, attachment_name, inplace=True)
df['File name'].replace(True, np.nan, inplace=True)

Can someone brighter than me correct me

Asked By: Puzzlemaster

||

Answers:

You can do it like that if you don’t replace the values inplace

df['File name'] = df['File name'].replace(False, attachment_name).replace(True, np.nan)
Answered By: Guy

In Python, putting a backslash at the end of the line creates a line continuation, which means "ignore the newline character and the next line’s indent". So, the code:

object.do_something(False) 
      .do_something(True)

will be interpreted as

object.do_something(False).do_something(True)

Now, this may actually be what you want to do. It’s often what you want to do with line continuation – you want to break one line for the machine into several lines for a human reader. So, what your code actually does is:

  1. Call the replace() method of df, with the specified parameters, and hold onto the value returned by replace.
  2. Call the replace() method of whatever the return value was, with the specified parameters.

So, this depends on the workings of df.replace. Given the name and parameters, I am going to assume that df is a pandas.DataFrame object. The documentation for pandas.DataFrame.replace says that it returns the caller (i.e. df) and the inplace parameter doesn’t mention changing the return value. However, testing this out in an interactive shell, we see:

>>> df=pandas.DataFrame()
>>> df.replace()
Empty DataFrame
Columns: []
Index: []
>>> df.replace(inplace=True)
>>> type(df.replace())
<class 'pandas.core.frame.DataFrame'>
>>> type(df.replace(inplace=True))
<class 'NoneType'>

So, when we get to the second line, we aren’t calling the replace method of a DataFrame, we are attempting to call the replace method of None – but None does not have a replace method, so Python throws an exception. There’s a variety of reasons to do this, but they definitely should have documented that inplace=True will change the return value.

How do we get what you want then? Well, the easiest way is to make sure that df.replace returns a value with a replace method, which we can call and make sense of. Luckily, we have already found a way to do this – by not using inplace!

df = df.replace(False, attachment_name) 
       .replace(True, np.nan)

With the default inplace=False, we get a return value of type pandas.DataFrame, which has a replace method which will accept these parameters as valid. However, we do not assign anything to df now – so we have to add that in.

You may be tempted to set inplace=True on the final replace. This will not work, because the final replace is not being called on df – it’s being called on the return value of df.replace, which is a different object. It won’t raise an exception, but it will fail to change the value of df.

Answered By: IntoAMuteCrypt
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.