How to trim a string inside a query string passed to the Pandas query function?

Question:

I need to trim the ‘My Pet’ column inside the query string. How can that be done?

# Note that one cat literal has a trailing space.
testDF = pd.DataFrame([{"My Pet":"cat ",   "Cost":"$10 ", "Weight":"10 pounds", "Name":"Violet"},
                       {"My Pet":"cat",    "Cost":"$10 ", "Weight":"15 pounds", "Name":"Sirius"},
                       {"My Pet":"dog",    "Cost":"$0 ",  "Weight":"50 pounds", "Name":"Sam"},
                       {"My Pet":"turtle", "Cost":"$5 ",  "Weight":"20 ounces", "Name":"Tommy"},
                      ])
# We try to filter on cat. 
catDF = testDF.query("`My Pet` == 'cat'")  # This yields only one row because one cat cell has a trailing space
catDF.head()

Output is only one row but I would like to get both rows with cat in them

    My Pet  Cost    Weight  Name
1   cat     $10     15 pounds   Sirius
Asked By: nicomp

||

Answers:

Use Series.str.strip, but need engine="python":

catDF = testDF.query("`My Pet`.str.strip() == 'cat'", engine="python")
print (catDF)
  My Pet  Cost     Weight    Name
0   cat   $10   10 pounds  Violet
1    cat  $10   15 pounds  Sirius
Answered By: jezrael

You can access string methods on string series. This means that you can use the Series.str.strip function to clean remove whitespace.

testDF["My Pet"] = testDF["My Pet"].str.strip()

If you want to only clean trailing spaces then use rstrip (and lstrip for leading spaces)

You can optionally specify the set of characters you want to remove explicitly. E.g. to remove only spaces and tabs:

testDF["My Pet"] = testDF["My Pet"].str.strip(" t")
Answered By: Matteo Zanoni
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.