Difference between pandas.Series.str.match and pandas.Series.str.contains

Question:

What’s the difference between pandas.Series.str.contains and pandas.Series.str.match? Why is the case below?

s1 = pd.Series(['house and parrot'])
s1.str.contains(r"bparrotb", case=False)

I got True, but when I do

s1.str.match(r"bparrotb", case=False)

I got False. Why is the case?

Asked By: Jay

||

Answers:

The documentation for str.contains() states:

Test if pattern or regex is contained within a string of a Series or
Index.

The documentation for str.match() states:

Determine if each string matches a regular expression.

The difference in these two methods is that str.contains() uses: re.search, while str.match() uses re.match.

As per documentation of re.match()

If zero or more characters at the beginning of string match the
regular expression pattern, return a corresponding match object.
Return None if the string does not match the pattern; note that this
is different from a zero-length match.

So parrot does not match the first character of the string so your expression returns False. House does match the first character so it finds house and returns true.

Answered By: Mack123456
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.