How to count number of rows with a specific string value in a column using pandas?

Question:

I have a pandas column with dtype 'object' that contains numeric values and the value '?'.

How should I proceed to count the number of rows that have the value '?' ?

I’m trying to run:

question_mark_count = df['column'].str.contains('?').sum()

in a column that has numeric value and some question marks ‘?’, but I’m getting the error:

AttributeError: Can only use .str accessor with string values!

When I run df.dtypes, I can see that the column is 'object' type.

I’ve also tried to convert the column to string:

df["column"] = df["column"].astype("string")

But I’m still getting the same error.

Asked By: jimmy

||

Answers:

how about this?

>>> (df["column"].str.contains('?')).astype('int').sum()
Answered By: Danail Petrov

to further explore possibilities:

df["column"].str.contains('?').value_counts()

immune to np.nan pd.NA ints floats or whatever you have in your df['column']

Answered By: eliu

In my case the previous answer is almost correct. Try to add na=False in the call to the contains function:

df["column"].str.contains('?', na=False).astype('int').sum()
Answered By: Morita Ichika
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.