group rows based on a string in a column in pandas and count the number of occurrence of unique rows that contained the string

Question:

I have a dataset with a few columns. I would like to slice the data frame with finding a string "M22" in the column "Run number". I am able to do so. However, I would like to count the number of unique rows that contained the string "M22".

Here is what I have done for the below table (example):

RUN_NUMBER  DATE_TIME   CULTURE_DAY AGE_HRS AGE_DAYS
335991M 6/30/2022   0   0   0
M220621 7/1/2022    1   24  1
M220678 7/2/2022    2   48  2
510091M 7/3/2022    3   72  3
M220500 7/4/2022    4   96  4
335991M 7/5/2022    5   120 5
M220621 7/6/2022    6   144 6
M220678 7/7/2022    7   168 7
335991M 7/8/2022    8   192 8
M220621 7/9/2022    9   216 9
M220678 7/10/2022   10  240 10

here is the results I got:

RUN_NUMBER
335991M      0
510091M      0
335992M      0
M220621      3
M220678      3
M220500      1

Now I need to count the strings/rows that contained "M22" : so I need to get 3 as output.

Asked By: zizoo

||

Answers:

Use the following approach with pd.Series.unique function:

df[df['RUN_NUMBER'].str.contains("M22")]['RUN_NUMBER'].unique().size

Or a more faster alternative using numpy.char.find function:

(np.char.find(df['RUN_NUMBER'].unique().astype(str), 'M22') != -1).sum()

3
Answered By: RomanPerekhrest
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.