How to delete rows per category in pandas based on a specific range? and the range is a string
Question:
I have a dataframe like this,
Date Info
2022-01-01 egg price
2022-01-01 Central Java
2022-01-01 East Java
2022-01-01 chicken price
2022-01-01 Central Java
2022-01-01 East Java
2022-01-02 egg price
2022-01-02 Central Java
2022-01-02 East Java
2022-01-02 chicken price
2022-01-02 Central Java
2022-01-02 East Java
how to delete rows starting from egg price to row before chicken price, and this is per date category.
I want to be like this:
Date Info
2022-01-01 chicken price
2022-01-01 Central Java
2022-01-01 East Java
2022-01-02 chicken price
2022-01-02 Central Java
2022-01-02 East Java
Answers:
If per Date
are only one values egg
and chicken
price, egg
is before chicken
is possible create mask and get values between with GroupBy.cummax
:
m1 = df['Info'].eq('egg price')
m2 = df['Info'].eq('chicken price')
mask = m1.groupby(df['Date']).cummax() & m2.iloc[::-1].groupby(df['Date']).cummax() & ~m2
df = df[~mask]
print (df)
Date Info
3 2022-01-01 chicken price
4 2022-01-01 Central Java
5 2022-01-01 East Java
9 2022-01-02 chicken price
10 2022-01-02 Central Java
11 2022-01-02 East Java
I have a dataframe like this,
Date Info
2022-01-01 egg price
2022-01-01 Central Java
2022-01-01 East Java
2022-01-01 chicken price
2022-01-01 Central Java
2022-01-01 East Java
2022-01-02 egg price
2022-01-02 Central Java
2022-01-02 East Java
2022-01-02 chicken price
2022-01-02 Central Java
2022-01-02 East Java
how to delete rows starting from egg price to row before chicken price, and this is per date category.
I want to be like this:
Date Info
2022-01-01 chicken price
2022-01-01 Central Java
2022-01-01 East Java
2022-01-02 chicken price
2022-01-02 Central Java
2022-01-02 East Java
If per Date
are only one values egg
and chicken
price, egg
is before chicken
is possible create mask and get values between with GroupBy.cummax
:
m1 = df['Info'].eq('egg price')
m2 = df['Info'].eq('chicken price')
mask = m1.groupby(df['Date']).cummax() & m2.iloc[::-1].groupby(df['Date']).cummax() & ~m2
df = df[~mask]
print (df)
Date Info
3 2022-01-01 chicken price
4 2022-01-01 Central Java
5 2022-01-01 East Java
9 2022-01-02 chicken price
10 2022-01-02 Central Java
11 2022-01-02 East Java