Pandas: Combine consecutive months having same values in other column

Question:

I have monthly performance of students for several years for all subjects. DataFrame has following columns: [Name, Subject, Month, Year, Marks] as given in following image 1:

  Name  Month  Year Subject  Marks
0    A      1  2022    Math     80
1    A      2  2022    Math     80
2    A      3  2022    Math     80
3    A      4  2022    Math     70
4    A      5  2022    Math     80
5    A      6  2022    Math     80
6    A      7  2022    Math     80

Now I want combine consecutive rows having same performance for given student and subject. As given in following image:

  Name Subject  Marks    Time_Period
0    A    Math     80  1.2022-3.2022
1    A    Math     70  4.2022-4.2022
2    A    Math     80  5.2022-7.2022

I have tried to group dataframe and extract Min/Max(Month) and Min/Max(Year). But it will give wrong result if student has different performance in month in between.

Asked By: RVD

||

Answers:

You can use a custom groupby.agg:

# identify consecutive marks
group = df['Marks'].ne(df['Marks'].shift()).cumsum()

out = (df.assign(Time_Period=lambda d: d['Month'].astype(str)
                                  +'.'+d['Year'].astype(str))
         .groupby(['Name', 'Subject', 'Marks', group],
                  sort=False, as_index=False)
         ['Time_Period']
         .agg(lambda x: '-'.join([x.iloc[0], x.iloc[-1]]))
      )

If you want to start a new group when a month is missing:

# identify consecutive marks
group1 = df['Marks'].ne(df['Marks'].shift()).cumsum()

# group by successive months
group2 = df.groupby('Year')['Month'].diff().ne(1)

out = (df.assign(Time_Period=lambda d: d['Month'].astype(str)
                                  +'.'+d['Year'].astype(str))
         .groupby(['Name', 'Subject', 'Marks', group1, group2],
                  sort=False, as_index=False)
         ['Time_Period']
         .agg(lambda x: '-'.join([x.iloc[0], x.iloc[-1]]))
      )

Output:

  Name Subject  Marks    Time_Period
0    A    Math     80  1.2022-3.2022
1    A    Math     70  4.2022-4.2022
2    A    Math     80  5.2022-7.2022
Answered By: mozway