Iterate dataframe and sum transactions by condition

Question:

I have the following sample of data:

    id   year  type  num
     1   1994   A     0
     2   1950   A  2333
     3   1977   B  4444
     4   1995   B   555
     1   1994   A     0
     6   1955   A   333
     7   2006   B  4123
     6   1975   A     0
     9   1999   B   123
     3   1950   A  1234

I’m looking for the easiest way how to sum column ‘num’ based on conditions of type == ‘A’ and year < 1999

I’m iterating through the dataframe df with the data:

    data = pd.read_csv('data.csv')
    df = pd.DataFrame(data)
    df_sum = pd.DataFrame
    
    for index, row in df.iterrows():
        if row['type'] == 'A' and row['year'] < 1999:
            df_sum = df_sum.append(row) //This doesn't work

and trying to store the rows that match the conditions into df_sum where I’d make the sumarized num by id. Have no idea how to iterate and store the data based on condition into new dataframe.

The desired output would be:

id num_sum
1   0
2   2333
6   333
.....
Asked By: vloubes

||

Answers:

You can use df.query() to accomplish that.

filtered_df = df.query('type == "A" and year < 1999')
sum_df = filtered_df.groupby("id")["num"].sum().reset_index()
print(sum_df)

Output:

   id      num
0   1        0
1   2     2333
2   3     1234
3   6      333
Answered By: Jamiu Shaibu

For summarised data you could filter, groupby sum then reset index

df_sum = df[((df.type=='A')&(df.year<1999))].groupby('id').sum('num').reset_index()
df_sum
Out[276]: 
   id  year   num
0   1  3988     0
1   2  1950  2333
2   3  1950  1234
3   6  3930   333
Answered By: Surjit Samra
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.