Create function to loop through a pandas dataframe

Question:

I have a dataframe that looks like this. enter image description here

Now i want to create a function to get the sum of the ‘Total’ column for each of the name ‘ABC’,’XYZ’… with a pre-defined condition of the Currency.
In other words, i want my output to be the same as the followingenter image description here

Name_list = list(df['Name'])
Currency_list = list(df['Currency'])

def Calculation(a,b):
    for a in Name_list:
        for b in Currency_list:
            return df['Total'].sum()

result = Calculation('ABC','VND')
print(result) 

So far what I have got is just the sum of the whole column ‘Total’. Anyone has ideas how to get the desired result for this?

Asked By: Anthony Nguyen

||

Answers:

In your code there is actually nothing that force the function to use only the lines you want to select in your dataframe.
This can be done by using the loc function to locate the lines you want to sum on.

In your example, if you replace le the line return df['Total'].sum() by this one :
return df.loc[(df['Name'] == a) & (df['Currency'] == b), 'Total').apply(numpy.sum)
It should return what you want.

Answered By: Haeden

I think pandas’ groupby method may help you here:

df.groupby(['Currency', 'Name'])['Total'].sum()

Link to official docs:
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html

Answered By: Flursch