I want to sum the values of a column file and append each sum to the new column in the same csv file
Question:
I’m new to coding, can you please share detailed code
This is the given data, I want to iterate in the amount column and each sum to a new column
Like this:
after adding all these amounts should be in the new column
and also I have to find the minimum in the sum and the date of it, maximum in the sum and the date of it.
Actually my question is asked like this
Please apply credit transactions first to calculate balance on a given day.
Input CSV Format:
There are multiple customer IDs in the csv file
CustomerID, Date, Amount
Output Format:
CustomerID, MM/YYYY, Min Balance, Max Balance, Ending Balance"
Is this the best method to address the query?
Thank you
Answers:
You can use cumsum
to add a cumulative sum column. The sum
function just adds up all the values in a single column, so it won’t fulfil your current requirement.
So instead of d.['Amounts'].sum()
it would be:
d['sum'] = d['Amounts'].cumsum()
From there, you can search that sum column with the .idxmin()
and .idxmax()
functions to grab the index of the rows containing the min and max, respectively. Finally, you can index those rows in d
and grab the associated dates.
Multiple customers
df['sum']=df.groupby(['Date','Customer ID'],as_index=False)['Amount'].cumsum()
df
Out[23]:
Customer ID Date Amount sum
0 C231 11-01-2022 9232 9232
1 C231 11-02-2022 6061 6061
2 C231 11-03-2022 5108 5108
3 C231 11-04-2022 9086 9086
4 C232 11-03-2022 100 100
5 C232 11-03-2022 200 300
Min, Max and sum by Customer
df.groupby('Customer ID').agg({'Amount': ['min', 'max','sum']})
Out[25]:
Amount
min max sum
Customer ID
C231 5108 9232 29487
C232 100 200 300
Monthly Min, Max and sum by Customer
df.groupby([df.Date.dt.strftime('%b/%Y'),'Customer ID']).agg({'Amount':['min','max','sum']}).reset_index()
Out[111]:
Date Customer ID Amount
min max sum
0 Nov/2022 C231 5108 9232 29487
1 Nov/2022 C232 100 200 300
I’m new to coding, can you please share detailed code
This is the given data, I want to iterate in the amount column and each sum to a new column
Like this:
after adding all these amounts should be in the new column
and also I have to find the minimum in the sum and the date of it, maximum in the sum and the date of it.
Actually my question is asked like this
Please apply credit transactions first to calculate balance on a given day.
Input CSV Format:
There are multiple customer IDs in the csv file
CustomerID, Date, Amount
Output Format:
CustomerID, MM/YYYY, Min Balance, Max Balance, Ending Balance"
Is this the best method to address the query?
Thank you
You can use cumsum
to add a cumulative sum column. The sum
function just adds up all the values in a single column, so it won’t fulfil your current requirement.
So instead of d.['Amounts'].sum()
it would be:
d['sum'] = d['Amounts'].cumsum()
From there, you can search that sum column with the .idxmin()
and .idxmax()
functions to grab the index of the rows containing the min and max, respectively. Finally, you can index those rows in d
and grab the associated dates.
Multiple customers
df['sum']=df.groupby(['Date','Customer ID'],as_index=False)['Amount'].cumsum()
df
Out[23]:
Customer ID Date Amount sum
0 C231 11-01-2022 9232 9232
1 C231 11-02-2022 6061 6061
2 C231 11-03-2022 5108 5108
3 C231 11-04-2022 9086 9086
4 C232 11-03-2022 100 100
5 C232 11-03-2022 200 300
Min, Max and sum by Customer
df.groupby('Customer ID').agg({'Amount': ['min', 'max','sum']})
Out[25]:
Amount
min max sum
Customer ID
C231 5108 9232 29487
C232 100 200 300
Monthly Min, Max and sum by Customer
df.groupby([df.Date.dt.strftime('%b/%Y'),'Customer ID']).agg({'Amount':['min','max','sum']}).reset_index()
Out[111]:
Date Customer ID Amount
min max sum
0 Nov/2022 C231 5108 9232 29487
1 Nov/2022 C232 100 200 300