Counting the occurences of country codes over each week in a python DataFrame containing daily sales

Question:

I am using python to analyse data from a sales dataset.
This dataset consist of two columns, namely the transaction date and the corresponding country code from where the purchase was made. The dataset is a pandas DataFrame and has been created based on the columns of two smaller datasets with different column names as seen here:

    frames_f, frames_l = get_sales_frames(True)
    sales_f = frames_f[['Transaction Date', 'Buyer Country']]
    sales_l = frames_l[['Order Charged Date', 'Country of Buyer']]
    
    sales_l.columns = sales_f.columns
    sales = pd.concat([sales_f, sales_l])
    sales['Transaction Date'] = pd.to_datetime(sales['Transaction Date'], infer_datetime_format=True)

The dataset contains purchase data from every day within a 7-month period. To analyse the data, I am required to count the number of purchases done in each country for every week.

First, I tried to look for similar problems, where I found an answer that somewhat helped me by suggesting to use an extra column with 1’s and to use .sum(), which gave me the following:

    sales['Purchases'] = 1
    
    purchases = sales.groupby(['Transaction Date', 'Buyer Country'])['Purchases'].sum()
    print(purchases)

This gives the following output:

Transaction Date  Buyer Country
2021-06-01        US               2
2021-06-02        GB               1
                  IL               1
                  SE               1
                  US               7
                                  ..
2021-12-29        US               7
2021-12-30        CA               1
                  US               4
2021-12-31        GB               1
                  US               7

Now, this helped me a lot, however I need to have the counts for every week instead of everyday. I would want the results to look just like this, but having the counts per country counted over every week of the Transaction Date.

What would be the most efficient way to achieve this?

Besides the suggestion to use the added ‘Purchases’ column, I have tried (maybe in the wrong way) using df.groupby(pd.Grouper(key='Transaction Date', freq="W")).count() but without any luck, since it would count all instances of purchases per week, unrelated to the country.

Asked By: Dinkleberg

||

Answers:

Try this approach: By first assigning the week then proceed to sum purchases per week.

sales['Week'] = sales['Transaction Date'].apply(lambda x: x.isocalendar()[1])
weekly_purchases = sales.groupby(['Week', 'Buyer Country'])['Purchases'].sum()
Answered By: Jamiu S.