Get the last date of each month in a list of dates in Python

Question:

I’m using Python 2.7, PyCharm and Anaconda,

I have a list of dates and I’d like to retrieve the last date of each month present in the array.

Are there any functions or libraries that could help me to do this?

I read the dates from a CSV file and stored them as datetime.

I have the following code:

Dates=[]
Dates1=[]
for date in dates:
    temp=xlrd.xldate_as_tuple(int(date),0)
    Dates1.append(datetime.datetime(temp[0],temp[1],temp[2]))

for date in Dates1:
    if not (date<startDate or date>endDate):
        Dates.append(date)

To make it clear, suppose I have:

Dates = [2015-01-20, 2015-01-15, 2015-01-17, 2015-02-21, 2015-02-06] 

(Consider it being in datetime format.)

The list I’d like to retrieve is:

[2015-01-20, 2015-02-21]

So far I’ve googled around, especially in Stack Overflow, but I could only find answers to how I could get the last date of each month, but not from a user-specified list.

Asked By: Pedro Braz

||

Answers:

For year y and month m, calendar.monthrange(y, m)[1] returns the day number of the last day of the month.

The following script takes a list of datetime object called dates and makes a new list, month_last_dates, containing datetime objects corresponding to the last date of each month in which the members of dates fall.

import datetime
import calendar

tuples = [(2015, 8, 1), (2015, 9, 16), (2015, 10, 4)]
dates = [datetime.datetime(y, m, d) for y, m, d in tuples]

month_last_dates = len(dates) * [None]
for i, date in enumerate(dates):
  y, m, d = date.year, date.month, date.day
  last = calendar.monthrange(y, m)[1]
  print y, m, last  # Output for testing purposes.
  month_last_dates[i] = datetime.datetime(y, m, last)

Here is an equivalent script written more concisely with the help of a list comprehension:

import datetime
import calendar

tuples = [(2015, 8, 1), (2015, 9, 16), (2015, 10, 4)]
dates = [datetime.datetime(y, m, d) for y, m, d in tuples]

month_last_dates = [datetime.datetime(date.year, date.month,
      calendar.monthrange(date.year, date.month)[1]) for date in dates]

# Output for testing purposes.
for date in month_last_dates:
  print date.year, date.month, date.day

In your case, given the list Dates, you can make a new list like this:

last_dates = [datetime.datetime(date.year, date.month,
      calendar.monthrange(date.year, date.month)[1]) for date in Dates]
Answered By: Michael Laszlo

Pandas can handle this task really well. Load your csv to a dataframe, then run a group by the month and find the max date using the aggregate function:

import pandas as pd
import numpy as np

df = pd.read_csv('/path/to/file/')          # Load a dataframe with your file
df.index = df['my_date_field']              # set the dataframe index with your date
dfg = df.groupby(pd.TimeGrouper(freq='M'))  # group by month / alternatively use MS for Month Start / referencing the previously created object

# Finally, find the max date in each month
dfg.agg({'my_date_field': np.max})

# To specifically coerce the results of the groupby to a list:
dfg.agg({'my_date_field': np.max})['my_date_field'].tolist()
Answered By: leroyJr

This works perfectly, using just one line of code:

dates = [datetime.date(2023, 3, 21), datetime.date(2023, 3, 1), datetime.date(2023, 2, 22), datetime.date(2023, 2, 14)]
eom_dates = set(pd.Series(dates,index=dates).groupby(lambda x:x.month).max())

This results in:

{datetime.date(2023, 2, 22), datetime.date(2023, 3, 21)}
Answered By: Emmanuel Wildiers
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.