How to create a pandas DatetimeIndex with year as frequency?
Question:
Using the pandas.date_range(startdate, periods=n, freq=f)
function you can create a range of pandas Timestamp
objects where the freq
optional paramter denotes the frequency (second, minute, hour, day…) in the range.
The documentation does not mention the literals that are expected to be passed in, but after a few minutes you can easily find most of them.
- ‘s’ : second
- ‘min’ : minute
- ‘H’ : hour
- ‘D’ : day
- ‘w’ : week
- ‘m’ : month
However, none of ‘y’, ‘Y’, ‘yr’, etc. create dates with year as frequency.
Does anybody know what to pass in, or if it is possible at all?
Answers:
You are able to use multiples for the frequency strings. For example:
pd.date_range('01/01/2010',periods=10,freq='365D')
This code will give you a series with 01/01/2010, 01/01/2011, etc., which I think is what you are looking for. Of course, the issue here is that you will run into problems with leap years.
You can use month and then pick every 12th month:
months=pandas.date_range(start=datetime.datetime.now(),periods=120,freq='M')
year=[months[11*i] for i in range(12)]
You can also do:
usingDays=pandas.date_range(start=datetime.datetime.now(),periods=10,freq='365D')
but that won’t work so well with leap years.
Annual indexing to the beginning or end of the year
Frequency is freq='A'
for end of year frequency, 'AS'
for start of year. Check the aliases in the documentation.
eg. pd.date_range(start=pd.datetime(2000, 1, 1), periods=4, freq='A')
returns
DatetimeIndex(['2000-12-31', '2001-12-31', '2002-12-31', '2003-12-31'], dtype='datetime64[ns]', freq='A-DEC', tz=None)
Annual indexing to the beginning of an arbitrary month
If you need it to be annual from a particular time use an anchored offset,
eg. pd.date_range(start=pd.datetime(2000, 1, 1), periods=10, freq='AS-AUG')
returns
DatetimeIndex(['2000-08-01', '2001-08-01', '2002-08-01', '2003-08-01'], dtype='datetime64[ns]', freq='AS-AUG', tz=None)
Annual indexing from an arbitrary date
To index from an arbitrary date, begin the series on that date and use a custom DateOffset
object.
eg. pd.date_range(start=pd.datetime(2000, 9, 10), periods=4, freq=pd.DateOffset(years=1))
returns
DatetimeIndex(['2000-09-10', '2001-09-10', '2002-09-10', '2003-09-10'], dtype='datetime64[ns]', freq='<DateOffset: kwds={'years': 1}>', tz=None)
With all those hacks, there is a clear way:
pd.date_range(start=datetime.datetime.now(),periods=5,freq='A')
A
: Annually.
365D
? Really? What about leap years?
Using the pandas.date_range(startdate, periods=n, freq=f)
function you can create a range of pandas Timestamp
objects where the freq
optional paramter denotes the frequency (second, minute, hour, day…) in the range.
The documentation does not mention the literals that are expected to be passed in, but after a few minutes you can easily find most of them.
- ‘s’ : second
- ‘min’ : minute
- ‘H’ : hour
- ‘D’ : day
- ‘w’ : week
- ‘m’ : month
However, none of ‘y’, ‘Y’, ‘yr’, etc. create dates with year as frequency.
Does anybody know what to pass in, or if it is possible at all?
You are able to use multiples for the frequency strings. For example:
pd.date_range('01/01/2010',periods=10,freq='365D')
This code will give you a series with 01/01/2010, 01/01/2011, etc., which I think is what you are looking for. Of course, the issue here is that you will run into problems with leap years.
You can use month and then pick every 12th month:
months=pandas.date_range(start=datetime.datetime.now(),periods=120,freq='M')
year=[months[11*i] for i in range(12)]
You can also do:
usingDays=pandas.date_range(start=datetime.datetime.now(),periods=10,freq='365D')
but that won’t work so well with leap years.
Annual indexing to the beginning or end of the year
Frequency is freq='A'
for end of year frequency, 'AS'
for start of year. Check the aliases in the documentation.
eg. pd.date_range(start=pd.datetime(2000, 1, 1), periods=4, freq='A')
returns
DatetimeIndex(['2000-12-31', '2001-12-31', '2002-12-31', '2003-12-31'], dtype='datetime64[ns]', freq='A-DEC', tz=None)
Annual indexing to the beginning of an arbitrary month
If you need it to be annual from a particular time use an anchored offset,
eg. pd.date_range(start=pd.datetime(2000, 1, 1), periods=10, freq='AS-AUG')
returns
DatetimeIndex(['2000-08-01', '2001-08-01', '2002-08-01', '2003-08-01'], dtype='datetime64[ns]', freq='AS-AUG', tz=None)
Annual indexing from an arbitrary date
To index from an arbitrary date, begin the series on that date and use a custom DateOffset
object.
eg. pd.date_range(start=pd.datetime(2000, 9, 10), periods=4, freq=pd.DateOffset(years=1))
returns
DatetimeIndex(['2000-09-10', '2001-09-10', '2002-09-10', '2003-09-10'], dtype='datetime64[ns]', freq='<DateOffset: kwds={'years': 1}>', tz=None)
With all those hacks, there is a clear way:
pd.date_range(start=datetime.datetime.now(),periods=5,freq='A')
A
: Annually.
365D
? Really? What about leap years?