How to convert TimeSeries object in pandas into integer?
Question:
I’ve been working with Pandas to calculate the age of a sportsman on a particular fixture, although it’s returned as a TimeSeries type.
I’d now like to be able to plot age (in days) against the fixture dates, but can’t work out how to turn the TimeSeries object to an integer. What can I try next?
This is the shape of the data.
squad_date['mean_age']
2008-08-16 11753 days, 0:00:00
2008-08-23 11760 days, 0:00:00
2008-08-30 11767 days, 0:00:00
2008-09-14 11782 days, 0:00:00
2008-09-20 11788 days, 0:00:00
This is what I would like:
2008-08-16 11753
2008-08-23 11760
2008-08-30 11767
2008-09-14 11782
2008-09-20 11788
Answers:
The way I did it:
def conv_delta_to_int (dt):
return int(str(dt).split(" ")[0].replace (",", ""))
squad_date['mean_age'] = map(conv_delta_to_int, squad_date['mean_age'])
you need to be on master for this (0.11-dev)
In [40]: x = pd.date_range('20130101',periods=5)
In [41]: td = pd.Series(x,index=x)-pd.Timestamp('20130101')
In [43]: td
Out[43]:
2013-01-01 00:00:00
2013-01-02 1 days, 00:00:00
2013-01-03 2 days, 00:00:00
2013-01-04 3 days, 00:00:00
2013-01-05 4 days, 00:00:00
Freq: D, Dtype: timedelta64[ns]
In [44]: td.apply(lambda x: x.item().days)
Out[44]:
2013-01-01 0
2013-01-02 1
2013-01-03 2
2013-01-04 3
2013-01-05 4
Freq: D, Dtype: int64
For people who find this post by google, if you have numpy >= 0.7 and pandas 0.11, these solutions will not work. What does work:
squad_date['mean_age'].apply(lambda x: x / np.timedelta64(1,'D'))
The official Pandas documentation can be confusing here. They suggest to do “x.item()”, where x is already a timedelta object.
x.item() would retrieve the difference in as an int value from the timedelta object. If that would be ‘ns’, you would get an int with the number of nanoseconds for example. So that would give a integer divide by a timedelta error; dividing the timedeltas directly by each other does work (and converts it to Days as by the ‘D’ in the second part).
I hope this will help someone in the future!
I’ve been working with Pandas to calculate the age of a sportsman on a particular fixture, although it’s returned as a TimeSeries type.
I’d now like to be able to plot age (in days) against the fixture dates, but can’t work out how to turn the TimeSeries object to an integer. What can I try next?
This is the shape of the data.
squad_date['mean_age']
2008-08-16 11753 days, 0:00:00
2008-08-23 11760 days, 0:00:00
2008-08-30 11767 days, 0:00:00
2008-09-14 11782 days, 0:00:00
2008-09-20 11788 days, 0:00:00
This is what I would like:
2008-08-16 11753
2008-08-23 11760
2008-08-30 11767
2008-09-14 11782
2008-09-20 11788
The way I did it:
def conv_delta_to_int (dt):
return int(str(dt).split(" ")[0].replace (",", ""))
squad_date['mean_age'] = map(conv_delta_to_int, squad_date['mean_age'])
you need to be on master for this (0.11-dev)
In [40]: x = pd.date_range('20130101',periods=5)
In [41]: td = pd.Series(x,index=x)-pd.Timestamp('20130101')
In [43]: td
Out[43]:
2013-01-01 00:00:00
2013-01-02 1 days, 00:00:00
2013-01-03 2 days, 00:00:00
2013-01-04 3 days, 00:00:00
2013-01-05 4 days, 00:00:00
Freq: D, Dtype: timedelta64[ns]
In [44]: td.apply(lambda x: x.item().days)
Out[44]:
2013-01-01 0
2013-01-02 1
2013-01-03 2
2013-01-04 3
2013-01-05 4
Freq: D, Dtype: int64
For people who find this post by google, if you have numpy >= 0.7 and pandas 0.11, these solutions will not work. What does work:
squad_date['mean_age'].apply(lambda x: x / np.timedelta64(1,'D'))
The official Pandas documentation can be confusing here. They suggest to do “x.item()”, where x is already a timedelta object.
x.item() would retrieve the difference in as an int value from the timedelta object. If that would be ‘ns’, you would get an int with the number of nanoseconds for example. So that would give a integer divide by a timedelta error; dividing the timedeltas directly by each other does work (and converts it to Days as by the ‘D’ in the second part).
I hope this will help someone in the future!