Expand Pandas date range

Question:

I have data that looks like this. Each row represents a value of that ID at some date.

ID   Date         Value
A    2012-01-05   50
A    2012-01-08   100
A    2012-01-10   200
B    2012-07-01   10
B    2012-07-03   20

I need to expand this so that I have rows for all days. The value of each day should be the value of the day before (i.e., think of the data above as updates of values, and the data below as a timeseries of values).

ID   Date         Value
A    2012-01-05   50
A    2012-01-06   50
A    2012-01-07   50
A    2012-01-08   100
A    2012-01-09   100
A    2012-01-10   200
B    2012-07-01   10
B    2012-07-02   10
B    2012-07-03   20

Currently, I have a solution that amounts to the following:

  • Group by ID
  • For each group, figure out the min and max date
  • Create a pd.date_range
  • Iterate simultaneously through the rows and through the date range, filling the values in the date range and incrementing the index-pointer to the rows if necessary
  • Append all these date ranges to a final dataframe

It works, but seems like a pretty bad bruteforce solution. I wonder if there’s a better approach supported by Pandas?

Asked By: gberger

||

Answers:

Using resample on Date indexed dataframe with ID groups and ffill on value

In [1725]: df.set_index('Date').groupby('ID').resample('1D')['Value'].ffill().reset_index()
Out[1725]:
  ID       Date  Value
0  A 2012-01-05     50
1  A 2012-01-06     50
2  A 2012-01-07     50
3  A 2012-01-08    100
4  A 2012-01-09    100
5  A 2012-01-10    200
6  B 2012-07-01     10
7  B 2012-07-02     10
8  B 2012-07-03     20
Answered By: Zero

Or you can try this one (Notice : this can be used for expend numeric column too ).

df.Date=pd.to_datetime(df.Date)
df=df.set_index(df.Date)
df.set_index(df.Date).groupby('ID')
   .apply(lambda x : x.reindex(pd.date_range(min(x.index), max(x.index),freq='D')))
     .ffill().reset_index(drop=True)

Out[519]: 
  ID       Date  Value
0  A 2012-01-05   50.0
1  A 2012-01-05   50.0
2  A 2012-01-05   50.0
3  A 2012-01-08  100.0
4  A 2012-01-08  100.0
5  A 2012-01-10  200.0
6  B 2012-07-01   10.0
7  B 2012-07-01   10.0
8  B 2012-07-03   20.0
Answered By: BENY

This format is not working. It said code cannot reindex on an axis with duplicate labels.Here Picture

Answered By: SHRADDHA VASA
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.