Get previous and next index values in DataFrame should they exist

Question:

Suppose I have a DataFrame

df = pd.DataFrame(dict(vals=np.random.randint(0, 10, 10)),
                  index=pd.date_range('20170401', '20170410'))

>>> df
               vals
2017-04-01     9
2017-04-02     8
2017-04-03     4
2017-04-04     5
2017-04-05     9
2017-04-06     9
2017-04-07     5
2017-04-08     3
2017-04-09     3
2017-04-10     1

and a particular date which I know is in my index but do not know the position of, for example

cur_dt = df.index[np.random.randint(0, df.index.size)]

>>> cur_dt
Timestamp('2017-04-05 00:00:00', freq='D')

Given cur_dt, I want to determine what the previous and next values in my index are. Should cur_dt be the first (last) value in my index, then the previous (next) element should be cur_dt itself.

To recap, my question is, what is the easiest way to find the previous and next value in my index (or my current value itself if it is an endpoint) given my current value?


My current approach seems rather roundabout, which is my motivation for asking.

cur_iloc = df.index.get_loc(cur_dt)
prev = cur_dt if cur_iloc == 0 else df.index[cur_iloc-1]
next = cur_dt if cur_iloc == df.index.size-1 else df.index[cur_iloc+1]

>>> prev
Timestamp('2017-04-04 00:00:00', freq='D')
>>> next
Timestamp('2017-04-06 00:00:00', freq='D')

If there’s no more straightforward way after all then my apologies. I’m imagining being able to just “shift” my index from my current value once forwards and once backwards (with some nice treatment for endpoints), but am not sure if this is possible.

Asked By: Eric Hansen

||

Answers:

Assuming that the index is sorted, try to use numpy.searchsorted:

Source data sets:

In [185]: df
Out[185]:
            vals
2017-04-01     5
2017-04-02     3
2017-04-03     9
2017-04-04     8
2017-04-05     1
2017-04-06     0
2017-04-07     4
2017-04-08     5
2017-04-09     1
2017-04-10     8

In [186]: cur_dt
Out[186]: Timestamp('2017-04-02 00:00:00', freq='D')

Solution:

In [187]: idx = np.searchsorted(df.index, cur_dt)

In [188]: df.index[max(0, idx-1)]
Out[188]: Timestamp('2017-04-01 00:00:00', freq='D')

In [189]: df.index[min(idx+1, len(df)-1)]
Out[189]: Timestamp('2017-04-03 00:00:00', freq='D')

Reset your index and then use your boolean logic to identify location of your cur_dt like so:

df = df.reset_index()
cur_dt_index = df.index[np.random.randint(0, df['index'].size)]
previous = max(cur_dt_index-1, 0)
next = min(cur_dt_index + 1, df.shape[0])
Answered By: Grr

Create a new timeseries ts with same index as df which will store the previous index (and make sure index of ts is sorted), and then simply shift ts by 1.

ts=pd.Series(df.index,index=df.index).sort_index().shift(1)

(This might be slower if you only need to find the previous index once, but is faster if you need to do this multiple times.)

Answered By: Sucharit Sarkar
def get_next_idx(df, current_idx):
    after = df.truncate(before=current_idx).iloc[1:]
    return after.index[0] if 0 < len(after) else None

def get_prev_idx(df, current_idx):
    before = df.truncate(after=current_idx).iloc[:-1]
    return before.index[-1] if 0 < len(before) else None

print(get_next_idx(df, cur_dt) or cur_dt)
print(get_prev_idx(df, cur_dt) or cur_dt)
Answered By: MathKid
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.