Most efficient way to check if timestamp in list exists between two other timestamps?

Question:

I am trying to search a list of datetimes to check if there is a timestamp C that exists between timestamp A and B. I found the bisect stdlib but am not sure how to apply it here with datetime types.

My setup is similar to this:

insulin_timestamps = ['2017/07/01 13:23:42', '2017/11/01 00:56:40', '2018/02/18 22:01:09']

start_time = '2017/10/31 22:00:11'
end_time = '2017/11/01 01:59:40'

I want to check if a timestamp in the list exists between my two variable times. I am doing this multiple times in a loop. The only solution I can come up with is another loop within it that checks the whole list. I would rather not do that for efficiency purposes as it is a lot of data. Any tips?

Asked By: Tianna Wrona

||

Answers:

Since your datetime values are in Y/m/d H:i:s format, they can be sorted by regular string comparison, and so you could use the bisect module (assuming insulin_timestamps is sorted) to efficiently search for a timestamp in insulin_timestamps which is between the start_time and end_time values:

from bisect import bisect_left, bisect_right

present = bisect_left(insulin_timestamps, start_time) < bisect_right(insulin_timestamps, end_time, sti)

Note that if in your outer loop you are iterating start and end times, it might be more efficient to iterate the timestamps array at the same time.

Answered By: Nick

If you just want to check if a timestamp exists between start_time and end_time, and the list is sorted, then you just need to check the timestamp immediately following start_time (assuming it occurs in the list). If that is less than end_time, then the result is true. It would add a few checks to the code but would cut runtime in half by removing the need for bisect_right. The code would appear as follows:

left_index = bisect_left(insulin_timestamps, start_time)
present = (
    len(insulin_timestamps) != 0 # if the list is empty the result is False
    and left_index != len(insulin_timestamps)
    and insulin_timestamps[left_index] < end_time
)

Here left_index != len(insulin_timestamps) checks that start_time is not than any element in the list: if it is, then the result is False.

Answered By: Kraigolas
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.