Pandas dataframe – find last timestamp with valid values

Question:

I have a pandas dataframe in which the index is the timestamp and I have a column that contains a value per timestamp, like this:

Values
timestamp
2022-03-17 13:21:00+00:00 15.2
2022-03-22 13:24:00+00:00 17.8
2022-03-27 13:27:00+00:00 NaN
2022-03-30 13:30:00+00:00 NaN

In the column of Values sometimes I get a number and other times I get NaN.

What I am trying to do is to get a new dataframe that contains the values of the last week, for which I am using the next piece of code:

dataW=data.loc[(pd.Timestamp.utcnow()-pd.Timedelta(days=7)):(pd.Timestamp.utcnow())]

Which works fine, except if by coincidence the data of the last week is all NaNs: then I get an error.
To solve this, I would like dataW to be a dataframe containing the data of the past seven days from the last day in which the Values is not a NaN. That means that, in the dataframe I wrote as example, instead of getting the data of

2022-03-30 13:30:00+00:00 - 7 days

I would like to get the data of

2022-03-22 13:24:00+00:00 - 7 days

Does anybody have an idea of how I could do this?

Asked By: Sara.SP92

||

Answers:

You can use last_valid_index:

last = data['Values'].last_valid_index()
# or to consider all columns
# last = data.last_valid_index()

data.loc[last-pd.Timedelta(days=7):last]

output:

                           Values
timestamp                        
2022-03-17 13:21:00+00:00    15.2
2022-03-22 13:24:00+00:00    17.8

last: Timestamp('2022-03-22 13:24:00+0000', tz='UTC')

Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.