Assign Date to Missing Cell Based on Dataframe

Question:

I have a dataframe as below. My question is, i want to assign the missing dates based on the first day of the Month available in the dataset. The dates will only be for a single month per file. It wont be multiple months in a single file. One cell will always have a date.

Please note that the files could be from a previous month so i cant use Today’s date. I have used

fillna(method='ffill')

But this only works if the first cell has a date in it. If its blank itll make the rest blank as well.

EDIT: The filled dates can be the first of the month or ANY previously populated dates in the column. As long as they are from the same month, it should be fine.

Sample Data:

Other Columns Date Invoice Other Columns
ABC232
5-27-2022 ABC232
5-27-2022 ANBFN2323
SADNF343
1232HHH

Expected Output:

Other Columns Date Invoice Other Columns
5-01-2022 ABC232
5-27-2022 ABC232
5-27-2022 ANBFN2323
5-01-2022 SADNF343
5-01-2022 1232HHH

Please let me know if you have any questions. I cant find anything to even get me started.

Thanks

Asked By: Waleed Mahmood

||

Answers:

import re

# Get the index of the first non-NA Date
idx = df["Date"].first_valid_index()
# Replace the day with 01
fill_value = re.sub(r"(d+)-(d+)-(d+)", r"1-01-3", df.loc[idx, "Date"])
# Fill
df["Date"] = df["Date"].fillna(fill_value)
Answered By: Code Different
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.