python: sorting time interval data into two days chucks based on index event

Question:

I have the following data:

df = 
id date_medication medication index_date
1  2000-01-01      A          2000-01-04
1  2000-01-02      A          2000-01-04
1  2000-01-05      B          2000-01-04
1  2000-01-06      B          2000-01-04
2  2000-01-01      A          2000-01-05
2  2000-01-03      B          2000-01-05
2  2000-01-06      A          2000-01-05
2  2000-01-10      B          2000-01-05

and I would like to transform the data into two days’ chucks around the index event (IE). That is creating new columns representing the time intervals such as:

df =
id -4 -2 0  2  4  6 
1  A  A  IE B  0  0
2  A  B  IE A  A  B

Answers:

Use:

#convert columns to datetimes
df['date_medication'] = pd.to_datetime(df['date_medication'])
df['index_date'] = pd.to_datetime(df['index_date'])

#get 2 days chunks
s = df['date_medication'].sub(df['index_date']).dt.days // 2 * 2
#add 2 days for greater/equal values 0 
s.loc[s.ge(0)] += 2

#pivoting columns
df1 = df.assign(g = s).pivot(index='id', columns='g', values='medication')
#added 0 column
df1.loc[:, 0] = 'IE'
#added 0 column
df1 = (df1.rename_axis(columns=None)
         .reindex(columns=range(df1.columns.min(), df1.columns.max() + 2, 2), fill_value=0)
         .fillna(0)
         .reset_index())
   id -4 -2   0  2  4  6
0   1  A  A  IE  B  B  0
1   2  A  B  IE  A  0  B

Details:

print (s)
0   -4
1   -2
2    2
3    4
4   -4
5   -2
6    2
7    6
dtype: int64
Answered By: jezrael

You can use:

# Compute the delta
days = df['date_medication'].sub(df['index_date']).dt.days

# Create desired bins and labels
bins = np.arange(days.min() - days.min() % 2, days.max() + days.max() % 2 + 1, 2)
lbls = bins[bins != 0]  # Exclude 0
df['interval'] = pd.cut(days, bins, labels=lbls, include_lowest=True, right=False)

# Reshape your dataframe
out = (df.pivot(index='id', columns='interval', values='medication')
         .reindex(bins, fill_value='IE', axis=1).fillna(0)
         .rename_axis(columns=None).reset_index())

Output:

>>> out
   id -4 -2   0  2  4  6
0   1  A  A  IE  B  B  0
1   2  A  B  IE  A  0  B
Answered By: Corralien
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.