frontfill or backfill of STRING column at resample() in pandas
Question:
is there any methode while doing resampling() to ffill() or bfill() a object column?
Suppose we have:
Date
Sort
Value
2022-10-23 15:40:41
A
1
2022-10-23 18:43:13
B
2
2022-10-24 15:40:41
C
3
2022-10-24 18:43:13
D
4
i would like to have following results with:
df.resample("15min").mean()
Date
Sort
Value
2022-10-23 15:45:00
A
1
2022-10-23 16:00:00
A
1
2022-10-23 16:15:00
A
1
2022-10-23 16:35:00
A
1
…
…
…
2022-10-23 18:00:00
D
1
2022-10-23 18:15:00
D
1
2022-10-23 18:30:00
D
1
2022-10-23 18:45:00
D
1
but it always kick out the "sort column".
would be nice if anyone here can help!
best
M.
Answers:
You can specify the aggregation functions for your columns separately, for example:
df = df.resample("15min").agg({"Sort": min, "Value": np.mean}).ffill()
Output:
Sort Value
Date
2022-10-23 15:30:00 A 1.0
2022-10-23 15:45:00 A 1.0
2022-10-23 16:00:00 A 1.0
2022-10-23 16:15:00 A 1.0
2022-10-23 16:30:00 A 1.0
... ... ...
2022-10-24 17:30:00 C 3.0
2022-10-24 17:45:00 C 3.0
2022-10-24 18:00:00 C 3.0
2022-10-24 18:15:00 C 3.0
2022-10-24 18:30:00 D 4.0
If need forward filling first
valeus per Sort
and mean
per Value
use:
df = df.resample("15min").agg({'Sort':'first', 'Value':'mean'}).ffill()
print (df)
Sort Value
Date
2022-10-23 15:30:00 A 1.0
2022-10-23 15:45:00 A 1.0
2022-10-23 16:00:00 A 1.0
2022-10-23 16:15:00 A 1.0
2022-10-23 16:30:00 A 1.0
... ...
2022-10-24 17:30:00 C 3.0
2022-10-24 17:45:00 C 3.0
2022-10-24 18:00:00 C 3.0
2022-10-24 18:15:00 C 3.0
2022-10-24 18:30:00 D 4.0
[109 rows x 2 columns]
is there any methode while doing resampling() to ffill() or bfill() a object column?
Suppose we have:
Date | Sort | Value |
---|---|---|
2022-10-23 15:40:41 | A | 1 |
2022-10-23 18:43:13 | B | 2 |
2022-10-24 15:40:41 | C | 3 |
2022-10-24 18:43:13 | D | 4 |
i would like to have following results with:
df.resample("15min").mean()
Date | Sort | Value |
---|---|---|
2022-10-23 15:45:00 | A | 1 |
2022-10-23 16:00:00 | A | 1 |
2022-10-23 16:15:00 | A | 1 |
2022-10-23 16:35:00 | A | 1 |
… | … | … |
2022-10-23 18:00:00 | D | 1 |
2022-10-23 18:15:00 | D | 1 |
2022-10-23 18:30:00 | D | 1 |
2022-10-23 18:45:00 | D | 1 |
but it always kick out the "sort column".
would be nice if anyone here can help!
best
M.
You can specify the aggregation functions for your columns separately, for example:
df = df.resample("15min").agg({"Sort": min, "Value": np.mean}).ffill()
Output:
Sort Value
Date
2022-10-23 15:30:00 A 1.0
2022-10-23 15:45:00 A 1.0
2022-10-23 16:00:00 A 1.0
2022-10-23 16:15:00 A 1.0
2022-10-23 16:30:00 A 1.0
... ... ...
2022-10-24 17:30:00 C 3.0
2022-10-24 17:45:00 C 3.0
2022-10-24 18:00:00 C 3.0
2022-10-24 18:15:00 C 3.0
2022-10-24 18:30:00 D 4.0
If need forward filling first
valeus per Sort
and mean
per Value
use:
df = df.resample("15min").agg({'Sort':'first', 'Value':'mean'}).ffill()
print (df)
Sort Value
Date
2022-10-23 15:30:00 A 1.0
2022-10-23 15:45:00 A 1.0
2022-10-23 16:00:00 A 1.0
2022-10-23 16:15:00 A 1.0
2022-10-23 16:30:00 A 1.0
... ...
2022-10-24 17:30:00 C 3.0
2022-10-24 17:45:00 C 3.0
2022-10-24 18:00:00 C 3.0
2022-10-24 18:15:00 C 3.0
2022-10-24 18:30:00 D 4.0
[109 rows x 2 columns]