Add missing rows per group to fill a series

Question

I have the following df:

Name	Time Elapsed	Amount
A	2	$2
A	5	$1
A	7	$6
B	3	$3
B	5	$5

I would like to expand this per group so that I get all the times in between the min and max, and then fill downwards (assuming it’s sorted by name then by time elapsed) to produce the following output

Name	Time Elapsed	Amount
A	2	$2
A	3	$2
A	4	$2
A	5	$1
A	6	$1
A	7	$6
B	3	$3
B	4	$3
B	5	$5

I tried the following:

df = df(df.set_index('Time Elapsed')
          .groupby('Name')['Amount']
          .apply(lambda x: x.reindex(range(x.index.min(), x.index.max()+1))
          .ffill().fillna(0)).reset_index)

Feel free to offer a solution that does not use any of my code.

Asked By: lakadibo

||

Source

Answer 1

try this:

def func(g: pd.DataFrame):
    tmp = g.set_index('Time Elapsed')
    res = tmp.reindex(
        np.arange(tmp.index.min(), tmp.index.max()+1),
        method='ffill')
    return res

grouped = df.groupby('Name', as_index=False)
result = grouped.apply(func).reset_index().reindex(columns=df.columns)
print(result)
>>>
Name    Time   Elapsed   Amount
0       A      2         $2
1       A      3         $2
2       A      4         $2
3       A      5         $1
4       A      6         $1
5       A      7         $6
6       B      3         $3
7       B      4         $3
8       B      5         $5

Answered By: ziying35

Answer 2

One option is with complete from pyjanitor, to expose the missing rows:

# pip install pyjanitor
import pandas as pd
import janitor

# build a dictionary of all the possible times
# the key of the dictionary should be the column to be expanded
times = {"Time Elapsed" : lambda df: range(df.min(), df.max() + 1)}
df.complete(times, by = 'Name').ffill()
  Name  Time Elapsed Amount
0    A             2     $2
1    A             3     $2
2    A             4     $2
3    A             5     $1
4    A             6     $1
5    A             7     $6
6    B             3     $3
7    B             4     $3
8    B             5     $5

Answered By: sammywemmy

Add missing rows per group to fill a series

Question:

Answers: