Add missing rows per group to fill a series

Question:

I have the following df:

Name Time Elapsed Amount
A 2 $2
A 5 $1
A 7 $6
B 3 $3
B 5 $5

I would like to expand this per group so that I get all the times in between the min and max, and then fill downwards (assuming it’s sorted by name then by time elapsed) to produce the following output

Name Time Elapsed Amount
A 2 $2
A 3 $2
A 4 $2
A 5 $1
A 6 $1
A 7 $6
B 3 $3
B 4 $3
B 5 $5

I tried the following:

df = df(df.set_index('Time Elapsed')
          .groupby('Name')['Amount']
          .apply(lambda x: x.reindex(range(x.index.min(), x.index.max()+1))
          .ffill().fillna(0)).reset_index)

Feel free to offer a solution that does not use any of my code.

Asked By: lakadibo

||

Answers:

try this:

def func(g: pd.DataFrame):
    tmp = g.set_index('Time Elapsed')
    res = tmp.reindex(
        np.arange(tmp.index.min(), tmp.index.max()+1),
        method='ffill')
    return res

grouped = df.groupby('Name', as_index=False)
result = grouped.apply(func).reset_index().reindex(columns=df.columns)
print(result)
>>>
Name    Time   Elapsed   Amount
0       A      2         $2
1       A      3         $2
2       A      4         $2
3       A      5         $1
4       A      6         $1
5       A      7         $6
6       B      3         $3
7       B      4         $3
8       B      5         $5
Answered By: ziying35

One option is with complete from pyjanitor, to expose the missing rows:

# pip install pyjanitor
import pandas as pd
import janitor

# build a dictionary of all the possible times
# the key of the dictionary should be the column to be expanded
times = {"Time Elapsed" : lambda df: range(df.min(), df.max() + 1)}
df.complete(times, by = 'Name').ffill()
  Name  Time Elapsed Amount
0    A             2     $2
1    A             3     $2
2    A             4     $2
3    A             5     $1
4    A             6     $1
5    A             7     $6
6    B             3     $3
7    B             4     $3
8    B             5     $5
Answered By: sammywemmy
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.