How do I extend a pandas DataFrame by repeating the last row?

Question:

I have a DataFrame, and would like to extend it by repeating the last row n times.

Example code:

import pandas as pd
import numpy as np
dates = date_range('1/1/2014', periods=4)
df = pd.DataFrame(np.eye(4, 4), index=dates, columns=['A', 'B', 'C', 'D'])
n = 3
for i in range(n):
    df = df.append(df[-1:])

so df is

            A  B  C  D
2013-01-01  1  0  0  0
2013-01-02  0  1  0  0
2013-01-03  0  0  1  0
2013-01-04  0  0  0  1
2013-01-04  0  0  0  1
2013-01-04  0  0  0  1
2013-01-04  0  0  0  1

Is there a better way to do this without the for loop?

Asked By: k107

||

Answers:

You could use nested concat operations, the inner one will concatenate your last row 3 times and we then concatenate this with your orig df:

In [181]:

dates = pd.date_range('1/1/2014', periods=4)
df = pd.DataFrame(np.eye(4, 4), index=dates, columns=['A', 'B', 'C', 'D'])
pd.concat([df,pd.concat([df[-1:]]*3)])
Out[181]:
            A  B  C  D
2014-01-01  1  0  0  0
2014-01-02  0  1  0  0
2014-01-03  0  0  1  0
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1

This could be put into a function like so:

In [182]:

def repeatRows(d, n=3):
    return pd.concat([d]*n)

pd.concat([df,repeatRows(df[-1:], 3)])
Out[182]:
            A  B  C  D
2014-01-01  1  0  0  0
2014-01-02  0  1  0  0
2014-01-03  0  0  1  0
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
Answered By: EdChum

Here’s an alternate (fancy indexing) way to do it:

df.append( df.iloc[[-1]*3] )

Out[757]: 
            A  B  C  D
2014-01-01  1  0  0  0
2014-01-02  0  1  0  0
2014-01-03  0  0  1  0
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
Answered By: JohnE

Another way, without using any index or multiple concat, is by using tail() and the unpack operator. Notice that the method append is deprecated.

pd.concat([df, *[df.tail(1)]*3]) 

Therefore, to repeat the last n rows d times:

pd.concat([df, *[df.tail(n)]*d]) 

tail(n) returns the last n elements (by default n=5).

The unpack operator (‘*’) allows you to unpack a sequence or iterable into separate variables, for example:

def sum_var(a, b, c):
    return a + b + c

numbers = [1, 2, 3]

sum_result = sum_var(*numbers)
Answered By: Rosario Scavo
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.