# Python Pandas – How to create a dataframe from a sequence

## Question:

I’m trying to create a dataframe populated by repeating rows based on an existing steady sequence.

For example, if I had a sequence increasing in 3s from 6 to 18, the sequence could be generated using `np.arange(6, 18, 3)`

to give `array([ 6, 9, 12, 15])`

.

How would I go about generating a dataframe in this way?

How could I get the below if I wanted 6 repeated rows?

```
0 1 2 3
0 6.0 9.0 12.0 15.0
1 6.0 9.0 12.0 15.0
2 6.0 9.0 12.0 15.0
3 6.0 9.0 12.0 15.0
4 6.0 9.0 12.0 15.0
5 6.0 9.0 12.0 15.0
6 6.0 9.0 12.0 15.0
```

The reason for creating this matrix is that I then wish to add a pd.sequence row-wise to this matrix

## Answers:

```
pd.DataFrame([np.arange(6, 18, 3)]*7)
```

alternately,

```
pd.DataFrame(np.repeat([np.arange(6, 18, 3)],7, axis=0))
```

```
0 1 2 3
0 6 9 12 15
1 6 9 12 15
2 6 9 12 15
3 6 9 12 15
4 6 9 12 15
5 6 9 12 15
6 6 9 12 15
```

Here is a solution using NumPy broadcasting which avoids Python loops, lists, and excessive memory allocation (as done by np.repeat):

```
pd.DataFrame(np.broadcast_to(np.arange(6, 18, 3), (6, 4)))
```

To understand why this is more efficient than other solutions, refer to the `np.broadcast_to()`

docs: https://numpy.org/doc/stable/reference/generated/numpy.broadcast_to.html

more than one element of a broadcasted array may refer to a single memory location.

This means that no matter how many rows you create before passing to Pandas, you’re only really allocating a single row, then a 2D array which refers to the data of that row multiple times.

If you assign the above to `df`

, you can say `df.values.base`

is a single row–this is the only storage required no matter how many rows appear in the DataFrame.