How to insert a pre-initialized dataframe or several columns into another dataframe at a specified column position?

Question

Suppose we have the following dataframe.

  col1 col2   col3
0  one  two  three
1  one  two  three
2  one  two  three
3  one  two  three
4  one  two  three

We seek to introduce 31 columns into this dataframe, each column representing a day in the month.

Let’s say we want to introduce it precisely between columns col2 and col3.

How do we achieve this?

To make it simple, the introduced columns can be numbered from 1 to 31.

Starting source code

import pandas as pd

src = pd.DataFrame({'col1': ['one', 'one', 'one', 'one','one'],    
                    'col2': ['two', 'two', 'two', 'two','two'],    
                    'col3': ['three', 'three', 'three', 'three','three'],
                    })

Asked By: Laurent B.

||

Source

Answer 1

For illustrative purposes

import pandas as pd
import numpy as np

src = pd.DataFrame({'col1': ['one', 'one', 'one', 'one','one'],    
                    'col2': ['two', 'two', 'two', 'two','two'],    
                    'col3': ['three', 'three', 'three', 'three','three'],
                    })

m = np.matrix([0]*31) # Builds a 31-columns numpy array matrix
df = pd.DataFrame(m) # Converts matrix to dataframe
df.columns = df.columns+1 # Increments columns from 1 in dataframe

# Operations on dataframe : extension + resetting index + replace Nan by 0
df = (df.reindex(list(range(0, len(src))))
        .reset_index(drop=True)
        .fillna(0))

df = pd.concat([src.iloc[:, :2], df, src.iloc[:, 2:]], axis=1) # inserts by slicing source in two parts

Result

  col1 col2    1    2    3    4    5  ...   26   27   28   29   30   31   col3
0  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
1  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
2  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
3  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
4  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three

[5 rows x 34 columns]

Answered By: Laurent B.

Answer 2

You can use pd.concat and reorder columns with iloc like the below:

import numpy as np

# Create dataframe with 31 column and 5 rows
tmp = pd.DataFrame(np.zeros((5, 31)), columns=range(1, 32))

# Concat two dataframes and reorder columns as you like
df = pd.concat([src.iloc[:,:2], tmp, src.iloc[:, 2:]], axis=1)

Output:

  col1 col2    1    2    3    4    5    6    7    8  ...   23   24   25   26  
0  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
1  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
2  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
3  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
4  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   

    27   28   29   30   31   col3  
0  0.0  0.0  0.0  0.0  0.0  three  
1  0.0  0.0  0.0  0.0  0.0  three  
2  0.0  0.0  0.0  0.0  0.0  three  
3  0.0  0.0  0.0  0.0  0.0  three  
4  0.0  0.0  0.0  0.0  0.0  three  

[5 rows x 34 columns]

Answered By: I'mahdi

Answer 3

I would assign values to the original dataframe and reorder the columns using column selection.

src[list(range(1, 32))] = 0
src = src[[*src.columns[:2], *range(1, 32), src.columns[2]]]

or for an entirely new copy, use assign:

cols = list(map(str, range(1, 32)))
new_df = (
    src
    .assign(**dict.fromkeys(cols, 0))
    .reindex(columns=[*src.columns[:2], *cols, *src.columns[2:]])
)

Answered By: cottontail

Answer 4

If your purpose is to add and initialize new columns, use reindex:

cols = list(src)
cols[2:2] = range(1,31+1)

df = src.reindex(columns=cols, fill_value=0)

Output:


  col1 col2  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31   col3
0  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
1  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
2  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
3  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
4  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three

Answered By: mozway

Answer 5

Another possible solution:

pd.concat([src.iloc[:, :2].assign(
    **{str(col): 0 for col in range(1, 32)}), src['col3']], axis=1)

Output:

  col1 col2  1  2  3  4  5  6  7  8  ...  23  24  25  26  27  28  29  30  31  
0  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
1  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
2  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
3  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
4  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   

    col3  
0  three  
1  three  
2  three  
3  three  
4  three  

[5 rows x 34 columns]

Answered By: PaulS

How to insert a pre-initialized dataframe or several columns into another dataframe at a specified column position?

Question:

Answers: