How to insert a pre-initialized dataframe or several columns into another dataframe at a specified column position?

Question:

Suppose we have the following dataframe.

  col1 col2   col3
0  one  two  three
1  one  two  three
2  one  two  three
3  one  two  three
4  one  two  three

We seek to introduce 31 columns into this dataframe, each column representing a day in the month.

Let’s say we want to introduce it precisely between columns col2 and col3.

How do we achieve this?

To make it simple, the introduced columns can be numbered from 1 to 31.

Starting source code

import pandas as pd

src = pd.DataFrame({'col1': ['one', 'one', 'one', 'one','one'],    
                    'col2': ['two', 'two', 'two', 'two','two'],    
                    'col3': ['three', 'three', 'three', 'three','three'],
                    })
Asked By: Laurent B.

||

Answers:

For illustrative purposes

import pandas as pd
import numpy as np

src = pd.DataFrame({'col1': ['one', 'one', 'one', 'one','one'],    
                    'col2': ['two', 'two', 'two', 'two','two'],    
                    'col3': ['three', 'three', 'three', 'three','three'],
                    })

m = np.matrix([0]*31) # Builds a 31-columns numpy array matrix
df = pd.DataFrame(m) # Converts matrix to dataframe
df.columns = df.columns+1 # Increments columns from 1 in dataframe

# Operations on dataframe : extension + resetting index + replace Nan by 0
df = (df.reindex(list(range(0, len(src))))
        .reset_index(drop=True)
        .fillna(0))

df = pd.concat([src.iloc[:, :2], df, src.iloc[:, 2:]], axis=1) # inserts by slicing source in two parts

Result

  col1 col2    1    2    3    4    5  ...   26   27   28   29   30   31   col3
0  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
1  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
2  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
3  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
4  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three

[5 rows x 34 columns]
Answered By: Laurent B.

You can use pd.concat and reorder columns with iloc like the below:

import numpy as np

# Create dataframe with 31 column and 5 rows
tmp = pd.DataFrame(np.zeros((5, 31)), columns=range(1, 32))

# Concat two dataframes and reorder columns as you like
df = pd.concat([src.iloc[:,:2], tmp, src.iloc[:, 2:]], axis=1)

Output:

  col1 col2    1    2    3    4    5    6    7    8  ...   23   24   25   26  
0  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
1  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
2  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
3  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
4  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   

    27   28   29   30   31   col3  
0  0.0  0.0  0.0  0.0  0.0  three  
1  0.0  0.0  0.0  0.0  0.0  three  
2  0.0  0.0  0.0  0.0  0.0  three  
3  0.0  0.0  0.0  0.0  0.0  three  
4  0.0  0.0  0.0  0.0  0.0  three  

[5 rows x 34 columns]
Answered By: I'mahdi

I would assign values to the original dataframe and reorder the columns using column selection.

src[list(range(1, 32))] = 0
src = src[[*src.columns[:2], *range(1, 32), src.columns[2]]]

or for an entirely new copy, use assign:

cols = list(map(str, range(1, 32)))
new_df = (
    src
    .assign(**dict.fromkeys(cols, 0))
    .reindex(columns=[*src.columns[:2], *cols, *src.columns[2:]])
)

res

Answered By: cottontail

If your purpose is to add and initialize new columns, use reindex:

cols = list(src)
cols[2:2] = range(1,31+1)

df = src.reindex(columns=cols, fill_value=0)

Output:


  col1 col2  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31   col3
0  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
1  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
2  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
3  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
4  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
Answered By: mozway

Another possible solution:

pd.concat([src.iloc[:, :2].assign(
    **{str(col): 0 for col in range(1, 32)}), src['col3']], axis=1)

Output:

  col1 col2  1  2  3  4  5  6  7  8  ...  23  24  25  26  27  28  29  30  31  
0  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
1  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
2  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
3  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
4  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   

    col3  
0  three  
1  three  
2  three  
3  three  
4  three  

[5 rows x 34 columns]
Answered By: PaulS
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.