How to insert a pre-initialized dataframe or several columns into another dataframe at a specified column position?
Question:
Suppose we have the following dataframe.
col1 col2 col3
0 one two three
1 one two three
2 one two three
3 one two three
4 one two three
We seek to introduce 31 columns into this dataframe, each column representing a day in the month.
Let’s say we want to introduce it precisely between columns col2
and col3
.
How do we achieve this?
To make it simple, the introduced columns can be numbered from 1 to 31.
Starting source code
import pandas as pd
src = pd.DataFrame({'col1': ['one', 'one', 'one', 'one','one'],
'col2': ['two', 'two', 'two', 'two','two'],
'col3': ['three', 'three', 'three', 'three','three'],
})
Answers:
For illustrative purposes
import pandas as pd
import numpy as np
src = pd.DataFrame({'col1': ['one', 'one', 'one', 'one','one'],
'col2': ['two', 'two', 'two', 'two','two'],
'col3': ['three', 'three', 'three', 'three','three'],
})
m = np.matrix([0]*31) # Builds a 31-columns numpy array matrix
df = pd.DataFrame(m) # Converts matrix to dataframe
df.columns = df.columns+1 # Increments columns from 1 in dataframe
# Operations on dataframe : extension + resetting index + replace Nan by 0
df = (df.reindex(list(range(0, len(src))))
.reset_index(drop=True)
.fillna(0))
df = pd.concat([src.iloc[:, :2], df, src.iloc[:, 2:]], axis=1) # inserts by slicing source in two parts
Result
col1 col2 1 2 3 4 5 ... 26 27 28 29 30 31 col3
0 one two 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 three
1 one two 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 three
2 one two 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 three
3 one two 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 three
4 one two 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 three
[5 rows x 34 columns]
You can use pd.concat
and reorder columns with iloc
like the below:
import numpy as np
# Create dataframe with 31 column and 5 rows
tmp = pd.DataFrame(np.zeros((5, 31)), columns=range(1, 32))
# Concat two dataframes and reorder columns as you like
df = pd.concat([src.iloc[:,:2], tmp, src.iloc[:, 2:]], axis=1)
Output:
col1 col2 1 2 3 4 5 6 7 8 ... 23 24 25 26
0 one two 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
1 one two 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
2 one two 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
3 one two 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
4 one two 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
27 28 29 30 31 col3
0 0.0 0.0 0.0 0.0 0.0 three
1 0.0 0.0 0.0 0.0 0.0 three
2 0.0 0.0 0.0 0.0 0.0 three
3 0.0 0.0 0.0 0.0 0.0 three
4 0.0 0.0 0.0 0.0 0.0 three
[5 rows x 34 columns]
I would assign values to the original dataframe and reorder the columns using column selection.
src[list(range(1, 32))] = 0
src = src[[*src.columns[:2], *range(1, 32), src.columns[2]]]
or for an entirely new copy, use assign
:
cols = list(map(str, range(1, 32)))
new_df = (
src
.assign(**dict.fromkeys(cols, 0))
.reindex(columns=[*src.columns[:2], *cols, *src.columns[2:]])
)
If your purpose is to add and initialize new columns, use reindex
:
cols = list(src)
cols[2:2] = range(1,31+1)
df = src.reindex(columns=cols, fill_value=0)
Output:
col1 col2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 col3
0 one two 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 three
1 one two 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 three
2 one two 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 three
3 one two 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 three
4 one two 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 three
Another possible solution:
pd.concat([src.iloc[:, :2].assign(
**{str(col): 0 for col in range(1, 32)}), src['col3']], axis=1)
Output:
col1 col2 1 2 3 4 5 6 7 8 ... 23 24 25 26 27 28 29 30 31
0 one two 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
1 one two 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
2 one two 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
3 one two 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
4 one two 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
col3
0 three
1 three
2 three
3 three
4 three
[5 rows x 34 columns]
Suppose we have the following dataframe.
col1 col2 col3
0 one two three
1 one two three
2 one two three
3 one two three
4 one two three
We seek to introduce 31 columns into this dataframe, each column representing a day in the month.
Let’s say we want to introduce it precisely between columns col2
and col3
.
How do we achieve this?
To make it simple, the introduced columns can be numbered from 1 to 31.
Starting source code
import pandas as pd
src = pd.DataFrame({'col1': ['one', 'one', 'one', 'one','one'],
'col2': ['two', 'two', 'two', 'two','two'],
'col3': ['three', 'three', 'three', 'three','three'],
})
For illustrative purposes
import pandas as pd
import numpy as np
src = pd.DataFrame({'col1': ['one', 'one', 'one', 'one','one'],
'col2': ['two', 'two', 'two', 'two','two'],
'col3': ['three', 'three', 'three', 'three','three'],
})
m = np.matrix([0]*31) # Builds a 31-columns numpy array matrix
df = pd.DataFrame(m) # Converts matrix to dataframe
df.columns = df.columns+1 # Increments columns from 1 in dataframe
# Operations on dataframe : extension + resetting index + replace Nan by 0
df = (df.reindex(list(range(0, len(src))))
.reset_index(drop=True)
.fillna(0))
df = pd.concat([src.iloc[:, :2], df, src.iloc[:, 2:]], axis=1) # inserts by slicing source in two parts
Result
col1 col2 1 2 3 4 5 ... 26 27 28 29 30 31 col3
0 one two 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 three
1 one two 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 three
2 one two 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 three
3 one two 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 three
4 one two 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 three
[5 rows x 34 columns]
You can use pd.concat
and reorder columns with iloc
like the below:
import numpy as np
# Create dataframe with 31 column and 5 rows
tmp = pd.DataFrame(np.zeros((5, 31)), columns=range(1, 32))
# Concat two dataframes and reorder columns as you like
df = pd.concat([src.iloc[:,:2], tmp, src.iloc[:, 2:]], axis=1)
Output:
col1 col2 1 2 3 4 5 6 7 8 ... 23 24 25 26
0 one two 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
1 one two 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
2 one two 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
3 one two 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
4 one two 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
27 28 29 30 31 col3
0 0.0 0.0 0.0 0.0 0.0 three
1 0.0 0.0 0.0 0.0 0.0 three
2 0.0 0.0 0.0 0.0 0.0 three
3 0.0 0.0 0.0 0.0 0.0 three
4 0.0 0.0 0.0 0.0 0.0 three
[5 rows x 34 columns]
I would assign values to the original dataframe and reorder the columns using column selection.
src[list(range(1, 32))] = 0
src = src[[*src.columns[:2], *range(1, 32), src.columns[2]]]
or for an entirely new copy, use assign
:
cols = list(map(str, range(1, 32)))
new_df = (
src
.assign(**dict.fromkeys(cols, 0))
.reindex(columns=[*src.columns[:2], *cols, *src.columns[2:]])
)
If your purpose is to add and initialize new columns, use reindex
:
cols = list(src)
cols[2:2] = range(1,31+1)
df = src.reindex(columns=cols, fill_value=0)
Output:
col1 col2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 col3
0 one two 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 three
1 one two 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 three
2 one two 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 three
3 one two 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 three
4 one two 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 three
Another possible solution:
pd.concat([src.iloc[:, :2].assign(
**{str(col): 0 for col in range(1, 32)}), src['col3']], axis=1)
Output:
col1 col2 1 2 3 4 5 6 7 8 ... 23 24 25 26 27 28 29 30 31
0 one two 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
1 one two 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
2 one two 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
3 one two 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
4 one two 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
col3
0 three
1 three
2 three
3 three
4 three
[5 rows x 34 columns]