How do you pop multiple columns off a Pandas dataframe, into a new dataframe?
Question:
Suppose I have the following:
df = pd.DataFrame({'a':range(2), 'b':range(2), 'c':range(2), 'd':range(2)})
I’d like to “pop” two columns (‘c’ and ‘d’) off the dataframe, into a new dataframe, leaving ‘a’ and ‘b’ behind in the original df. The following does not work:
df2 = df.pop(['c', 'd'])
Here’s my error:
TypeError: '['c', 'd']' is an invalid key
Does anyone know a quick, classy solution, besides doing the following?
df2 = df[['c', 'd']]
df3 = df[['a', 'b']]
I know the above code is not that tedious to type, but this is why DataFrame.pop was invented–to save us a step when popping one column off a database.
Answers:
This will have to be a two step process (you cannot get around this, because as rightly mentioned, pop
works for a single column and returns a Series).
First, slice df
(step 1), and then drop those columns (step 2).
df2 = df[['c', 'd']].copy()
df = df.drop(['c', 'd'], axis=1)
And here’s the one-liner version using pd.concat
:
df2 = pd.concat([df.pop(x) for x in ['c', 'd']], axis=1)
This is still a two step process, but you’re doing it in one line.
df
a b
0 0 0
1 1 1
df2
c d
0 0 0
1 1 1
With that said, I think there’s value in allowing pop
to take a list-like of column headers appropriately returning a DataFrame of popped columns. This would make a good feature request for GitHub, assuming one has the time to write one up.
Here’s an alternative, but I’m not sure if it’s more classy than your original solution:
df2 = pd.DataFrame([df.pop(x) for x in ['c', 'd']]).T
df3 = pd.DataFrame([df.pop(x) for x in ['a', 'b']]).T
Output:
print(df2)
# c d
#0 0 0
#1 1 1
print(df3)
# a b
#0 0 0
#1 1 1
new_df = old_df.loc[:,pop_columns]
If you don’t want to copy your original pd.DataFrame, using list comprehension has nice code
list_to_pop = ['a', 'b']
[df.pop(col) for col in list_to_pop]
Suppose I have the following:
df = pd.DataFrame({'a':range(2), 'b':range(2), 'c':range(2), 'd':range(2)})
I’d like to “pop” two columns (‘c’ and ‘d’) off the dataframe, into a new dataframe, leaving ‘a’ and ‘b’ behind in the original df. The following does not work:
df2 = df.pop(['c', 'd'])
Here’s my error:
TypeError: '['c', 'd']' is an invalid key
Does anyone know a quick, classy solution, besides doing the following?
df2 = df[['c', 'd']]
df3 = df[['a', 'b']]
I know the above code is not that tedious to type, but this is why DataFrame.pop was invented–to save us a step when popping one column off a database.
This will have to be a two step process (you cannot get around this, because as rightly mentioned, pop
works for a single column and returns a Series).
First, slice df
(step 1), and then drop those columns (step 2).
df2 = df[['c', 'd']].copy()
df = df.drop(['c', 'd'], axis=1)
And here’s the one-liner version using pd.concat
:
df2 = pd.concat([df.pop(x) for x in ['c', 'd']], axis=1)
This is still a two step process, but you’re doing it in one line.
df
a b
0 0 0
1 1 1
df2
c d
0 0 0
1 1 1
With that said, I think there’s value in allowing pop
to take a list-like of column headers appropriately returning a DataFrame of popped columns. This would make a good feature request for GitHub, assuming one has the time to write one up.
Here’s an alternative, but I’m not sure if it’s more classy than your original solution:
df2 = pd.DataFrame([df.pop(x) for x in ['c', 'd']]).T
df3 = pd.DataFrame([df.pop(x) for x in ['a', 'b']]).T
Output:
print(df2)
# c d
#0 0 0
#1 1 1
print(df3)
# a b
#0 0 0
#1 1 1
new_df = old_df.loc[:,pop_columns]
If you don’t want to copy your original pd.DataFrame, using list comprehension has nice code
list_to_pop = ['a', 'b']
[df.pop(col) for col in list_to_pop]