Remove element from cell when cell contents is a list
Question:
Using the dataframe posted below, I need to remove element 0
from each cell in column Var2
(each cell is a list), but only for rows where Var1 > 0
.
import numpy as np
import pandas as pd
df = pd.DataFrame({'Var1': [1,0,3,1],
'Var2': [[0,8],[6,0],[1,3,0],[5,0,3]]
I tried this, but the output is not what I expected – it seems to remove all elements in the cell.
df['Var2'] = df.apply(lambda x: x['Var2'].remove(0) if x['Var1']>0 else x['Var2'], axis = 1)
╔══════════════╗
║ Var1 Var2 ║
╠══════════════╣
║ 1 None ║
║ 0 [6, 0] ║
║ 3 None ║
║ 1 None ║
╚══════════════╝
The desired output is:
╔══════════════╗
║ Var1 Var2 ║
╠══════════════╣
║ 1 [8] ║
║ 0 [6, 0] ║
║ 3 [1, 3] ║
║ 1 [5, 3] ║
╚══════════════╝
What am I doing wrong? Also, I wonder whether this could be done without using apply
.
Answers:
remove
working inplace
(return None
s), so need list comprehension with filtering:
f = lambda x: [y for y in x['Var2'] if y != 0] if x['Var1']>0 else x['Var2']
df['Var2'] = df.apply(f, axis = 1)
print (df)
Var1 Var2
0 1 [8]
1 0 [6, 0]
2 3 [1, 3]
3 1 [5, 3]
You can use pd.Series.apply
with a list comprehension. Your code doesn’t work because list.remove
is an in-place operation which returns None
. See here for more details.
df = pd.DataFrame({'Var1': [1,0,3,1],
'Var2': [[0,8],[6,0],[1,3,0],[5,0,3]]})
def remove_zero(x):
return [i for i in x if i != 0]
df.loc[df['Var1'] > 0, 'Var2'] = df['Var2'].apply(remove_zero)
print(df)
Var1 Var2
0 1 [8]
1 0 [6, 0]
2 3 [1, 3]
3 1 [5, 3]
try this,
inside if
condition (Var1 not equal to zero) block find the index of 0 and remove it and return back to the list and save the result back.
print df.apply(lambda x: list(np.delete(x['Var2'],x['Var2'].index(0))) if x['Var1']!=0 else x['Var2'],axis=1)
Input:
Var1 Var2
0 1 [0, 8]
1 0 [6, 0]
2 3 [1, 3, 0]
3 1 [5, 0, 3]
Output:
Var1 Var2
0 1 [8]
1 0 [6, 0]
2 3 [1, 3]
3 1 [5, 3]
Using the dataframe posted below, I need to remove element 0
from each cell in column Var2
(each cell is a list), but only for rows where Var1 > 0
.
import numpy as np
import pandas as pd
df = pd.DataFrame({'Var1': [1,0,3,1],
'Var2': [[0,8],[6,0],[1,3,0],[5,0,3]]
I tried this, but the output is not what I expected – it seems to remove all elements in the cell.
df['Var2'] = df.apply(lambda x: x['Var2'].remove(0) if x['Var1']>0 else x['Var2'], axis = 1)
╔══════════════╗
║ Var1 Var2 ║
╠══════════════╣
║ 1 None ║
║ 0 [6, 0] ║
║ 3 None ║
║ 1 None ║
╚══════════════╝
The desired output is:
╔══════════════╗
║ Var1 Var2 ║
╠══════════════╣
║ 1 [8] ║
║ 0 [6, 0] ║
║ 3 [1, 3] ║
║ 1 [5, 3] ║
╚══════════════╝
What am I doing wrong? Also, I wonder whether this could be done without using apply
.
remove
working inplace
(return None
s), so need list comprehension with filtering:
f = lambda x: [y for y in x['Var2'] if y != 0] if x['Var1']>0 else x['Var2']
df['Var2'] = df.apply(f, axis = 1)
print (df)
Var1 Var2
0 1 [8]
1 0 [6, 0]
2 3 [1, 3]
3 1 [5, 3]
You can use pd.Series.apply
with a list comprehension. Your code doesn’t work because list.remove
is an in-place operation which returns None
. See here for more details.
df = pd.DataFrame({'Var1': [1,0,3,1],
'Var2': [[0,8],[6,0],[1,3,0],[5,0,3]]})
def remove_zero(x):
return [i for i in x if i != 0]
df.loc[df['Var1'] > 0, 'Var2'] = df['Var2'].apply(remove_zero)
print(df)
Var1 Var2
0 1 [8]
1 0 [6, 0]
2 3 [1, 3]
3 1 [5, 3]
try this,
inside if
condition (Var1 not equal to zero) block find the index of 0 and remove it and return back to the list and save the result back.
print df.apply(lambda x: list(np.delete(x['Var2'],x['Var2'].index(0))) if x['Var1']!=0 else x['Var2'],axis=1)
Input:
Var1 Var2
0 1 [0, 8]
1 0 [6, 0]
2 3 [1, 3, 0]
3 1 [5, 0, 3]
Output:
Var1 Var2
0 1 [8]
1 0 [6, 0]
2 3 [1, 3]
3 1 [5, 3]