Is there a way to add multiple elements in one line to a python set?
Question:
I am iterating over a pandas dataframe and would like to add unique elements to a set from multiple columns of the dataframe. Currently I do it like this:
list_a = set([])
for i, row in df.iterrows():
list_a.add(row.a)
list_a.add(row.b)
I tried this:
list_a = set([])
for i, row in df.iterrows():
list_a.add(row.a, row.b)
But it results in the following error message:
TypeError: add() takes exactly one argument (2 given)
Is there a more elegant way to do this operation than the way I did it (consider the case when there are more than 2 columns to add values from)?
Answers:
You can use the union functionality –
list_a = list_a.union([row.a, row.b])
See more on the python sets documentation – https://docs.python.org/2/library/sets.html
You can use the Union
of sets:
list_a = set()
for i, row in df.iterrows():
list_a |= {row.a, row.b}
IIUC then the following should work:
df[['a','b']].stack().unique()
Example:
In [60]:
df = pd.DataFrame({'a': [0,1,2,2,3], 'b':np.arange(5), 'c':[-1,2,2,54,6]})
df
Out[60]:
a b c
0 0 0 -1
1 1 1 2
2 2 2 2
3 2 3 54
4 3 4 6
In [61]:
df[['a','b']].stack().unique()
Out[61]:
array([0, 1, 2, 3, 4], dtype=int64)
You can cast to a set
if necessary:
In [63]:
set(df[['a','b']].stack().unique())
Out[63]:
{0, 1, 2, 3, 4}
You can simply use the update method in the set
data type.
list_a = set()
for i, row in df.iterrows():
list_a.update((row.a, row.b))
I am iterating over a pandas dataframe and would like to add unique elements to a set from multiple columns of the dataframe. Currently I do it like this:
list_a = set([])
for i, row in df.iterrows():
list_a.add(row.a)
list_a.add(row.b)
I tried this:
list_a = set([])
for i, row in df.iterrows():
list_a.add(row.a, row.b)
But it results in the following error message:
TypeError: add() takes exactly one argument (2 given)
Is there a more elegant way to do this operation than the way I did it (consider the case when there are more than 2 columns to add values from)?
You can use the union functionality –
list_a = list_a.union([row.a, row.b])
See more on the python sets documentation – https://docs.python.org/2/library/sets.html
You can use the Union
of sets:
list_a = set()
for i, row in df.iterrows():
list_a |= {row.a, row.b}
IIUC then the following should work:
df[['a','b']].stack().unique()
Example:
In [60]:
df = pd.DataFrame({'a': [0,1,2,2,3], 'b':np.arange(5), 'c':[-1,2,2,54,6]})
df
Out[60]:
a b c
0 0 0 -1
1 1 1 2
2 2 2 2
3 2 3 54
4 3 4 6
In [61]:
df[['a','b']].stack().unique()
Out[61]:
array([0, 1, 2, 3, 4], dtype=int64)
You can cast to a set
if necessary:
In [63]:
set(df[['a','b']].stack().unique())
Out[63]:
{0, 1, 2, 3, 4}
You can simply use the update method in the set
data type.
list_a = set()
for i, row in df.iterrows():
list_a.update((row.a, row.b))