Is there a way to add multiple elements in one line to a python set?

Question:

I am iterating over a pandas dataframe and would like to add unique elements to a set from multiple columns of the dataframe. Currently I do it like this:

list_a = set([])
for i, row in df.iterrows():
    list_a.add(row.a)
    list_a.add(row.b)

I tried this:

list_a = set([])
for i, row in df.iterrows():
    list_a.add(row.a, row.b)

But it results in the following error message:

TypeError: add() takes exactly one argument (2 given)

Is there a more elegant way to do this operation than the way I did it (consider the case when there are more than 2 columns to add values from)?

Asked By: Balázs Fehér

||

Answers:

You can use the union functionality –
list_a = list_a.union([row.a, row.b])

See more on the python sets documentation – https://docs.python.org/2/library/sets.html

Answered By: Tom Ron

You can use the Union of sets:

list_a = set()
for i, row in df.iterrows():
    list_a |= {row.a, row.b}
Answered By: Daniele Pantaleone

IIUC then the following should work:

df[['a','b']].stack().unique()

Example:

In [60]:
df = pd.DataFrame({'a': [0,1,2,2,3], 'b':np.arange(5), 'c':[-1,2,2,54,6]})
df

Out[60]:
   a  b   c
0  0  0  -1
1  1  1   2
2  2  2   2
3  2  3  54
4  3  4   6

In [61]:    
df[['a','b']].stack().unique()

Out[61]:
array([0, 1, 2, 3, 4], dtype=int64)

You can cast to a set if necessary:

In [63]:
set(df[['a','b']].stack().unique())

Out[63]:
{0, 1, 2, 3, 4}
Answered By: EdChum

You can simply use the update method in the set data type.

list_a = set()
for i, row in df.iterrows():
    list_a.update((row.a, row.b))
Answered By: Pedram
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.