Set difference versus set subtraction
Question:
What distinguishes -
and .difference()
on sets? Obviously the syntax is not the same, one is a binary operator, the other is an instance method. What else?
s1 = set([1,2,3])
s2 = set([3,4,5])
>>> s1 - s2
set([1, 2])
>>> s1.difference(s2)
set([1, 2])
Answers:
set.difference, set.union...
can take any iterable as the second arg while both need to be sets to use -
, there is no difference in the output.
Operation Equivalent Result
s.difference(t) s - t new set with elements in s but not in t
With .difference you can do things like:
s1 = set([1,2,3])
print(s1.difference(*[[3],[4],[5]]))
{1, 2}
It is also more efficient when creating sets using the *(iterable,iterable)
syntax as you don’t create intermediary sets, you can see some comparisons here
The documentation appears to suggest that difference can take multiple sets, so it is possible that it might be more efficient and clearer for things like:
s1 = set([1, 2, 3, 4])
s2 = set([2, 5])
s3 = set([3, 6])
s1.difference(s2, s3) # instead of s1 - s2 - s3
but I would suggest some testing to verify.
On a quick glance it may not be quite evident from the documentation but buried deep inside a paragraph is dedicated to differentiate the method call with the operator version
Note, the non-operator versions of union(), intersection(),
difference(), and symmetric_difference(), issubset(), and issuperset()
methods will accept any iterable as an argument. In contrast, their
operator based counterparts require their arguments to be sets. This
precludes error-prone constructions like set('abc') & 'cbs'
in favor
of the more readable set('abc').intersection('cbs')
.
What distinguishes -
and .difference()
on sets? Obviously the syntax is not the same, one is a binary operator, the other is an instance method. What else?
s1 = set([1,2,3])
s2 = set([3,4,5])
>>> s1 - s2
set([1, 2])
>>> s1.difference(s2)
set([1, 2])
set.difference, set.union...
can take any iterable as the second arg while both need to be sets to use -
, there is no difference in the output.
Operation Equivalent Result
s.difference(t) s - t new set with elements in s but not in t
With .difference you can do things like:
s1 = set([1,2,3])
print(s1.difference(*[[3],[4],[5]]))
{1, 2}
It is also more efficient when creating sets using the *(iterable,iterable)
syntax as you don’t create intermediary sets, you can see some comparisons here
The documentation appears to suggest that difference can take multiple sets, so it is possible that it might be more efficient and clearer for things like:
s1 = set([1, 2, 3, 4])
s2 = set([2, 5])
s3 = set([3, 6])
s1.difference(s2, s3) # instead of s1 - s2 - s3
but I would suggest some testing to verify.
On a quick glance it may not be quite evident from the documentation but buried deep inside a paragraph is dedicated to differentiate the method call with the operator version
Note, the non-operator versions of union(), intersection(),
difference(), and symmetric_difference(), issubset(), and issuperset()
methods will accept any iterable as an argument. In contrast, their
operator based counterparts require their arguments to be sets. This
precludes error-prone constructions likeset('abc') & 'cbs'
in favor
of the more readableset('abc').intersection('cbs')
.