Difference Between Two Lists with Duplicates in Python

Question:

I have two lists that contain many of the same items, including duplicate items. I want to check which items in the first list are not in the second list. For example, I might have one list like this:

l1 = ['a', 'b', 'c', 'b', 'c']

and one list like this:

l2 = ['a', 'b', 'c', 'b']

Comparing these two lists I would want to return a third list like this:

l3 = ['c']

I am currently using some terrible code that I made a while ago that I’m fairly certain doesn’t even work properly shown below.

def list_difference(l1,l2):
    for i in range(0, len(l1)):
        for j in range(0, len(l2)):
            if l1[i] == l1[j]:
                l1[i] = 'damn'
                l2[j] = 'damn'
    l3 = []
    for item in l1:
        if item!='damn':
            l3.append(item)
    return l3

How can I better accomplish this task?

Asked By: Paul

||

Answers:

You didn’t specify if the order matters. If it does not, you can do this in >= Python 2.7:

l1 = ['a', 'b', 'c', 'b', 'c']
l2 = ['a', 'b', 'c', 'b']

from collections import Counter

c1 = Counter(l1)
c2 = Counter(l2)

diff = c1-c2
print list(diff.elements())
Answered By: Jochen Ritzel

Counters are new in Python 2.7.
For a general solution to substract a from b:

def list_difference(b, a):
    c = list(b)
    for item in a:
       try:
           c.remove(item)
       except ValueError:
           pass            #or maybe you want to keep a values here
    return c
Answered By: joaquin

Create Counters for both lists, then subtract one from the other.

from collections import Counter

a = [1,2,3,1,2]
b = [1,2,3,1]

c = Counter(a)
c.subtract(Counter(b))
Answered By: Matt Fenwick

To take into account both duplicates and the order of elements:

from collections import Counter

def list_difference(a, b):
    count = Counter(a) # count items in a
    count.subtract(b)  # subtract items that are in b
    diff = []
    for x in a:
        if count[x] > 0:
           count[x] -= 1
           diff.append(x)
    return diff

Example

print(list_difference("z y z x v x y x u".split(), "x y z w z".split()))
# -> ['y', 'x', 'v', 'x', 'u']

Python 2.5 version:

from collections import defaultdict 

def list_difference25(a, b):
    # count items in a
    count = defaultdict(int) # item -> number of occurrences
    for x in a:
        count[x] += 1

    # subtract items that are in b
    for x in b: 
        count[x] -= 1

    diff = []
    for x in a:
        if count[x] > 0:
           count[x] -= 1
           diff.append(x)
    return diff
Answered By: jfs

you can try this

list(filter(lambda x:l1.remove(x),li2))
print(l1)

Answered By: Akash Gupta

Try this one:

from collections import Counter
from typing import Sequence

def duplicates_difference(a: Sequence, b: Sequence) -> Counter:
    """
    >>> duplicates_difference([1,2],[1,2,2,3])
    Counter({2: 1, 3: 1})
    """
    shorter, longer = sorted([a, b], key=len)
    return Counter(longer) - Counter(shorter)
Answered By: RafalS
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.