How to check if one dictionary is a subset of another larger dictionary?

Question

I’m trying to write a custom filter method that takes an arbitrary number of kwargs and returns a list containing the elements of a database-like list that contain those kwargs.

For example, suppose d1 = {'a':'2', 'b':'3'} and d2 = the same thing. d1 == d2 results in True. But suppose d2 = the same thing plus a bunch of other things. My method needs to be able to tell if d1 in d2, but Python can’t do that with dictionaries.

Context:

I have a Word class, and each object has properties like word, definition, part_of_speech, and so on. I want to be able to call a filter method on the main list of these words, like Word.objects.filter(word='jump', part_of_speech='verb-intransitive'). I can’t figure out how to manage these keys and values at the same time. But this could have larger functionality outside this context for other people.

Asked By: Jamey

||

Source

Answer 1

Convert to item pairs and check for containment.

all(item in superset.items() for item in subset.items())

Optimization is left as an exercise for the reader.

Answered By: Ignacio Vazquez-Abrams

Answer 2

>>> d1 = {'a':'2', 'b':'3'}
>>> d2 = {'a':'2', 'b':'3','c':'4'}
>>> all((k in d2 and d2[k]==v) for k,v in d1.iteritems())
True

context:

>>> d1 = {'a':'2', 'b':'3'}
>>> d2 = {'a':'2', 'b':'3','c':'4'}
>>> list(d1.iteritems())
[('a', '2'), ('b', '3')]
>>> [(k,v) for k,v in d1.iteritems()]
[('a', '2'), ('b', '3')]
>>> k,v = ('a','2')
>>> k
'a'
>>> v
'2'
>>> k in d2
True
>>> d2[k]
'2'
>>> k in d2 and d2[k]==v
True
>>> [(k in d2 and d2[k]==v) for k,v in d1.iteritems()]
[True, True]
>>> ((k in d2 and d2[k]==v) for k,v in d1.iteritems())
<generator object <genexpr> at 0x02A9D2B0>
>>> ((k in d2 and d2[k]==v) for k,v in d1.iteritems()).next()
True
>>> all((k in d2 and d2[k]==v) for k,v in d1.iteritems())
True
>>>

Answered By: Rusty Rob

Answer 3

for keys and values check use:
set(d1.items()).issubset(set(d2.items()))

if you need to check only keys:
set(d1).issubset(set(d2))

Answered By: kashchey

Answer 4

My function for the same purpose, doing this recursively:

def dictMatch(patn, real):
    """does real dict match pattern?"""
    try:
        for pkey, pvalue in patn.iteritems():
            if type(pvalue) is dict:
                result = dictMatch(pvalue, real[pkey])
                assert result
            else:
                assert real[pkey] == pvalue
                result = True
    except (AssertionError, KeyError):
        result = False
    return result

In your example, dictMatch(d1, d2) should return True even if d2 has other stuff in it, plus it applies also to lower levels:

d1 = {'a':'2', 'b':{3: 'iii'}}
d2 = {'a':'2', 'b':{3: 'iii', 4: 'iv'},'c':'4'}

dictMatch(d1, d2)   # True

Notes: There could be even better solution which avoids the if type(pvalue) is dict clause and applies to even wider range of cases (like lists of hashes etc). Also recursion is not limited here so use at your own risk. 😉

Answered By: Alois Mahdal

Answer 5

Note for people that need this for unit testing: there’s also an assertDictContainsSubset() method in Python’s TestCase class.

http://docs.python.org/2/library/unittest.html?highlight=assertdictcontainssubset#unittest.TestCase.assertDictContainsSubset

It’s however deprecated in 3.2, not sure why, maybe there’s a replacement for it.

Answered By: gitaarik

Answer 6

This function works for non-hashable values. I also think that it is clear and easy to read.

def isSubDict(subDict,dictionary):
    for key in subDict.keys():
        if (not key in dictionary) or (not subDict[key] == dictionary[key]):
            return False
    return True

In [126]: isSubDict({1:2},{3:4})
Out[126]: False

In [127]: isSubDict({1:2},{1:2,3:4})
Out[127]: True

In [128]: isSubDict({1:{2:3}},{1:{2:3},3:4})
Out[128]: True

In [129]: isSubDict({1:{2:3}},{1:{2:4},3:4})
Out[129]: False

Answered By: timthelion

Answer 7

For completeness, you can also do this:

def is_subdict(small, big):
    return dict(big, **small) == big

However, I make no claims whatsoever concerning speed (or lack thereof) or readability (or lack thereof).

Update: As pointed out by Boris’ comment, this trick does not work if your small dict has non-string keys and you’re using Python >= 3 (or in other words: in the face of arbitrarily typed keys, it only works in legacy Python 2.x).

If you are using Python 3.9 or newer, though, you can make it work both with non-string typed keys as well as get an even neater syntax.

Provided your code already has both dictionaries as variables, it’s very concise to check for this inline:

if big | small == big:
    # do something

Otherwise, or if you prefer a reusable function as above, you can use this:

def is_subdict(small, big):
    return big | small == big

The working principle is the same as the first function, only this time around making use of the union operator that was extended to support dicts.

Answered By: blubberdiblub

Answer 8

In Python 3, you can use dict.items() to get a set-like view of the dict items. You can then use the <= operator to test if one view is a “subset” of the other:

d1.items() <= d2.items()

In Python 2.7, use the dict.viewitems() to do the same:

d1.viewitems() <= d2.viewitems()

In Python 2.6 and below you will need a different solution, such as using all():

all(key in d2 and d2[key] == d1[key] for key in d1)

Answered By: augurar

Answer 9

This seemingly straightforward issue costs me a couple hours in research to find a 100% reliable solution, so I documented what I’ve found in this answer.

“Pythonic-ally” speaking, small_dict <= big_dict would be the most intuitive way, but too bad that it won’t work. {'a': 1} < {'a': 1, 'b': 2} seemingly works in Python 2, but it is not reliable because the official documention explicitly calls it out. Go search “Outcomes other than equality are resolved consistently, but are not otherwise defined.” in this section. Not to mention, comparing 2 dicts in Python 3 results in a TypeError exception.
The second most-intuitive thing is small.viewitems() <= big.viewitems() for Python 2.7 only, and small.items() <= big.items() for Python 3. But there is one caveat: it is potentially buggy. If your program could potentially be used on Python <=2.6, its d1.items() <= d2.items() are actually comparing 2 lists of tuples, without particular order, so the final result will be unreliable and it becomes a nasty bug in your program. I am not keen to write yet another implementation for Python<=2.6, but I still don’t feel comfortable that my code comes with a known bug (even if it is on an unsupported platform). So I abandon this approach.
I settle down with @blubberdiblub ‘s answer (Credit goes to him):

def is_subdict(small, big): return dict(big, **small) == big

It is worth pointing out that, this answer relies on the == behavior between dicts, which is clearly defined in official document, hence should work in every Python version. Go search:
- “Dictionaries compare equal if and only if they have the same (key, value) pairs.” is the last sentence in this page
- “Mappings (instances of dict) compare equal if and only if they have equal (key, value) pairs. Equality comparison of the keys and elements enforces reflexivity.” in this page

Answered By: RayLuo

Answer 10

Here’s a general recursive solution for the problem given:

import traceback
import unittest

def is_subset(superset, subset):
    for key, value in subset.items():
        if key not in superset:
            return False

        if isinstance(value, dict):
            if not is_subset(superset[key], value):
                return False

        elif isinstance(value, str):
            if value not in superset[key]:
                return False

        elif isinstance(value, list):
            if not set(value) <= set(superset[key]):
                return False
        elif isinstance(value, set):
            if not value <= superset[key]:
                return False

        else:
            if not value == superset[key]:
                return False

    return True


class Foo(unittest.TestCase):

    def setUp(self):
        self.dct = {
            'a': 'hello world',
            'b': 12345,
            'c': 1.2345,
            'd': [1, 2, 3, 4, 5],
            'e': {1, 2, 3, 4, 5},
            'f': {
                'a': 'hello world',
                'b': 12345,
                'c': 1.2345,
                'd': [1, 2, 3, 4, 5],
                'e': {1, 2, 3, 4, 5},
                'g': False,
                'h': None
            },
            'g': False,
            'h': None,
            'question': 'mcve',
            'metadata': {}
        }

    def tearDown(self):
        pass

    def check_true(self, superset, subset):
        return self.assertEqual(is_subset(superset, subset), True)

    def check_false(self, superset, subset):
        return self.assertEqual(is_subset(superset, subset), False)

    def test_simple_cases(self):
        self.check_true(self.dct, {'a': 'hello world'})
        self.check_true(self.dct, {'b': 12345})
        self.check_true(self.dct, {'c': 1.2345})
        self.check_true(self.dct, {'d': [1, 2, 3, 4, 5]})
        self.check_true(self.dct, {'e': {1, 2, 3, 4, 5}})
        self.check_true(self.dct, {'f': {
            'a': 'hello world',
            'b': 12345,
            'c': 1.2345,
            'd': [1, 2, 3, 4, 5],
            'e': {1, 2, 3, 4, 5},
        }})
        self.check_true(self.dct, {'g': False})
        self.check_true(self.dct, {'h': None})

    def test_tricky_cases(self):
        self.check_true(self.dct, {'a': 'hello'})
        self.check_true(self.dct, {'d': [1, 2, 3]})
        self.check_true(self.dct, {'e': {3, 4}})
        self.check_true(self.dct, {'f': {
            'a': 'hello world',
            'h': None
        }})
        self.check_false(
            self.dct, {'question': 'mcve', 'metadata': {'author': 'BPL'}})
        self.check_true(
            self.dct, {'question': 'mcve', 'metadata': {}})
        self.check_false(
            self.dct, {'question1': 'mcve', 'metadata': {}})

if __name__ == "__main__":
    unittest.main()

NOTE: The original code would fail in certain cases, credits for the fixing goes to @olivier-melançon

Answered By: BPL

Answer 11

I know this question is old, but here is my solution for checking if one nested dictionary is a part of another nested dictionary. The solution is recursive.

def compare_dicts(a, b):
    for key, value in a.items():
        if key in b:
            if isinstance(a[key], dict):
                if not compare_dicts(a[key], b[key]):
                    return False
            elif value != b[key]:
                return False
        else:
            return False
    return True

Answered By: NutCracker

Answer 12

If you don’t mind using pydash there is is_match there which does exactly that:

import pydash

a = {1:2, 3:4, 5:{6:7}}
b = {3:4.0, 5:{6:8}}
c = {3:4.0, 5:{6:7}}

pydash.predicates.is_match(a, b) # False
pydash.predicates.is_match(a, c) # True

Answered By: Zaro

Answer 13

A short recursive implementation that works for nested dictionaries:

def compare_dicts(a,b):
    if not a: return True
    if isinstance(a, dict):
        key, val = a.popitem()
        return isinstance(b, dict) and key in b and compare_dicts(val, b.pop(key)) and compare_dicts(a, b)
    return a == b

This will consume the a and b dicts. If anyone knows of a good way to avoid that without resorting to partially iterative solutions as in other answers, please tell me. I would need a way to split a dict into head and tail based on a key.

This code is more usefull as a programming exercise, and probably is a lot slower than other solutions in here that mix recursion and iteration. @Nutcracker’s solution is pretty good for nested dictionaries.

Answered By: Frederik Baetens

Answer 14

Here is a solution that also properly recurses into lists and sets contained within the dictionary. You can also use this for lists containing dicts etc…

def is_subset(subset, superset):
    if isinstance(subset, dict):
        return all(key in superset and is_subset(val, superset[key]) for key, val in subset.items())
    
    if isinstance(subset, list) or isinstance(subset, set):
        return all(any(is_subset(subitem, superitem) for superitem in superset) for subitem in subset)

    # assume that subset is a plain value if none of the above match
    return subset == superset

When using python 3.10, you can use python’s new match statements to do the typechecking:

def is_subset(subset, superset):
    match subset:
        case dict(_): return all(key in superset and is_subset(val, superset[key]) for key, val in subset.items())
        case list(_) | set(_): return all(any(is_subset(subitem, superitem) for superitem in superset) for subitem in subset)
        # assume that subset is a plain value if none of the above match
        case _: return subset == superset

Answered By: Frederik Baetens

Answer 15

Use this wrapper object that provides partial comparison and nice diffs:


class DictMatch(dict):
    """ Partial match of a dictionary to another one """
    def __eq__(self, other: dict):
        assert isinstance(other, dict)
        return all(other[name] == value for name, value in self.items())

actual_name = {'praenomen': 'Gaius', 'nomen': 'Julius', 'cognomen': 'Caesar'}
expected_name = DictMatch({'praenomen': 'Gaius'})  # partial match
assert expected_name == actual_name  # True

Answered By: kolypto

Answer 16

Most of the answers will not work if within dict there are some arrays of other dicts, here is a solution for this:

def d_eq(d, d1):
   if not isinstance(d, (dict, list)):
      return d == d1
   if isinstance(d, list):
      return all(d_eq(a, b) for a, b in zip(d, d1))
   return all(d.get(i) == d1[i] or d_eq(d.get(i), d1[i]) for i in d1)

def is_sub(d, d1):
  if isinstance(d, list):
     return any(is_sub(i, d1) for i in d)
  return d_eq(d, d1) or (isinstance(d, dict) and any(is_sub(b, d1) for b in d.values()))

print(is_sub(dct_1, dict_2))

Taken from How to check if dict is subset of another complex dict

Answered By: Marius

Answer 17

Another way of doing this:

>>> d1 = {'a':'2', 'b':'3'}
>>> d2 = {'a':'2', 'b':'3','c':'4'}
>>> d3 = {'a':'1'}
>>> set(d1.items()).issubset(d2.items())
True
>>> set(d3.items()).issubset(d2.items())
False

Answered By: Mike Lane

Answer 18

With Python 3.9, this is what I use:

def dict_contains_dict(small: dict, big: dict):    
   return (big | small) == big

Answered By: Andreas Profous

How to check if one dictionary is a subset of another larger dictionary?

Question:

Answers: