How to remove keys-values from dictionary 1 which are not in dictionary 2 based on common keys?

Question:

I have two large dictionaries and both dictionaries have same keys, (name of images) and have different values.

1st dict named train_descriptions which looks like this:

{'15970.jpg': 'Turtle Check Men Navy Blue Shirt',
 '39386.jpg': 'Peter England Men Party Blue Jeans',
 '59263.jpg': 'Titan Women Silver Watch',
 ....
 ....
 '1855.jpg': 'Inkfruit Mens Chain Reaction T-shirt'}

and a 2nd dict named train_features

{'31973.jpg': array([[0.00125694, 0.        , 0.03409385, ..., 0.00434341, 0.00728011,
         0.01451511]], dtype=float32),
 '30778.jpg': array([[0.0174035 , 0.04345186, 0.00772929, ..., 0.02230316, 0.        ,
         0.03104496]], dtype=float32),
 ...,
 ...,

 '38246.jpg': array([[0.00403965, 0.03701203, 0.02616892, ..., 0.02296285, 0.00930257,
         0.04575242]], dtype=float32)}

The length of both dictionaries are as follows:

len(train_descriptions) is 44424 and len(train_features) is 44441

As you can see length of train_description dict is less than length of train_features. train_features dictionary has more keys-values than train_descriptions. How do I remove the keys from train_features dictionary which are not in train_description? To make their length same.

Asked By: Sticky

||

Answers:

Using for loop

feat = train_features.keys()
desc = train_description.keys()
common = list(i for i in feat if i not in decc)

for i in common: del train_features[i]

Edit: See below

Above code works. But we can do this more efficiently by not converting dict_keys to list as follows:

for i in train_features.keys() - train_description.keys(): del train_features[i]

When python dict_keys are subtracted then it gives dict_keys of uncommon keys. First code was first converting into list which was neither efficient nor required.

Answered By: Prabhas Kumar

just pop() if not exist in another dict.

for key in train_descriptions.keys():
    if key not in train_features.keys():
        train_features.pop(key)
Answered By: JayPeerachai

Use xor to get the difference between the dictionaries

diff = train_features.keys() ^ train_descriptions.keys()
for k in diff:
    del train_features[k]
Answered By: Guy
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.