Dictionary merge by updating but not overwriting if value exists
Question:
If I have 2 dicts as follows:
d1 = {'a': 2, 'b': 4}
d2 = {'a': 2, 'b': ''}
In order to ‘merge’ them:
dict(d1.items() + d2.items())
results in
{'a': 2, 'b': ''}
But what should I do if I would like to compare each value of the two dictionaries and only update d2
into d1
if values in d1
are empty/None
/''
?
When the same key exists, I would like to only maintain the numerical value (either from d1
or d2
) instead of the empty value. If both values are empty, then no problems maintaining the empty value. If both have values, then d1
-value should stay.
i.e.
d1 = {'a': 2, 'b': 8, 'c': ''}
d2 = {'a': 2, 'b': '', 'c': ''}
should result in
{'a': 2, 'b': 8, 'c': ''}
where 8 is not overwritten by ''
.
Answers:
Just switch the order:
z = dict(d2.items() + d1.items())
By the way, you may also be interested in the potentially faster update
method.
In Python 3, you have to cast the view objects to lists first:
z = dict(list(d2.items()) + list(d1.items()))
If you want to special-case empty strings, you can do the following:
def mergeDictsOverwriteEmpty(d1, d2):
res = d2.copy()
for k,v in d2.items():
if k not in d1 or d1[k] == '':
res[k] = v
return res
In case when you have dictionaries with the same size and keys you can use the following code:
dict((k,v if k in d2 and d2[k] in [None, ''] else d2[k]) for k,v in d1.iteritems())
d2.update(d1)
instead of dict(d2.items() + d1.items())
Updates d2
with d1
key/value pairs, but only if d1
value is not None
, ''
(False):
>>> d1 = dict(a=1, b=None, c=2)
>>> d2 = dict(a=None, b=2, c=1)
>>> d2.update({k: v for k, v in d1.items() if v})
>>> d2
{'a': 1, 'c': 2, 'b': 2}
(Use iteritems()
instead of items()
in Python 2.)
Here’s an in-place solution (it modifies d2):
# assumptions: d2 is a temporary dict that can be discarded
# d1 is a dict that must be modified in place
# the modification is adding keys from d2 into d1 that do not exist in d1.
def update_non_existing_inplace(original_dict, to_add):
to_add.update(original_dict) # to_add now holds the "final result" (O(n))
original_dict.clear() # erase original_dict in-place (O(1))
original_dict.update(to_add) # original_dict now holds the "final result" (O(n))
return
Here’s another in-place solution, which is less elegant but potentially more efficient, as well as leaving d2 unmodified:
# assumptions: d2 is can not be modified
# d1 is a dict that must be modified in place
# the modification is adding keys from d2 into d1 that do not exist in d1.
def update_non_existing_inplace(original_dict, to_add):
for key in to_add.iterkeys():
if key not in original_dict:
original_dict[key] = to_add[key]
To add to d2
keys/values from d1
which do not exist in d2
without overwriting any existing keys/values in d2
:
temp = d2.copy()
d2.update(d1)
d2.update(temp)
Python 3.5+ Literal Dict
unless using obsolete version of python you better off using this.
Pythonic & faster way for dict unpacking:
d1 = {'a':1, 'b':1}
d2 = {'a':2, 'c':2}
merged = {**d1, **d2} # priority from right to left
print(merged)
{'a': 2, 'b': 1, 'c': 2}
its simpler and also faster than the dict(list(d2.items()) + list(d1.items()))
alternative:
d1 = {i: 1 for i in range(1000000)}
d2 = {i: 2 for i in range(2000000)}
%timeit dict(list(d1.items()) + list(d2.items()))
402 ms ± 33.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit {**d1, **d2}
144 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
more on this from PEP448:
The keys in a dictionary remain in a right-to-left priority order, so {**{‘a’: 1}, ‘a’: 2, **{‘a’: 3}} evaluates to {‘a’: 3}. There is no restriction on the number or position of unpackings.
Merging Only Non-zero values
to do this we can just create a dict without the empty values and then merge them together this way:
d1 = {'a':1, 'b':1, 'c': '', 'd': ''}
d2 = {'a':2, 'c':2, 'd': ''}
merged_non_zero = {
k: (d1.get(k) or d2.get(k))
for k in set(d1) | set(d2)
}
print(merged_non_zero)
outputs:
{'a': 1, 'b': 1, 'c': 2, 'd': ''}
a
-> prefer first value from d1 as ‘a’ exists on both d1 and d2
b
-> only exists on d1
c
-> non-zero on d2
d
-> empty string on both
Explanation
The above code will create a dictionary using dict comprehension.
if d1
has the value and its non-zero value (i.e. bool(val) is True
), it’ll use d1[k]
value, otherwise it’ll take d2[k]
.
notice that we also merge all keys of the two dicts as they may not have the exact same keys using set union – set(d1) | set(d2)
.
If you want to ignore empty spaces so that for example merging:
a = {"a": 1, "b": 2, "c": ""}
b = {"a": "", "b": 4, "c": 5}
c = {"a": "aaa", "b": ""}
d = {"a": "", "w": ""}
results in:{'a': 'aaa', 'b': 4, 'c': 5, 'w': ''}
You can use these 2 functions:
def merge_two_dicts(a, b, path=None):
"merges b into a"
if path is None:
path = []
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
merge_two_dicts(a[key], b[key], path + [str(key)])
elif a[key] == b[key]:
pass # same leaf value
else:
if a[key] and not b[key]:
a[key] = a[key]
else:
a[key] = b[key]
else:
a[key] = b[key]
return a
def merge_multiple_dicts(*a):
output = a[0]
if len(a) >= 2:
for n in range(len(a) - 1):
output = merge_two_dicts(output, a[n + 1])
return output
So you can just use merge_multiple_dicts(a,b,c,d)
I have a solution if you want to have more freedom to choose when a value should be overwritten in the merged dictionary. Maybe it’s a verbose script, but it’s not hard to understand its logic.
Thanks fabiocaccamo and senderle for sharing the benedict package, and the nested iteration logic in lists, respectively. This knowledge was fundamental to the script development.
Python Requirements
pip install python-benedict==0.24.3
Python Script
Definition of the Dict
class.
from __future__ import annotations
from collections.abc import Mapping
from benedict import benedict
from typing import Iterator
from copy import deepcopy
class Dict:
def __init__(self, data: dict = None):
"""
Instantiates a dictionary object with nested keys-based indexing.
Parameters
----------
data: dict
Dictionary.
References
----------
[1] 'Dict' class: https://stackoverflow.com/a/70908985/16109419
[2] 'Benedict' package: https://github.com/fabiocaccamo/python-benedict
[3] Dictionary nested iteration: https://stackoverflow.com/a/10756615/16109419
"""
self.data = deepcopy(data) if data is not None else {}
def get(self, keys: [object], **kwargs) -> (object, bool):
"""
Get dictionary item value based on nested keys.
Parameters
----------
keys: [object]
Nested keys to get item value based on.
Returns
-------
value, found: (object, bool)
Item value, and whether the target item was found.
"""
data = kwargs.get('data', self.data)
path = kwargs.get('path', [])
value, found = None, False
# Looking for item location on dictionary:
for outer_key, outer_value in data.items():
trace = path + [outer_key]
# Getting item value from dictionary:
if trace == keys:
value, found = outer_value, True
break
if trace == keys[:len(trace)] and isinstance(outer_value, Mapping): # Recursion cutoff.
value, found = self.get(
data=outer_value,
keys=keys,
path=trace
)
return value, found
def set(self, keys: [object], value: object, **kwargs) -> bool:
"""
Set dictionary item value based on nested keys.
Parameters
----------
keys: [object]
Nested keys to set item value based on.
value: object
Item value.
Returns
-------
updated: bool
Whether the target item was updated.
"""
data = kwargs.get('data', self.data)
path = kwargs.get('path', [])
updated = False
# Looking for item location on dictionary:
for outer_key, outer_value in data.items():
trace = path + [outer_key]
# Setting item value on dictionary:
if trace == keys:
data[outer_key] = value
updated = True
break
if trace == keys[:len(trace)] and isinstance(outer_value, Mapping): # Recursion cutoff.
updated = self.set(
data=outer_value,
keys=keys,
value=value,
path=trace
)
return updated
def add(self, keys: [object], value: object, **kwargs) -> bool:
"""
Add dictionary item value based on nested keys.
Parameters
----------
keys: [object]
Nested keys to add item based on.
value: object
Item value.
Returns
-------
added: bool
Whether the target item was added.
"""
data = kwargs.get('data', self.data)
added = False
# Adding item on dictionary:
if keys[0] not in data:
if len(keys) == 1:
data[keys[0]] = value
added = True
else:
data[keys[0]] = {}
# Looking for item location on dictionary:
for outer_key, outer_value in data.items():
if outer_key == keys[0]: # Recursion cutoff.
if len(keys) > 1 and isinstance(outer_value, Mapping):
added = self.add(
data=outer_value,
keys=keys[1:],
value=value
)
return added
def remove(self, keys: [object], **kwargs) -> bool:
"""
Remove dictionary item based on nested keys.
Parameters
----------
keys: [object]
Nested keys to remove item based on.
Returns
-------
removed: bool
Whether the target item was removed.
"""
data = kwargs.get('data', self.data)
path = kwargs.get('path', [])
removed = False
# Looking for item location on dictionary:
for outer_key, outer_value in data.items():
trace = path + [outer_key]
# Removing item from dictionary:
if trace == keys:
del data[outer_key]
removed = True
break
if trace == keys[:len(trace)] and isinstance(outer_value, Mapping): # Recursion cutoff.
removed = self.remove(
data=outer_value,
keys=keys,
path=trace
)
return removed
def items(self, **kwargs) -> Iterator[object, object]:
"""
Get dictionary items based on nested keys.
Returns
-------
keys, value: Iterator[object, object]
List of nested keys and list of values.
"""
data = kwargs.get('data', self.data)
path = kwargs.get('path', [])
for outer_key, outer_value in data.items():
if isinstance(outer_value, Mapping):
for inner_key, inner_value in self.items(data=outer_value, path=path + [outer_key]):
yield inner_key, inner_value
else:
yield path + [outer_key], outer_value
@staticmethod
def merge(dict_list: [dict], overwrite: bool = False, concat: bool = False, default_value: object = None) -> dict:
"""
Merges dictionaries, with value assignment based on order of occurrence. Overwrites values if and only if:
- The key does not yet exist on merged dictionary;
- The current value of the key on merged dictionary is the default value.
Parameters
----------
dict_list: [dict]
List of dictionaries.
overwrite: bool
Overwrites occurrences of values. If false, keep the first occurrence of each value found.
concat: bool
Concatenates occurrences of values for the same key.
default_value: object
Default value used as a reference to override dictionary attributes.
Returns
-------
md: dict
Merged dictionary.
"""
dict_list = [d for d in dict_list if d is not None and isinstance(d, dict)] if dict_list is not None else []
assert len(dict_list), f"no dictionaries given."
# Keeping the first occurrence of each value:
if not overwrite:
dict_list = [Dict(d) for d in dict_list]
for i, d in enumerate(dict_list[:-1]):
for keys, value in d.items():
if value != default_value:
for j, next_d in enumerate(dict_list[i+1:], start=i+1):
next_d.remove(keys=keys)
dict_list = [d.data for d in dict_list]
md = benedict()
md.merge(*dict_list, overwrite=True, concat=concat)
return md
Definition of the main
method to show examples.
import json
def main() -> None:
dict_list = [
{1: 'a', 2: None, 3: {4: None, 5: {6: None}}},
{1: None, 2: None, 3: {4: 'c', 5: {6: {7: None}}}},
{1: None, 2: 'b', 3: {4: None, 5: {6: {7: 'd'}}}},
{1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['e', 'f']}}}}}},
{1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['g', 'h']}}}}}},
]
d = Dict(data=dict_list[-1])
print("Dictionary operations test:n")
print(f"data = {json.dumps(d.data, indent=4)}n")
print(f"d = Dict(data=data)")
keys = [11]
value = {12: {13: 14}}
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
print(f"d.set(keys={keys}, value={value}) --> {d.set(keys=keys, value=value)}")
print(f"d.add(keys={keys}, value={value}) --> {d.add(keys=keys, value=value)}")
keys = [11, 12, 13]
value = 14
print(f"d.add(keys={keys}, value={value}) --> {d.add(keys=keys, value=value)}")
value = 15
print(f"d.set(keys={keys}, value={value}) --> {d.set(keys=keys, value=value)}")
keys = [11]
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
keys = [11, 12]
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
keys = [11, 12, 13]
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
keys = [11, 12, 13, 15]
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
keys = [2]
print(f"d.remove(keys={keys}) --> {d.remove(keys=keys)}")
print(f"d.remove(keys={keys}) --> {d.remove(keys=keys)}")
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
print("n-----------------------------n")
print("Dictionary values match test:n")
print(f"data = {json.dumps(d.data, indent=4)}n")
print(f"d = Dict(data=data)")
for keys, value in d.items():
real_value, found = d.get(keys=keys)
status = "found" if found else "not found"
print(f"d{keys} = {value} == {real_value} ({status}) --> {value == real_value}")
print("n-----------------------------n")
print("Dictionaries merge test:n")
for i, d in enumerate(dict_list, start=1):
print(f"d{i} = {d}")
dict_list_ = [f"d{i}" for i, d in enumerate(dict_list, start=1)]
print(f"dict_list = [{', '.join(dict_list_)}]")
md = Dict.merge(dict_list=dict_list)
print("nmd = Dict.merge(dict_list=dict_list)")
print("print(md)")
print(f"{json.dumps(md, indent=4)}")
if __name__ == '__main__':
main()
Output
Dictionary operations test:
data = {
"1": null,
"2": "b",
"3": {
"4": null,
"5": {
"6": {
"8": {
"9": {
"10": [
"g",
"h"
]
}
}
}
}
}
}
d = Dict(data=data)
d.get(keys=[11]) --> (None, False)
d.set(keys=[11], value={12: {13: 14}}) --> False
d.add(keys=[11], value={12: {13: 14}}) --> True
d.add(keys=[11, 12, 13], value=14) --> False
d.set(keys=[11, 12, 13], value=15) --> True
d.get(keys=[11]) --> ({12: {13: 15}}, True)
d.get(keys=[11, 12]) --> ({13: 15}, True)
d.get(keys=[11, 12, 13]) --> (15, True)
d.get(keys=[11, 12, 13, 15]) --> (None, False)
d.remove(keys=[2]) --> True
d.remove(keys=[2]) --> False
d.get(keys=[2]) --> (None, False)
-----------------------------
Dictionary values match test:
data = {
"1": null,
"3": {
"4": null,
"5": {
"6": {
"8": {
"9": {
"10": [
"g",
"h"
]
}
}
}
}
},
"11": {
"12": {
"13": 15
}
}
}
d = Dict(data=data)
d[1] = None == None (found) --> True
d[3, 4] = None == None (found) --> True
d[3, 5, 6, 8, 9, 10] = ['g', 'h'] == ['g', 'h'] (found) --> True
d[11, 12, 13] = 15 == 15 (found) --> True
-----------------------------
Dictionaries merge test:
d1 = {1: 'a', 2: None, 3: {4: None, 5: {6: None}}}
d2 = {1: None, 2: None, 3: {4: 'c', 5: {6: {7: None}}}}
d3 = {1: None, 2: 'b', 3: {4: None, 5: {6: {7: 'd'}}}}
d4 = {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['e', 'f']}}}}}}
d5 = {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['g', 'h']}}}}}}
dict_list = [d1, d2, d3, d4, d5]
md = Dict.merge(dict_list=dict_list)
print(md)
{
"1": "a",
"2": "b",
"3": {
"4": "c",
"5": {
"6": {
"7": "d",
"8": {
"9": {
"10": [
"e",
"f"
]
}
}
}
}
}
}
If I have 2 dicts as follows:
d1 = {'a': 2, 'b': 4}
d2 = {'a': 2, 'b': ''}
In order to ‘merge’ them:
dict(d1.items() + d2.items())
results in
{'a': 2, 'b': ''}
But what should I do if I would like to compare each value of the two dictionaries and only update d2
into d1
if values in d1
are empty/None
/''
?
When the same key exists, I would like to only maintain the numerical value (either from d1
or d2
) instead of the empty value. If both values are empty, then no problems maintaining the empty value. If both have values, then d1
-value should stay.
i.e.
d1 = {'a': 2, 'b': 8, 'c': ''}
d2 = {'a': 2, 'b': '', 'c': ''}
should result in
{'a': 2, 'b': 8, 'c': ''}
where 8 is not overwritten by ''
.
Just switch the order:
z = dict(d2.items() + d1.items())
By the way, you may also be interested in the potentially faster update
method.
In Python 3, you have to cast the view objects to lists first:
z = dict(list(d2.items()) + list(d1.items()))
If you want to special-case empty strings, you can do the following:
def mergeDictsOverwriteEmpty(d1, d2):
res = d2.copy()
for k,v in d2.items():
if k not in d1 or d1[k] == '':
res[k] = v
return res
In case when you have dictionaries with the same size and keys you can use the following code:
dict((k,v if k in d2 and d2[k] in [None, ''] else d2[k]) for k,v in d1.iteritems())
d2.update(d1)
instead of dict(d2.items() + d1.items())
Updates d2
with d1
key/value pairs, but only if d1
value is not None
, ''
(False):
>>> d1 = dict(a=1, b=None, c=2)
>>> d2 = dict(a=None, b=2, c=1)
>>> d2.update({k: v for k, v in d1.items() if v})
>>> d2
{'a': 1, 'c': 2, 'b': 2}
(Use iteritems()
instead of items()
in Python 2.)
Here’s an in-place solution (it modifies d2):
# assumptions: d2 is a temporary dict that can be discarded
# d1 is a dict that must be modified in place
# the modification is adding keys from d2 into d1 that do not exist in d1.
def update_non_existing_inplace(original_dict, to_add):
to_add.update(original_dict) # to_add now holds the "final result" (O(n))
original_dict.clear() # erase original_dict in-place (O(1))
original_dict.update(to_add) # original_dict now holds the "final result" (O(n))
return
Here’s another in-place solution, which is less elegant but potentially more efficient, as well as leaving d2 unmodified:
# assumptions: d2 is can not be modified
# d1 is a dict that must be modified in place
# the modification is adding keys from d2 into d1 that do not exist in d1.
def update_non_existing_inplace(original_dict, to_add):
for key in to_add.iterkeys():
if key not in original_dict:
original_dict[key] = to_add[key]
To add to d2
keys/values from d1
which do not exist in d2
without overwriting any existing keys/values in d2
:
temp = d2.copy()
d2.update(d1)
d2.update(temp)
Python 3.5+ Literal Dict
unless using obsolete version of python you better off using this.
Pythonic & faster way for dict unpacking:
d1 = {'a':1, 'b':1}
d2 = {'a':2, 'c':2}
merged = {**d1, **d2} # priority from right to left
print(merged)
{'a': 2, 'b': 1, 'c': 2}
its simpler and also faster than the dict(list(d2.items()) + list(d1.items()))
alternative:
d1 = {i: 1 for i in range(1000000)}
d2 = {i: 2 for i in range(2000000)}
%timeit dict(list(d1.items()) + list(d2.items()))
402 ms ± 33.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit {**d1, **d2}
144 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
more on this from PEP448:
The keys in a dictionary remain in a right-to-left priority order, so {**{‘a’: 1}, ‘a’: 2, **{‘a’: 3}} evaluates to {‘a’: 3}. There is no restriction on the number or position of unpackings.
Merging Only Non-zero values
to do this we can just create a dict without the empty values and then merge them together this way:
d1 = {'a':1, 'b':1, 'c': '', 'd': ''}
d2 = {'a':2, 'c':2, 'd': ''}
merged_non_zero = {
k: (d1.get(k) or d2.get(k))
for k in set(d1) | set(d2)
}
print(merged_non_zero)
outputs:
{'a': 1, 'b': 1, 'c': 2, 'd': ''}
a
-> prefer first value from d1 as ‘a’ exists on both d1 and d2b
-> only exists on d1c
-> non-zero on d2d
-> empty string on both
Explanation
The above code will create a dictionary using dict comprehension.
if d1
has the value and its non-zero value (i.e. bool(val) is True
), it’ll use d1[k]
value, otherwise it’ll take d2[k]
.
notice that we also merge all keys of the two dicts as they may not have the exact same keys using set union – set(d1) | set(d2)
.
If you want to ignore empty spaces so that for example merging:
a = {"a": 1, "b": 2, "c": ""}
b = {"a": "", "b": 4, "c": 5}
c = {"a": "aaa", "b": ""}
d = {"a": "", "w": ""}
results in:{'a': 'aaa', 'b': 4, 'c': 5, 'w': ''}
You can use these 2 functions:
def merge_two_dicts(a, b, path=None):
"merges b into a"
if path is None:
path = []
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
merge_two_dicts(a[key], b[key], path + [str(key)])
elif a[key] == b[key]:
pass # same leaf value
else:
if a[key] and not b[key]:
a[key] = a[key]
else:
a[key] = b[key]
else:
a[key] = b[key]
return a
def merge_multiple_dicts(*a):
output = a[0]
if len(a) >= 2:
for n in range(len(a) - 1):
output = merge_two_dicts(output, a[n + 1])
return output
So you can just use merge_multiple_dicts(a,b,c,d)
I have a solution if you want to have more freedom to choose when a value should be overwritten in the merged dictionary. Maybe it’s a verbose script, but it’s not hard to understand its logic.
Thanks fabiocaccamo and senderle for sharing the benedict package, and the nested iteration logic in lists, respectively. This knowledge was fundamental to the script development.
Python Requirements
pip install python-benedict==0.24.3
Python Script
Definition of the Dict
class.
from __future__ import annotations
from collections.abc import Mapping
from benedict import benedict
from typing import Iterator
from copy import deepcopy
class Dict:
def __init__(self, data: dict = None):
"""
Instantiates a dictionary object with nested keys-based indexing.
Parameters
----------
data: dict
Dictionary.
References
----------
[1] 'Dict' class: https://stackoverflow.com/a/70908985/16109419
[2] 'Benedict' package: https://github.com/fabiocaccamo/python-benedict
[3] Dictionary nested iteration: https://stackoverflow.com/a/10756615/16109419
"""
self.data = deepcopy(data) if data is not None else {}
def get(self, keys: [object], **kwargs) -> (object, bool):
"""
Get dictionary item value based on nested keys.
Parameters
----------
keys: [object]
Nested keys to get item value based on.
Returns
-------
value, found: (object, bool)
Item value, and whether the target item was found.
"""
data = kwargs.get('data', self.data)
path = kwargs.get('path', [])
value, found = None, False
# Looking for item location on dictionary:
for outer_key, outer_value in data.items():
trace = path + [outer_key]
# Getting item value from dictionary:
if trace == keys:
value, found = outer_value, True
break
if trace == keys[:len(trace)] and isinstance(outer_value, Mapping): # Recursion cutoff.
value, found = self.get(
data=outer_value,
keys=keys,
path=trace
)
return value, found
def set(self, keys: [object], value: object, **kwargs) -> bool:
"""
Set dictionary item value based on nested keys.
Parameters
----------
keys: [object]
Nested keys to set item value based on.
value: object
Item value.
Returns
-------
updated: bool
Whether the target item was updated.
"""
data = kwargs.get('data', self.data)
path = kwargs.get('path', [])
updated = False
# Looking for item location on dictionary:
for outer_key, outer_value in data.items():
trace = path + [outer_key]
# Setting item value on dictionary:
if trace == keys:
data[outer_key] = value
updated = True
break
if trace == keys[:len(trace)] and isinstance(outer_value, Mapping): # Recursion cutoff.
updated = self.set(
data=outer_value,
keys=keys,
value=value,
path=trace
)
return updated
def add(self, keys: [object], value: object, **kwargs) -> bool:
"""
Add dictionary item value based on nested keys.
Parameters
----------
keys: [object]
Nested keys to add item based on.
value: object
Item value.
Returns
-------
added: bool
Whether the target item was added.
"""
data = kwargs.get('data', self.data)
added = False
# Adding item on dictionary:
if keys[0] not in data:
if len(keys) == 1:
data[keys[0]] = value
added = True
else:
data[keys[0]] = {}
# Looking for item location on dictionary:
for outer_key, outer_value in data.items():
if outer_key == keys[0]: # Recursion cutoff.
if len(keys) > 1 and isinstance(outer_value, Mapping):
added = self.add(
data=outer_value,
keys=keys[1:],
value=value
)
return added
def remove(self, keys: [object], **kwargs) -> bool:
"""
Remove dictionary item based on nested keys.
Parameters
----------
keys: [object]
Nested keys to remove item based on.
Returns
-------
removed: bool
Whether the target item was removed.
"""
data = kwargs.get('data', self.data)
path = kwargs.get('path', [])
removed = False
# Looking for item location on dictionary:
for outer_key, outer_value in data.items():
trace = path + [outer_key]
# Removing item from dictionary:
if trace == keys:
del data[outer_key]
removed = True
break
if trace == keys[:len(trace)] and isinstance(outer_value, Mapping): # Recursion cutoff.
removed = self.remove(
data=outer_value,
keys=keys,
path=trace
)
return removed
def items(self, **kwargs) -> Iterator[object, object]:
"""
Get dictionary items based on nested keys.
Returns
-------
keys, value: Iterator[object, object]
List of nested keys and list of values.
"""
data = kwargs.get('data', self.data)
path = kwargs.get('path', [])
for outer_key, outer_value in data.items():
if isinstance(outer_value, Mapping):
for inner_key, inner_value in self.items(data=outer_value, path=path + [outer_key]):
yield inner_key, inner_value
else:
yield path + [outer_key], outer_value
@staticmethod
def merge(dict_list: [dict], overwrite: bool = False, concat: bool = False, default_value: object = None) -> dict:
"""
Merges dictionaries, with value assignment based on order of occurrence. Overwrites values if and only if:
- The key does not yet exist on merged dictionary;
- The current value of the key on merged dictionary is the default value.
Parameters
----------
dict_list: [dict]
List of dictionaries.
overwrite: bool
Overwrites occurrences of values. If false, keep the first occurrence of each value found.
concat: bool
Concatenates occurrences of values for the same key.
default_value: object
Default value used as a reference to override dictionary attributes.
Returns
-------
md: dict
Merged dictionary.
"""
dict_list = [d for d in dict_list if d is not None and isinstance(d, dict)] if dict_list is not None else []
assert len(dict_list), f"no dictionaries given."
# Keeping the first occurrence of each value:
if not overwrite:
dict_list = [Dict(d) for d in dict_list]
for i, d in enumerate(dict_list[:-1]):
for keys, value in d.items():
if value != default_value:
for j, next_d in enumerate(dict_list[i+1:], start=i+1):
next_d.remove(keys=keys)
dict_list = [d.data for d in dict_list]
md = benedict()
md.merge(*dict_list, overwrite=True, concat=concat)
return md
Definition of the main
method to show examples.
import json
def main() -> None:
dict_list = [
{1: 'a', 2: None, 3: {4: None, 5: {6: None}}},
{1: None, 2: None, 3: {4: 'c', 5: {6: {7: None}}}},
{1: None, 2: 'b', 3: {4: None, 5: {6: {7: 'd'}}}},
{1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['e', 'f']}}}}}},
{1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['g', 'h']}}}}}},
]
d = Dict(data=dict_list[-1])
print("Dictionary operations test:n")
print(f"data = {json.dumps(d.data, indent=4)}n")
print(f"d = Dict(data=data)")
keys = [11]
value = {12: {13: 14}}
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
print(f"d.set(keys={keys}, value={value}) --> {d.set(keys=keys, value=value)}")
print(f"d.add(keys={keys}, value={value}) --> {d.add(keys=keys, value=value)}")
keys = [11, 12, 13]
value = 14
print(f"d.add(keys={keys}, value={value}) --> {d.add(keys=keys, value=value)}")
value = 15
print(f"d.set(keys={keys}, value={value}) --> {d.set(keys=keys, value=value)}")
keys = [11]
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
keys = [11, 12]
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
keys = [11, 12, 13]
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
keys = [11, 12, 13, 15]
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
keys = [2]
print(f"d.remove(keys={keys}) --> {d.remove(keys=keys)}")
print(f"d.remove(keys={keys}) --> {d.remove(keys=keys)}")
print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
print("n-----------------------------n")
print("Dictionary values match test:n")
print(f"data = {json.dumps(d.data, indent=4)}n")
print(f"d = Dict(data=data)")
for keys, value in d.items():
real_value, found = d.get(keys=keys)
status = "found" if found else "not found"
print(f"d{keys} = {value} == {real_value} ({status}) --> {value == real_value}")
print("n-----------------------------n")
print("Dictionaries merge test:n")
for i, d in enumerate(dict_list, start=1):
print(f"d{i} = {d}")
dict_list_ = [f"d{i}" for i, d in enumerate(dict_list, start=1)]
print(f"dict_list = [{', '.join(dict_list_)}]")
md = Dict.merge(dict_list=dict_list)
print("nmd = Dict.merge(dict_list=dict_list)")
print("print(md)")
print(f"{json.dumps(md, indent=4)}")
if __name__ == '__main__':
main()
Output
Dictionary operations test:
data = {
"1": null,
"2": "b",
"3": {
"4": null,
"5": {
"6": {
"8": {
"9": {
"10": [
"g",
"h"
]
}
}
}
}
}
}
d = Dict(data=data)
d.get(keys=[11]) --> (None, False)
d.set(keys=[11], value={12: {13: 14}}) --> False
d.add(keys=[11], value={12: {13: 14}}) --> True
d.add(keys=[11, 12, 13], value=14) --> False
d.set(keys=[11, 12, 13], value=15) --> True
d.get(keys=[11]) --> ({12: {13: 15}}, True)
d.get(keys=[11, 12]) --> ({13: 15}, True)
d.get(keys=[11, 12, 13]) --> (15, True)
d.get(keys=[11, 12, 13, 15]) --> (None, False)
d.remove(keys=[2]) --> True
d.remove(keys=[2]) --> False
d.get(keys=[2]) --> (None, False)
-----------------------------
Dictionary values match test:
data = {
"1": null,
"3": {
"4": null,
"5": {
"6": {
"8": {
"9": {
"10": [
"g",
"h"
]
}
}
}
}
},
"11": {
"12": {
"13": 15
}
}
}
d = Dict(data=data)
d[1] = None == None (found) --> True
d[3, 4] = None == None (found) --> True
d[3, 5, 6, 8, 9, 10] = ['g', 'h'] == ['g', 'h'] (found) --> True
d[11, 12, 13] = 15 == 15 (found) --> True
-----------------------------
Dictionaries merge test:
d1 = {1: 'a', 2: None, 3: {4: None, 5: {6: None}}}
d2 = {1: None, 2: None, 3: {4: 'c', 5: {6: {7: None}}}}
d3 = {1: None, 2: 'b', 3: {4: None, 5: {6: {7: 'd'}}}}
d4 = {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['e', 'f']}}}}}}
d5 = {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['g', 'h']}}}}}}
dict_list = [d1, d2, d3, d4, d5]
md = Dict.merge(dict_list=dict_list)
print(md)
{
"1": "a",
"2": "b",
"3": {
"4": "c",
"5": {
"6": {
"7": "d",
"8": {
"9": {
"10": [
"e",
"f"
]
}
}
}
}
}
}