How to recursively replace character in keys of a nested dictionary?

Question:

I’m trying to create a generic function that replaces dots in keys of a nested dictionary. I have a non-generic function that goes 3 levels deep, but there must be a way to do this generic. Any help is appreciated! My code so far:

output = {'key1': {'key2': 'value2', 'key3': {'key4 with a .': 'value4', 'key5 with a .': 'value5'}}} 

def print_dict(d):
    new = {}
    for key,value in d.items():
        new[key.replace(".", "-")] = {}
        if isinstance(value, dict):
            for key2, value2 in value.items():
                new[key][key2] = {}
                if isinstance(value2, dict):
                    for key3, value3 in value2.items():
                        new[key][key2][key3.replace(".", "-")] = value3
                else:
                    new[key][key2.replace(".", "-")] = value2
        else:
            new[key] = value
    return new

print print_dict(output)

UPDATE: to answer my own question, I made a solution using json object_hooks:

import json

def remove_dots(obj):
    for key in obj.keys():
        new_key = key.replace(".","-")
        if new_key != key:
            obj[new_key] = obj[key]
            del obj[key]
    return obj

output = {'key1': {'key2': 'value2', 'key3': {'key4 with a .': 'value4', 'key5 with a .': 'value5'}}}
new_json = json.loads(json.dumps(output), object_hook=remove_dots) 

print new_json
Asked By: Bas Tichelaar

||

Answers:

Yes, there exists better way:

def print_dict(d):
    new = {}
    for k, v in d.iteritems():
        if isinstance(v, dict):
            v = print_dict(v)
        new[k.replace('.', '-')] = v
    return new

(Edit: It’s recursion, more on Wikipedia.)

Answered By: horejsek

You have to remove the original key, but you can’t do it in the body of the loop because it will throw RunTimeError: dictionary changed size during iteration.

To solve this, iterate through a copy of the original object, but modify the original object:

def change_keys(obj):
    new_obj = obj
    for k in new_obj:
            if hasattr(obj[k], '__getitem__'):
                    change_keys(obj[k])
            if '.' in k:
                    obj[k.replace('.', '$')] = obj[k]
                    del obj[k]

>>> foo = {'foo': {'bar': {'baz.121': 1}}}
>>> change_keys(foo)
>>> foo
{'foo': {'bar': {'baz$121': 1}}}
Answered By: bk0

I used the code by @horejsek, but I adapted it to accept nested dictionaries with lists and a function that replaces the string.

I had a similar problem to solve: I wanted to replace keys in underscore lowercase convention for camel case convention and vice versa.

def change_dict_naming_convention(d, convert_function):
    """
    Convert a nested dictionary from one convention to another.
    Args:
        d (dict): dictionary (nested or not) to be converted.
        convert_function (func): function that takes the string in one convention and returns it in the other one.
    Returns:
        Dictionary with the new keys.
    """
    new = {}
    for k, v in d.iteritems():
        new_v = v
        if isinstance(v, dict):
            new_v = change_dict_naming_convention(v, convert_function)
        elif isinstance(v, list):
            new_v = list()
            for x in v:
                new_v.append(change_dict_naming_convention(x, convert_function))
        new[convert_function(k)] = new_v
    return new
Answered By: jllopezpino

Here’s a simple recursive solution that deals with nested lists and dictionnaries.

def change_keys(obj, convert):
    """
    Recursivly goes through the dictionnary obj and replaces keys with the convert function.
    """
    if isinstance(obj, dict):
        new = {}
        for k, v in obj.iteritems():
            new[convert(k)] = change_keys(v, convert)
    elif isinstance(obj, list):
        new = []
        for v in obj:
            new.append(change_keys(v, convert))
    else:
        return obj
    return new
Answered By: ngenain

Actually all of the answers contain a mistake that may lead to wrong typing in the result.

I’d take the answer of @ngenain and improve it a bit below.

My solution will take care about the types derived from dict (OrderedDict, defaultdict, etc) and also about not only list, but set and tuple types.

I also do a simple type check in the beginning of the function for the most common types to reduce the comparisons count (may give a bit of speed in the large amounts of the data).

Works for Python 3. Replace obj.items() with obj.iteritems() for Py2.

def change_keys(obj, convert):
    """
    Recursively goes through the dictionary obj and replaces keys with the convert function.
    """
    if isinstance(obj, (str, int, float)):
        return obj
    if isinstance(obj, dict):
        new = obj.__class__()
        for k, v in obj.items():
            new[convert(k)] = change_keys(v, convert)
    elif isinstance(obj, (list, set, tuple)):
        new = obj.__class__(change_keys(v, convert) for v in obj)
    else:
        return obj
    return new

If I understand the needs right, most of users want to convert the keys to use them with mongoDB that does not allow dots in key names.

Answered By: baldr

While jllopezpino’s answer works but only limited to the start with the dictionary, here is mine that works with original variable is either list or dict.

def fix_camel_cases(data):
    def convert(name):
        # https://stackoverflow.com/questions/1175208/elegant-python-function-to-convert-camelcase-to-snake-case
        s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
        return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()

    if isinstance(data, dict):
        new_dict = {}
        for key, value in data.items():
            value = fix_camel_cases(value)
            snake_key = convert(key)
            new_dict[snake_key] = value
        return new_dict

    if isinstance(data, list):
        new_list = []
        for value in data:
            new_list.append(fix_camel_cases(value))
        return new_list

    return data
Answered By: James Lin

Here’s a 1-liner variant of @horejsek ‘s answer using dict comprehension for those who prefer:

def print_dict(d):
    return {k.replace('.', '-'): print_dict(v) for k, v in d.items()} if isinstance(d, dict) else d

I’ve only tested this in Python 2.7

Answered By: ecoe

You can dump everything to a JSON
replace through the whole string and load the JSON back

def nested_replace(data, old, new):
    json_string = json.dumps(data)
    replaced = json_string.replace(old, new)
    fixed_json = json.loads(replaced)
    return fixed_json

Or use a one-liner

def short_replace(data, old, new):
    return json.loads(json.dumps(data).replace(old, new))
Answered By: Ariel Voskov

I am guessing you have the same issue as I have, inserting dictionaries into a MongoDB collection, encountering exceptions when trying to insert dictionaries that have keys with dots (.) in them.

This solution is essentially the same as most other answers here, but it is slightly more compact, and perhaps less readable in that it uses a single statement and calls itself recursively. For Python 3.

def replace_keys(my_dict):
    return { k.replace('.', '(dot)'): replace_keys(v) if type(v) == dict else v for k, v in my_dict.items() }
Answered By: Mats Bengtsson