how to convert a nested OrderedDict to dict?

Question:

I have a nested OrderedDict I would like to convert to a dict. Applying dict() on it apparently only converts the outermost layer of the last entry.

from collections import OrderedDict

od = OrderedDict(
    [
        (u'name', u'Alice'),
        (u'ID', OrderedDict(
            [
                (u'type', u'card'),
                (u'nr', u'123')
            ]
        )),
        (u'name', u'Bob'),
        (u'ID', OrderedDict(
            [
                (u'type', u'passport'),
                (u'nr', u'567')
            ]
        ))
    ]
)

print(dict(od))

Output:

{u'name': u'Bob', u'ID': OrderedDict([(u'type', u'passport'), (u'nr', u'567')])}

Is there a direct method to convert all the occurences?

Asked By: WoJ

||

Answers:

This should work:

import collections

def deep_convert_dict(layer):
    to_ret = layer
    if isinstance(layer, collections.OrderedDict):
        to_ret = dict(layer)

    try:
        for key, value in to_ret.items():
            to_ret[key] = deep_convert_dict(value)
    except AttributeError:
        pass

    return to_ret

Although, as jonrsharpe mentioned, there’s probably no reason to do this — an OrderedDict (by design) works wherever a dict does.

Answered By: Patrick Collins

NOTE: This answer is only partially correct, check https://stackoverflow.com/a/25057250/1860929 to understand more about why the dicts are of same sizes.

Original Answer

This doesn’t answer the question of the conversion, its more about what needs to be done.

The basic assumption that an OrderedDict is twice the size of Dict is flawed. Check this:

import sys
import random
from collections import OrderedDict

test_dict = {}
test_ordered_dict = OrderedDict()

for key in range(10000):
    test_dict[key] = random.random()
    test_ordered_dict[key] = random.random()

sys.getsizeof(test_dict)
786712

sys.getsizeof(test_ordered_dict)
786712

Basically both are of same size.

However, the time taken for the operations are not same, and in fact, creating a large dictionary (with 100-10000 keys) is around 7-8x faster than creating an OrderedDict with same keys. (Verified using %timeit in ipython)

import sys
import random
from collections import OrderedDict


def operate_on_dict(r):
    test_dict = {}
    for key in range(r):
        test_dict[key] = random.random()

def operate_on_ordered_dict(r):
    test_ordered_dict = OrderedDict()
    for key in range(r):
        test_ordered_dict[key] = random.random()

%timeit for x in range(100): operate_on_ordered_dict(100)
100 loops, best of 3: 9.24 ms per loop

%timeit for x in range(100): operate_on_dict(100)
1000 loops, best of 3: 1.23 ms per loop

So, IMO, you should focus on reading data directly into a dict and operate upon it, rather than first creating an OrderedDict and then converting it to a dict repetitively.

Answered By: Anshul Goyal

Simplest solution is to use json dumps and loads

from json import loads, dumps
from collections import OrderedDict

def to_dict(input_ordered_dict):
    return loads(dumps(input_ordered_dict))

NOTE: The above code will work for dictionaries that are known to json as serializable objects. The list of default object types can be found here

So, this should be enough if the ordered dictionary do not contain special values.

EDIT: Based on the comments, let us improve the above code. Let us say, the input_ordered_dict might contain custom class objects that cannot be serialized by json by default.
In that scenario, we should use the default parameter of json.dumps with a custom serializer of ours.

(eg):

from collections import OrderedDict as odict
from json import loads, dumps

class Name(object):
    def __init__(self, name):
        name = name.split(" ", 1)
        self.first_name = name[0]
        self.last_name = name[-1]

a = odict()
a["thiru"] = Name("Mr Thiru")
a["wife"] = Name("Mrs Thiru")
a["type"] = "test" # This is by default serializable

def custom_serializer(obj):
    if isinstance(obj, Name):
        return obj.__dict__

b = dumps(a) 
# Produces TypeError, as the Name objects are not serializable
b = dumps(a, default=custom_serializer)
# Produces desired output

This example can be extended further to a lot bigger scope. We can even add filters or modify the value to our necessity. Just add an else part to the custom_serializer function

def custom_serializer(obj):
    if isinstance(obj, Name):
        return obj.__dict__
    else:
        # Will get into this if the value is not serializable by default 
        # and is not a Name class object
        return None

The function that is given at the top, in case of custom serializers, should be:

from json import loads, dumps
from collections import OrderedDict

def custom_serializer(obj):
    if isinstance(obj, Name):
        return obj.__dict__
    else:
        # Will get into this if the value is not serializable by default 
        # and is also not a Name class object
        return None

def to_dict(input_ordered_dict):
    return loads(dumps(input_ordered_dict, default=custom_serializer))
Answered By: thiruvenkadam

I wrote a recursive method to convert an OrderedDict to a simple dict.

def recursive_ordered_dict_to_dict(ordered_dict):
    simple_dict = {}

    for key, value in ordered_dict.items():
        if isinstance(value, OrderedDict):
            simple_dict[key] = recursive_ordered_dict_to_dict(value)
        else:
            simple_dict[key] = value

    return simple_dict

Note: OrderedDicts and dicts are usually interchangeable, but I ran into an issue when running an assert between the two types using pytest.

Answered By: vpontis

Here’s a version that also handles lists and tuples. In this comment the OP mentions that lists of dicts also is also a case to handle.

Note, this also converts the tuples to lists. Preserving tuples is left as an excercise for the reader 🙂

def od2d(val):                                                                  
  if isinstance(val, (OrderedDict, dict)):                                    
      return {k: od2d(v) for k, v in val.items()}                             
  elif isinstance(val, (tuple, list)):                                        
      return [od2d(v) for v in val]                                           
  else:                                                                       
      return val 
Answered By: rrauenza

This code should work with nested lists.

def nested_convert_to_dict(input: [dict, collections.OrderedDict]):
    if isinstance(input, collections.OrderedDict):
        res = dict(input)
    else:
        res = input
    try:
        for key, value in res.items():
            res[key] = nested_convert_to_dict(value)
            if isinstance(value, list):
                new_value = []
                for item in value:
                    if isinstance(item, collections.OrderedDict):
                        item = nested_convert_to_dict(item)
                    new_value.append(item)
                res[key] = new_value
    except AttributeError:
        pass
    return res

You should leverage Python’s builtin copy mechanism.

You can override copying behavior for OrderedDict via Python’s copyreg module (also used by pickle). Then you can use Python’s builtin copy.deepcopy() function to perform the conversion.

import copy
import copyreg
from collections import OrderedDict

def convert_nested_ordered_dict(x):
    """
    Perform a deep copy of the given object, but convert
    all internal OrderedDicts to plain dicts along the way.

    Args:
        x: Any pickleable object

    Returns:
        A copy of the input, in which all OrderedDicts contained
        anywhere in the input (as iterable items or attributes, etc.)
        have been converted to plain dicts.
    """
    # Temporarily install a custom pickling function
    # (used by deepcopy) to convert OrderedDict to dict.
    orig_pickler = copyreg.dispatch_table.get(OrderedDict, None)
    copyreg.pickle(
        OrderedDict,
        lambda d: (dict, ([*d.items()],))
    )
    try:
        return copy.deepcopy(x)
    finally:
        # Restore the original OrderedDict pickling function (if any)
        del copyreg.dispatch_table[OrderedDict]
        if orig_pickler:
            copyreg.dispatch_table[OrderedDict] = orig_pickler

Merely by using Python’s builtin copying infrastructure, this solution is superior to all other answers presented here, in the following ways:

  • Works for more than just JSON data.

  • Does not require you to implement special logic for each possible element type (e.g. list, tuple, etc.)

  • deepcopy() will properly handle duplicate references within the collection:

    x = [1,2,3]
    d = {'a': x, 'b': x}
    assert d['a'] is d['b']
    
    d2 = copy.deepcopy(d)
    assert d2['a'] is d2['b']
    

    Since our solution is based on deepcopy() we’ll have the same advantage.

  • This solution also converts attributes that happen to be OrderedDict, not only collection elements:

    class C:
        def __init__(self, a):
            self.a = a
    
        def __repr__(self):
            return f"C(a={self.a})"
    
    c = C(OrderedDict([(1, 'one'), (2, 'two')]))
    print("original: ", c)
    print("converted:", convert_nested_ordered_dict(c))
    
    original:  C(a=OrderedDict([(1, 'one'), (2, 'two')]))
    converted: C(a={1: 'one', 2: 'two'})
    
Answered By: Stuart Berg
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.