Max string recursion exceeded when using str.format_map()
Question:
I am using str.format_map to format some strings but I encounter a problem when this string contains quotes, even escaped. Here is the code:
class __FormatDict(dict):
def __missing__(self, key):
return '{' + key + '}'
def format_dict(node, template_values):
template_values = __FormatDict(template_values)
for key, item in node.items():
if isinstance(item, str):
node[key] = item.format_map(template_values)
For reqular strings (that do not include brackets or quotes) it works, however for strings like "{"libraries":[{"file": "bonjour.so", "modules":[{"name": "hello"}]}]}"
it crashes with the message ValueError: Max string recursion exceeded
.
Escaping the quotes using json.dumps(item)
before formatting it does not solve the issue. What should be done to fix this problem? I am modifying strings I get from JSON files and I would prefer to fix the Python code instead of updating the JSON documents I use.
Answers:
You can’t use your __missing__
trick on JSON data. There are multiple problems. That’s because the text within {...}
replacement fields are not just treated as strings. Take a look at the syntax grammar:
replacement_field ::= "{" [field_name] ["!" conversion] [":" format_spec] "}"
field_name ::= arg_name ("." attribute_name | "[" element_index "]")*
Within a replacement field, !...
and :...
have meaning too! What goes into those sections has strict limits as well.
The recursion error comes from the multiple nested {...}
placeholders inside placeholders inside placeholders; str.format()
and str.format_map()
can’t support a large number of levels of nesting:
>>> '{foo:{baz: {ham}}}'.format_map({'foo': 'bar', 'baz': 's', 'ham': 's'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Max string recursion exceeded
but there are other problems here:
-
The :
colon denotes a formatting specification, which is then passed to the object (key) from the part before the :
. You’d have to give your __missing__
return values a wrapper object with __format__
method if you wanted to recover those.
-
Field names with .
or [...]
in them have special meaning too; "bonjour.so"
will be parsed as the "bonjour
key, with a so
attribute. Ditto for [...]
in the field name, but for item lookups.
Those last two can be approached by returning a wrapper object with __format__
and __getitem__
and __getattr__
methods, but only in limited cases:
>>> class FormatWrapper:
... def __init__(self, v):
... self.v = v
... def __format__(self, spec):
... return '{{{}{}}}'.format(self.v, (':' + spec) if spec else '')
... def __getitem__(self, key):
... return FormatWrapper('{}[{}]'.format(self.v, key))
... def __getattr__(self, attr):
... return FormatWrapper('{}.{}'.format(self.v, attr))
...
>>> class MissingDict(dict):
... def __missing__(self, key):
... return FormatWrapper(key)
...
>>> '{"foo.com": "bar[baz]", "ham": "eggs"}'.format_map(MissingDict())
'{"foo.com": "bar[baz]", "ham": "eggs"}'
>>> '{"foo .com": "bar [ baz ]", "ham": "eggs"}'.format_map(MissingDict())
'{"foo .com": "bar [ baz ]", "ham": "eggs"}'
This fails for ’empty’ attributes:
>>> '{"foo...com": "bar[baz]", "ham": "eggs"}'.format_map(MissingDict())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Empty attribute in format string
In short, the format makes too many assumptions about what is contained inside {...}
curly braces, assumptions JSON data easily breaks.
I suggest you look at using the string.Template()
class instead, a simpler templating system that can be subclassed; the default is to look for and replace $identifier
strings. The Template.safe_substitute()
method does exactly what you want; replace known $identifier
placeholders, but leave unknown names untouched.
import ast
my_dict = {'outer_key':{"inner1_k1":"iv_some_string_{xyz}"},"inner1_k2":{'inner2_k2':'{abc}'}}
s = str(my_dict)
maps = {'{xyz}':'is_cool','{abc}':123}
for k,v in maps.items():
s = s.replace(f"{k}",str(v))
my_dict = ast.literal_eval(s)
- If you are okay with string as value in required dict.
I am using str.format_map to format some strings but I encounter a problem when this string contains quotes, even escaped. Here is the code:
class __FormatDict(dict):
def __missing__(self, key):
return '{' + key + '}'
def format_dict(node, template_values):
template_values = __FormatDict(template_values)
for key, item in node.items():
if isinstance(item, str):
node[key] = item.format_map(template_values)
For reqular strings (that do not include brackets or quotes) it works, however for strings like "{"libraries":[{"file": "bonjour.so", "modules":[{"name": "hello"}]}]}"
it crashes with the message ValueError: Max string recursion exceeded
.
Escaping the quotes using json.dumps(item)
before formatting it does not solve the issue. What should be done to fix this problem? I am modifying strings I get from JSON files and I would prefer to fix the Python code instead of updating the JSON documents I use.
You can’t use your __missing__
trick on JSON data. There are multiple problems. That’s because the text within {...}
replacement fields are not just treated as strings. Take a look at the syntax grammar:
replacement_field ::= "{" [field_name] ["!" conversion] [":" format_spec] "}" field_name ::= arg_name ("." attribute_name | "[" element_index "]")*
Within a replacement field, !...
and :...
have meaning too! What goes into those sections has strict limits as well.
The recursion error comes from the multiple nested {...}
placeholders inside placeholders inside placeholders; str.format()
and str.format_map()
can’t support a large number of levels of nesting:
>>> '{foo:{baz: {ham}}}'.format_map({'foo': 'bar', 'baz': 's', 'ham': 's'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Max string recursion exceeded
but there are other problems here:
-
The
:
colon denotes a formatting specification, which is then passed to the object (key) from the part before the:
. You’d have to give your__missing__
return values a wrapper object with__format__
method if you wanted to recover those. -
Field names with
.
or[...]
in them have special meaning too;"bonjour.so"
will be parsed as the"bonjour
key, with aso
attribute. Ditto for[...]
in the field name, but for item lookups.
Those last two can be approached by returning a wrapper object with __format__
and __getitem__
and __getattr__
methods, but only in limited cases:
>>> class FormatWrapper:
... def __init__(self, v):
... self.v = v
... def __format__(self, spec):
... return '{{{}{}}}'.format(self.v, (':' + spec) if spec else '')
... def __getitem__(self, key):
... return FormatWrapper('{}[{}]'.format(self.v, key))
... def __getattr__(self, attr):
... return FormatWrapper('{}.{}'.format(self.v, attr))
...
>>> class MissingDict(dict):
... def __missing__(self, key):
... return FormatWrapper(key)
...
>>> '{"foo.com": "bar[baz]", "ham": "eggs"}'.format_map(MissingDict())
'{"foo.com": "bar[baz]", "ham": "eggs"}'
>>> '{"foo .com": "bar [ baz ]", "ham": "eggs"}'.format_map(MissingDict())
'{"foo .com": "bar [ baz ]", "ham": "eggs"}'
This fails for ’empty’ attributes:
>>> '{"foo...com": "bar[baz]", "ham": "eggs"}'.format_map(MissingDict())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Empty attribute in format string
In short, the format makes too many assumptions about what is contained inside {...}
curly braces, assumptions JSON data easily breaks.
I suggest you look at using the string.Template()
class instead, a simpler templating system that can be subclassed; the default is to look for and replace $identifier
strings. The Template.safe_substitute()
method does exactly what you want; replace known $identifier
placeholders, but leave unknown names untouched.
import ast
my_dict = {'outer_key':{"inner1_k1":"iv_some_string_{xyz}"},"inner1_k2":{'inner2_k2':'{abc}'}}
s = str(my_dict)
maps = {'{xyz}':'is_cool','{abc}':123}
for k,v in maps.items():
s = s.replace(f"{k}",str(v))
my_dict = ast.literal_eval(s)
- If you are okay with string as value in required dict.