Convert a String representation of a Dictionary to a dictionary
Question:
How can I convert the str
representation of a dict
, such as the following string, into a dict
?
s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
I prefer not to use eval
. What else can I use?
The main reason for this, is one of my coworkers classes he wrote, converts all input into strings. I’m not in the mood to go and modify his classes, to deal with this issue.
Answers:
If the string can always be trusted, you could use eval
(or use literal_eval
as suggested; it’s safe no matter what the string is.) Otherwise you need a parser. A JSON parser (such as simplejson) would work if he only ever stores content that fits with the JSON scheme.
You can use the built-in ast.literal_eval
:
>>> import ast
>>> ast.literal_eval("{'muffin' : 'lolz', 'foo' : 'kitty'}")
{'muffin': 'lolz', 'foo': 'kitty'}
This is safer than using eval
. As its own docs say:
>>> help(ast.literal_eval)
Help on function literal_eval in module ast:
literal_eval(node_or_string)
Safely evaluate an expression node or a string containing a Python
expression. The string or node provided may only consist of the following
Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
and None.
For example:
>>> eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 208, in rmtree
onerror(os.listdir, path, sys.exc_info())
File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 206, in rmtree
names = os.listdir(path)
OSError: [Errno 2] No such file or directory: 'mongo'
>>> ast.literal_eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 68, in literal_eval
return _convert(node_or_string)
File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 67, in _convert
raise ValueError('malformed string')
ValueError: malformed string
https://docs.python.org/library/json.html
JSON can solve this problem, though its decoder wants double quotes around keys and values. If you don’t mind a replace hack…
import json
s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
json_acceptable_string = s.replace("'", """)
d = json.loads(json_acceptable_string)
# d = {u'muffin': u'lolz', u'foo': u'kitty'}
NOTE that if you have single quotes as a part of your keys or values this will fail due to improper character replacement. This solution is only recommended if you have a strong aversion to the eval solution.
More about json single quote: jQuery.parseJSON throws “Invalid JSON” error due to escaped single quote in JSON
using json.loads
:
>>> import json
>>> h = '{"foo":"bar", "foo2":"bar2"}'
>>> d = json.loads(h)
>>> d
{u'foo': u'bar', u'foo2': u'bar2'}
>>> type(d)
<type 'dict'>
Use json
. the ast
library consumes a lot of memory and and slower. I have a process that needs to read a text file of 156Mb. Ast
with 5 minutes delay for the conversion dictionary json
and 1 minutes using 60% less memory!
To OP’s example:
s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
We can use Yaml to deal with this kind of non-standard json in string:
>>> import yaml
>>> s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
>>> s
"{'muffin' : 'lolz', 'foo' : 'kitty'}"
>>> yaml.load(s)
{'muffin': 'lolz', 'foo': 'kitty'}
string = "{'server1':'value','server2':'value'}"
#Now removing { and }
s = string.replace("{" ,"")
finalstring = s.replace("}" , "")
#Splitting the string based on , we get key value pairs
list = finalstring.split(",")
dictionary ={}
for i in list:
#Get Key Value pairs separately to store in dictionary
keyvalue = i.split(":")
#Replacing the single quotes in the leading.
m= keyvalue[0].strip(''')
m = m.replace(""", "")
dictionary[m] = keyvalue[1].strip('"'')
print dictionary
no any libs are used (python2):
dict_format_string = "{'1':'one', '2' : 'two'}"
d = {}
elems = filter(str.isalnum,dict_format_string.split("'"))
values = elems[1::2]
keys = elems[0::2]
d.update(zip(keys,values))
NOTE: As it has hardcoded split("'")
will work only for strings where data is "single quoted".
NOTE2: In python3 you need to wrap filter()
to list()
to get list.
To summarize:
import ast, yaml, json, timeit
descs=['short string','long string']
strings=['{"809001":2,"848545":2,"565828":1}','{"2979":1,"30581":1,"7296":1,"127256":1,"18803":2,"41619":1,"41312":1,"16837":1,"7253":1,"70075":1,"3453":1,"4126":1,"23599":1,"11465":3,"19172":1,"4019":1,"4775":1,"64225":1,"3235":2,"15593":1,"7528":1,"176840":1,"40022":1,"152854":1,"9878":1,"16156":1,"6512":1,"4138":1,"11090":1,"12259":1,"4934":1,"65581":1,"9747":2,"18290":1,"107981":1,"459762":1,"23177":1,"23246":1,"3591":1,"3671":1,"5767":1,"3930":1,"89507":2,"19293":1,"92797":1,"32444":2,"70089":1,"46549":1,"30988":1,"4613":1,"14042":1,"26298":1,"222972":1,"2982":1,"3932":1,"11134":1,"3084":1,"6516":1,"486617":1,"14475":2,"2127":1,"51359":1,"2662":1,"4121":1,"53848":2,"552967":1,"204081":1,"5675":2,"32433":1,"92448":1}']
funcs=[json.loads,eval,ast.literal_eval,yaml.load]
for desc,string in zip(descs,strings):
print('***',desc,'***')
print('')
for func in funcs:
print(func.__module__+' '+func.__name__+':')
%timeit func(string)
print('')
Results:
*** short string ***
json loads:
4.47 µs ± 33.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
builtins eval:
24.1 µs ± 163 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
ast literal_eval:
30.4 µs ± 299 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
yaml load:
504 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
*** long string ***
json loads:
29.6 µs ± 230 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
builtins eval:
219 µs ± 3.92 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
ast literal_eval:
331 µs ± 1.89 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
yaml load:
9.02 ms ± 92.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Conclusion:
prefer json.loads
Optimized code of Siva Kameswara Rao Munipalle
s = s.replace("{", "").replace("}", "").split(",")
dictionary = {}
for i in s:
dictionary[i.split(":")[0].strip(''').replace(""", "")] = i.split(":")[1].strip('"'')
print(dictionary)
My string didn’t have quotes inside:
s = 'Date: 2022-11-29T10:57:01.024Z, Size: 910.11 KB'
My solution was to use str.split
:
{k:v for k, v in map(lambda d: d.split(': '), s.split(', '))}
How can I convert the str
representation of a dict
, such as the following string, into a dict
?
s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
I prefer not to use eval
. What else can I use?
The main reason for this, is one of my coworkers classes he wrote, converts all input into strings. I’m not in the mood to go and modify his classes, to deal with this issue.
If the string can always be trusted, you could use eval
(or use literal_eval
as suggested; it’s safe no matter what the string is.) Otherwise you need a parser. A JSON parser (such as simplejson) would work if he only ever stores content that fits with the JSON scheme.
You can use the built-in ast.literal_eval
:
>>> import ast
>>> ast.literal_eval("{'muffin' : 'lolz', 'foo' : 'kitty'}")
{'muffin': 'lolz', 'foo': 'kitty'}
This is safer than using eval
. As its own docs say:
>>> help(ast.literal_eval) Help on function literal_eval in module ast: literal_eval(node_or_string) Safely evaluate an expression node or a string containing a Python expression. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.
For example:
>>> eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 208, in rmtree
onerror(os.listdir, path, sys.exc_info())
File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 206, in rmtree
names = os.listdir(path)
OSError: [Errno 2] No such file or directory: 'mongo'
>>> ast.literal_eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 68, in literal_eval
return _convert(node_or_string)
File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 67, in _convert
raise ValueError('malformed string')
ValueError: malformed string
https://docs.python.org/library/json.html
JSON can solve this problem, though its decoder wants double quotes around keys and values. If you don’t mind a replace hack…
import json
s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
json_acceptable_string = s.replace("'", """)
d = json.loads(json_acceptable_string)
# d = {u'muffin': u'lolz', u'foo': u'kitty'}
NOTE that if you have single quotes as a part of your keys or values this will fail due to improper character replacement. This solution is only recommended if you have a strong aversion to the eval solution.
More about json single quote: jQuery.parseJSON throws “Invalid JSON” error due to escaped single quote in JSON
using json.loads
:
>>> import json
>>> h = '{"foo":"bar", "foo2":"bar2"}'
>>> d = json.loads(h)
>>> d
{u'foo': u'bar', u'foo2': u'bar2'}
>>> type(d)
<type 'dict'>
Use json
. the ast
library consumes a lot of memory and and slower. I have a process that needs to read a text file of 156Mb. Ast
with 5 minutes delay for the conversion dictionary json
and 1 minutes using 60% less memory!
To OP’s example:
s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
We can use Yaml to deal with this kind of non-standard json in string:
>>> import yaml
>>> s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
>>> s
"{'muffin' : 'lolz', 'foo' : 'kitty'}"
>>> yaml.load(s)
{'muffin': 'lolz', 'foo': 'kitty'}
string = "{'server1':'value','server2':'value'}"
#Now removing { and }
s = string.replace("{" ,"")
finalstring = s.replace("}" , "")
#Splitting the string based on , we get key value pairs
list = finalstring.split(",")
dictionary ={}
for i in list:
#Get Key Value pairs separately to store in dictionary
keyvalue = i.split(":")
#Replacing the single quotes in the leading.
m= keyvalue[0].strip(''')
m = m.replace(""", "")
dictionary[m] = keyvalue[1].strip('"'')
print dictionary
no any libs are used (python2):
dict_format_string = "{'1':'one', '2' : 'two'}"
d = {}
elems = filter(str.isalnum,dict_format_string.split("'"))
values = elems[1::2]
keys = elems[0::2]
d.update(zip(keys,values))
NOTE: As it has hardcoded split("'")
will work only for strings where data is "single quoted".
NOTE2: In python3 you need to wrap filter()
to list()
to get list.
To summarize:
import ast, yaml, json, timeit
descs=['short string','long string']
strings=['{"809001":2,"848545":2,"565828":1}','{"2979":1,"30581":1,"7296":1,"127256":1,"18803":2,"41619":1,"41312":1,"16837":1,"7253":1,"70075":1,"3453":1,"4126":1,"23599":1,"11465":3,"19172":1,"4019":1,"4775":1,"64225":1,"3235":2,"15593":1,"7528":1,"176840":1,"40022":1,"152854":1,"9878":1,"16156":1,"6512":1,"4138":1,"11090":1,"12259":1,"4934":1,"65581":1,"9747":2,"18290":1,"107981":1,"459762":1,"23177":1,"23246":1,"3591":1,"3671":1,"5767":1,"3930":1,"89507":2,"19293":1,"92797":1,"32444":2,"70089":1,"46549":1,"30988":1,"4613":1,"14042":1,"26298":1,"222972":1,"2982":1,"3932":1,"11134":1,"3084":1,"6516":1,"486617":1,"14475":2,"2127":1,"51359":1,"2662":1,"4121":1,"53848":2,"552967":1,"204081":1,"5675":2,"32433":1,"92448":1}']
funcs=[json.loads,eval,ast.literal_eval,yaml.load]
for desc,string in zip(descs,strings):
print('***',desc,'***')
print('')
for func in funcs:
print(func.__module__+' '+func.__name__+':')
%timeit func(string)
print('')
Results:
*** short string ***
json loads:
4.47 µs ± 33.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
builtins eval:
24.1 µs ± 163 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
ast literal_eval:
30.4 µs ± 299 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
yaml load:
504 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
*** long string ***
json loads:
29.6 µs ± 230 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
builtins eval:
219 µs ± 3.92 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
ast literal_eval:
331 µs ± 1.89 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
yaml load:
9.02 ms ± 92.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Conclusion:
prefer json.loads
Optimized code of Siva Kameswara Rao Munipalle
s = s.replace("{", "").replace("}", "").split(",")
dictionary = {}
for i in s:
dictionary[i.split(":")[0].strip(''').replace(""", "")] = i.split(":")[1].strip('"'')
print(dictionary)
My string didn’t have quotes inside:
s = 'Date: 2022-11-29T10:57:01.024Z, Size: 910.11 KB'
My solution was to use str.split
:
{k:v for k, v in map(lambda d: d.split(': '), s.split(', '))}