python dict.update vs. subscript to add a single key/value pair
Question:
Every semester I have at least one Python student who uses dict.update()
to add a single key/value pair, viz.:
mydict.update({'newkey':'newvalue'})
instead of
mydict['newkey'] = 'newvalue'
I don’t teach this method and I don’t know where they’re finding examples of this, but I tell them not to do it because it’s less efficient (presumably creates a new 2-element dict) and because it’s nonstandard.
Honestly, I can understand the desire to use a visible method rather than this syntax – it perhaps feels more consistent with other method calls. But I think it looks like a newbie approach.
Is there any wisdom anyone has to offer on this point?
Answers:
Updating the key directly is thrice as fast, but YMMV:
$ python -m timeit 'd={"k":1}; d.update({"k":2})'
1000000 loops, best of 3: 0.669 usec per loop
$ python -m timeit 'd={"k":1}; d["k"] = 2'
1000000 loops, best of 3: 0.212 usec per loop
There are people who feel that []= is not a valid syntactic element in an object oriented language, or any other for that matter. I remember hearing this argument decades ago when I worked in APL language development. That syntax is a holdover from Fortran…
I don’t personally subscribe to that view and am quite happy with indexed assignment. But there are those that would claim that a real method call is better. And of course it’s always good to have more than one solution.
Edit:
The real issue here is readability, not performance, the indexed assignment has endured because many people find the assignment easier to read even if less theoretically correct.
A benchmark shows your suspicions of its performance impact appear to be correct:
$ python -m timeit -s 'd = {"key": "value"}' 'd["key"] = "value"'
10000000 loops, best of 3: 0.0741 usec per loop
$ python -m timeit -s 'd = {"key": "value"}' 'd.update(key="value")'
1000000 loops, best of 3: 0.294 usec per loop
$ python -m timeit -s 'd = {"key": "value"}' 'd.update({"key": "value"})'
1000000 loops, best of 3: 0.461 usec per loop
That is, it’s about six times slower on my machine. However, Python is already not a language you’d use if you need top performance, so I’d just recommend use of whatever is most readable in the situation. For many things, that would be the []
way, though update
could be more readable in a situation like this:
configuration.update(
timeout=60,
host='example.com',
)
…or something like that.
Every semester I have at least one Python student who uses dict.update()
to add a single key/value pair, viz.:
mydict.update({'newkey':'newvalue'})
instead of
mydict['newkey'] = 'newvalue'
I don’t teach this method and I don’t know where they’re finding examples of this, but I tell them not to do it because it’s less efficient (presumably creates a new 2-element dict) and because it’s nonstandard.
Honestly, I can understand the desire to use a visible method rather than this syntax – it perhaps feels more consistent with other method calls. But I think it looks like a newbie approach.
Is there any wisdom anyone has to offer on this point?
Updating the key directly is thrice as fast, but YMMV:
$ python -m timeit 'd={"k":1}; d.update({"k":2})'
1000000 loops, best of 3: 0.669 usec per loop
$ python -m timeit 'd={"k":1}; d["k"] = 2'
1000000 loops, best of 3: 0.212 usec per loop
There are people who feel that []= is not a valid syntactic element in an object oriented language, or any other for that matter. I remember hearing this argument decades ago when I worked in APL language development. That syntax is a holdover from Fortran…
I don’t personally subscribe to that view and am quite happy with indexed assignment. But there are those that would claim that a real method call is better. And of course it’s always good to have more than one solution.
Edit:
The real issue here is readability, not performance, the indexed assignment has endured because many people find the assignment easier to read even if less theoretically correct.
A benchmark shows your suspicions of its performance impact appear to be correct:
$ python -m timeit -s 'd = {"key": "value"}' 'd["key"] = "value"'
10000000 loops, best of 3: 0.0741 usec per loop
$ python -m timeit -s 'd = {"key": "value"}' 'd.update(key="value")'
1000000 loops, best of 3: 0.294 usec per loop
$ python -m timeit -s 'd = {"key": "value"}' 'd.update({"key": "value"})'
1000000 loops, best of 3: 0.461 usec per loop
That is, it’s about six times slower on my machine. However, Python is already not a language you’d use if you need top performance, so I’d just recommend use of whatever is most readable in the situation. For many things, that would be the []
way, though update
could be more readable in a situation like this:
configuration.update(
timeout=60,
host='example.com',
)
…or something like that.