I searched in this official document to find difference between the json.dump() and json.dumps() in python. It is clear that they are related with file write option.
But what is the detailed difference between them and in what situations one has more advantage than other?
There isn’t much else to add other than what the docs say. If you want to dump the JSON into a file/socket or whatever, then you should go with
dump(). If you only need it as a string (for printing, parsing or whatever) then use
dumps() (dump string)
As mentioned by Antti Haapala in this answer, there are some minor differences on the
ensure_ascii behaviour. This is mostly due to how the underlying
write() function works, being that it operates on chunks rather than the whole string. Check his answer for more details on that.
Serialize obj as a JSON formatted stream to fp (a .write()-supporting file-like object
If ensure_ascii is False, some chunks written to fp may be unicode instances
Serialize obj to a JSON formatted str
If ensure_ascii is False, the result may contain non-ASCII characters and the return value may be a unicode instance
The functions ending with
s accept string parameters. The other take file
streams or pointers to files.
One notable difference in Python 2 is that if you’re using
dump will properly write UTF-8 encoded data into the file (unless you used 8-bit strings with extended characters that are not UTF-8):
dumps on the other hand, with
ensure_ascii=False can produce a
unicode just depending on what types you used for strings:
Serialize obj to a JSON formatted str using this conversion table. If ensure_ascii is False, the result may contain non-ASCII characters and the return value may be a
(emphasis mine). Note that it may still be a
str instance as well.
Thus you cannot use its return value to save the structure into file without checking which
format was returned and possibly playing with
This of course is not valid concern in Python 3 any more, since there is no more this 8-bit/Unicode confusion.
load considers the whole file to be one JSON document, so you cannot use it to read multiple newline limited JSON documents from a single file.
In memory usage and speed.
When you call
jsonstr = json.dumps(mydata) it first creates a full copy of your data in memory and only then you
file.write(jsonstr) it to disk. So this is a faster method but can be a problem if you have a big piece of data to save.
When you call
json.dump(mydata, file) — without ‘s’, new memory is not used, as the data is dumped by chunks. But the whole process is about 2 times slower.
Source: I checked the source code of
json.dumps() and also tested both the variants measuring the time with
time.time() and watching the memory usage in htop.