Insert documents into mongodb using `insertmany` method of pymongo ignoring "InvalidDocument"

Question:

I have a extremely large data set(which is extracted for nginx log files), and some of the keys of the documents contain ., which may lead to invaliddocument error.

Instead of filtering out these invalid documents or replace the dots inside the keys, I prefer just ignore these documents, is there any way that I can ignore the invalid documents when insert_many with pymongo?

Asked By: Woods Chen

||

Answers:

Normally you can “ignore” errors on an insert_many by setting the ordered=False parameter; however this still fails for an invalid document apparently by design.

You can, however, do something like this:

import pymongo
import bson.errors

db = pymongo.MongoClient()['mydatabase']

data_to_load = [{"ok": 1},
                {"ok": 2},
                {"not.ok": 3},
                {"ok": 4},
                {"ok": 5}]

for item in data_to_load:
    try:
        db.testdata.insert_one(item)
    except bson.errors.InvalidDocument:
        pass

for item in db.testdata.find({}, {'_id': 0}):
    print(item)

Result:

{'ok': 1}
{'ok': 2}
{'ok': 4}
{'ok': 5}
Answered By: Belly Buster
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.