Pymongo Not inserting full document on update_one

Question:

I have a lot of documents to update and I want to write a timestamp initially and then an update timestamp when there are duplicates.
So I found this answer and am attempting it for MongoDB 6.0
https://stackoverflow.com/a/17533368/3300927

I also store in my model the variable to use when looking for duplicates as searchable

If a query has no searchable then I insert it without checking and add a timestamp, then take the results and add a timestamp:

data_inserted = collection.insert_many(results)
for doc_id in data_inserted.inserted_ids:
    collection.update_many(
        filter={'_id': doc_id},
        update={'$set': {'insert_date': now, }, },
        upsert=True)

No issues there:

{
  "_id": {
    "$oid": "321654987654"
  },
  "IR NUMBER": "ABC784",
  "Plate": " ",
  "Plate State": " ",
  "Make": "TOYOTA",
  "Model": "TACOMA",
  "Style": "  ",
  "Color": "SIL /    ",
  "Year": "2008",
  "insert_date": {
    "$date": {
      "$numberLong": "1660000808176"
    }
  }
}

If there is a searchable I attempt to look for it. What I get in MongoDB is only the searchable field with the timestamp:

# q_statement.searchable == 'IR NUMBER'

for document in results:
    collection.update_one(
        filter={q_statement.searchable: document[q_statement.searchable], },
        update={'$setOnInsert': {'insert_date': now, }, '$set': {'update_date': now, }},
        upsert=True)

result:

{
  "_id": {
    "$oid": "62f19d981aa321654987"
  },
  "IR NUMBER": "ABC784",
  "insert_date": {
    "$date": {
      "$numberLong": "1660001688126"
    }
  }
}

EDIT

Looking at the pymongo.results.UpdateResult by changing the for loop contents to updates = collection.update_one( ... print(updates.raw_result) shows ~ 10k results like:

  {
    "n": 1,
    "upserted": ObjectId("62f27ae21aa62fbfa734f01d"),
    "nModified": 0,
    "ok": 1.0,
    "updatedExisting": False
  },
  {
    "n": 1,
    "nModified": 0,
    "ok": 1.0,
    "updatedExisting": True
  },
  {
    "n": 1,
    "nModified": 0,
    "ok": 1.0,
    "updatedExisting": True
  }

(python==3.10.3, Django==4.0.4, pymongo==4.2.0)

Asked By: Bammer

||

Answers:

To "upsert" a full document and additional fields using python, you can use MongoDB’s "$setOnInsert" with a python merged dictionary.

From the python library docs, here’s how you merge dictionaries. (It’s similar to MongoDB’s "$mergeObjects".)

d | other

Create a new dictionary with the merged keys and values of d and other,
which must both be dictionaries. The values of other take priority
when d and other share keys.

So, to insert the full document, using your python code, it just needs a minor addition – merge document with your other object.

...
update={'$setOnInsert': document | {'insert_date': now}, '$set': {'update_date': now, }}
...
Answered By: rickhg12hs
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.