MangoDB using JavaScript Pattern with $in operator
Question:
Trying to create a MongoDB query with the Pymongo driver that search a database for some dog breeds, but I need to do pattern matching due to data being of differing quality. I know I cannot use Regex in a $in style query so I am using JS patterns.
I have this so far:
df = shelter.getRecords({
"breed": {"$in": [/labrador/i, /chesa/i, /newfound/i]}
})
But I get syntax errors. Command seems to work in Mongo shell….is this a limitation of Pymongo?
Answers:
You could use "$regex"
like this.
N.B.: Your pymongo
query filter could be a python
dictionary like {'breed': {'$regex': '(labrador)|(chesa)|(newfound)', '$options': 'i'}}
.
db.shelter.find({
"breed": {
"$regex": "(labrador)|(chesa)|(newfound)",
"$options": "i"
}
})
Example output:
[
{
"_id": 21,
"breed": "Chesapeake Retriever"
},
{
"_id": 80,
"breed": "Newfoundland"
}
]
Try it on mongoplayground.net.
In the end this was my hybrid approach…
labRegex = re.compile(".*lab.*", re.IGNORECASE)
chesaRegex = re.compile(".*chesa.*", re.IGNORECASE)
newRegex = re.compile(".*newf.*", re.IGNORECASE)
df = pd.DataFrame.from_records(shelter.getRecordCriteria({
'$or':[ #Regex isn't allowed in an $in helper so use $or
{"breed": {'$regex': newRegex}}, #pass the regex to the filter
{"breed": {'$regex': chesaRegex}},
{"breed": {'$regex': labRegex}},
],
"sex_upon_outcome": "Intact Female",
"age_upon_outcome_in_weeks": {"$gte":26.0, "$lte":156.0}
}))
Trying to create a MongoDB query with the Pymongo driver that search a database for some dog breeds, but I need to do pattern matching due to data being of differing quality. I know I cannot use Regex in a $in style query so I am using JS patterns.
I have this so far:
df = shelter.getRecords({
"breed": {"$in": [/labrador/i, /chesa/i, /newfound/i]}
})
But I get syntax errors. Command seems to work in Mongo shell….is this a limitation of Pymongo?
You could use "$regex"
like this.
N.B.: Your pymongo
query filter could be a python
dictionary like {'breed': {'$regex': '(labrador)|(chesa)|(newfound)', '$options': 'i'}}
.
db.shelter.find({
"breed": {
"$regex": "(labrador)|(chesa)|(newfound)",
"$options": "i"
}
})
Example output:
[
{
"_id": 21,
"breed": "Chesapeake Retriever"
},
{
"_id": 80,
"breed": "Newfoundland"
}
]
Try it on mongoplayground.net.
In the end this was my hybrid approach…
labRegex = re.compile(".*lab.*", re.IGNORECASE)
chesaRegex = re.compile(".*chesa.*", re.IGNORECASE)
newRegex = re.compile(".*newf.*", re.IGNORECASE)
df = pd.DataFrame.from_records(shelter.getRecordCriteria({
'$or':[ #Regex isn't allowed in an $in helper so use $or
{"breed": {'$regex': newRegex}}, #pass the regex to the filter
{"breed": {'$regex': chesaRegex}},
{"breed": {'$regex': labRegex}},
],
"sex_upon_outcome": "Intact Female",
"age_upon_outcome_in_weeks": {"$gte":26.0, "$lte":156.0}
}))