Django: Filter matching objects using 2 fields
Question:
I have the following classes:
class Event(Model):
...
class IOCType(Model):
name = CharField(max_length=50)
class IOCInfo(Model):
event = ForeignKey(Event, on_delete=CASCADE, related_name="iocs"
ioc_type = ForeignKey(IOCType, on_delete=CASCADE)
value = CharField(max_lenght=50)
Each event has one or several IOCs associated with it, which are stored in the IOCInfo table.
This is how my IOCInfo table looks like after creating some events:
id
value
event_id
ioc_type_id
1
some-value1
eventid1
4
2
some-value2
eventid1
8
3
some-value3
eventid1
8
4
some-value4
eventid1
1
5
some-value3
eventid2
8
6
some-value1
eventid2
1
7
some-value2
eventid3
8
8
some-value3
eventid4
8
What I want to do is to take an event, compare its IOCInfo with those of other events and get back those events that match.
This is what I have done so far and it works, but I fear that as the database grows and the app has more users this query will end up being a bottleneck
def search_matches(event):
matches = Event.objects.none()
for ioc in event.iocs.all():
matches |= Event.objects.filter(
iocs__ioc_type=ioc.ioc_type, iocs__value=ioc.value
)
return matches.exclude(event=event.id)
If I pass the eventid2 to The above function, it will return eventid1 and eventid4 as expected.
Any suggestions to improve this function using any django method would be appreciated.
Answers:
If I understand correctly:
-
Filter iocs that correspond to given event:
iocs = IOCInfo.objects.filter(event=event)
-
Get events that has the same iocs (except the given one)
events = Event.objects.filter(iocs__in=iocs).exclude(pk=event.pk)
That will result in a single pretty efficent query.
UPD:
To look into the fields of ioc
– replace point 2 with the following:
events = Event.objects.filter(
iocs__ioc__type__in=iocs.values_list('type', flat=True),
iocs__ioc__value__in=iocs.values_list('value', flat=True)
).exclude(pk=event.pk)
Let’s see if I understand…
def search_matches(event):
get_values = lambda data, key: [x[key] for x in data]
data_to_filter = evetn.iocs.all().values('ioc_type', 'value').order_by('id')
return Event.objects.filter(
iocs__ioc_type__in=get_values(data_to_filter, ‘ioc_type’),
iocs__value__in=get_values(data_to_filter, ‘value’)
).exclude(pk=event.id)
first of all you should get the values of ioc_type
and values
for the IOCInfo of the eventid specified
event_filters=IOCInfo.objects.filter(event=event).values(iocs__ioc_type=F("ioc_type"),iocs__value=F("value"))
then you should construct a Q
object to filter the events according to either condition of ioc_type
and value
satisfied
finally,use this Q
object to filter your event
filtering_kwargs=Q()
for filter_ in event_filters:
filtering_kwargs|=Q(**filter_)
matching_events=Event.objects.filter(filtering_kwargs).exclude(id=event.id)
Hope that solves your problem as db will be queried only twice no matter how large your data is
combining everything in a single function becomes
from django.db.models import F
def search_matches(event):
event_filters=IOCInfo.objects.filter(event=event).values(iocs__ioc_type=F("ioc_type"),iocs__value=F("value"))
filtering_kwargs=Q()
for filter_ in event_filters:
filtering_kwargs|=Q(**filter_)
matching_events=Event.objects.filter(filtering_kwargs).exclude(id=event.id)
return matching_events
I have the following classes:
class Event(Model):
...
class IOCType(Model):
name = CharField(max_length=50)
class IOCInfo(Model):
event = ForeignKey(Event, on_delete=CASCADE, related_name="iocs"
ioc_type = ForeignKey(IOCType, on_delete=CASCADE)
value = CharField(max_lenght=50)
Each event has one or several IOCs associated with it, which are stored in the IOCInfo table.
This is how my IOCInfo table looks like after creating some events:
id | value | event_id | ioc_type_id |
---|---|---|---|
1 | some-value1 | eventid1 | 4 |
2 | some-value2 | eventid1 | 8 |
3 | some-value3 | eventid1 | 8 |
4 | some-value4 | eventid1 | 1 |
5 | some-value3 | eventid2 | 8 |
6 | some-value1 | eventid2 | 1 |
7 | some-value2 | eventid3 | 8 |
8 | some-value3 | eventid4 | 8 |
What I want to do is to take an event, compare its IOCInfo with those of other events and get back those events that match.
This is what I have done so far and it works, but I fear that as the database grows and the app has more users this query will end up being a bottleneck
def search_matches(event):
matches = Event.objects.none()
for ioc in event.iocs.all():
matches |= Event.objects.filter(
iocs__ioc_type=ioc.ioc_type, iocs__value=ioc.value
)
return matches.exclude(event=event.id)
If I pass the eventid2 to The above function, it will return eventid1 and eventid4 as expected.
Any suggestions to improve this function using any django method would be appreciated.
If I understand correctly:
-
Filter iocs that correspond to given event:
iocs = IOCInfo.objects.filter(event=event)
-
Get events that has the same iocs (except the given one)
events = Event.objects.filter(iocs__in=iocs).exclude(pk=event.pk)
That will result in a single pretty efficent query.
UPD:
To look into the fields of ioc
– replace point 2 with the following:
events = Event.objects.filter(
iocs__ioc__type__in=iocs.values_list('type', flat=True),
iocs__ioc__value__in=iocs.values_list('value', flat=True)
).exclude(pk=event.pk)
Let’s see if I understand…
def search_matches(event):
get_values = lambda data, key: [x[key] for x in data]
data_to_filter = evetn.iocs.all().values('ioc_type', 'value').order_by('id')
return Event.objects.filter(
iocs__ioc_type__in=get_values(data_to_filter, ‘ioc_type’),
iocs__value__in=get_values(data_to_filter, ‘value’)
).exclude(pk=event.id)
first of all you should get the values of ioc_type
and values
for the IOCInfo of the eventid specified
event_filters=IOCInfo.objects.filter(event=event).values(iocs__ioc_type=F("ioc_type"),iocs__value=F("value"))
then you should construct a Q
object to filter the events according to either condition of ioc_type
and value
satisfied
finally,use this Q
object to filter your event
filtering_kwargs=Q()
for filter_ in event_filters:
filtering_kwargs|=Q(**filter_)
matching_events=Event.objects.filter(filtering_kwargs).exclude(id=event.id)
Hope that solves your problem as db will be queried only twice no matter how large your data is
combining everything in a single function becomes
from django.db.models import F
def search_matches(event):
event_filters=IOCInfo.objects.filter(event=event).values(iocs__ioc_type=F("ioc_type"),iocs__value=F("value"))
filtering_kwargs=Q()
for filter_ in event_filters:
filtering_kwargs|=Q(**filter_)
matching_events=Event.objects.filter(filtering_kwargs).exclude(id=event.id)
return matching_events