Django: Filter matching objects using 2 fields

Question:

I have the following classes:


class Event(Model):
    ...

class IOCType(Model):
    name = CharField(max_length=50)

class IOCInfo(Model):
    event = ForeignKey(Event, on_delete=CASCADE, related_name="iocs"
    ioc_type = ForeignKey(IOCType, on_delete=CASCADE)
    value = CharField(max_lenght=50)

Each event has one or several IOCs associated with it, which are stored in the IOCInfo table.

This is how my IOCInfo table looks like after creating some events:

id value event_id ioc_type_id
1 some-value1 eventid1 4
2 some-value2 eventid1 8
3 some-value3 eventid1 8
4 some-value4 eventid1 1
5 some-value3 eventid2 8
6 some-value1 eventid2 1
7 some-value2 eventid3 8
8 some-value3 eventid4 8

What I want to do is to take an event, compare its IOCInfo with those of other events and get back those events that match.

This is what I have done so far and it works, but I fear that as the database grows and the app has more users this query will end up being a bottleneck


def search_matches(event):
    matches = Event.objects.none() 
    for ioc in event.iocs.all():
        matches |= Event.objects.filter(
            iocs__ioc_type=ioc.ioc_type, iocs__value=ioc.value
        )
    return matches.exclude(event=event.id)

If I pass the eventid2 to The above function, it will return eventid1 and eventid4 as expected.

Any suggestions to improve this function using any django method would be appreciated.

Asked By: wisvem

||

Answers:

If I understand correctly:

  1. Filter iocs that correspond to given event:

    iocs = IOCInfo.objects.filter(event=event)
    
  2. Get events that has the same iocs (except the given one)

    events = Event.objects.filter(iocs__in=iocs).exclude(pk=event.pk)
    

That will result in a single pretty efficent query.

UPD:

To look into the fields of ioc – replace point 2 with the following:

events = Event.objects.filter(
    iocs__ioc__type__in=iocs.values_list('type', flat=True),
    iocs__ioc__value__in=iocs.values_list('value', flat=True)
).exclude(pk=event.pk)
Answered By: Egor Wexler

Let’s see if I understand…

def search_matches(event):
    get_values = lambda data, key: [x[key] for x in data]
    data_to_filter = evetn.iocs.all().values('ioc_type', 'value').order_by('id')

    return Event.objects.filter(
        iocs__ioc_type__in=get_values(data_to_filter, ‘ioc_type’),
        iocs__value__in=get_values(data_to_filter, ‘value’)
    ).exclude(pk=event.id)
Answered By: SoundWave

first of all you should get the values of ioc_type and values for the IOCInfo of the eventid specified

event_filters=IOCInfo.objects.filter(event=event).values(iocs__ioc_type=F("ioc_type"),iocs__value=F("value"))

then you should construct a Q object to filter the events according to either condition of ioc_type and value satisfied
finally,use this Q object to filter your event

filtering_kwargs=Q()
for filter_ in event_filters:
    filtering_kwargs|=Q(**filter_)
matching_events=Event.objects.filter(filtering_kwargs).exclude(id=event.id)

Hope that solves your problem as db will be queried only twice no matter how large your data is

combining everything in a single function becomes

from django.db.models import F
def search_matches(event):
    event_filters=IOCInfo.objects.filter(event=event).values(iocs__ioc_type=F("ioc_type"),iocs__value=F("value"))
    filtering_kwargs=Q()
    for filter_ in event_filters:
        filtering_kwargs|=Q(**filter_)
    matching_events=Event.objects.filter(filtering_kwargs).exclude(id=event.id)
    return matching_events
Answered By: Dante