is it possible to have lists with the maximum number of common integers?

Question:

I have a list containing several lists of integers and I would like to find the lists having the maximum of common elements.

I tried to use the intersection, but it returns an empty set, since here the intersection concerns the common elements of all the lists found in my list.
I would like my code to show me the lists having the common integer number that I want. If I want, for example, the lists having 3 integers in common, that it displays the lists in question to me.
I have searched a lot on the net but I can only find reasoning to determine if two lists are identical or not.

Here is the code for intersection:

import string
list = [[3,5,9], [4,6,6], [4,7], [2,7], [2,1,4,5], [1,2,4,6], [3,3], [3,3], [3,2,1], [3,2]]
result = set.intersection(*map(set,list))
print(result)

Here is the result:

set()

but what I want is:

[2,1,4,5],[1,2,4,6]
Asked By: Lea

||

Answers:

What you want is filtering the pairs of lists based on the condition of having 3 items in the intersection.

You can get all the pairs by using itertools.combinations and filter them with a list comprehension:

import string
from itertools import combinations

list_ = [[3,5,9], [4,6,6], [4,7], [2,7], [2,1,4,5], [1,2,4,6], [3,3], [3,3], [3,2,1], [3,2]]

print([c for  c in combinations(list_, r=2) if len(set(c[0]) & set(c[1])) == 3])

The output is as requested:

[([2, 1, 4, 5], [1, 2, 4, 6])]
Answered By: Caridorc

First incorrect answer (misunderstanding of requirements)

data = [[3, 5, 9], [4, 6, 6], [4, 7], [2, 7], [2, 1, 4, 5], [1, 2, 4, 6], [3, 3], [3, 3], [3, 2, 1], [3, 2]]

max_unique_elements = 0
holding = []
for data_list in data:
    unique_elements = len(set(data_list))

    if unique_elements > max_unique_elements:
        holding = [data_list]
        max_unique_elements = unique_elements

    elif unique_elements == max_unique_elements:
        holding.append(data_list)

print(holding)

Second (I believe) correct answer. Please note that this will not be optimal and as noted in comments will give incorrect answer if two or more sets have the maximum intersection (greatest number of common elements). Also due to the method using sets only one occurrence of each element will be printed e.g. [2, 3, 3, 4, 6] will be output as [3, 2, 4, 6] (order not preserved). I will fix these problems as soon as I have time but I am on holiday at the moment and this should give you the gist of how to solve this problem.

data = [[3, 5, 9], [4, 6, 6], [4, 7], [2, 7], [2, 1, 4, 5], [1, 2, 4, 6], [3, 3], [3, 3], [3, 2, 1], [3, 2]]
# set default to fist element of first list
most_common_count = 0
max_intersection = 0
sets_with_max_intersection = []

# sets remove any duplicates, as duplicates only count once
# (e.g. [4, 6, 6] and [6, 2, 6] only have one element in common)
# this makes processing easier
data_sets = [set(data_list) for data_list in data]

# count the number of sets which each element occurs in
for index, data_set_1 in enumerate(data_sets):
    for data_set_2 in data_sets[index + 1:]:
        union_result = data_set_1.intersection(data_set_2)

        # new greatest union found
        if len(union_result) > max_intersection:
            max_intersection = len(union_result)
            sets_with_max_intersection = [data_set_1, data_set_2]

        # equal length to max union assume part of same group and add
        # note: will give erroneous result if two or groups of sets
        # have the same number of elements in common
        elif len(union_result) == max_intersection:
            if data_set_1 not in sets_with_max_intersection:
                sets_with_max_intersection.append(data_set_1)
            if data_set_2 not in sets_with_max_intersection:
                sets_with_max_intersection.append(data_set_2)

print(max_intersection)
print(sets_with_max_intersection)


Answered By: Pioneer_11
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.