Compare and iterate over two 2D arrays

Question:

I have two 2D arrays like these:

mix = [[1,'Blue'],[2,'Black'],[3,'Black'],[4,'Red']]

possibilities = [[1,'Black'],[1,'Red'],[1,'Blue'],[1,'Yellow'],
         [2,'Green'],[2,'Black'],
         [3,'Black'],[3,'Pink'],
         [4,'White'],[4,'Blue'],[4,'Yellow'],
         [5,'Purple'],[5,'Blue']
        ]

I want to loop through the possibilities list, find the exact index in which it matches the mix list, then append the correct mix into a new list.IF it does not match the MIX list, append into a "bad list" and then move on to the next iteration. For now this is the idea I had —- note it totally DOES NOT work! 🙂

i = 0
j = 0
bad_guess = []
correct_guess = []
while i < len(mix):
    while possibilities[j] != mix[i]:
        j += 1
    if possibilities[j] == mix[i]:
        i +=1
        correct_guess.append(possibilities[j])
        j = 0
    elif possibilities[j] != mix[i]:
        bad_guess.append(mix[i])
        break

Output
Basically the output I would want for this example is:

correct_guess = [[1,'Blue'],[2,'Black'],[3,'Black'],[5,'Purple']
bad_guess = [4,'Red']
Asked By: Romain_NewPython

||

Answers:

This should do the job:

mix = [[1,'Blue'],[2,'Black'],[3,'Black'],[4,'Red']]

possibilities = [[1,'Black'],[1,'Red'],[1,'Blue'],[1,'Yellow'],
         [2,'Green'],[2,'Black'],
         [3,'Black'],[3,'Pink'],
         [4,'White'],[4,'Blue'],[4,'Yellow'],
         [5,'Purple'],[5,'Blue']
        ]

bad_guess = []
correct_guess = []
for i in mix:
    if i in possibilities:
        correct_guess.append(i)
    else:
        bad_guess.append(i)
Answered By: ÅŸamil arınç
mix = {1:'Blue' , 2:'Black' , 3:'Black' , 4:'Red'}

possibilities = [[1,'Black'],[1,'Red'],[1,'Blue'],[1,'Yellow'],
     [2,'Green'],[2,'Black'],
     [3,'Black'],[3,'Pink'],
     [4,'White'],[4,'Blue'],[4,'Yellow'],
     [5,'Purple'],[5,'Blue']
    ]

for poss in possibilities:
    if (mix[poss[0]] = poss[1]) & (poss not in bad_guess):
        bad_guess.append(poss)
    elif poss not in good_guess:
        good_guess.append(poss)

you could try making the mix list a dictionary instead so you dont need to iterate over it

Answered By: Christian Trujillo

There are a number of ways of solving this. The simple way is to loop through the lists, but that’s not very pythonic. You can remove the inner loop using a containment check:

bad_guess = []
correct_guess = []
for item in mix:
    if item in possibilities:
        correct_guess.append(item)
    else:
        bad_guess.append(item)

The in operator is going to do a linear search through possibilities at every iteration. For a small list like this, it’s probably fine, but for something larger, you will want a faster lookup.

Faster lookup is offered in sets. Unfortunately, sets can not contain non-hashable types such as lists. The good news is that they can contain tuples:

mix = [tuple(item) for item in mix]
possibilities = {tuple(item) for item in possibilities}
bad_guess = []
correct_guess = []
for item in mix:
    if item in possibilities:
        correct_guess.append(item)
    else:
        bad_guess.append(item)

Another way to get the same result is to first sort mix by whether an item appears in possibilities or not, and then use itertools.groupby to create the output lists. This approach is fun to parse, but is not particularly legible, and therefore not recommended:

key = lambda item: item in possibilities
bad_guess, correct_guess = (list(g) for k, g in itertools.groupby(sorted(mix, key=key), key=key))

This last method is more algorithmically complex than the set lookup because sorting is an O(N log N) operation, while lookup in a set is O(1).

Answered By: Mad Physicist

As is often the case in Python, you don’t actually have to muck about with indices at all to do this.

First, here is a simple solution using your existing data structure. Iterate over mix, and append each item to the appropriate list, depending on whether it’s in possibilities or not. (This is the same idea presented in this answer to "How to split a list based on a condition?")

mix = [
    [1, 'Blue'],
    [2, 'Black'],
    [3, 'Black'],
    [4, 'Red'],
]

possibilities = [
    [1, 'Black'], [1, 'Red'], [1, 'Blue'], [1, 'Yellow'],
    [2, 'Green'], [2, 'Black'],
    [3, 'Black'], [3, 'Pink'],
    [4, 'White'], [4, 'Blue'], [4, 'Yellow'],
    [5, 'Purple'], [5, 'Blue'],
]

correct_guesses = []
bad_guesses = []

for item in mix:
    if item in possibilities:
        correct_guesses.append(item)
    else:
        bad_guesses.append(item)

print(correct_guesses)
print(bad_guesses)

Output:

[[1, 'Blue'], [2, 'Black'], [3, 'Black']]
[[4, 'Red']]

However, this does a lot of unnecessary looping. Each time you check item in possibilities, the code has to iterate over possibilities (which is a list) to see whether or not item is there.

As others have commented, the issue here is your data structure. Instead of a list, possibilities could be a dictionary. Checking whether a dictionary has a given key, or accessing the value associated with a given key, is O(n); essentially it’s "instant" instead of having to go look for it.

possibilities = {
    1: ['Black', 'Red', 'Blue', 'Yellow'],
    2: ['Green', 'Black'],
    3: ['Black', 'Pink'],
    4: ['White', 'Blue', 'Yellow'],
    5: ['Purple', 'Blue']
}

Here each key is an integer, and each value is a list of the colors that number allows. Then your for loop would look like this, checking if the color is one allowed for that number

for item in mix:
    number, color = item
    if color in possibilities[number]:
        correct_guesses.append(item)
    else:
        bad_guesses.append(item)

Do you see the problem, though? We’re still doing the same thing: using in on a list. We could turn each of those lists into a set instead, which can much more efficiently check whether or not it contains something:

possibilities = {
    1: {'Black', 'Red', 'Blue', 'Yellow'},
    2: {'Green', 'Black'},
    3: {'Black', 'Pink'},
    4: {'White', 'Blue', 'Yellow'},
    5: {'Purple', 'Blue'}
}

The for loop would remain the same.


With all that in mind, here’s a complete solution. I’ve also changed the two-item lists to tuples, which serves no functional difference in this case, but is more idiomatic.

mix = [
    (1, 'Blue'),
    (2, 'Black'),
    (3, 'Black'),
    (4, 'Red'),
]

possibilities = {
    1: {'Black', 'Red', 'Blue', 'Yellow'},
    2: {'Green', 'Black'},
    3: {'Black', 'Pink'},
    4: {'White', 'Blue', 'Yellow'},
    5: {'Purple', 'Blue'}
}

correct_guesses = []
bad_guesses = []

for item in mix:
    number, color = item
    if color in possibilities[number]:
        correct_guesses.append(item)
    else:
        bad_guesses.append(item)

print(correct_guesses)
print(bad_guesses)

Output:

[(1, 'Blue'), (2, 'Black'), (3, 'Black')]
[(4, 'Red')]

P.S. Ideally you’d change how you generate possibilities in the first place, but if you’re faced with a situation where you have the lists of lists, and want to convert it into the corresponding dictionary of sets, this would work:

possibilities_list = [
    [1, 'Black'], [1, 'Red'], [1, 'Blue'], [1, 'Yellow'],
    [2, 'Green'], [2, 'Black'],
    [3, 'Black'], [3, 'Pink'],
    [4, 'White'], [4, 'Blue'], [4, 'Yellow'],
    [5, 'Purple'], [5, 'Blue'],
]

possibilities_dict = {}

for num, color in possibilities_list:
    if num in possibilities_dict:
        possibilities_dict[num].add(color)
    else:
        possibilities_dict[num] = {color}

Or using defaultdict to simplify the logic:

from collections import defaultdict

possibilities_dict = defaultdict(set)

for num, color in possibilities_list:
    possibilities_dict[num].add(color)
Answered By: CrazyChucky
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.