Finding the count of how many elements of list A appear before than in the similar but mixed list B

Question:

A=[2,3,4,1] B=[1,2,3,4]
I need to find how many elements of list A appear before than the same element of list B. In this case values 2,3,4 and the expected return would be 3.

def count(a, b):
    muuttuja = 0    
    for i in range(0, len(a)-1):        
        if a[i] != b[i] and a[i] not in  b[:i]:
            muuttuja += 1            
            
    return muuttuja

I have tried this kind of solution but it is very slow to process lists that have great number of values. I would appreciate some suggestions for alternative methods of doing the same thing but more efficiently. Thank you!

Asked By: Iltsukka

||

Answers:

You can make a prefix-count of A, which is an array where for each index you keep track of the number of occurrences of each element before the index.

You can use this to efficiently look-up the prefix-counts when looping over B:

import collections

A=[2,3,4,1]
B=[1,2,3,4]

prefix_count = [collections.defaultdict(int) for _ in range(len(A))]
prefix_count[0][A[0]] += 1
for i, n in enumerate(A[1:], start=1):
    prefix_count[i] = collections.defaultdict(int, prefix_count[i-1])
    prefix_count[i][n] += 1

prefix_count_b = sum(prefix_count[i][n] for i, n in enumerate(B))
print(prefix_count_b)

This outputs 3.

This still could be O(NN) because of the copy from the previous index when initializing the prefix_count array, if someone knows a better way to do this, please let me know*

Answered By: Tom McLean

If both the lists have unique elements you can make a map of element (as key) and index (as value). This can be achieved using dictionary in python. Since, dictionary uses only O(1) time for lookup. This code will give a time complexity of O(n)

A=[2,3,4,1] 
B=[1,2,3,4]
d = {}
count = 0
for i,ele in enumerate(A) :
    d[ele] = i
for i,ele in enumerate(B) :
    if i > d[ele] :
        count+=1
Answered By: Nehal Birla

This only works if the values in your lists are immutable.


Your method is slow because it has a time complexity of O(N²): checking if an element exists in a list of length N is O(N), and you do this N times. We can do better by using up some more memory instead of time.

First, iterate over b and create a dictionary mapping the values to the first index that value occurs at:

b_map = {}
for index, value in enumerate(b):
    if value not in b_map:
        b_map[value] = index

b_map is now {1: 0, 2: 1, 3: 2, 4: 3}

Next, iterate over a, counting how many elements have an index less than that element’s value in the dictionary we just created:

result = 0
for index, value in enumerate(a):
    if index < b_map.get(value, -1):
        result += 1

Which gives the expected result of 3.

b_map.get(value, -1) is used to protect against the situation when a value in a doesn’t occur in b, and you don’t want to count it towards the total: .get returns the default value of -1, which is guaranteed to be less than any index. If you do want to count it, you can replace the -1 with len(a).

The second snippet can be replaced by a single call to sum:

result = sum(index < b_map.get(value, -1) 
             for index, value in enumerate(a))
Answered By: Pranav Hosangadi

Use a set of already seen B-values.

def count(A, B):
    result = 0
    seen = set()
    for a, b in zip(A, B):
        seen.add(b)
        if a not in seen:
            result += 1
    return result
Answered By: Kelly Bundy
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.