Find the first duplicate number for which the second occurrence has the minimal index

Question:

This is a question on codefights:

Given an array a that contains only numbers in the range from 1 to
a.length, find the first duplicate number for which the second
occurrence has the minimal index. In other words, if there are more
than 1 duplicated numbers, return the number for which the second
occurrence has a smaller index than the second occurrence of the other
number does.

I’ve been struggling to figure out how to complete this in python. I’m unsure if I’m on the right path and if I am I can’t seem to figure out how to access my index dictionary after finding the specific value from my d dictionary. I want to grab all the values that are greater than one in my d dictionary and then grab those values from index and then whichever value in index is smaller would be the answer.

If I’m going about this completely wrong then please let me know.

def firstDuplicate(a):
    d = {}
    index = {}

    for i in a:
        if i in d:
            d[i] += 1
        else:
            d[i] = 1

    for i,e in enumerate(a):
        if e in d:
            index[e] = i
        else:
            index[e] = i

    for key,val in d.items():
        if val > 1:
Asked By: user2914144

||

Answers:

I think you can try using indexing technique here. Since you mentioned that numbers are in the range 1 to a.length, you can retrieve that element from the list, go to index l[i] and change that element to -l[l[i]] i.e.

l[l[i]] = -1 * l[l[i]]

While doing this, if you encounter a negative value, return the absolute value of the element present at this index. Let me know if you have problem implementing this. Here will be the code: (forgot to mention it earlier):

l = [7,4,5,6,4,2,3]
found = 0
for i in range (0 , 6):
    item = abs(l[i])
    if(l[item - 1] > 0):
        l[item - 1] = -1 * l[item - 1]
    else:
        found = abs(l[i])
        break    
print (found)


output : 4

Time complexity : O(n)

Space : O(1)

Answered By: CodeHunter

Your second for loop does the same for if and else condition, let’s change that for loop, also, there’s no need to store the elements that have less than two occurrences, so let’s add that condition as well. What the second loop here does is, using list comprehension, it stores all occurrences of an element in a list(Famous solution), and then we store that in our indexdic. Finally, printing both dictionaries to see how they look like:

def firstDuplicate(a):
    d = {}
    indexdic = {}

    for element in a:
        if a.count(element) > 1:
            if element in d:
                d[element] += 1
            else:
                d[element] = 1

    for key in d:
        indexdic[key] = [i for i, x in enumerate(a) if x == key]

    print('d: ', d)
    print('indexdic: ', indexdic)

Running this:

>>> firstDuplicate(['a','b','c','a','d','d','b'])
d:  {'a': 2, 'd': 2, 'b': 2}
indexdic:  {'a': [0, 3], 'd': [4, 5], 'b': [1, 6]}

Now after this hint, you need to work on what operations are needed in indexdic values to get the output you want, I’ll let you work that out, it’s an exercise afterall. Let me know if any of the steps is not well described.

Emm… What’s wrong with simple approach?

def firstDuplicate(a):

   aset = set()
   for i in a:
       if i in aset:
           return i
       else:   
           aset.add(i)

print(firstDuplicate([7,4,5,6,4,6,3]))  

dictionary version:

adict = {}
for i in a:
   if i in adict:
       return i
   else:   
       adict[i] = 1   
Answered By: MBo

As I red in another post the key is to use a dictionary. Here is the solution for python 3. The index is the number, thus you will know if you saw it before.

 def firstDuplicate(a):
    oldies={}
    notfound=True
    for i in range(len(a)):
        try:
            if oldies[a[i]]==a[i]:
                notfound=False
                return a[i]     
        except:
            oldies[a[i]]=a[i]
    if notfound:
        return -1    
Answered By: Diego Orellana
def firstDuplicate(a):
lst = list()
x=-1
for i in a:
    if i in lst:
        x = i
        break
    else:
       lst.append(i)
return(x)

This answer solves 20/22 inputs. Codefights gives error that it exceeds time limit.

Answered By: Anish

For faster result we need to use set and not list.

def first_duplicate(arr):
        uniques = set()
        for item in arr:
            if item in uniques:
                return item
            else:
                uniques.add(item)
    
        return -1
Answered By: HarutyunyanGG

Solution in Javascript

function solution(a) {    
    return a.filter((n, i) => a.indexOf(n) !== i)[0] || -1;
}

console.log(solution([2, 1, 3, 5, 3, 2])); // 3
console.log(solution([2, 2])); // 2
console.log(solution([2, 4, 3, 5, 1])); // -1

Answered By: Shahid Nauman

this is a solution with swift

  1. create a set

  2. loop through the array

  3. check the set if an item exists and return else add it to the set

  4. return -1 if none of the items has a duplicate

    func solution(a: [Int]) -> Int {
    var mySet = Set()
    for num in a{
    if mySet.contains(num){
    return num
    } else{
    mySet.insert(num)
    }
    }
    return -1
    }

Answered By: Meek

I tested the next Haskell solution but it still takes more than 4 seconds with a very large list (length >= 100000)

firstRepeated :: [Int] -> [Int] -> Int
firstRepeated [] [] = -1
firstRepeated [] _  = -1
firstRepeated (x:xs) ys
  | any (x==) ys = x
  | otherwise = firstRepeated xs (x:ys)

For calling, just sent input and a empty list:

firstRepeated a []
Answered By: Brayan David Arias