Find the smallest positive number not in list

Question:

I have a list in python like this:

myList = [1,14,2,5,3,7,8,12]

How can I easily find the first unused value? (in this case ‘4’)

Asked By: jul

||

Answers:

This makes use of the property of sets

>>> l = [1,2,3,5,7,8,12,14]
>>> m = range(1,len(l))
>>> min(set(m)-set(l))
4
Answered By: Bhargav Rao

My effort, no itertools. Sets “current” to be the one less than the value you are expecting.

list = [1,2,3,4,5,7,8]
current = list[0]-1
for i in list:
    if i != current+1:
        print current+1
        break
    current = i
Answered By: Richard Green

I would suggest you to use a generator and use enumerate to determine the missing element

>>> next(a for a, b in enumerate(myList, myList[0]) if a != b)
4

enumerate maps the index with the element so your goal is to determine that element which differs from its index.
Note, I am also assuming that the elements may not start with a definite value, in this case which is 1, and if it is so, you can simplify the expression further as

>>> next(a for a, b in enumerate(myList, 1) if a != b)
4
Answered By: Abhijit

A solution that returns all those values is

free_values = set(range(1, max(L))) - set(L)

it does a full scan, but those loops are implemented in C and unless the list or its maximum value are huge this will be a win over more sophisticated algorithms performing the looping in Python.

Note that if this search is needed to implement “reuse” of IDs then keeping a free list around and maintaining it up-to-date (i.e. adding numbers to it when deleting entries and picking from it when reusing entries) is a often a good idea.

Answered By: 6502

The naive way is to traverse the list which is an O(n) solution. However, since the list is sorted, you can use this feature to perform binary search (a modified version for it). Basically, you are looking for the last occurance of A[i] = i.

The pseudo algorithm will be something like:

binarysearch(A):
  start = 0
  end = len(A) - 1
  while(start <= end ):
    mid = (start + end) / 2
    if(A[mid] == mid):
      result = A[mid]
      start = mid + 1
    else: #A[mid] > mid since there is no way A[mid] is less than mid
      end = mid - 1
  return (result + 1)

This is an O(log n) solution. I assumed lists are one indexed. You can modify the indices accordingly

EDIT: if the list is not sorted, you can use the heapq python library and store the list in a min-heap and then pop the elements one by one

pseudo code

H = heapify(A) //Assuming A is the list
count = 1
for i in range(len(A)):
  if(H.pop() != count): return count
  count += 1
Answered By: Keeto

I came up with several different ways:

Iterate the first number not in set

I didn’t want to get the shortest code (which might be the set-difference trickery) but something that could have a good running time.

This might be one of the best proposed here, my tests show that it might be substantially faster – especially if the hole is in the beginning – than the set-difference approach:

from itertools import count, filterfalse # ifilterfalse on py2

A = [1,14,2,5,3,7,8,12]
print(next(filterfalse(set(A).__contains__, count(1))))

The array is turned into a set, whose __contains__(x) method corresponds to x in A. count(1) creates a counter that starts counting from 1 to infinity. Now, filterfalse consumes the numbers from the counter, until a number is found that is not in the set; when the first number is found that is not in the set it is yielded by next()

Timing for len(a) = 100000, randomized and the sought-after number is 8:

>>> timeit(lambda: next(filterfalse(set(a).__contains__, count(1))), number=100)
0.9200698399945395
>>> timeit(lambda: min(set(range(1, len(a) + 2)) - set(a)), number=100)
3.1420603669976117

Timing for len(a) = 100000, ordered and the first free is 100001

>>> timeit(lambda: next(filterfalse(set(a).__contains__, count(1))), number=100)
1.520096342996112
>>> timeit(lambda: min(set(range(1, len(a) + 2)) - set(a)), number=100)
1.987783643999137

(note that this is Python 3 and range is the py2 xrange)

Use heapq

The asymptotically good answer: heapq with enumerate

from heapq import heapify, heappop

heap = list(A)
heapify(heap)

from heapq import heapify, heappop
from functools import partial

# A = [1,2,3] also works
A = [1,14,2,5,3,7,8,12]

end = 2 ** 61      # these are different and neither of them can be the 
sentinel = 2 ** 62 # first gap (unless you have 2^64 bytes of memory).

heap = list(A)
heap.append(end)
heapify(heap)

print(next(n for n, v in enumerate(
     iter(partial(heappop, heap), sentinel), 1) if n != v))

Now, the one above could be the preferred solution if written in C, but heapq is written in Python and most probably slower than many other alternatives that mainly use C code.

Just sort and enumerate to find the first not matching

Or the simple answer with good constants for O(n lg n)

next(i for i, e in enumerate(sorted(A) + [ None ], 1) if i != e)

This might be fastest of all if the list is almost sorted because of how the Python Timsort works, but for randomized the set-difference and iterating the first not in set are faster.

The + [ None ] is necessary for the edge cases of there being no gaps (e.g. [1,2,3]).

Don’t know how efficient, but why not use an xrange as a mask and use set minus?

>>> myList = [1,14,2,5,3,7,8,12]
>>> min(set(xrange(1, len(myList) + 1)) - set(myList))
4

You’re only creating a set as big as myList, so it can’t be that bad 🙂

This won’t work for “full” lists:

>>> myList = range(1, 5)
>>> min(set(xrange(1, len(myList) + 1)) - set(myList))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: min() arg is an empty sequence

But the fix to return the next value is simple (add one more to the masked set):

>>> min(set(xrange(1, len(myList) + 2)) - set(myList))
5
Answered By: Reut Sharabani

sort + reduce to the rescue!

from functools  import reduce # python3
myList = [1,14,2,5,3,7,8,12]
res = 1 + reduce(lambda x, y: x if y-x>1 else y, sorted(myList), 0)
print(res)

Unfortunatelly it won’t stop after match is found and will iterate whole list.

Faster (but less fun) is to use for loop:

myList = [1,14,2,5,3,7,8,12]
res = 0
for num in sorted(myList):
    if num - res > 1:
        break
    res = num
res = res + 1
print(res)
Answered By: glyuck
import itertools as it

next(i for i in it.count() if i not in mylist)

I like this because it reads very closely to what you’re trying to do: “start counting, keep going until you reach a number that isn’t in the list, then tell me that number”. However, this is quadratic since testing i not in mylist is linear.

Solutions using enumerate are linear, but rely on the list being sorted and no value being repeated. Sorting first makes it O(n log n) overall, which is still better than quadratic. However, if you can assume the values are distinct, then you could put them into a set first:

myset = set(mylist)
next(i for i in it.count() if i not in myset)

Since set containment checks are roughly constant time, this will be linear overall.

Answered By: lvc

The following solution loops all numbers in between 1 and the length of the input list and breaks the loop whenever a number is not found inside it. Otherwise the result is the length of the list plus one.

listOfNumbers=[1,14,2,5,3,7,8,12]
for i in range(1, len(listOfNumbers)+1):
   if not i in listOfNumbers: 
      nextNumber=i
      break
else:
   nextNumber=len(listOfNumbers)+1
Answered By: user9595914

A for loop with the list will do it.

l = [1,14,2,5,3,7,8,12]
for i in range(1, max(l)):
    if i not in  l: break
print(i) # result 4
Answered By: silgon

you can try this

for i in range(1,max(arr1)+2):
        if i not in arr1:
            print(i)
            break
Answered By: ravi tanwar

The easiest way would be just to loop through the sorted list and check if the index is equal the value and if not return the index as solution.
This would have complexity O(nlogn) because of the sorting:

for index,value in enumerate(sorted(myList)):
    if index is not value:
        print(index)
        break

Another option is to use python sets which are somewhat dictionaries without values, just keys. In dictionaries you can look for a key in constant time which make the whol solution look like the following, having only linear complexity O(n):

mySet = set(myList)
for i in range(len(mySet)):
    if i not in mySet:
        print(i)
        break

Edit:
If the solution should also deal with lists where no number is missing (e.g. [0,1]) and output the next following number and should also correctly consider 0, then a complete solution would be:

def find_smallest_positive_number_not_in_list(myList):
    mySet = set(myList)
    for i in range(1, max(mySet)+2):
        if i not in mySet:
            return i
Answered By: gustavz

I just solved this in a probably non pythonic way

def solution(A):
    # Const-ish to improve readability
    MIN = 1
    if not A: return MIN
    # Save re-computing MAX
    MAX = max(A)
    # Loop over all entries with minimum of 1 starting at 1
    for num in range(1, MAX):
        # going for greatest missing number return optimistically (minimum)
        # If order needs to switch, then use max as start and count backwards
        if num not in A: return num
    # In case the max is < 0 double wrap max with minimum return value
    return max(MIN, MAX+1)

I think it reads quite well

Answered By: MrMesees

Easy to read, easy to understand, gets the job done:

def solution(A):
    smallest = 1
    unique = set(A)
    for int in unique:
        if int == smallest:
            smallest += 1
    return smallest
Answered By: Dev

Keep incrementing a counter in a loop until you find the first positive integer that’s not in the list.

def getSmallestIntNotInList(number_list):
    """Returns the smallest positive integer that is not in a given list"""
    i = 0
    while True:
        i += 1
        if i not in number_list:
            return i    

print(getSmallestIntNotInList([1,14,2,5,3,7,8,12]))
# 4

I found that this had the fastest performance compared to other answers on this post. I tested using timeit in Python 3.10.8. My performance results can be seen below:

import timeit

def findSmallestIntNotInList(number_list):
    # Infinite while-loop until first number is found
    i = 0
    while True:
        i += 1
        if i not in number_list:
            return i

t = timeit.Timer(lambda: findSmallestIntNotInList([1,14,2,5,3,7,8,12]))
print('Execution time:', t.timeit(100000), 'seconds')
# Execution time: 0.038100800011307 seconds
import timeit

def findSmallestIntNotInList(number_list):
    # Loop with a range to len(number_list)+1 
    for i in range (1, len(number_list)+1):
        if i not in number_list:
            return i

t = timeit.Timer(lambda: findSmallestIntNotInList([1,14,2,5,3,7,8,12]))
print('Execution time:', t.timeit(100000), 'seconds')
# Execution time: 0.05068870005197823 seconds
import timeit

def findSmallestIntNotInList(number_list):
    # Loop with a range to max(number_list) (by silgon)
    # https://stackoverflow.com/a/49649558/3357935
    for i in range (1, max(number_list)):
        if i not in number_list:
            return i

t = timeit.Timer(lambda: findSmallestIntNotInList([1,14,2,5,3,7,8,12]))
print('Execution time:', t.timeit(100000), 'seconds')
# Execution time: 0.06317249999847263 seconds
import timeit
from itertools import count, filterfalse

def findSmallestIntNotInList(number_list):
    # iterate the first number not in set (by Antti Haapala -- Слава Україні)
    # https://stackoverflow.com/a/28178803/3357935
    return(next(filterfalse(set(number_list).__contains__, count(1))))

t = timeit.Timer(lambda: findSmallestIntNotInList([1,14,2,5,3,7,8,12]))
print('Execution time:', t.timeit(100000), 'seconds')
# Execution time: 0.06515420007053763 seconds
import timeit

def findSmallestIntNotInList(number_list):
    # Use property of sets (by Bhargav Rao)
    # https://stackoverflow.com/a/28176962/3357935
    m = range(1, len(number_list))
    return min(set(m)-set(number_list))

t = timeit.Timer(lambda: findSmallestIntNotInList([1,14,2,5,3,7,8,12]))
print('Execution time:', t.timeit(100000), 'seconds')
# Execution time: 0.08586219989228994 seconds
Answered By: Stevoisiak

I believe this the best approach. I have delt with the negatives (<=0) and duplicates (i +=2) after sorting, then now working in checking if the integer missing is within the array by while loop if not we increment the last integer in array to get the next one as the missing integer

def getSmallInt(A):

    A.sort()
    n = len(A)-1
    if A[0]<=0:
        return 1
    if A[n]<=0:
        return 1
    i=0
    while i <n:
        j = A[i]
        next = A[i+1]
        first_step = next- j
        if first_step==1:
            i+=1
        elif first_step==0:
            i+=2
        if first_step >1:
            print(j+1)
            return j+1
    print(A[n]+1)
    return A[n]+1

if __name__ == "__main__":
    myList = [1,14,2,5,3,7,8,12]
    getSmallInt(myList)
Answered By: Bumho Nisubire
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.