Most efficient way to filter prime numbers from a list of random numbers in Python

Question:

I have a list filled with random numbers and I want to return the prime numbers from this list. So I created these functions:

def is_prime(number):
    for i in range(2, int(sqrt(number)) + 1):
        if number % i == 0:
            return False

    return number > 1

And

def filter_primes(general_list):
    return set(filter(is_prime, general_list))

But I want to improve performance, so how can I achieve this?

Asked By: flpn

||

Answers:

How about this? I think It’s a little better:

def filter_primes(general_list):
   return filter(is_prime, set(general_list))

This way we don’t call is_prime() for same number multiple times.

Answered By: Masood Lapeh
  1. The Sieve of Eratosthenes is more efficient than Trial Division, the method you are using.

  2. Your trial division loop can be made more efficient, taking about half the time. Two is the only even prime number, so treat two as a special case and only deal with odd numbers thereafter, which will halve the work.

My Python is non-existent, but this pseudocode should make things clear:

def isPrime(num)

  // Low numbers.
  if (num <= 1)
    return false
  end if

  // Even numbers
  if (num % 2 == 0)
    return (num == 2)  // 2 is the only even prime.
  end if

  // Odd numbers
  for (i = 3 to sqrt(num) + 1 step 2)
    if (num % i == 0)
      return false
    end if
  end for

  // If we reach here, num is prime.
  return true;

end def

That step 2 in the for loop is what halves the work. Having earlier eliminated all even numbers you only need to test with odd trial divisors: 3, 5, 7, …

Answered By: rossum

Sieve of Eratosthenes, taking about 0.17 seconds for primes under 10 million on PyPy 3.5 on my device:

from array import array
from math import isqrt

def primes(upper):
    numbers = array('B', [1]) * (upper + 1)

    for i in range(2, isqrt(upper) + 1):
        if numbers[i]:
            low_multiple = i * i
            numbers[low_multiple:upper + 1:i] = array('B', [0]) * ((upper - low_multiple) // i + 1)

    return {i for i, x in enumerate(numbers) if i >= 2 and x}

and the filter function:

filter_primes = primes(10_000_000).intersection
Answered By: Ry-

3 rounds of the the Miller-Rabin test ( https://en.wikipedia.org/wiki/Miller%2dRabin_primality_test ) using bases 2, 7, and 61, is known to accurately detect all primes <= 32 bit, i.e., anything that fits into a python int.

This is much, much faster than trial division or sieving if the numbers can be large.

If the numbers cannot be large (i.e., < 10,000,000 as you suggest in comments), then you may want to precompute the set of all primes < 10,000,000, but there are over 600,000 of those.

Answered By: Matt Timmermans
def primes_list(num_list):
    divs = [2,3,5,7]
    primes = [x for x in set(num_list) if 0 not in {1 if ((x%i != 0) | (x in divs)) & (x > 0) else 0 for i in divs}]
    return primes

For this function, it takes a list, num_list, as a parameter. divs is a predefined, or rather hard coded, list of prime numbers less than 10 excluding 1. Then we use list comprehension to filter num_list for prime numbers as the variable primes.

Answered By: Monaheng Ramochele

This is one more flavour of code to find the prime no of the range. The simple and easy way.

    def find_prime(n):
       if n <=1:
         return False
       else:
          for i in range(2, n):
              if n % i == 0:
                  return False
       return True
n = 10
x = filter(find_prime, range(n)) #you can give random no list too
print(list(x))
Answered By: cloudyrathor
def is_prime(n):
  if n>1:
      for i in range(2,int(n**0.5)+1):
         if n%i==0: 
            return False
      return True
  else: return False
print([x for x in general_list if is_prime(x)])

would you try this… There is no need for the filter at all and you could simply apply set() to the general_list if there are duplicate elements in the list to optimize more.

Answered By: Sanketkumar7