Generating prime numbers using Numpy

I’ve been looking at generating primes, and using various element-wise operations in Numpy arrays to do so. To cut the the chase, prime6 below is the fastest implementation. The ones before that tell the story of how I got there.

The following function uses a for loop to generate primes less than 100k and takes about 7 secs on my laptop:

import numpy
import math
def prime2(upto=100000):
    return filter(lambda num: numpy.array([num % factor for factor in range(2,1+int(math.sqrt(num)))]).all(), range(2,upto+1))

If we remove the for loop we cut the time to about 1.3 secs:

import numpy
import math
def prime(upto=100000):
    return filter(lambda num: (num % numpy.arange(2,1+int(math.sqrt(num)))).all(), range(2,upto+1))

By not including even numbers, we can run in half the time again to about .7 secs:

import numpy
import math
def prime3(upto=100000):
    return [2]+filter(lambda num: (num % numpy.arange(3,1+int(math.sqrt(num)),2)).all(), range(3,upto+1,2))

The above is illustrative of efficiencies in using element-wise operations in Numpy arrays rather than looping.

A good way of speeding up the above code even further is by breaking after finding the first modulo equal zero condition. The code below runs in about .4 seconds:

import numpy
import math
def prime4(upto=100000):
    primes=[2]
    for num in range(3,upto+1,2):
        isprime=True
        for factor in range(3,1+int(math.sqrt(num)),2):
             if not num % factor: isprime=False; break
        if isprime: primes.append(num)
    return primes

Ok Scruffy Pete has laid down the challenge in the comments section so I need to get serious about optimization. The code below uses a few Numpy tricks for a zippy implementation and generates 1 million primes in about .03 secs, about 6X faster than Scruffy Pete’s.

import math
import numpy
def prime5(upto=100000):
    primes=numpy.arange(2,upto+1)
    isprime=numpy.ones(upto-1,dtype=bool)
    for factor in primes[:int(math.sqrt(upto))]:
        if isprime[factor-2]: isprime[factor*2-2::factor]=0
    return primes[isprime]

UPDATE 9/3 – did a half sieve style implementation of the above (see comments) which runs a little faster – under .02 secs for primes up to 1 million. Updated for Python 3 and incorporating suggestion in comments from Paul Weemaes

import math
import numpy
def prime6(upto=1000000):
    primes=numpy.arange(3,upto+1,2)
    isprime=numpy.ones((upto-1)//2,dtype=bool)
    for factor in primes[:int(math.sqrt(upto))//2]:
        if isprime[(factor-2)//2]: isprime[(factor*3-2)//2::factor]=0
    return numpy.insert(primes[isprime],0,2)
This entry was posted in programming and tagged , , . Bookmark the permalink.

12 Responses to Generating prime numbers using Numpy

  1. Peter says:

    The Sieve Of Eratosthenes is faster. One possible implementation:

    
    def FastPrimeSieve(max):
        possible_primes =  range(3,max+1, 2)
        curr_index = -1
        max_index = len(possible_primes)
        for latest_prime in possible_primes:
            curr_index +=1
            if not latest_prime : continue
            for index_variable_not_named_j in xrange((curr_index+latest_prime),max_index, latest_prime): possible_primes[index_variable_not_named_j]=0
        possible_primes.insert(0,2)
        return [x for x in possible_primes if x > 0]
    
    
  2. nolfonzo says:

    Very interesting – thanks Bob.

    I notice that the fastest pure python solution on that post runs at about .15 secs on my machine.

    My Numpy solution is about 3X better than that.

    The half/third-sieve Numpy solutions primesfrom3to(n) and primesfrom2to(n) from your link are almost twice as fast as mine, which is probably expected.

    At some point I’ll try to turn mine into a half sieve and get proper timings.

  3. nolfonzo says:

    ok – made the change for the 1/2 sieve (see update). The code now runs comparable to the 1/2 sieve impl in bob’s link. Though dare I say my impl is a little more elegant?

    I thought about having a shot at the 1/3 sieve but after a few minutes thought it may be time to stop the madness.

  4. nolfonzo says:

    I posted my 1/2 sieve implementation on

    http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188

    They run automated tests against the submitted prime scripts, and you can see that mine (nolfonzo_prime6) wasn’t the fastest but held its own against the other numpy implementations clocking in at 12ms for 1 million primes.

    You can look at the scripts and results here: http://ideone.com/oDg2Y

    Some of the faster implementations dispensed with the initial prime array I had, and backed into it from the sieve, or indexes, directly at the end. I guess that’s the way you’re supposed to do a sieve but it does get a little more complicated – more memory efficient though I’m guessing.

    Interesting stuff.

  5. Tom Luo says:

    Pete’s is the fatest my Mac. But I love your first numpy pythonic solution without loops.

  6. babsdoc says:

    Hi,
    I’m not a programmer, but I’m interested in learning python. I’m trying to write a program in python. Can you please help me out?

    I need to generate primes by using 2 lists,
    the first list will be have only one value [2].
    The second list will have odd numbers lets say from 3 to 20 [3,5,7,9,11,13,15,17,19]

    The code should will take the first number 3 and check if 3 % 2 ==0, if it isnt, then 3 should be added to the first list [2,3] the code will then go to the next number 5 and check divisibility in reverse, if 5 % 3 ==0 if not then check if 5 % 2==0 if not then add 5 to the list [2,3,5] and so on.

    May seem funny, but I’m finding it hard to program.

    Also, kindly explain what is happening in the following line of your code:
    return filter(lambda num: (num % numpy.arange(2,1+int(math.sqrt(num)))).all(), range(2,upto+1))

    Regards,
    babsdoc

  7. Apologies if this comment is useless but I tried my own routine for calculating primes under a certain number. It might not be the best but it’s far from the slowest and is easy to understand. It works by doing sieve on a boolean vector and converts all True values to indices. It’s odd because it avoids all division in the code.

    def ajs_primes3a(upto):
    mat = np.ones((upto), dtype=bool)
    mat[0] = False # 0 is not a prime
    mat[1] = False # 1 is (probably) not a prime
    mat[4::2] = False # neither is anything divisible by 2 except 2
    for idx in range(3, int(sqrt(upto)), 2): # go through odd numbers from 3 up
    mat[idx*2::idx] = False # sieve them out
    return np.where(mat == True) # produce a vector where elements = True which gives primes!

    I also wondered whether using a sparse matrix would be quicker.

  8. Pingback: ??N???????????|Python??

  9. Paul Weemaes says:

    Hi nolfonzo,

    Thanx for the wonderful code! I ran into your blog after my pure Python implemention of an atkin sieve for generating primes (slightly adapted to accepts both a min and a max prime as arguments) turned out to be (too) slow when generating > approx. 10^7 primes. Your Numpy implementation is way faster, so it helped me a lot.

    Your code assumes Python 2.x and this causes a minor issue when using Python 3.x: the integer division operator has changed from / in 2.x to // in 3.x. After making this minor change, it worked fine and fast (I tested with Python 3.9 and Numpy 1.19.3).

    Best regards,
    Paul

  10. Paul Weemaes says:

    The loop in Prime6 can be reduced to half the number of iterations:

    for factor in primes[:int(math.sqrt(upto)) // 2]:

    will do 😉

    This saves math.sqrt(upto)) // 2 iterations. The gain in speed is in the order of 10%, of course mostly when upto is relatively large (>10^6)

    Also, although perhaps not very practical, you could nible of a few percents if you don’t prepend 2 to the output. Then there is no more need for the numpy.insert call, which copies(!) the (possible large) narray.

  11. Pingback: Fastest way to list all primes below N

Leave a Reply

Your email address will not be published. Required fields are marked *