Improving the prime sieve code#
An inefficiency with our code is that we compute the prime numbers using two data structures. This wastes memory, but also means we need to check if a values already exists in crossed_out
twice. The second check is needed to make sure we aren’t needlessly appending an already crossed out prime number.
The function prime_sieve2
delivers the same results as prime_sieve
, but more efficiently. The primary source of this saving is by designing our code so that we required only one list
.
Show code cell source
# imports and test cases
import math
n_primes = {10 : 4,
100 : 25,
1000 : 168,
10000 : 1229,
100000 : 9592,
1000000 : 78498,
10000000 : 664579}
def prime_sieve2(n):
# We will use boolean's here instead of ints
candidates = [True] * (n + 1)
candidates[0] = candidates[1] = False
limit = int(math.sqrt(n)) + 1
for i in range(2, limit):
if candidates[i]:
candidates[i+i::i] = [False] * ((n - i) // i)
return [i for i in range(n+1) if candidates[i]]
print(len(prime_sieve2(100)) == n_primes[100])
print(len(prime_sieve2(1000)) == n_primes[1000])
True
True
%timeit prime_sieve2(1000)
36.7 μs ± 245 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
Does the time saving matter?#
You should see a substantial reduction in computational overhead of prime_sieve2
. We are now finding the primes under 1000 in microseconds (µs) as opposed milliseconds in our naive implementation (1 µs = 1000ms!).
Some of you might be thinking “both procedures give the same results and execution takes under a second. So does this type of optimisation matter?” To answer that question let’s compute primes under 10,000. You should now see a much bigger difference with our naive function returning in around a second while prime_sieve2
is still in µs (on my machine still below half a millsecond).
When first writing this section I first tested
prime_sieve
withn=1_000_000
. Needless to say, I quickly decided to use an example of 10,000 instead! When you fancy a cup of tea or coffee setn=1_000_000
and come back after your break to see an enourmous difference in performance!
Show code cell source
# original function
def prime_sieve(n):
'''
Compute the prime numbers between 1 and n.
Naive implementation of the Sieve of Eratosthenes.
Parameters:
----------
n: int
The upper limit
Returns:
-------
list
a list of primes
'''
# a list of candidate numbers to sieve
candidates = [i for i in range(n + 1)]
# a list of numbered eliminated from consideration
crossed_out = [0, 1]
# maximum iterations required
limit = int(math.sqrt(n)) + 1
for i in range(2, limit):
for j in range(i+i, n+1, i):
if not candidates[j] in crossed_out:
crossed_out.append(candidates[j])
return [i for i in candidates if i not in crossed_out]
TEN_THOUSAND = 10_000
%timeit len(prime_sieve(TEN_THOUSAND))
1.09 s ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit len(prime_sieve2(TEN_THOUSAND))
368 μs ± 1 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Taking a look at the redesign#
The redesigned implementation only has single list candidates
: its contains only bool
values. In python that won’t make much, if any, difference to memory usage, as bool
is really a special type of int. But its a nice simplification as we the index of the list item gives us our factor. This modification provides our biggest speed up. Each time we ‘cross out’ a number e.g. i
we simply set candidate[i] = False
. (We don’t bother checking as we are not appending to a second list). At the end of the function any items that are still True
are the primes.
Don’t believe me about
int
andbool
? Try executing(True + 4)
.
The second change might be considered a ‘micro optimisation’. I.e. list slicing in favour of an inner for
loop.
candidates[i+i::i] = [False] * ((n - i) // i)
Here we start at the current index + one factor ahead e.g. if i=2
then we start at index i+i=4
and select items in steps of i=2
all the way to the end candidates[i+i::i]
(Note we could also have used candidates[i+i:n+1:i]
). In general, a slice will be faster than looping over the list. Here’s a simpler example of list slicing with a step:
list_to_slice = [i for i in range(31)]
list_to_slice[4:len(list_to_slice):2]
[4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30]