Mind like a Sieve


Yes, ok, I like prime numbers. I admit it. So what?

One of the key features of prime numbers is their apparent randomness. Just because you know that one number ‘n’ is prime, doesn’t mean that you can predict that n+2 will also be prime, etc. And, given a 100-digit odd number, it’s going to take you a while to determine if it is prime or not. On the flip side though, non-prime numbers do follow specific patterns. That is, if you start out with n=2, every second number after that (4, 6, 8…) will be non-prime. And, starting with n=3, every third number will also be non-prime.

So, if you take a number at random, you can eventually determine if it is prime or not simply by dividing it by every value from 2 to n/2. Very straight-forward. Time-consuming, but straight-forward.

Now, if you want to speed things up, the patterns fall apart.

Say n=101. Sure, you could divide by 2 and see if there’s no remainder, then go to 3, 4, 5, 6, 7, 8, all the way up to 100. That wouldn’t make much sense, but you could. Instead, if you remove all the redundant numbers from the check list (4, 6, 8, 9, 10, 12, 14, 15, etc.), what does that leave you? Answer: With only the PRIME NUMBERS. Therefore, to find out if the non-primes have a pattern, we’re stuck with using apparently random values in the check list (2, 3, 5, 7, 11, etc.) On the other hand, with one more simplification, we can speed the test up.

101 / 2 = 50.5
101 / 3 = 33.67
101 / 5 = 20.2
101 / 7 = 14.4
101 / 11 = 9.2

Yup – we only need to use the prime numbers up to the one just before the square root of n. In this case, 2, 3, 5 and 7. (sqrt(101) = 10.05, and the largest prime less than 10 is 7.) If we were writing a program to do this for us, we could do something like: Look at ‘n’ and take its square root (‘sqr’). Find the first prime number smaller than sqr (pn0), and the next larger prime number (pn1). Square pn1 (pn1^2). And then say that for every number between n and pn1^2, use a prime number check list from 2 to pn0. When n equals pn1^2, pn0 = pn1, and pn1 is the next larger prime number.

Now, yes, I know that in normal cryptography and math applications there are much more elegant algorithms, and my approach isn’t practical, but I want ‘artsy’, not ‘craft’.

Anyway, a fun little pattern is to apply a sieve to all the numbers between 1 and 2 * 3 * 5. I hope it’ll become obvious why in a second.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

(I like visualizing primes as sinewaves that have a half-period zero-crossing at that specific point on the x-axis. Any integer points that don’t already have one or more existing sinewaves crossing there will be prime and therefore will spawn a new sinewave. Here, we have sines for 2, 3 and 5.)

Assuming that in this loop 2, 3 and 5 have already been identified as prime and added to our check list manually, what we now have is a symmetrical string that we can pre-load into an array and say “any location that is crossed-out is non-prime and therefore we don’t have to test it. Substituting the above values with ‘T’ and ‘F’, we get (T=test for prime; F=Number is known to be non-prime):

TFFFFFTFFFTFTFFFTFTFFFTFFFFFTF

We could try making an array 2x3x5x7 elements long, but that’s 210 elements, and adding 7 as a factor only eliminates 7×7, 7×11, 7×13, 7×17, 7×19, 7×23, and 7×29, or 7 extra array element positions of the 210 from the calculations. Depending on how important speed is to the algorithm, this may or may not be worth the work.


(2, 3, 5 and 7, but only for n=0 to 30. Note the sudden lack of symmetry.)

In other words, if we use the above 30-element test loop, set n>5, and pre-load our prime number check list with 2, 3 and 5, we only need to test a number if the test loop position is “true” for that value, and keep repeating the loop as n increments. And we only test with the prime numbers between 2 and whichever one is smaller than the square root of n. Further, if we start with an odd value of n, and increment by 2, then we can remove 2 from our prime number checks. That is, to find all primes between 1 and 100, we only need to test the odd numbers between 7 and 99, using a check list of 3, 5 and 7 (7 gets added to our primes list because it’s prime, and to our check list once n exceeds 5×5 = 25.)

Yes, yes, very boring. So what?

Well, this finally brings me to my art application. I wanted to know which check list prime numbers are the most common lowest prime common denominators for the non-primes.

Example:

n = 2: prime
n = 3: prime
n = 4: lowest prime factor = 2
n = 5: prime
n = 6: lowest prime factor = 2
n = 7: prime
n = 8: lowest prime factor = 2
n = 9: lowest prime factor = 3
n = 10: lowest prime factor = 2
n = 11: prime
n = 12: lowest prime factor = 2
n = 13: prime
n = 14: lowest prime factor = 2
n = 15: lowest prime factor = 3

For the numbers between 1 and 15, 2 shows up as the lowest factor in the non-primes 6 times, and 3 shows up twice. The next higher prime is 5 and it doesn’t show up in the list in this example.

The question then became “what happens between 1 and 100,000?” How long is the check list, and how nonlinear is the distribution?

I surprised myself. The check list only consists of the first 65 prime numbers (from 2 to 313), and the log of the distribution is an ugly “s” shape. Nothing really earthshattering, but it did keep me off the streets for the couple of hours it took to write the code and make the graph in excel.

As a comment, the point behind using the lowest prime factor of n is to remove redundancies from the sieve algorithm. If you write out the numbers from 1 to 100, and then ‘x’ out every second number, you’re identifying 100/2 – 1 = 49 numbers as non-prime. When you ‘x’ out every third number, half of those have already been eliminated when x=2. Sure, 100/3 – 1 = 32 of the numbers are divisible by 3 and therefore are non-prime, but 6, 12, 18, 24, etc. were already identified as non-prime. Who cares if n=3 catches them again? It’s really only useful if x=3 catches 9, 15, 21, 27, etc., because these are the non-primes that x=2 missed. So, if x=2 catches 49 non-primes, x=3 catches (100/3 – 1)/2 = 16 more. x=5 only locates 6 more non-primes, and x=7 gets the last 3. That’s 74 total non-primes, leaving 25 primes (1 doesn’t count), and all 74 non-primes were caught by only 4 prime numbers (2, 3, 5 and 7).

I’m seriously considering writing a Java applet to animate the ‘s’ curve for n=1 to 100,000, but at 100,000 frames, that would be 55 minutes long. That might require a bit of editing…

==============

Shh… Don’t anybody tell me, but I went behind my back and returned to the sieves program to bump the test range from 100,000 to 1,000,000. To keep things simple, I’d written the original program in vsbasic (because it’s easy to use, fast to program in, and free) and did the tests by brute force rather than by using my above algorithm. With the first 100,000 numbers, the program run time was maybe 1-2 minutes, so optimization was unnecessary. But at 1,000,000 numbers, I cancelled out after unsuccessfully waiting several hours. Figuring that the algorithm was now justified, I spent about an hour implementing it. (I’m kind of rusty, and debug took most of my time.) It was worth the work, though, since I got the program run time down to under 20 seconds. For n = 1 to 1,000,000, the non-primes have the first 168 primes (largest is 997) as lowest prime factors. The graph shows the s-curve looking like a nice, clean “s”, too.

It’s still a silly waste of time, but it is a nice looking s.

 

Advertisements
Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: