Computer programming contest update:
The second round (Consecutive Primes) has closed and I like my chances for winning it. More about that below. It may be awhile before we know the outcome because they still haven't finished judging the first round. Meanwhile, the third round has started and nobody stands a chance against my awesome program. It's another math problem that I've figured out to a greater degree than the problem designers imagined.
This will be the final round before they award the grand prize. Winning 2 out of 3 rounds would guarantee that Intel pays for my vacation this autumn.
Sums of consecutive primes are boring. Just program a sieve algorithm and you're pretty much done.
We have another programmer on THB! The rest of this message will be technical. Read on only if you understand programming and/or want a rare glimpse of stabby source code.
Okay, Phi, the problem I had to solve is simple to describe. Consider a run of k consecutive primes, such as [11083, 11087, 11093]. Is their sum a perfect power? (A number n is a perfect power if it can be expressed as n = mx where m and x are natural numbers greater than 1.) For example, 2 + 3 + 5 + 7 + 11 + 13 + 17 + 19 + 23 = 100 which is a perfect power because 102 = 100. Got it?
My program took as input two 32-bit integers representing the start and end of a range to consider. It then had to find all possible sub-runs of consecutive primes in the range that sum to a perfect power. For example, on the input [1, 29] it would print (in no particular order):
sum(3:5) = 8 = 2^3
sum(17:19) = 36 = 6^2
sum(2:23) = 100 = 10^2
sum(13:19) = 49 = 7^2
sum(5:13) = 36 = 6^2
So right away there's an O(n2) complexity going on. For any full run of n consecutive primes there are (n2 - n)/2 sub-runs contained within. [Note: They refer to this sum(x,y) notation as a "k-pair" where the pair are the bounds (x, y) and k is the number of consecutive primes in the run, including x and y.]
I broke the problem down into four tasks:
1. Generate primes in the range
2. Calculate sums of runs
3. Generate perfect powers in range of the sums
4. Find matches between the sums and the powers
You commented on step 1. Sieve of Eratosthenes is an efficient algorithm for generating the first primes up to 10 million. Unfortunately I had to go up to 4.3 billion (232) in the worst case. Also, I didn't need the primes below the start of the range. I ended up using the Miller-Rabin test. Instead of choosing random numbers (the way this algorithm usually works!) I used the basis {2, 7, and 61} because it's guaranteed to be correct up to 4.7 billion.
Step 2 is more interesting than it sounds. Instead of storing the raw prime numbers in a table and summing them up as needed, I processed the primes into a prefix sum table. To recover the original prime number at index i, use sum[i+1] - sum[i]. That's inconvenient but you get something big in return. The sum of any sequence of primes from index i to j is given by sum[j+1] - sum[i]. Beauty!
From what I gather on the contest forums, many competitors skipped step 3 and headed right into the search. Big mistake! They were trying to use math to solve the problem backwards, "I take the sum and calculate its nth root and see if that's an integer. Then I repeat for all values of n." That's a fool's errand. It's fast to build a table of perfect powers using an integer pow(m, x) function and a priority queue to keep them sorted. Hell, I assigned 39 threads to step 1 and only one thread to step 3 and it still finished first.
Step 4 is where the contest will be won or lost. On a big problem of a hundred million primes, steps 1 through 3 take one second and step 4 takes 15 minutes. Seriously.
I wrote five different search strategies only to throw four of them away when the other one emerged as the fastest on all inputs. I called it the "stride strategy". Each thread would take a stride of length k, starting with k=2. It would look at all runs of k consecutive primes from the beginning of the range to the end. The cool thing about this is that the sums gradually grow larger as you move downrange. Since the power table is also sorted, I can keep an index into it and do short linear searches (usually no more than 2 steps) for values that match the current sum.
That's about it other than multi-threading issues such as load balancing. There was one astonishing result: frequent barriers caused the program to run faster! The stride strategy has excellent cache coherency as long as all of the threads run together as a pack. If one thread fell behind (perhaps to output results) then I'd see some cache thrashing. So I added a "metronome" to resynchronize the threads. The optimum time between resyncs: 600 milliseconds. I was expecting it to be closer to a minute.
Anyway, here's an excerpt of my C++ code. It's the stride strategy, well optimized and commented.
We have another programmer on THB!
Only when I feel like it.
Okay, Phi, the problem I had to solve is simple to describe. Consider a run of k consecutive primes, such as [11083, 11087, 11093]. Is their sum a perfect power? (A number n is a perfect power if it can be expressed as n = mx where m and x are natural numbers greater than 1.) For example, 2 + 3 + 5 + 7 + 11 + 13 + 17 + 19 + 23 = 100 which is a perfect power because 102 = 100. Got it?
My name is the golden ratio. I'm officially a student mathematics. Of course I got it.
You commented on step 1. Sieve of Eratosthenes is an efficient algorithm for generating the first primes up to 10 million. Unfortunately I had to go up to 4.3 billion (232) in the worst case. Also, I didn't need the primes below the start of the range. I ended up using the Miller-Rabin test. Instead of choosing random numbers (the way this algorithm usually works!) I used the basis {2, 7, and 61} because it's guaranteed to be correct up to 4.7 billion.
Clever using the Miller-Rabin test instead of the sieve!
Step 2 is more interesting than it sounds. Instead of storing the raw prime numbers in a table and summing them up as needed, I processed the primes into a prefix sum table. To recover the original prime number at index i, use sum[i+1] - sum[i]. That's inconvenient but you get something big in return. The sum of any sequence of primes from index i to j is given by sum[j+1] - sum[i]. Beauty!
Really creative.
From what I gather on the contest forums, many competitors skipped step 3 and headed right into the search. Big mistake! They were trying to use math to solve the problem backwards, "I take the sum and calculate its nth root and see if that's an integer. Then I repeat for all values of n." That's a fool's errand. It's fast to build a table of perfect powers using an integer pow(m, x) function and a priority queue to keep them sorted. Hell, I assigned 39 threads to step 1 and only one thread to step 3 and it still finished first.
I'd have used the mathematical approach. A few years ago. Root functions are slow.
Step 4 is where the contest will be won or lost. On a big problem of a hundred million primes, steps 1 through 3 take one second and step 4 takes 15 minutes. Seriously.I wrote five different search strategies only to throw four of them away when the other one emerged as the fastest on all inputs. I called it the "stride strategy". Each thread would take a stride of length k, starting with k=2. It would look at all runs of k consecutive primes from the beginning of the range to the end. The cool thing about this is that the sums gradually grow larger as you move downrange. Since the power table is also sorted, I can keep an index into it and do short linear searches (usually no more than 2 steps) for values that match the current sum.
Anyway, here's an excerpt of my C++ code. It's the stride strategy, well optimized and commented.
I've never programmed C++, only a few lines of C. I have also never dealt with threading issues. I was, however, able to get the gist of it. I'd never have thought of doing it like that. I'd say you deserve to win the competition, but I haven't seen the others' contributions.
That's about it other than multi-threading issues such as load balancing. There was one astonishing result: frequent barriers caused the program to run faster! The stride strategy has excellent cache coherency as long as all of the threads run together as a pack. If one thread fell behind (perhaps to output results) then I'd see some cache thrashing. So I added a "metronome" to resynchronize the threads. The optimum time between resyncs: 600 milliseconds. I was expecting it to be closer to a minute.
Funny how those things sometimes work.
aurllcooljay wrote:
The last time I saw anyone in chat was the last stabby chat, over a month ago.We'll have another stabby chat this month. I've turned the corner on my heavy workload and have things under control.
Awesome.
We'll have another stabby chat this month. I've turned the corner on my heavy workload and have things under control.
Funny that you should mention it, I just chatted with someone on THB chat today and yesterday. But it will still be fun to have another stabby chat. Too bad it wasn't last month though, because it would have been on Friday the 13th, and you know what that means...
Slashy chat!
Oh well. I guess this month will just be regular stabby chat.
ITS 0VER 9000!!!!!!!!!!!!
ITS 0VER 9000!!!!!!!!!!!!
What?
dbz VAGETA
http://www.youtube.com/watch?v=SiMHTK15Pik
ezioaltair56 wrote:
http://www.youtube.com/watch?v=SiMHTK15Pik
lmfao
what is this golden ratio?
can it be explained to someone who isn't a math-genius?
what is this golden ratio?
can it be explained to someone who isn't a math-genius?
Wikipedia was the best explanation I found.
so... what now...
It's basically a number that is everywhere in nature, like in the pattern of sunflower seeds on the flower itself, or how the milky way spirals.
It also has a fun few properties, like:
1/phi = phi-1 and phi^2=phi+1.
There's a lot more to it, but I don't feel like explaining (I've been awake for literally ten minutes ).
I like math and statistics, my job revolves around numbers and their analysis, but there is nothing "fun" about an equation like that.
Uh oh, this post is cursed! Just look at the number ----------------------------------------------------------------------------^
So I guess I don't win (or in this case claim victory).
Gotta get rid of that 666...so here's to 667.
Gotta get rid of that 666...so here's to 667.
Whew, thanks.
GopherBlaine wrote:
Gotta get rid of that 666...so here's to 667.Whew, thanks.
Anytime.
Math is not so much about numbers as about abstract concepts.
Math is not so much about numbers as about abstract concepts.
That's true love.
Other people who find maths interesting!?! I didn't know they existed, although I do live in Newcastle
Other people who find maths interesting!?! I didn't know they existed, although I do live in Newcastle
Math is great, but I prefer practical applications to equations and programming feeds. I love statistical applications to real world issues. I could care less how beautiful an equation is if I will never need it for anything.
I'm the polar opposite of Gopher. I couldn't care less about the practical applications. All that matters is beauty and the concept of being free to do whatever you wish.
no, the polar opposite is me.
i hate math, all of it
I'm the polar opposite of Gopher. I couldn't care less about the practical applications. All that matters is beauty and the concept of being free to do whatever you wish.
There's nothing wrong with that at all. But I prefer to be able to see it and use it. I spend all day doing it.
My brain sploded...
My brain sploded...
That's because art and math don't mix at all.
stabguy wrote:
"I take the sum and calculate its nth root and see if that's an integer. Then I repeat for all values of n." That's a fool's errand.Root functions are slow.
What's worse is that they were trying to use the Standard C Library's double precision floating point math functions to calculate the integer nth root of an integer [there is no nroot function per se but one can use pow(x, 1.0/n) or exp(ln(x) / n)]. This is the worst of both worlds: slow and imprecise. It's slow because the math functions will calculate it out to the full 64 bit precision even if it's clear from the first few bits that the answer will not be an integer (e.g. 3.5490.... is not converging to an integer). Imprecise because even double precision floats can't represent all integers exactly. If the answer is 5.000000000001 is that an integer? Ceilings and floors don't help. The best way to know for sure is to turn around and double check if pow(x, k) = n.
Also, they were talking about checking the nth root for all values of n from 2 to 64. Practically the only thing Wikipedia has to say on the subject of detecting perfect powers is that you only have to consider exponents that are prime. Maybe these guys knew that and were playing it close to the vest for competitive advantage.
I'd have used the mathematical approach. A few years ago.
In all honesty, one of the five strategies I implemented and ultimately rejected was a mathematical perfect power detector. It wasn't my first approach but it was definitely the hardest work. I implemented the algorithms outlined in the paper Detecting Perfect Powers in Essentially Linear Time by Daniel J. Bernstein. It calculates nth roots a few bits at a time and aborts when the answer is clearly not going to be an integer. My thought was that it might scale better because it's CPU bound rather than memory bound like my table-based search strategies. No dice. It was by far the slowest of the five strategies (which is no offense to Mr. Bernstein - his paper is geared toward BigInts for which power tables would be impractical).
161803398874989 wrote:
stabguy wrote:
"I take the sum and calculate its nth root and see if that's an integer. Then I repeat for all values of n." That's a fool's errand.Root functions are slow.
What's worse is that they were trying to use the Standard C Library's double precision floating point math functions to calculate the integer nth root of an integer [there is no nroot function per se but one can use pow(x, 1.0/n) or exp(ln(x) / n)]. This is the worst of both worlds: slow and imprecise. It's slow because the math functions will calculate it out to the full 64 bit precision even if it's clear from the first few bits that the answer will not be an integer (e.g. 3.5490.... is not converging to an integer). Imprecise because even double precision floats can't represent all integers exactly. If the answer is 5.000000000001 is that an integer? Ceilings and floors don't help. The best way to know for sure is to turn around and double check if pow(x, k) = n.
Also, they were talking about checking the nth root for all values of n from 2 to 64. Practically the only thing Wikipedia has to say on the subject of detecting perfect powers is that you only have to consider exponents that are prime. Maybe these guys knew that and were playing it close to the vest for competitive advantage.
I'd have used the mathematical approach. A few years ago.In all honesty, one of the five strategies I implemented and ultimately rejected was a mathematical perfect power detector. It wasn't my first approach but it was definitely the hardest work. I implemented the algorithms outlined in the paper Detecting Perfect Powers in Essentially Linear Time by Daniel J. Bernstein. It calculates nth roots a few bits at a time and aborts when the answer is clearly not going to be an integer. My thought was that it might scale better because it's CPU bound rather than memory bound like my table-based search strategies. No dice. It was by far the slowest of the five strategies (which is no offense to Mr. Bernstein - his paper is geared toward BigInts for which power tables would be impractical).
Damn I hope you get paid well. This is the kind of stuff you do for fun? I'm impressed. Very impressed.
Muhahahahah EA56 sent me a message saying
"WHYYYYYYYYYYYYYYYYYYYYY DO YOU HAVE TO FUCKIN TALK ABOUT ME MISFUCKING SPELLING SHIT LEAVE ME THE FUCK ALONE U FUCKING DICK ASSHOLE DOUCH BAG .... ah feel soooo much better"
I couldn't notice that he misspelt "douche".
FUCK U
Time to bring out feathers?
Could we all stop having a go at ea56? Although writing properly woul be helping yourself.
Changing the subject in 3... 2... 1:
Soon the party will be ALL AROUND US!
Changing subject back in 3... 2... 1...:
I would leave him alone if he took 1 second and read his posts before posting them, I do it all the time.
Changing the subject in 3... 2... 1:Soon the party will be ALL AROUND US!
I HOPE THEY LET US GO INSIDE!
Doesn't everyone get annoyed with them repeating thoses phrases 9 million times during that mission?
LisaMurphy wrote:
Changing the subject in 3... 2... 1:Soon the party will be ALL AROUND US!
I HOPE THEY LET US GO INSIDE!
Doesn't everyone get annoyed with them repeating thoses phrases 9 million times during that mission?
i don't get it
Time to bring out feathers?
How about instead of threatening to send feathers we use this --->
(and I don't mean anything violent by it, just trying tell to people when to back off and respect others)
I HOPE THEY LET US GO INSIDE!Doesn't everyone get annoyed with them repeating thoses phrases 9 million times during that mission?
No, I don't get annoyed; I love this part of the game. My daughters and I will say these things to each other all the time and it always makes us laugh.
We're on track.
I feel you, 18, I really do. XD
He sent me a message when we were questioning his age.
He's annoying, but us getting annoyed by him is bothering everyone else even more, so we should stop. Bottle that anger. Use it to make destructive bombs is ACR.
so i geuss u guis dont cear if i to tipe liek im a retart.
Nope, I don't give a shit, because you're just being an asshat.
LET THE FLAMING COMMENCE
O hey, just posting some random stuff while completely ignoring the previous posts.
As long as ridiculous ailments are on the table...
http://en.wikipedia.org/wiki/Elephantitis
I win!!
because i am not good at spelling..................im..annoying ......really
Now can we please stop dwelling on who knows how to spell and who doesn't?
699...
...and 700!