Supercomputer

This was a four part challenge, in that the binary prints out four different keys, and you get a progressively higher number of points for each one. The program performs some calculations, which are very poorly implemented and run extremely slowly. To solve this challenge, you need to optimize the code to make it compute the answer quickly.

Part One

One of the most obvious speedups is to remove the calls to the sleep() function. Just NOP them out with a hex editor. However, it still runs too slow for even the first key. If you break into the debugger periodically you will see it is commonly stuck in the function at 0x400c6d:

images/1u.jpg

This loop is just performing four adds by incrementing in a loop. Replace these with proper ADD instructions, without the loop:

images/1p.jpg

Now the calculation finishes quickly, and gives us the first key. Challenge one of four solved:

Yay! The first key is 414e0d423f5fcd195a579f95f1ff6525

Part Two

Going back into the debugger waiting for the second key, now the slow function is 0x400edc:

images/2u.jpg

This function is performing a multiply by adding in a loop. Replace it with an actual multiply:

images/2p.jpg

Now we get the second key, and the second of four challenges is solved:

Hooray! The second key is f811f0e8a1f9196e27eef9e23eff6367

Part Three

To get the third key, we again break into the debugger to find out what function is slowing things down. This time it is the function at 0x4010b8:

images/3u.jpg

This function first does another multiply by repeated addition, this time by a constant. Replace this with a single IMUL instruction. The second loop, however, is different. This compares the number to a constant, then adds a negative number to it while it is above that constant. If you negate the negative number, you will see that this is actually just a slow way to do a modulus with a constant. Replace this loop with a DIV instruction:

images/3p.jpg

Now we get the third key, and part three of four is complete:

Hooray! The third key is c9e6d35ed6007b35f7d01a98f6d548fb

Part Four

Now for the final key. Earlier while reversing we noticed a function with some fairly obfuscated code, so we were pretty sure that was going to be the final challenge, but finding the slow loop is easiest by just periodically breaking into the debugger just like before. This will appear to make the slow function be the one at 0x400d18, but this is not really the slow function. Instead, it is called in a loop at 0x401387, and again at 0x401425:

images/4u.jpg

After spending some time trying to figure out how to optimize these loops, we decided to ask the question "If we branch jammed these loops to make them be instant, would any other functions cause it to go too slow?" We put NOPs over the branches at 0x4013c0 and 0x40145e, and it finished. Surprisingly, it also gave us a correct key. We still aren't sure why getting rid of this code was actually correct:

Congratz! The last key is f9b02fabc4b866288d7c4c5bbcd5507e