Saturday, August 3, 2024
HomeUncategorized4B If Statements

# 4B If Statements

I recently stumbled upon this screenshot while researching social media on the train. Of course, it was followed by a cascade of spiteful comments, criticizing this fresh programmer’s attempt to solve a classical problem in computer science. The modulus operation.

Now, in a world where AI is replacing programmers by the minute, taking their jobs and revolutionizing the way we think about code, maybe we should be more open to the thoughts of the fresh new blood of the industry? In fact, the above code is a perfect example of a time-memory tradeoff. You’re trading off your time and at the same time, the computers memory and time as well! Truly a marvelous algorithm!

So I went to work to explore this idea of checking if a number is odd or even by only using comparisons to see how well it works in a real world scenario. Since I’m a great believer in performant code I decided to implement this in the C programming language as it’s by far the fastest language on the planet to this day (thanks to the visionary genius Dennis Richie).

So I started composing

Beautiful! Lets compile the code, disabling optimizations with /Od to make sure that the pesky compiler doesn’t interfere with our algorithm. After compiling we can do a quick test of the program we get some positive results:

However, after doing some further testing I found some problems:

No output! It seems that the program only works for numbers under 11! Going back to the code we can find the issue right after the last if statement, we need more if statements!

Now, this is a time-memory tradeoff, but my time on this earth is limited so I decided to meta-program the if statements using a programmer program in a different programming language. To compensate for this cheating I decided to use the slowest language on the planet, Python (thanks to the visionary genius of Ross van der Gussom).

Nice! Now we can generate a program that solves the even-odd problem for all 8-bit integers!

Would you look at that! It works flawlessly! Now, let’s scale it up to 16 bit!

This gives a nice and thick c file of around 130k lines. Nothing really when looking back at some of the code bases I’ve worked on over the years. Let’s compile!

Beautiful! Our algorithm seems to scale with the data! The executable is around 2 MB, but that’s no match for my beefy gaming rig with a whopping 31.8 GB of memory.

Now, 16 bit is a very cool bitwidth, but as we all know, 32 bit is the holy grail of computing and is the final bitwidth that we need to solve all practical engineering and scientific problems. After all, IPv4 is still standing stronger than ever, 60 years after it was deemed deprecated due to so called “address exhaustion”.

So without further ado, lets scale to our final size. 32 bit is only 65536 times as many numbers as 16 bit, what could go wrong?

So I let the mighty snake do its work and after getting a cup of coffee and getting back to check on the program 48 hours later I was left with a beautiful c file, almost 330 GB in size! Almost certainly among the largest c files in history. My fingers were trembling when I entered the next command, surely MSVC had never before encountered such powerful source code. After abusing the pagefile of my poor, powerful computer for half an hour the following was spat out:

Pathetic!

And not only did the compiler fail us, but when looking into the limits of the Portable Executable format (.exe) for windows, I discovered that it cannot handle more than a measly 4 GB! With more than 4 billion comparisons needed to be encoded into the executable, this is a major obstacle for implementing our algorithm. Even if each comparison would use less than a single byte we would still be too heavy.

However, bad compilers and file formats should not stop us from achieving our dream. After all, all what a compiler does is writing some fancy machine code into a file and the file format is just some structure telling the OS how to put the binary code into memory. Really, we can do that ourselves.

Let’s start by writing an IsEven function in x86-64 assembly as it’s the native language of my Intel powered machine. It looks something like this:

Not really correct asm, but it doesn’t matter much, because we’re gonna compile it into machine code manually.

How did I do this? Well I jumped online, using a mix of my early life experience coding emulators and hacking and looked into the x86(-64) architecture manuals to figure out the correct opcodes and format for each instruction.

… Just kidding, that’s horrible. I asked ChatGPT what the correct opcode was for each instruction and lucky for us it didn’t hallucinate any new extensions to x86-64.

So now we just write a “compiler” to output this code. Note that we will write the opcodes we got from the AI for the instructions directly. Here’s how it looks in our friend python:

While we somewhat deviated from the original vision of the TikTok post, the essence remains the same. We create a long, long, long list of if-statements to determine if any number is even or odd, ignoring any arithmetic operation that would help out.

Running this gives us a nice 40 GB file which contains all 4.2 billion comparisons needed to determine if any 32 bit number is even or odd! Now we just need to write our host program that can load and use these instructions. For added performance (it is very important), I decided to map the file into the address space instead of reading all of it. By doing this, we can just pretend that the entire file is already in memory and let the poor OS deal with fitting a 40 GB blob into virtual memory. After mapping the file with READ and EXECUTE permissions we can call into the code by using a function pointer. It looks like this:

And there we go! We now have everything to check if any 32 bit number is even or odd. Let’s take it for a spin:

Almost! Seems like the algorithm has some issues with signedness, any value over 2^31 seems to give random results. Sad!

Let’s fix the final bug.

It turns out that atoi cannot deal with unsigned pureness, so it failed to parse our big boy numbers. Replacing it with strtoul fixes everything.

As a side note, the program is amazingly performant. For small numbers the results are instantaneous and for the large number close to the 2^32 limit the result is still returned in around 10 seconds. Considering the computer has to read 40 GB of data from disk, map it to physical memory and then let the CPU has a rip of it without many chances of caching is honestly quite mind blowing. For reference, the computer is a Core i5 12600K with 32 GB memory and the files are residing on a M.2 SSD disk. While calculating, the peak read speed I saw from the SSD was around 800 MB/s (which doesn’t really make sense as that should give execution speeds at 40+ seconds, but computers are magical so who knows what is going on).

And there we have it! The Internet proven wrong once again, not only can you actually write a fully functioning and performant program in the manner of the TikTok post, but it’s also very fun.

RELATED ARTICLES