4 billion if statements (2023)

This is time efficient* but rather wasteful of space.

The best way to save space is to use a Bloom Filter.

If we capture all the even numbers, that would sadly only give us "Definitely not Even" or "Maybe Even".

But for just the cost of doubling our space, we can use two Bloom filters!

So we can construct one bloom filter capturing even numbers, and another bloom filter capturing odd numbers.

Now we have "Definitely not Even" and "Maybe Even" but also "Definitely not Odd" and "Maybe Odd".

In this manner, we can use the "evens" filter to find the odd numbers and the "odds" filter to find the even numbers.

Having done this, we'll be left with just a handful of unlucky numbers that are recorded as both "Maybe even" and "Maybe odd". These will surely be few enough in number that we can special case these in our if/else block.

The filters as a first-pass will save gigabytes of memory!

gopalv · 2 months ago

> But for just the cost of doubling our space, we can use two Bloom filters!

We can optimize the hash function to make it more space efficient.

Instead of using remainders to locate filter positions, we can use a mersenne prime number mask (like say 31), but in this case I have a feeling the best hash function to use would be to mask with (2^1)-1.

AlotOfReading · 2 months ago

This produced strange results on my ternary computer. I had to use a recursive popcnt instead.

alexjurkiewicz · 2 months ago

You are absolutely write! Let me write a Go program to implement this idea. The bloom filters will take approximately 5gb (for a 1% error rate) and take a few minutes to populate on a modern Macbook Pro.

https://gist.github.com/alexjurkiewicz/1abf05f16fd98aabf380c...

nixpulvis · 2 months ago

How is this time efficient at all? It takes upwards of 40 seconds to compute on large 32bit values.

It's a joke post with some interesting bits and details.

xnorswap · 2 months ago

It's a constant number of lookups, and all good Computer Scientists know that it is therefore an O(1) algorithm.

It is hard to imagine better efficiency than O(1)!

Indeed we could improve it further by performing all evaluations even when we find the answer earlier, ensuring it is a true Constant Time algorithm, safe for use in cryptography.

Sohcahtoa82 · 2 months ago

How are you able to recognize a joke post but not a joke comment?

uplifter · 2 months ago

You're absolutely right. The obvious solution would have been to create a boolean table containing all the pre-computed answers, and then simply use the integer you are testing as the index of the correct answer in memory. Now your isEven code is just a simple array lookup! Such an obvious improvement, I can't believe the OP didn't see it.

And with a little extra work you can shrink the whole table's size in memory by a factor of eight, but I'll leave that as an exercise for the interested reader.

Maxatar · 2 months ago

The comment you're replying to is also a joke, with some interesting bits and details.

nrhrjrjrjtntbt · 2 months ago

r/whoosh

I took an ASIC design class in college, unfortunately with a heavy course load that didn't allow me to focus on it. For our final project we were given a numbered dictionary and asked to design a chip that would accept the characters on a 7 bit interface (ASCII), one character per clock cycle and output the dictionary number on an output interface but I can't remember how wide. We were graded on the size of the resulting ASIC and how many clock cycles it took from the last character in to the number on the output.

I started designing my modules, a ROM, a register with a ROM pointer, etc, etc, writing the Verilog and working out the clock sync between modules. Then I got 'lazy' and wrote a trie tree like implementation in Java, and have it spit out the whole tree in Verilog. It worked and just one clock cycle after the last letter my number would output. Fastest in the class! Also the most number of gates in the class. Surprised I got a 90 grade given I didn't use any of the advanced ASIC design the class taught. The TA didn't know what the hell they were looking at.

weli · 2 months ago

Yep! Something a bit counterintuitive on circuit design is that dedicated transistors will always beat reusing existing components. If we do reuse existing components like ALUs, multipliers, or state machines, we save on chip area but pay the penalty in clock cycles. Your approach was the extreme version of this tradeoff. You essentially unrolled the entire dictionary lookup into pure combinatorial logic (well, with registers for the input characters). One clock cycle latency because you weren't doing any sequential searching, comparing, or state machine transitions just racing electrons through logic gates.

retrac · 2 months ago

It's akin to a compiler unrolling a loop. Uses more RAM (area) but fewer cycles to execute. Hardware synthesis uses many of the same techniques as compilers use to optimize code.

It's a common pitfall for those learning hardware description languages like Verilog, when they think about them like programming languages. If you go "if (calc) res <= a * b;" If res is 32 bits wide then you have instantiated a 32 bit fast multiplier circuit dedicated just to that one operation. This is often not what was intended.

Despite how leaning on the analogy too closely can mislead in that way, the analogy between hardware and software is not a shallow one. A combinatorial circuit is akin to the pure function of functional programming. Anything that can be described as a pure function working on fixed integers or floating point or other discrete data types, can be transformed into a combinatorial circuit. And there are algorithms to do so automatically and often with reasonable efficiency.

Free software synthesis has come a long way in recent years, by the way. There's even several hobbyist projects that can take VHDL or Verilog and produce layouts using TTL chips or even discrete transistor logic with automatic circuit board layout. You can now compile your code directly to circuit board copper masks and a part list.

for i in range(0, 2**8, 2): print(" if (number == "+str(i)+")") print(" printf(\"even\\n\");") print(" if (number == "+str(i + 1)+")") print(" printf(\"odd\\n\");")

ZeroConcerns · 2 months ago

Ah, yes, exactly the pointless diversion I needed for my lunch break. For science: generating a C# switch statement for similar purposes took 7 minutes on similar-ish hardware, but the resulting 99.2GB file could not be opened or compiled ('Stream is too long'), which was slightly disappointing.

Optimization efforts included increasing the internal buffer size of the StreamWriter used to create the source code: this reduced the runtime to around 6 minutes, as well as running a non-debug build, as it was observed that the poor Visual Studio metrics gathering process was contributing significantly to disk activity as well, but that ultimately didn't matter much. So, ehhm, yes, good job on that I guess?

afandian · 2 months ago

Isn't the obvious thing to generate different classes for different ranges over the input? Then the classes could be loaded lazily.

And if you then make the ranges tree-shaped you get logarithmic complexity, which massively cuts down the O(n) of the rather naive chained `if` statements.

jiggawatts · 2 months ago

To match the article, you'd want to directly emit the Intermediate Language (IL) tokens with something like this: https://learn.microsoft.com/en-us/dotnet/api/system.reflecti...

I haven't found any authoritative source, but I strongly suspect that the .NET bytecode format has 32-bit limits all over the place. Maaaybe you could break up the code into functions less than 1 GB in size and then chain them together.

opticfluorine · 2 months ago

I wonder if you could generate it via a Roslyn incremental source generator instead of as a file to bypass this limit. I'm guessing not, but it does sound like fun.

tkapin · 2 months ago

You can totally use source generators for that.

szundi · 2 months ago

Diversion for your whole next week: just generate machine code binary for this

mft_ · 2 months ago

Similar humour if opposite directions to an old favourite: https://joelgrus.com/2016/05/23/fizz-buzz-in-tensorflow/

thih9 · 2 months ago

I expected some job interview meme[1][2] but I did not know this one and it looks like a real story too! Thanks for sharing, that was a fun read.

[1]: https://aphyr.com/posts/342-typing-the-technical-interview

[2]: https://www.richard-towers.com/2023/03/11/typescripting-the-...

waisbrot · 2 months ago

I love the Aphyr posts.

> “Can I use any language?” > > “Sure.” > > Move quickly, before he realizes his mistake.

wiz21c · 2 months ago

Gemini took 4 seconds to answer this prompt: "Here is a number 4200020010101. Think deeply about it and tell me if it is not or or not even."

So if you're concerned with privacy issues, you can run the assembly version proposed in the article locally and be well within the same order of performance.

Let's thank the author of the article for providing a decent alternative to Google.

ah, but the license is not that good we can't reproduce his code.

brap · 2 months ago

>Think deeply about it and tell me if it is not or or not even

I think I just experienced a segfault

sethaurus · 2 months ago

Why do they call it even when you of in the true number of out false odd the number?

iberator · 2 months ago

Hey, why segfault and not stack overflow?

bdangubic · 2 months ago

classified · 2 months ago

> if it is not or or not even

Did you want to test the LLM's grammatical comprehension?

When I'm tired my typing goes bad. I obvioulsy meant: "is that number even odd ?" :-)

moffkalast · 2 months ago

Finally a problem that Microsoft Phi can ace. Probably. Maybe. Some of the time at least.

kspacewalk2 · 2 months ago

Surely you mean Microsoft CoPhiLot 365

kayge · 2 months ago

> any value over 2^31 seems to give random results.

Wow he really lucked out... On his way to perfecting a fully functioning and performant Even/Odd Detector, he stumbled upon a fully functioning and performant Coin Flip Simulator!

thedougd · 2 months ago

blauditore · 2 months ago

This could be obviously done with much less code: Just add "if"s for all even number, and at the end just return "odd" if none of the evens matched. 50% less code!

Or even simpler: If it's 0, return "even". If not, do a recursive call to n-1, if that equals "even", return "odd", otherwise return "even".

But the best way is probably to just use a library. Yes, 500MB of additional dependencies, but then it's a one-liner.

majkinetor · 2 months ago

But then, even numbers will have the worst possible performance.

IncreasePosts · 2 months ago

Good point. Have two programs - one checking every even number and returning odd of not even. And then have a program checking every odd number and returning even if not. Then, a simple program to dispatch to either program randomly, so you end up in the long term with good performance for each.

bobbylarrybobby · 2 months ago

You brought up an important opportunity for optimization. If you know the distribution of your data, it may make more sense to implement it in terms of the odd numbers and leave even numbers as the fallback. It's important to profile with a realistic distribution of data to make sure you're targeting the correct parity of numbers.

layer8 · 2 months ago

You could save stack space by transforming it into a loop. It’s still only O(n)!

xg15 · 2 months ago

> Now, this is a time-memory tradeoff, but my time on this earth is limited so I decided to meta-program the if statements using a programmer program in a different programming language.

  for i in range(2*8):
    if i % 2 == 0:

No comment...

_ache_ · 2 months ago

Yeah... I come here to talk about that. Should have been

  for i in range(0, 2**8, 2):
      print(f"""    if (number == {i})
          puts("even");
      if (number == {i + 1})
          puts("odd");""")

NetMageSCW · 2 months ago

What happens when you try to compute 2**8+1 ?

PurpleRamen · 2 months ago

I think we can improve this. Just make a microservice who generates the code on the fly and streams it to the compiler. Then you also just have to create the necessary code and don't waste the SSD with unused code-paths.

travisgriggs · 2 months ago

I’m disappointed there is no docker image for this. How will I test it out?

cowsandmilk · 2 months ago

How horridly inefficient, he should have just flipped a Boolean on each iteration.

kroolik · 2 months ago

Horridly inefficient. Just unfold the loop.

projektfu · 2 months ago

Just output odd and even for each pass and increment by two. Need to make sure you have the right starting value, and check for off-by-one errors.

catigula · 2 months ago

Claude's version:

  even, odd = "even", "odd"
  for i in range(2\*32):
      print(f'    if (number == {i}) puts("{even}");')
      even, odd = odd, even

As usual, a non-marginally superior mind to commentators.

It should have used a flag that is being toggled.